Rate limits

Per-token, fixed-window.

Storylayer rate-limits every token on a fixed 60-second window across both the REST API and the MCP server. Limits are per-token, not per-IP.

Defaults

Token typeDefaultNotes
Personal Access Token (sl_pat_…)120 req/minIssued from the dashboard.
OAuth access token (sl_oat_…)60 req/minIssued via /oauth/token to hosted AI tools.
Legacy admin key (sl_…)600 req/minBack-compat only; new integrations should use PATs.

Need higher limits? Per-token overrides are stored on the token row (rate_limit_per_minute) and can be raised on request.

Response headers

Every successful response — and every 429 — includes:

HeaderMeaning
X-RateLimit-LimitThe token's effective limit for the current window.
X-RateLimit-RemainingRequests left before the next reset.
X-RateLimit-ResetUnix epoch seconds when the window resets.
Retry-After(429 only) seconds until the next window.

429 response

HTTP/1.1 429 Too Many Requests
Retry-After: 14
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1745948240

{
  "error": {
    "code": "rate_limited",
    "message": "Rate limit exceeded (120/min). Retry after 14s.",
    "limit": 120,
    "retry_after_seconds": 14
  }
}

Strategy

  • Read X-RateLimit-Remaining on every response and back off when it gets close to zero.
  • Honour Retry-After. Don't poll faster than the header says — bursts of 429s slow your token down further.
  • Spread bursts across more than one second; the bucket is fixed-window so a 60-request burst at :00 followed by 60 at :01 never trips the limit.
  • For agents that retry on errors, surface 429s with their Retry-After so the agent waits instead of retrying immediately.

Failure mode

Rate limiting is best-effort. If the counter store is briefly unavailable, requests are allowed rather than denied so a transient database hiccup never takes the API down. Counters resume on the next request.