Rate limits

Per-token, fixed-window.

Storylayer rate-limits every token on a fixed 60-second window across both the REST API and the MCP server. Limits are per-token, not per-IP.

Defaults

Token type	Default	Notes
Personal Access Token (`sl_pat_…`)	120 req/min	Issued from the dashboard.
OAuth access token (`sl_oat_…`)	60 req/min	Issued via `/oauth/token` to hosted AI tools.
Legacy admin key (`sl_…`)	600 req/min	Back-compat only; new integrations should use PATs.

Need higher limits? Per-token overrides are stored on the token row (rate_limit_per_minute) and can be raised on request.

Response headers

Every successful response — and every 429 — includes:

Header	Meaning
`X-RateLimit-Limit`	The token's effective limit for the current window.
`X-RateLimit-Remaining`	Requests left before the next reset.
`X-RateLimit-Reset`	Unix epoch seconds when the window resets.
`Retry-After`	(429 only) seconds until the next window.

429 response

HTTP/1.1 429 Too Many Requests
Retry-After: 14
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1745948240

{
  "error": {
    "code": "rate_limited",
    "message": "Rate limit exceeded (120/min). Retry after 14s.",
    "limit": 120,
    "retry_after_seconds": 14
  }
}

Strategy

Read X-RateLimit-Remaining on every response and back off when it gets close to zero.
Honour Retry-After. Don't poll faster than the header says — bursts of 429s slow your token down further.
Spread bursts across more than one second; the bucket is fixed-window so a 60-request burst at :00 followed by 60 at :01 never trips the limit.
For agents that retry on errors, surface 429s with their Retry-After so the agent waits instead of retrying immediately.

Failure mode

Rate limiting is best-effort. If the counter store is briefly unavailable, requests are allowed rather than denied so a transient database hiccup never takes the API down. Counters resume on the next request.