Rate limits
Per-token, fixed-window.
Storylayer rate-limits every token on a fixed 60-second window across both the REST API and the MCP server. Limits are per-token, not per-IP.
Defaults
| Token type | Default | Notes |
|---|---|---|
Personal Access Token (sl_pat_…) | 120 req/min | Issued from the dashboard. |
OAuth access token (sl_oat_…) | 60 req/min | Issued via /oauth/token to hosted AI tools. |
Legacy admin key (sl_…) | 600 req/min | Back-compat only; new integrations should use PATs. |
Need higher limits? Per-token overrides are stored on the token row (rate_limit_per_minute) and can be raised on request.
Response headers
Every successful response — and every 429 — includes:
| Header | Meaning |
|---|---|
X-RateLimit-Limit | The token's effective limit for the current window. |
X-RateLimit-Remaining | Requests left before the next reset. |
X-RateLimit-Reset | Unix epoch seconds when the window resets. |
Retry-After | (429 only) seconds until the next window. |
429 response
HTTP/1.1 429 Too Many Requests
Retry-After: 14
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1745948240
{
"error": {
"code": "rate_limited",
"message": "Rate limit exceeded (120/min). Retry after 14s.",
"limit": 120,
"retry_after_seconds": 14
}
}Strategy
- Read
X-RateLimit-Remainingon every response and back off when it gets close to zero. - Honour
Retry-After. Don't poll faster than the header says — bursts of 429s slow your token down further. - Spread bursts across more than one second; the bucket is fixed-window so a 60-request burst at
:00followed by 60 at:01never trips the limit. - For agents that retry on errors, surface 429s with their
Retry-Afterso the agent waits instead of retrying immediately.
Failure mode
Rate limiting is best-effort. If the counter store is briefly unavailable, requests are allowed rather than denied so a transient database hiccup never takes the API down. Counters resume on the next request.