Skip to content
FoundationMachines

Rate limits

Per-plan request and daily token caps and how throttling is signalled.

Foundation Machines applies two complementary limits to every API key: a sliding-window cap on requests per minute, and a rolling daily cap on total tokens. Both reset on a fixed schedule.

Per-plan limits

PlanRPMDaily tokens
free550,000
pro20500,000
team603,000,000
enterprise12010,000,000

These limits apply to all gateway endpoints, /v1/chat/completions, /v1/completions and /v1/audit share the same per-key budget.

Reset cadence

  • RPM: 60-second sliding window. The counter decays continuously.
  • Daily tokens: 00:00 UTC. The day boundary is fixed regardless of your timezone.

When you hit a limit

When throttled you receive HTTP 429 with one of two error.type values:

  • rate_limit: your per-minute request rate exceeded the plan cap. Back off briefly.
  • quota_exceeded: your daily token budget is exhausted. Upgrade or wait for the UTC reset.

Both responses include a human-readable error.message describing the cap that was hit. The response body shape matches OpenAI's error format so existing SDK error handling works unchanged.

Checking your usage

Call GET /v1/usage/me at any time to see the current counter and remaining headroom for the day. The endpoint is not rate-limited.