Rate Limits

The Voka API uses a sliding-window rate limit, applied at two scopes:

Bucket	Limit	Identifier
Per customer	60 requests / minute	Aggregated across all your API keys
Per API key	60 requests / minute	Each key has its own bucket

Both gates are checked on every request. The lower remaining count is what you actually have available.

Why two buckets?

The per-customer ceiling protects against a runaway integration consuming all your throughput. The per-key bucket lets you isolate a leaked or abusive key without throttling the rest of your traffic.

Headers

Every response — successful and 429 — carries:

Header	Value
`X-RateLimit-Limit`	Total requests allowed in the window (60)
`X-RateLimit-Remaining`	Lower of (per-customer remaining, per-key remaining)
`X-RateLimit-Reset`	ISO-8601 timestamp when the bucket refills

On 429 responses we additionally send:

Header	Value
`Retry-After`	Integer seconds until you can retry

Zapier's runtime auto-honors Retry-After on 429. n8n and Make wrappers should manually back off using this header.

429 response shape

{
  "type": "about:blank",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "Rate limit exceeded. Retry after the indicated interval.",
  "code": "rate.exceeded",
  "request_id": "..."
}

The code field tells you which bucket tripped:

`code`	Meaning
`rate.exceeded`	Per-customer bucket is empty
`rate.exceeded_key`	Per-key bucket is empty (other keys still have budget)

Best practices

Honor Retry-After. Don't retry sooner — you'll just consume more budget on the next response.
Spread bursts. Adding 1000 webhook subscriptions at startup will trip the limit. Stagger by 1-2 seconds between calls.
Cache. GET /assistants returns the same data for hours at a time — cache it locally and refresh on demand, not on every action invocation.
Use webhooks instead of polling. If you find yourself polling GET /calls every 10 seconds, subscribe to call.completed instead. Webhook delivery doesn't count against your rate limit.
Multiple keys for different workloads. A key dedicated to bulk ingestion and a separate key for real-time integrations isolate failures across the per-key buckets.

What doesn't count

Webhook deliveries (we send to you, you don't request)
Connection-test calls to /auth/whoami count normally — but most wrappers only call this once per session

Need higher limits?

Contact support@vokaai.com with:

Your customer ID (from /auth/whoami)
Use case description (volume, burst pattern, peak time of day)
A budget you'd accept for sustained over-quota usage

We typically grant up to 600 requests/minute on commercial plans without architectural changes.

Why two buckets?​

Headers​

429 response shape​

Best practices​

What doesn't count​

Need higher limits?​