Rate Limits
The Voka API uses a sliding-window rate limit, applied at two scopes:
| Bucket | Limit | Identifier |
|---|---|---|
| Per customer | 60 requests / minute | Aggregated across all your API keys |
| Per API key | 60 requests / minute | Each key has its own bucket |
Both gates are checked on every request. The lower remaining count is what you actually have available.
Why two buckets?
The per-customer ceiling protects against a runaway integration consuming all your throughput. The per-key bucket lets you isolate a leaked or abusive key without throttling the rest of your traffic.
Headers
Every response — successful and 429 — carries:
| Header | Value |
|---|---|
X-RateLimit-Limit | Total requests allowed in the window (60) |
X-RateLimit-Remaining | Lower of (per-customer remaining, per-key remaining) |
X-RateLimit-Reset | ISO-8601 timestamp when the bucket refills |
On 429 responses we additionally send:
| Header | Value |
|---|---|
Retry-After | Integer seconds until you can retry |
Zapier's runtime auto-honors Retry-After on 429. n8n and Make wrappers should manually back off using this header.
429 response shape
{
"type": "about:blank",
"title": "Too Many Requests",
"status": 429,
"detail": "Rate limit exceeded. Retry after the indicated interval.",
"code": "rate.exceeded",
"request_id": "..."
}
The code field tells you which bucket tripped:
code | Meaning |
|---|---|
rate.exceeded | Per-customer bucket is empty |
rate.exceeded_key | Per-key bucket is empty (other keys still have budget) |
Best practices
- Honor
Retry-After. Don't retry sooner — you'll just consume more budget on the next response. - Spread bursts. Adding 1000 webhook subscriptions at startup will trip the limit. Stagger by 1-2 seconds between calls.
- Cache.
GET /assistantsreturns the same data for hours at a time — cache it locally and refresh on demand, not on every action invocation. - Use webhooks instead of polling. If you find yourself polling
GET /callsevery 10 seconds, subscribe tocall.completedinstead. Webhook delivery doesn't count against your rate limit. - Multiple keys for different workloads. A key dedicated to bulk ingestion and a separate key for real-time integrations isolate failures across the per-key buckets.
What doesn't count
- Webhook deliveries (we send to you, you don't request)
- Connection-test calls to
/auth/whoamicount normally — but most wrappers only call this once per session
Need higher limits?
Contact support@vokaai.com with:
- Your customer ID (from
/auth/whoami) - Use case description (volume, burst pattern, peak time of day)
- A budget you'd accept for sustained over-quota usage
We typically grant up to 600 requests/minute on commercial plans without architectural changes.