How-to

Rate limits

Default per-key limit: 1000 requests per minute, applied at the FastAPI middleware layer (RateLimitMiddleware). The limit is per-key, or per-IP for unauthenticated routes.

What rate limits apply to the API?

Every authenticated request counts against the same per-key budget, regardless of endpoint. There is no per-route override in v1. If you’re hitting the limit, the issue is almost always a tight loop calling a single-item endpoint where a bulk endpoint exists.

When you exceed it

You get a 429 Too Many Requests. Retry with exponential backoff. There is no Retry-After header in v1. Treat the wait as min(2^n, 60) seconds, where n is the consecutive failure count.

Sample 429 body:


{
  "error": "rate_limited",
  "message": "Too many requests. Please retry after a backoff."
}

A reasonable client-side handler:


async function call(url: string, init: RequestInit, attempt = 0): Promise<Response> {
  const res = await fetch(url, init)
  if (res.status !== 429) return res
  const wait = Math.min(2 ** attempt, 60) * 1000
  await new Promise(r => setTimeout(r, wait))
  return call(url, init, attempt + 1)
}

MCP tools

MCP tool calls hit the same limit. Each tool call is one request. For high-throughput workflows like bulk_import_leads, prefer the bulk variant over a loop of single-item calls. Bulk endpoints batch up to 500 items per call and only count as one request.

Provider-side limits

When 10ex makes calls to third-party providers (Gmail, Google Ads, OpenAI), those have their own limits. Long-running agents handle 429s internally with retries. You don’t need to.

Common mistakes

Looping create_lead 10,000 times instead of calling bulk_import_leads once.
Hammering after a 429 with no backoff. You’ll get 429s back-to-back forever.
Sharing one API key across staging and production traffic. Mint one per environment so each has its own budget.