Skip to main content

Default limits

PlanRequests per minuteConcurrent requests
Free tier6010
Rate limits are applied per API key. If you need higher limits, contact us.

Rate limit headers

When you approach or exceed limits, responses include:
HeaderDescription
X-RateLimit-LimitMaximum requests per minute
X-RateLimit-RemainingRemaining requests in current window
X-RateLimit-ResetUnix timestamp when the window resets

Handling rate limits

When rate limited, you’ll receive a 429 status code:
{
  "error": {
    "message": "Rate limit exceeded",
    "type": "rate_limit_error"
  }
}
import time
from openai import OpenAI, RateLimitError

client = OpenAI(
    base_url="https://www.opencompress.ai/api/v1",
    api_key="sk-occ-your-key-here",
)

def call_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o-mini",
                messages=messages,
            )
        except RateLimitError:
            wait = 2 ** attempt  # exponential backoff
            time.sleep(wait)
    raise Exception("Max retries exceeded")
The OpenAI SDK has built-in retry logic. By default, it retries rate-limited requests with exponential backoff.