Rate limiting refers to the constraints our API enforces on how frequently a user or client can access our services within a given timeframe. Rate limits are denoted as HTTP status code 429s.
What is the purpose of rate limits?
Rate limits in APIs are a standard approach, and they serve to safeguard against abuse or misuse of the API, helping to ensure equitable access to the API with consistent performance.
Tier-based rate limits
We now offer rate-limits based on consistent spend on the platform.
You can view your rate limit by navigating to Settings > Billing. As your usage of the Together API and your spend on our API increases, we will automatically increase your rate limits.
Tier | Qualification criteria | Chat, language & code | Embeddings | Image |
Free | User must be in an allowed geography | 60 RPM | 3,000 RPM | 1 IMG |
Tier 1 | Valid credit card added | 600 RPM | 3,000 RPM | 5 IMG |
Tier 2 | $50 paid | 1,800 RPM | 5,000 RPM | 10 IMG |
Tier 3 | $100 paid | 3,000 RPM | 5,000 RPM | 15 IMG |
Tier 4 | $250 paid | 4,500 RPM | 10,000 RPM | 20 IMG |
Tier 5 | $1,000 paid | 6,000 RPM | 10,000 RPM | 50 IMG |
Note: Spend is only "paid" when invoiced and charged from your credit card, NOT when spend in incurred.
Rate limits in headers
Field | Description |
x-ratelimit-limit | The maximum number of requests that are permitted before exhausting the rate limit. |
x-ratelimit-remaining | The remaining number of requests that are permitted before exhausting the rate limit. |
x-ratelimit-reset | The time until the rate limit (based on requests) resets to its initial state. |
If you're interested in a higher rate limit immediately, contact our sales team to ask about credit packages or fill out this form.