Skip to main content
Rate limits

1 query per second (QPS) for free users, 10 QPS for paid users.

Updated today

Rate limiting refers to the constraints our API enforces on how frequently a user or client can access our services within a given timeframe. Rate limits are denoted as HTTP status code 429s.

What is the purpose of rate limits?

Rate limits in APIs are a standard approach, and they serve to safeguard against abuse or misuse of the API, helping to ensure equitable access to the API with consistent performance.

Tier-based rate limits

We now offer rate-limits based on consistent spend on the platform.

You can view your rate limit by navigating to Settings > Billing. As your usage of the Together API and your spend on our API increases, we will automatically increase your rate limits.

Tier

Qualification criteria

Chat, language & code

Embeddings

Image

Free

User must be in an allowed geography

60 RPM

3,000 RPM

1 IMG

Tier 1

Valid credit card added

600 RPM

3,000 RPM

5 IMG

Tier 2

$50 paid

1,800 RPM

5,000 RPM

10 IMG

Tier 3

$100 paid

3,000 RPM

5,000 RPM

15 IMG

Tier 4

$250 paid

4,500 RPM

10,000 RPM

20 IMG

Tier 5

$1,000 paid

6,000 RPM

10,000 RPM

50 IMG

Note: Spend is only "paid" when invoiced and charged from your credit card, NOT when spend in incurred.

Rate limits in headers

Field

Description

x-ratelimit-limit

The maximum number of requests that are permitted before exhausting the rate limit.

x-ratelimit-remaining

The remaining number of requests that are permitted before exhausting the rate limit.

x-ratelimit-reset

The time until the rate limit (based on requests) resets to its initial state.

If you're interested in a higher rate limit immediately, contact our sales team to ask about credit packages or fill out this form.

Did this answer your question?