To ensure everyone gets the best performance, Snowflake limits how much you can use Cortex LLM functions. If you use them too much, your requests might be slowed down. These limits may change over time. The table below shows the current limits per account.
Function (Model) | Tokens processed per minute (TPM) | Rows processed per minute (RPM) |
---|---|---|
COMPLETE (mistral-large ) |
200,000 | 100 |
COMPLETE (mixtral-8x7b ) |
300,000 | 400 |
COMPLETE (llama2-70b-chat ) |
300,000 | 400 |
COMPLETE (mistral-7b ) |
300,000 | 500 |
COMPLETE (gemma-7b ) |
300,000 | 500 |
EXTRACT_ANSWER | 1,000,000 | 3,000 |
SENTIMENT | 1,000,000 | 5,000 |
SUMMARIZE | 300,000 | 500 |
TRANSLATE | 1,000,000 | 2,000 |
Note
On-demand Snowflake accounts without a valid payment method (such as trial accounts) are limited to roughly one credit per day in Snowflake Cortex LLM function usage.