This lists various services that provide free access or credits towards API-based LLM usage.
[!NOTE]
Please don’t abuse these services, else we might lose them.
[!WARNING]
This list explicitly excludes any services that are not legitimate (eg reverse engineers an existing chatbot)
Limits:
20 requests/minute
50 requests/day
1000 requests/day with $10 credit balance
Models share a common quota.
Data is used for training when used outside of the UK/CH/EEA/EU.
Model Name | Model Limits |
---|---|
Gemini 2.5 Pro (Experimental) | 1,000,000 tokens/day 250,000 tokens/minute 25 requests/day 5 requests/minute |
Gemini 2.5 Flash (Preview) | 250,000 tokens/minute 500 requests/day 10 requests/minute |
Gemini 2.0 Flash | 1,000,000 tokens/minute 1,500 requests/day 15 requests/minute |
Gemini 2.0 Flash-Lite | 1,000,000 tokens/minute 1,500 requests/day 30 requests/minute |
Gemini 2.0 Flash (Experimental) | 4,000,000 tokens/minute 1,500 requests/day 10 requests/minute |
Gemini 1.5 Flash | 1,000,000 tokens/minute 1,500 requests/day 15 requests/minute |
Gemini 1.5 Flash-8B | 1,000,000 tokens/minute 1,500 requests/day 15 requests/minute |
Gemini 1.5 Pro | 32,000 tokens/minute 50 requests/day 2 requests/minute |
LearnLM 1.5 Pro (Experimental) | 1,500 requests/day 15 requests/minute |
Gemma 3 27B Instruct | 15,000 tokens/minute 14,400 requests/day 30 requests/minute |
Gemma 3 12B Instruct | 15,000 tokens/minute 14,400 requests/day 30 requests/minute |
Gemma 3 4B Instruct | 15,000 tokens/minute 14,400 requests/day 30 requests/minute |
Gemma 3 1B Instruct | 15,000 tokens/minute 14,400 requests/day 30 requests/minute |
text-embedding-004 | 150 batch requests/minute 1,500 requests/minute 100 content/batch Shared Quota |
embedding-001 |
Phone number verification required. Models tend to be context window limited.
Limits: 40 requests/minute
Limits (per-model): 1 request/second, 500,000 tokens/minute, 1,000,000,000 tokens/month
Limits: 30 requests/minute, 2,000 requests/day
HuggingFace Serverless Inference limited to models smaller than 10GB. Some popular models are supported even if they exceed 10GB.
Limits: $0.10/month in credits
Free tier restricted to 8K context.
Model Name | Model Limits |
---|---|
Llama 4 Scout | 30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day |
Llama 3.1 8B | 30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day |
Llama 3.3 70B | 30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day |
Model Name | Model Limits |
---|---|
Allam 2 7B | 7,000 requests/day 6,000 tokens/minute |
DeepSeek R1 Distill Llama 70B | 1,000 requests/day 6,000 tokens/minute |
Distil Whisper Large v3 | 7,200 audio-seconds/minute 2,000 requests/day |
Gemma 2 9B Instruct | 14,400 requests/day 15,000 tokens/minute |
Groq compound-beta | 200 requests/day 70,000 tokens/minute |
Groq compound-beta-mini | 200 requests/day 70,000 tokens/minute |
Llama 3 70B | 14,400 requests/day 6,000 tokens/minute |
Llama 3 8B | 14,400 requests/day 6,000 tokens/minute |
Llama 3.1 8B | 14,400 requests/day 6,000 tokens/minute |
Llama 3.3 70B | 1,000 requests/day 12,000 tokens/minute |
Llama 4 Maverick 17B 128E Instruct | 1,000 requests/day 6,000 tokens/minute |
Llama 4 Scout Instruct | 1,000 requests/day 30,000 tokens/minute |
Llama Guard 3 8B | 14,400 requests/day 15,000 tokens/minute |
Mistral Saba 24B | 1,000 requests/day 6,000 tokens/minute |
Qwen QwQ 32B | 1,000 requests/day 6,000 tokens/minute |
Whisper Large v3 | 7,200 audio-seconds/minute 2,000 requests/day |
Whisper Large v3 Turbo | 7,200 audio-seconds/minute 2,000 requests/day |
Model Name | Model Limits |
---|---|
DeepSeek R1 Distill Llama 70B | 12 requests/minute |
Llama 3.1 70B Instruct | 12 requests/minute |
Llama 3.1 8B Instruct | 12 requests/minute |
Llama 3.3 70B Instruct | 12 requests/minute |
Llava Next Mistral 7B | 12 requests/minute |
Mamba Codestral 7B v0.1 | 12 requests/minute |
Mistral 7B Instruct v0.3 | 12 requests/minute |
Mistral Nemo 2407 | 12 requests/minute |
Mixtral 8x7B Instruct v0.1 | 12 requests/minute |
Qwen 2.5 VL 72B Instruct | 12 requests/minute |
Qwen2.5 Coder 32B Instruct | 12 requests/minute |
Limits: Up to 60 requests/minute
Limits:
20 requests/minute
1,000 requests/month
Models share a common quota.
Extremely restrictive input/output token limits.
Limits: Dependent on Copilot subscription tier (Free/Pro/Pro+/Business/Enterprise)
Distributed, decentralized crypto-based compute. Data is sent to individual hosts.
Limits: 10,000 neurons/day
Very stringent payment verification for Google Cloud.
Model Name | Model Limits |
---|---|
Gemini 2.5 Pro (Experimental) | 10 requests/minute Shared Quota |
Gemini 2.0 Flash (Experimental) | |
Gemini 2.0 Flash Thinking (Experimental) | |
Gemini 2.0 Pro (Experimental) | |
Llama 4 Maverick Instruct | 60 requests/minute Free during preview |
Llama 4 Scout Instruct | 60 requests/minute Free during preview |
Llama 3.3 70B Instruct | 30 requests/minute Free during preview |
Llama 3.2 90B Vision Instruct | 30 requests/minute Free during preview |
Llama 3.1 70B Instruct | 60 requests/minute Free during preview |
Llama 3.1 8B Instruct | 60 requests/minute Free during preview |
Credits: $1 when you add a payment method
Models: Various open models
Credits: $1
Models: Various open models
Credits: $5 when you add a payment method
Models: Routes to other providers, various open models and proprietary models (OpenAI, Gemini, Anthropic, Mistral, Perplexity, etc)
Credits: $30
Models: Any supported model - pay by compute time
Credits: $1
Models: Various open models
Credits: $0.5 for 1 year, $10 for 3 months for LLMs with referral code + GitHub account connection
Models: Various open models
Credits: $10 for 3 months
Models: Jamba family of models
Credits: $10 for 3 months
Models: Solar Pro/Mini
Credits: $15
Requirements: Phone number verification
Models: Various open models
Credits: Token/time-limited trials on a per-model basis
Models: Various open and proprietary Qwen models
Credits: $5/month upon sign up, $30/month with payment method added
Models: Any supported model - pay by compute time
Credits: $1
Models:
Credits: $5 for 3 months
Models:
Credits: 1,000,000 free tokens
Models: