Groq Provider
Groq's ultra-fast LPU inference with various models
Available Models
GPT OSS 120B
openai
gpt-oss-120bStreaming
Tools
Reasoning
JSON Output
Groq
Context: 131.1k
Input
$0.15
/M tokens
Cached
—
/M tokens
Output
$0.75
/M tokens
GPT OSS 20B
openai
gpt-oss-20bStreaming
Tools
Reasoning
JSON Output
Groq
Context: 131.1k
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.5
/M tokens
Kimi K2
moonshot
kimi-k2Streaming
Tools
JSON Output
Groq
Context: 131.1k
Input
$1
/M tokens
Cached
$0.5
/M tokens
Output
$3
/M tokens
Llama Guard 4 12B
metaModel Deactivated
llama-guard-4-12bStreaming
Groq
Context: 131.1k
Deactivated since Mar 29, 2026
Input
$0.2
/M tokens
Cached
—
/M tokens
Output
$0.2
/M tokens
DeepSeek R1 Distill Llama 70B
deepseekModel Deactivated
deepseek-r1-distill-llama-70bStreaming
Tools
JSON Output
Groq
Context: 131.1k
Deactivated since Oct 9, 2025
Input
$0.75
/M tokens
Cached
—
/M tokens
Output
$0.99
/M tokens
Gemma2 9B IT
googleModel Deactivated
gemma2-9b-itStreaming
Tools
Groq
Context: 8.1k
Deactivated since Oct 8, 2025
Input
$0.2
/M tokens
Cached
—
/M tokens
Output
$0.2
/M tokens