Groq Provider

Groq's ultra-fast LPU inference with various models

Available Models

GPT OSS 120B

openai
gpt-oss-120b
Streaming
Tools
Reasoning
JSON Output
Groq
Context: 131.1k
Input
$0.15
/M tokens
Cached
/M tokens
Output
$0.75
/M tokens

GPT OSS 20B

openai
gpt-oss-20b
Streaming
Tools
Reasoning
JSON Output
Groq
Context: 131.1k
Input
$0.1
/M tokens
Cached
/M tokens
Output
$0.5
/M tokens

Kimi K2

moonshot
kimi-k2
Streaming
Tools
JSON Output
Groq
Context: 131.1k
Input
$1
/M tokens
Cached
$0.5
/M tokens
Output
$3
/M tokens

Llama Guard 4 12B

metaModel Deactivated
llama-guard-4-12b
Streaming
Groq
Context: 131.1k
Deactivated since Mar 29, 2026
Input
$0.2
/M tokens
Cached
/M tokens
Output
$0.2
/M tokens

DeepSeek R1 Distill Llama 70B

deepseekModel Deactivated
deepseek-r1-distill-llama-70b
Streaming
Tools
JSON Output
Groq
Context: 131.1k
Deactivated since Oct 9, 2025
Input
$0.75
/M tokens
Cached
/M tokens
Output
$0.99
/M tokens

Gemma2 9B IT

googleModel Deactivated
gemma2-9b-it
Streaming
Tools
Groq
Context: 8.1k
Deactivated since Oct 8, 2025
Input
$0.2
/M tokens
Cached
/M tokens
Output
$0.2
/M tokens