Llama 3.1 8B Instruct

Compact Llama 3.1 for efficient text generation.

llama-3.1-8b-instruct
STABLEGet Started
128,000 context
Starting at $0.02/M input tokens
Starting at $0.05/M output tokens
Streaming
Tools
JSON Output

All Providers for Llama 3.1 8B Instruct

LLM Gateway routes requests to the best providers that are able to handle your prompt size and parameters.

AWS Bedrock
Context: 128k
Input
$0.22
/M tokens
Cached
/M tokens
Output
$0.22
/M tokens
Get Started
Nebius AI
Context: 128k
Input
$0.02
/M tokens
Cached
/M tokens
Output
$0.06
/M tokens
Get Started
Inference.net
Context: 128k
Input
$0.07
/M tokens
Cached
/M tokens
Output
$0.33
/M tokens
Get Started
Together AI
Context: 128k
Deactivated since Mar 27, 2026
Input
$0.06
/M tokens
Cached
/M tokens
Output
$0.06
/M tokens
Get Started
Cerebras
Context: 128k
Input
$0.1
/M tokens
Cached
/M tokens
Output
$0.1
/M tokens
Get Started
NovitaAI
Context: 16.4k
Input
$0.02
/M tokens
Cached
/M tokens
Output
$0.05
/M tokens
Get Started