Llama 3.1 8B Instruct

Compact Llama 3.1 for efficient text generation.

llama-3.1-8b-instruct

STABLEGet Started

128,000 context

Starting at $0.02/M input tokens

Starting at $0.05/M output tokens

Streaming

Tools

JSON Output

Select Provider

All Providers for Llama 3.1 8B Instruct

LLM Gateway routes requests to the best providers that are able to handle your prompt size and parameters.

AWS Bedrock

Context: 128k

Input

$0.22

/M tokens

Cached

—

/M tokens

Output

$0.22

/M tokens

Get Started

Nebius AI

Context: 128k

Input

$0.02

/M tokens

Cached

—

/M tokens

Output

$0.06

/M tokens

Get Started

Inference.net

Context: 128k

Input

$0.07

/M tokens

Cached

—

/M tokens

Output

$0.33

/M tokens

Get Started

Together AI

Context: 128k

Deactivated since Mar 27, 2026

Input

$0.06

/M tokens

Cached

—

/M tokens

Output

$0.06

/M tokens

Get Started

Cerebras

Context: 128k

Input

$0.1

/M tokens

Cached

—

/M tokens

Output

$0.1

/M tokens

Get Started

NovitaAI

Context: 16.4k

Input

$0.02

/M tokens

Cached

—

/M tokens

Output

$0.05

/M tokens

Get Started