Back to all models

Llama 3.1 Nemotron Ultra 253B

NVIDIA-tuned Llama 3.1 253B for maximum capability.

llama-3.1-nemotron-ultra-253b

STABLEGet Started

128,000 context

Starting at $0.60/M input tokens

Starting at $1.80/M output tokens

Streaming

JSON Output

Select Provider

All Providers for Llama 3.1 Nemotron Ultra 253B

LLM Gateway routes requests to the best providers that are able to handle your prompt size and parameters.

Nebius AI

Context: 128k

Input

$0.6

/M tokens

Cached

—

/M tokens

Output

$1.8

/M tokens