Nebius AI Provider
Nebius AI Studio - OpenAI-compatible API for large language models
Available Models
Qwen3 Coder 30B A3B Instruct
alibaba
qwen3-coder-30b-a3b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 262k
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.3
/M tokens
Qwen3 30B A3B Instruct 2507
alibaba
qwen3-30b-a3b-instruct-2507Streaming
Tools
JSON Output
Nebius AI
Context: 262k
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.3
/M tokens
Qwen3 30B A3B Thinking 2507
alibaba
qwen3-30b-a3b-thinking-2507Streaming
Tools
Reasoning
JSON Output
Nebius AI
Context: 262k
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.3
/M tokens
Qwen3 235B A22B Thinking 2507
alibaba
qwen3-235b-a22b-thinking-2507Streaming
Tools
Reasoning
JSON Output
Nebius AI
Context: 262k
Input
$0.2
/M tokens
Cached
—
/M tokens
Output
$0.6
/M tokens
Qwen3 235B A22B Instruct 2507
alibaba
qwen3-235b-a22b-instruct-2507Streaming
Tools
JSON Output
Nebius AI
Context: 262k
Input
$0.2
/M tokens
Cached
—
/M tokens
Output
$0.6
/M tokens
Kimi K2
moonshot
kimi-k2Streaming
Tools
JSON Output
Nebius AI
Context: 131.1k
Input
$0.5
/M tokens
Cached
—
/M tokens
Output
$2.4
/M tokens
DeepSeek R1 (0528)
deepseek
deepseek-r1-0528Streaming
Nebius AI
Context: 64k
Input
$0.8
/M tokens
Cached
—
/M tokens
Output
$2.4
/M tokens
Qwen3 14B
alibabaModel Deactivated
qwen3-14bStreaming
Tools
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Nov 3, 2025
Input
$0.08
/M tokens
Cached
—
/M tokens
Output
$0.24
/M tokens
Qwen3 32B
alibaba
qwen3-32bStreaming
Tools
JSON Output
Nebius AI
Context: 32.8k
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.3
/M tokens
Qwen3 30B A3B
alibabaModel Deactivated
qwen3-30b-a3bStreaming
Tools
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Nov 3, 2025
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.3
/M tokens
Llama 3.1 Nemotron Ultra 253B
meta
llama-3.1-nemotron-ultra-253bStreaming
JSON Output
Nebius AI
Context: 128k
Input
$0.6
/M tokens
Cached
—
/M tokens
Output
$1.8
/M tokens
Gemma 3 27B
google
gemma-3-27bStreaming
Vision
Nebius AI
Context: 128k
Input
$0.27
/M tokens
Cached
—
/M tokens
Output
$0.27
/M tokens
Qwen QwQ 32B
alibabaModel Deactivated
qwen-qwq-32bStreaming
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Nov 3, 2025
Input
$0.15
/M tokens
Cached
—
/M tokens
Output
$0.45
/M tokens
Qwen3 Coder 480B A35B Instruct
alibaba
qwen3-coder-480b-a35b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 262k
Input
$0.4
/M tokens
Cached
—
/M tokens
Output
$1.8
/M tokens
Qwen2.5 VL 72B Instruct
alibaba
qwen2-5-vl-72b-instructStreaming
Vision
JSON Output
Nebius AI
Context: 32.8k
Input
$0.13
/M tokens
Cached
—
/M tokens
Output
$0.4
/M tokens
DeepSeek V3
deepseekModel Deactivated
deepseek-v3Streaming
Nebius AI
Context: 64k
Deactivated since Nov 3, 2025
Input
$0.5
/M tokens
Cached
—
/M tokens
Output
$1.5
/M tokens
Llama 3.3 70B Instruct
meta
llama-3.3-70b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 128k
Input
$0.13
/M tokens
Cached
—
/M tokens
Output
$0.4
/M tokens
Qwen2.5 Coder 7B
alibaba
qwen25-coder-7bStreaming
JSON Output
Nebius AI
Context: 32.8k
Input
$0.01
/M tokens
Cached
—
/M tokens
Output
$0.03
/M tokens
Qwen2.5 32B Instruct
alibabaModel Deactivated
qwen25-32b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Sep 10, 2025
Input
$0.06
/M tokens
Cached
—
/M tokens
Output
$0.2
/M tokens
Qwen2.5 72B Instruct
alibabaModel Deactivated
qwen25-72b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Nov 3, 2025
Input
$0.13
/M tokens
Cached
—
/M tokens
Output
$0.4
/M tokens
Qwen2 VL 72B Instruct
alibabaModel Deactivated
qwen2-vl-72b-instructStreaming
Vision
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Sep 10, 2025
Input
$0.13
/M tokens
Cached
—
/M tokens
Output
$0.4
/M tokens
Hermes 3 Llama 405B
nousresearchModel Deactivated
hermes-3-llama-405bStreaming
JSON Output
Nebius AI
Context: 131.1k
Deactivated since Nov 3, 2025
Input
$1
/M tokens
Cached
—
/M tokens
Output
$3
/M tokens
Llama 3.1 8B Instruct
meta
llama-3.1-8b-instructStreaming
Nebius AI
Context: 128k
Input
$0.02
/M tokens
Cached
—
/M tokens
Output
$0.06
/M tokens
Llama 3.1 405B Instruct
metaModel Deactivated
llama-3.1-405b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 128k
Deactivated since Nov 3, 2025
Input
$1
/M tokens
Cached
—
/M tokens
Output
$3
/M tokens