NovitaAI Provider
NovitaAI's OpenAI-compatible large language models
Available Models
Qwen3.5 397B A17B
alibaba
qwen35-397b-a17bStreaming
Vision
Tools
Reasoning
JSON Output
NovitaAI
Context: 262.1k
Input
$0.6
/M tokens
Cached
—
/M tokens
Output
$3.6
/M tokens
MiniMax M2.5
minimax
minimax-m2.5Streaming
Tools
Reasoning
JSON Output
NovitaAI
Context: 204.8k
Input
$0.3
/M tokens
Cached
$0.03
/M tokens
Output
$1.2
/M tokens
GLM-5
glm
glm-5Streaming
Tools
Reasoning
JSON Output
NovitaAI
Context: 202.8k
Input
$1
/M tokens
Cached
$0.2
/M tokens
Output
$3.2
/M tokens
MiniMax M2.1
minimax
minimax-m2.1Streaming
Tools
Reasoning
JSON Output
NovitaAI
Context: 204.8k
Input
$0.3
/M tokens
Cached
$0.03
/M tokens
Output
$1.2
/M tokens
GLM-4.7
glm
glm-4.7Streaming
Tools
Reasoning
JSON Output
NovitaAI
Context: 204.8k
Input
$0.6
/M tokens
Cached
$0.11
/M tokens
Output
$2.2
/M tokens
GLM-4.6V
glm
glm-4.6vStreaming
Vision
Tools
Reasoning
JSON Output
NovitaAI
Context: 131.1k
Input
$0.3
/M tokens
Cached
$0.055
/M tokens
Output
$0.9
/M tokens
Qwen3 VL 8B Instruct
alibaba
qwen3-vl-8b-instructStreaming
Vision
JSON Output
NovitaAI
Context: 131.1k
Input
$0.08
/M tokens
Cached
—
/M tokens
Output
$0.5
/M tokens
Qwen3 VL 30B A3B Thinking
alibaba
qwen3-vl-30b-a3b-thinkingStreaming
Vision
Tools
Reasoning
JSON Output
NovitaAI
Context: 131.1k
Input
$0.2
/M tokens
Cached
—
/M tokens
Output
$1
/M tokens
Qwen3 VL 30B A3B Instruct
alibaba
qwen3-vl-30b-a3b-instructStreaming
Vision
Tools
NovitaAI
Context: 131.1k
Input
$0.2
/M tokens
Cached
—
/M tokens
Output
$0.7
/M tokens
GLM-4.6
glm
glm-4.6Streaming
Tools
Reasoning
JSON Output
NovitaAI
Context: 204.8k
Input
$0.55
/M tokens
Cached
$0.11
/M tokens
Output
$2.2
/M tokens
DeepSeek V3.2
deepseek
deepseek-v3.2Streaming
Tools
JSON Output
NovitaAI
Context: 163.8k
Input
$0.269
/M tokens
Cached
$0.1345
/M tokens
Output
$0.4
/M tokens
Qwen3 Max
alibaba
qwen3-maxStreaming
Tools
JSON Output
NovitaAI
Context: 262.1k
Input
$0.845
/M tokens
Cached
—
/M tokens
Output
$3.38
/M tokens
Qwen3 VL 235B A22B Instruct
alibaba
qwen3-vl-235b-a22b-instructStreaming
Vision
Tools
JSON Output
NovitaAI
Context: 131.1k
Input
$0.3
/M tokens
Cached
—
/M tokens
Output
$1.5
/M tokens
Qwen3 VL 235B A22B Thinking
alibaba
qwen3-vl-235b-a22b-thinkingStreaming
Vision
Reasoning
NovitaAI
Context: 131.1k
Input
$0.98
/M tokens
Cached
—
/M tokens
Output
$3.95
/M tokens
Qwen3 Next 80B A3B Thinking
alibaba
qwen3-next-80b-a3b-thinkingStreaming
Tools
Reasoning
NovitaAI
Context: 131.1k
Input
$0.15
/M tokens
Cached
—
/M tokens
Output
$1.5
/M tokens
Qwen3 Next 80B A3B Instruct
alibaba
qwen3-next-80b-a3b-instructStreaming
Tools
JSON Output
NovitaAI
Context: 131.1k
Input
$0.15
/M tokens
Cached
—
/M tokens
Output
$1.5
/M tokens
GLM-4.5V
glm
glm-4.5vStreaming
Vision
Tools
Reasoning
JSON Output
NovitaAI
Context: 65.5k
Input
$0.6
/M tokens
Cached
$0.11
/M tokens
Output
$1.8
/M tokens
Qwen3 Coder 30B A3B Instruct
alibaba
qwen3-coder-30b-a3b-instructStreaming
Tools
JSON Output
NovitaAI
Context: 160k
Input
$0.07
/M tokens
Cached
—
/M tokens
Output
$0.27
/M tokens
Qwen3 235B A22B Thinking 2507
alibaba
qwen3-235b-a22b-thinking-2507Streaming
Tools
NovitaAI
Context: 131.1k
Input
$0.3
/M tokens
Cached
—
/M tokens
Output
$3
/M tokens
Qwen3 235B A22B Instruct 2507
alibaba
qwen3-235b-a22b-instruct-2507Streaming
Tools
JSON Output
NovitaAI
Context: 131.1k
Input
$0.09
/M tokens
Cached
—
/M tokens
Output
$0.58
/M tokens
Kimi K2
moonshot
kimi-k2Streaming
Tools
NovitaAI
Context: 131.1k
Input
$0.57
/M tokens
Cached
—
/M tokens
Output
$2.3
/M tokens
Qwen3 235B A22B FP8
alibaba
qwen3-235b-a22b-fp8Streaming
JSON Output
NovitaAI
Context: 41.0k
Input
$0.2
/M tokens
Cached
—
/M tokens
Output
$0.8
/M tokens
Qwen3 32B FP8
alibaba
qwen3-32b-fp8Streaming
NovitaAI
Context: 41.0k
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.45
/M tokens
Qwen3 30B A3B FP8
alibaba
qwen3-30b-a3b-fp8Streaming
NovitaAI
Context: 41.0k
Input
$0.09
/M tokens
Cached
—
/M tokens
Output
$0.45
/M tokens
Qwen3 4B FP8
alibaba
qwen3-4b-fp8Streaming
NovitaAI
Context: 128k
Input
$0.03
/M tokens
Cached
—
/M tokens
Output
$0.03
/M tokens
Llama 4 Scout 17B Instruct
meta
llama-4-scout-17b-instructStreaming
Vision
JSON Output
NovitaAI
Context: 131.1k
Input
$0.18
/M tokens
Cached
—
/M tokens
Output
$0.59
/M tokens
Llama 4 Maverick 17B Instruct
meta
llama-4-maverick-17b-instructStreaming
Vision
JSON Output
NovitaAI
Context: 1.0M
Input
$0.27
/M tokens
Cached
—
/M tokens
Output
$0.85
/M tokens
Llama 3 8B Instruct
meta
llama-3-8b-instructStreaming
JSON Output
NovitaAI
Context: 8.2k
Input
$0.04
/M tokens
Cached
—
/M tokens
Output
$0.04
/M tokens
Qwen3 Coder 480B A35B Instruct
alibaba
qwen3-coder-480b-a35b-instructStreaming
Tools
JSON Output
NovitaAI
Context: 262.1k
Input
$0.3
/M tokens
Cached
—
/M tokens
Output
$1.3
/M tokens
Llama 3.3 70B Instruct
meta
llama-3.3-70b-instructStreaming
Tools
NovitaAI
Context: 131.1k
Input
$0.135
/M tokens
Cached
—
/M tokens
Output
$0.4
/M tokens
Llama 3.2 3B Instruct
meta
llama-3.2-3b-instructStreaming
JSON Output
NovitaAI
Context: 32.8k
Input
$0.03
/M tokens
Cached
—
/M tokens
Output
$0.05
/M tokens
Llama 3.1 8B Instruct
meta
llama-3.1-8b-instructStreaming
JSON Output
NovitaAI
Context: 16.4k
Input
$0.02
/M tokens
Cached
—
/M tokens
Output
$0.05
/M tokens
Hermes 2 Pro Llama 3 8B
nousresearch
hermes-2-pro-llama-3-8bStreaming
NovitaAI
Context: 8.2k
Input
$0.14
/M tokens
Cached
—
/M tokens
Output
$0.14
/M tokens
Llama 3 70B Instruct
meta
llama-3-70b-instructStreaming
JSON Output
NovitaAI
Context: 8.2k
Input
$0.51
/M tokens
Cached
—
/M tokens
Output
$0.74
/M tokens
MiniMax M2.7
minimax
minimax-m2.7Streaming
Tools
Reasoning
JSON Output
NovitaAI
Context: 204.8k
Input
$0.3
/M tokens
Cached
$0.06
/M tokens
Output
$1.2
/M tokens