Together AI Provider
Together AI is a platform for running large language models in the cloud with fast inference.
Available Models
MiniMax M2.5
minimax
minimax-m2.5Streaming
Reasoning
JSON Output
JSON Schema
Together AI
Context: 228.7k
Input
$0.3
/M tokens
Cached
—
/M tokens
Output
$1.2
/M tokens
GLM-5
glm
glm-5Streaming
Tools
Reasoning
JSON Output
JSON Schema
Together AI
Context: 202.8k
Input
$1
/M tokens
Cached
—
/M tokens
Output
$3.2
/M tokens
Kimi K2.5
moonshot
kimi-k2.5Streaming
Vision
Reasoning
Together AI
Context: 262.1k
Input
$0.5
/M tokens
Cached
—
/M tokens
Output
$2.8
/M tokens
GLM-4.7
glm
glm-4.7Streaming
Reasoning
Together AI
Context: 202.8k
Input
$0.45
/M tokens
Cached
—
/M tokens
Output
$2
/M tokens
Llama 4 Scout
meta
llama-4-scoutStreaming
Tools
Together AI
Context: 32.8k
Input
$0.18
/M tokens
Cached
—
/M tokens
Output
$0.59
/M tokens
Llama 3.1 8B Instruct
metaModel Deactivated
llama-3.1-8b-instructStreaming
Tools
Together AI
Context: 128k
Deactivated since Mar 27, 2026
Input
$0.06
/M tokens
Cached
—
/M tokens
Output
$0.06
/M tokens
Gemma 2 27B IT
google
gemma-2-27b-it-togetherStreaming
Together AI
Context: 8.2k
Input
$0.08
/M tokens
Cached
—
/M tokens
Output
$0.08
/M tokens
Mixtral 8x7B Instruct
mistral
mixtral-8x7b-instruct-togetherStreaming
JSON Output
Together AI
Context: 32.8k
Input
$0.06
/M tokens
Cached
—
/M tokens
Output
$0.06
/M tokens
Mistral 7B Instruct
mistralModel Deactivated
mistral-7b-instruct-togetherStreaming
JSON Output
Together AI
Context: 8.2k
Deactivated since Nov 13, 2025
Input
$0.06
/M tokens
Cached
—
/M tokens
Output
$0.06
/M tokens