All Models - Kyma API

Kyma currently serves 68 models across language, image, video, and audio. All models are verified working and accessible through the same /v1 API. Language models are pay-per-token; image and video are flat per-call or pay-per-second depending on SKU; audio is per-character (TTS), per-minute (transcription / understand), or flat per call (music, voice clone, voice design). Live catalog: GET /v1/models (model metadata) and GET /v1/pricing (full pricing — text + image + video + audio in one round-trip).

curl https://kymaapi.com/v1/models

How to choose quickly

Start with qwen-3.6-plus if you want the best default for general work, coding, and reasoning.
Step up to qwen-3.7-max when you want the highest-quality Qwen for hard reasoning and multilingual work and cost is secondary.
Use grok-4.3 for frontier-quality general work and agentic coding at a balanced price.
Use gemini-3.5-flash when you need 1M context with multimodal input (image, audio, video) and fast responses.
Use kimi-k2.6 for tool-heavy agents, long coding sessions, and image-aware workflows.
Use deepseek-v4-pro for top reasoning and complex coding with 1M context.
Use deepseek-v4-flash when you want V4-family quality at the cheapest price.
Use gemini-2.5-flash when you need 1M context or cheap long-context throughput.
Use qwen-3-32b when latency matters and you still want strong coding quality.
Use glm-5.1 when you need a long-running coding agent for repo-scale engineering work.
Use sonar when the answer needs live web data with citations — current events, prices, releases (sonar-pro for deeper research).

Filter the catalog

The live GET /v1/models endpoint now supports capability filters so agents can select models programmatically instead of hardcoding a shortlist.

# Tool-capable models for coding agents
curl "https://kymaapi.com/v1/models?recommended_for=coding&tools=true&supported_parameters=tools,structured_outputs"

# Fast, cheap models with at least 128K context
curl "https://kymaapi.com/v1/models?latency_tier=fast&cost_tier=cheap&min_context_window=128000"

# Vision-capable models
curl "https://kymaapi.com/v1/models?vision=true&input_modalities=text,image"

Recommended Models

Qwen 3.6 Plus

#1 most popular. Closed-weight, highest quality overall. 131K context.

model="qwen-3.6-plus"

DeepSeek V4 Flash

Best value V4. 1M context, MIT, native reasoning. $0.19/M input.

model="deepseek-v4-flash"

DeepSeek V4 Pro

Top reasoning. 1.6T MoE flagship, 1M context, complex coding.

model="deepseek-v4-pro"

Kimi K2.5

Best for agents. Multimodal agentic model. 262K context.

model="kimi-k2.6"

Capability Guide

Need	Best first pick	Why
General default	`qwen-3.6-plus`	Best overall quality, strong multilingual reasoning
Max Qwen quality	`qwen-3.7-max`	Highest-quality Qwen, 1M context — premium step-up from 3.6 Plus
Frontier agentic (balanced cost)	`grok-4.3`	Frontier reasoning + tool use at a balanced price, 1M context
Multimodal 1M context	`gemini-3.5-flash`	Image/audio/video input, fast, 1M context
Tool-heavy agents	`kimi-k2.6`	Strong tool use, long context, multimodal
Top reasoning	`deepseek-v4-pro`	1.6T MoE flagship, 1M context, native reasoning
Best value	`deepseek-v4-flash`	V4-tier quality at the lowest price, 1M context
Long-running coding agents	`glm-5.1`	Better fit for repo-scale engineering and multi-step execution
Fast coding	`qwen-3-32b`	Lower latency while staying strong on code and math
1M context	`gemini-2.5-flash`	Cheapest long-context option on Kyma
Vision	`gemma-4-31b`	Reliable image + text workflows
Live web search	`sonar`	Real-time web search with citations; `sonar-pro` for deeper research
Image generation	`gpt-image-2`	OpenAI flagship — best text-in-image, multilingual typography. Quality dropdown low/medium/high.
Design-quality images	`recraft-v4`	#1 HF Arena. Brand and illustration default at $0.054.
Video generation	`kling-3-pro`	Premium cinematic clips, hero brand video

Tier 1 — Highest Quality

Model ID	Name	Context	Speed	Best For
`qwen-3.6-plus`	Qwen 3.6 Plus	131K	Medium	General, #1 traffic
`qwen-3.7-max`	Qwen 3.7 Max	1M	Medium	Max Qwen quality, reasoning, multilingual
`grok-4.3`	Grok 4.3	1M	Medium	Frontier general, agentic coding
`gemini-3.5-flash`	Gemini 3.5 Flash	1M	Fast	Multimodal input, fast long-context
`deepseek-v4-pro`	DeepSeek V4 Pro	1M	Medium	Top reasoning, complex coding
`deepseek-v4-flash`	DeepSeek V4 Flash	1M	Fast	Best value, long context
`deepseek-v3`	DeepSeek V3	160K	Medium	Previous-gen flagship, stable
`deepseek-r1`	DeepSeek R1	64K	Slow	Reasoning, analysis
`kimi-k2.6`	Kimi K2.6	262K	Medium	Agentic coding, multimodal
`gemma-4-31b`	Gemma 4 31B	128K	Medium	Multimodal, vision
`qwen-3-32b`	Qwen 3 32B	32K	Fast	Code, math, multilingual
`llama-3.3-70b`	Llama 3.3 70B	128K	Fast	General, most popular open model
`minimax-m2.5`	MiniMax M2.5	196K	Medium	Agentic coding (SWE-bench 80.2%)
`glm-5.1`	GLM 5.1	203K	Medium	Long-running coding agents, repo-scale engineering

Tier 2 — High Quality

Model ID	Name	Context	Speed	Best For
`minimax-m2.7`	MiniMax M2.7	205K	Medium	Agentic coding, productivity, debugging
`gpt-oss-120b`	GPT-OSS 120B	128K	Medium	Writing, general intelligence
`qwen-3-coder`	Qwen 3 Coder	131K	Medium	Code generation
`gemini-2.5-flash`	Gemini 2.5 Flash	1M	Fast	Long context
`gemini-3-flash`	Gemini 3 Flash	1M	Fast	Ultra-long context
`glm-4.5-air`	GLM 4.5 Air	131K	Fast	Cheap agentic bulk tasks
`glm-4.7-flash`	GLM 4.7 Flash	203K	Fast	Cheap long-context throughput

Web Search

Live, web-grounded models that search the web on every request and return cited answers. They bill a small flat web-search fee on top of token cost — see Pricing → Web search. No tool-calling support: the search runs internally, you just ask a question. Alias search → sonar.

Model ID	Name	Context	Speed	Best For
`sonar`	Sonar	127K	Medium	Quick cited answers, current events, live lookups
`sonar-pro`	Sonar Pro	200K	Medium	Deep multi-step research, longer cited reports

Image Generation

Async endpoint at POST /v1/images/generations — pay per image, no token billing. See the Image Generation guide for prompting tips and full examples.

Model	Best For	Cost / image	Input
`gpt-image-2`	Text-in-image — multilingual typography, logos, posters	$0.014 /$ 0.081 / $0.297 (low/medium/high)	text + image
`flux-2-pro`	Photoreal, multi-reference blend (up to 10 sources)	$0.041–$ 0.101	text + image(s)
`recraft-v4-pro`	4MP print-ready design	$0.338	text
`ideogram-v3`	Typography, logos, packaging	$0.108	text
`recraft-v4`	Design-quality, brand assets	$0.054	text
`flux-kontext-pro`	Image edit, inpaint, refine	$0.054	text + image
`recraft-v4-vector`	Native SVG — logos, icons	$0.108	text
`recraft-v4-vector-pro`	4MP SVG, print-ready	$0.405	text
`minimax-image-01`	Sub-cent budget tier, bulk	$0.005	text

Video Generation

Async endpoint at POST /v1/videos/generations — pay per second of generated footage. See the Video Generation guide for prompting tips and the full per-SKU breakdown.

Model	Best For	Cost / sec	5s clip	Audio	Input
`kling-2.5-pro`	Budget cinematic, b-roll	$0.0945	$0.4725	—	text + image
`kling-3-pro`	Premium cinematic, hero video	$0.1512	$0.7560	—	text + image
`kling-3-pro-audio`	Cinematic w/ diegetic sound	$0.2268	$1.1340	native	text + image
`seedance-2-pro`	Action, multi-shot, social	$0.40959	$2.04795	bundled	text + image
`seedance-2-fast`	Social shorts, rapid iteration	$0.326565	$1.63283	bundled	text + image

For the live canonical list, use GET /v1/models. Disabled models are intentionally omitted from this page.

Models are updated regularly. Use GET /v1/models for the latest list.

​How to choose quickly

​Filter the catalog

​Recommended Models

Qwen 3.6 Plus

DeepSeek V4 Flash

DeepSeek V4 Pro

Kimi K2.5

​Capability Guide

​Tier 1 — Highest Quality

​Tier 2 — High Quality

​Web Search

​Image Generation

​Video Generation

How to choose quickly

Filter the catalog

Recommended Models

Capability Guide

Tier 1 — Highest Quality

Tier 2 — High Quality

Web Search

Image Generation

Video Generation