Skip to main content
Kyma currently serves 68 models across language, image, video, and audio. All models are verified working and accessible through the same /v1 API. Language models are pay-per-token; image and video are flat per-call or pay-per-second depending on SKU; audio is per-character (TTS), per-minute (transcription / understand), or flat per call (music, voice clone, voice design). Live catalog: GET /v1/models (model metadata) and GET /v1/pricing (full pricing — text + image + video + audio in one round-trip).
curl https://kymaapi.com/v1/models

How to choose quickly

  • Start with qwen-3.6-plus if you want the best default for general work, coding, and reasoning.
  • Step up to qwen-3.7-max when you want the highest-quality Qwen for hard reasoning and multilingual work and cost is secondary.
  • Use grok-4.3 for frontier-quality general work and agentic coding at a balanced price.
  • Use gemini-3.5-flash when you need 1M context with multimodal input (image, audio, video) and fast responses.
  • Use kimi-k2.6 for tool-heavy agents, long coding sessions, and image-aware workflows.
  • Use deepseek-v4-pro for top reasoning and complex coding with 1M context.
  • Use deepseek-v4-flash when you want V4-family quality at the cheapest price.
  • Use gemini-2.5-flash when you need 1M context or cheap long-context throughput.
  • Use qwen-3-32b when latency matters and you still want strong coding quality.
  • Use glm-5.1 when you need a long-running coding agent for repo-scale engineering work.
  • Use sonar when the answer needs live web data with citations — current events, prices, releases (sonar-pro for deeper research).

Filter the catalog

The live GET /v1/models endpoint now supports capability filters so agents can select models programmatically instead of hardcoding a shortlist.
# Tool-capable models for coding agents
curl "https://kymaapi.com/v1/models?recommended_for=coding&tools=true&supported_parameters=tools,structured_outputs"

# Fast, cheap models with at least 128K context
curl "https://kymaapi.com/v1/models?latency_tier=fast&cost_tier=cheap&min_context_window=128000"

# Vision-capable models
curl "https://kymaapi.com/v1/models?vision=true&input_modalities=text,image"

Qwen 3.6 Plus

#1 most popular. Closed-weight, highest quality overall. 131K context.
model="qwen-3.6-plus"

DeepSeek V4 Flash

Best value V4. 1M context, MIT, native reasoning. $0.19/M input.
model="deepseek-v4-flash"

DeepSeek V4 Pro

Top reasoning. 1.6T MoE flagship, 1M context, complex coding.
model="deepseek-v4-pro"

Kimi K2.5

Best for agents. Multimodal agentic model. 262K context.
model="kimi-k2.6"

Capability Guide

NeedBest first pickWhy
General defaultqwen-3.6-plusBest overall quality, strong multilingual reasoning
Max Qwen qualityqwen-3.7-maxHighest-quality Qwen, 1M context — premium step-up from 3.6 Plus
Frontier agentic (balanced cost)grok-4.3Frontier reasoning + tool use at a balanced price, 1M context
Multimodal 1M contextgemini-3.5-flashImage/audio/video input, fast, 1M context
Tool-heavy agentskimi-k2.6Strong tool use, long context, multimodal
Top reasoningdeepseek-v4-pro1.6T MoE flagship, 1M context, native reasoning
Best valuedeepseek-v4-flashV4-tier quality at the lowest price, 1M context
Long-running coding agentsglm-5.1Better fit for repo-scale engineering and multi-step execution
Fast codingqwen-3-32bLower latency while staying strong on code and math
1M contextgemini-2.5-flashCheapest long-context option on Kyma
Visiongemma-4-31bReliable image + text workflows
Live web searchsonarReal-time web search with citations; sonar-pro for deeper research
Image generationgpt-image-2OpenAI flagship — best text-in-image, multilingual typography. Quality dropdown low/medium/high.
Design-quality imagesrecraft-v4#1 HF Arena. Brand and illustration default at $0.054.
Video generationkling-3-proPremium cinematic clips, hero brand video

Tier 1 — Highest Quality

Model IDNameContextSpeedBest For
qwen-3.6-plusQwen 3.6 Plus131KMediumGeneral, #1 traffic
qwen-3.7-maxQwen 3.7 Max1MMediumMax Qwen quality, reasoning, multilingual
grok-4.3Grok 4.31MMediumFrontier general, agentic coding
gemini-3.5-flashGemini 3.5 Flash1MFastMultimodal input, fast long-context
deepseek-v4-proDeepSeek V4 Pro1MMediumTop reasoning, complex coding
deepseek-v4-flashDeepSeek V4 Flash1MFastBest value, long context
deepseek-v3DeepSeek V3160KMediumPrevious-gen flagship, stable
deepseek-r1DeepSeek R164KSlowReasoning, analysis
kimi-k2.6Kimi K2.6262KMediumAgentic coding, multimodal
gemma-4-31bGemma 4 31B128KMediumMultimodal, vision
qwen-3-32bQwen 3 32B32KFastCode, math, multilingual
llama-3.3-70bLlama 3.3 70B128KFastGeneral, most popular open model
minimax-m2.5MiniMax M2.5196KMediumAgentic coding (SWE-bench 80.2%)
glm-5.1GLM 5.1203KMediumLong-running coding agents, repo-scale engineering

Tier 2 — High Quality

Model IDNameContextSpeedBest For
minimax-m2.7MiniMax M2.7205KMediumAgentic coding, productivity, debugging
gpt-oss-120bGPT-OSS 120B128KMediumWriting, general intelligence
qwen-3-coderQwen 3 Coder131KMediumCode generation
gemini-2.5-flashGemini 2.5 Flash1MFastLong context
gemini-3-flashGemini 3 Flash1MFastUltra-long context
glm-4.5-airGLM 4.5 Air131KFastCheap agentic bulk tasks
glm-4.7-flashGLM 4.7 Flash203KFastCheap long-context throughput
Live, web-grounded models that search the web on every request and return cited answers. They bill a small flat web-search fee on top of token cost — see Pricing → Web search. No tool-calling support: the search runs internally, you just ask a question. Alias searchsonar.
Model IDNameContextSpeedBest For
sonarSonar127KMediumQuick cited answers, current events, live lookups
sonar-proSonar Pro200KMediumDeep multi-step research, longer cited reports

Image Generation

Async endpoint at POST /v1/images/generations — pay per image, no token billing. See the Image Generation guide for prompting tips and full examples.
ModelBest ForCost / imageInput
gpt-image-2Text-in-image — multilingual typography, logos, posters0.014/0.014 / 0.081 / $0.297 (low/medium/high)text + image
flux-2-proPhotoreal, multi-reference blend (up to 10 sources)0.0410.041–0.101text + image(s)
recraft-v4-pro4MP print-ready design$0.338text
ideogram-v3Typography, logos, packaging$0.108text
recraft-v4Design-quality, brand assets$0.054text
flux-kontext-proImage edit, inpaint, refine$0.054text + image
recraft-v4-vectorNative SVG — logos, icons$0.108text
recraft-v4-vector-pro4MP SVG, print-ready$0.405text
minimax-image-01Sub-cent budget tier, bulk$0.005text

Video Generation

Async endpoint at POST /v1/videos/generations — pay per second of generated footage. See the Video Generation guide for prompting tips and the full per-SKU breakdown.
ModelBest ForCost / sec5s clipAudioInput
kling-2.5-proBudget cinematic, b-roll$0.0945$0.4725text + image
kling-3-proPremium cinematic, hero video$0.1512$0.7560text + image
kling-3-pro-audioCinematic w/ diegetic sound$0.2268$1.1340nativetext + image
seedance-2-proAction, multi-shot, social$0.40959$2.04795bundledtext + image
seedance-2-fastSocial shorts, rapid iteration$0.326565$1.63283bundledtext + image
For the live canonical list, use GET /v1/models. Disabled models are intentionally omitted from this page.
Models are updated regularly. Use GET /v1/models for the latest list.