# Kyma API ## Docs - [Audio Music](https://docs.kymaapi.com/api-reference/audio-music.md): Prompt-driven music generation. Synchronous, returns audio bytes, billed per second (ElevenLabs) or per song (MiniMax). - [Audio Sound Effects](https://docs.kymaapi.com/api-reference/audio-sfx.md): Generate non-speech audio from a text prompt. Synchronous, flat per-generation pricing. - [Audio Speech (TTS)](https://docs.kymaapi.com/api-reference/audio-speech.md): Text-to-speech. Synchronous, returns audio bytes, billed per character. Several voice models, 3,000+ pre-made voices. - [Audio Transcriptions](https://docs.kymaapi.com/api-reference/audio-transcriptions.md): Speech-to-text. Two input modes — multipart file (25 MB, OpenAI Whisper compatible) or JSON audio_url (100 MB, Kyma extension). Synchronous, billed per minute. - [Audio Understand](https://docs.kymaapi.com/api-reference/audio-understand.md): Audio scene Q&A. Hears tone, music, SFX, language, speaker emotion. Custom Kyma endpoint, no OpenAI equivalent. - [Voice Clone (MiniMax)](https://docs.kymaapi.com/api-reference/audio-voice-clone.md): Clone a voice from a 10-second to 5-minute reference recording. Returns a voice_id usable in /v1/audio/speech with any MiniMax HD or Turbo SKU. - [Voice Design (MiniMax)](https://docs.kymaapi.com/api-reference/audio-voice-design.md): Generate a synthesized voice profile from a natural-language description. No reference recording needed. Returns a voice_id usable in /v1/audio/speech. - [Audio Voices](https://docs.kymaapi.com/api-reference/audio-voices.md): Browse the available voice library across both providers. Pick a voice_id for /v1/audio/speech. - [Manage API Keys](https://docs.kymaapi.com/api-reference/auth-keys.md): Create, list, and delete API keys. - [GET /v1/auth/limits](https://docs.kymaapi.com/api-reference/auth-limits.md): Check your tier, rate limits, and current usage. - [Login](https://docs.kymaapi.com/api-reference/auth-login.md): Sign in and get a session token. - [Get User Info](https://docs.kymaapi.com/api-reference/auth-me.md): Get your account info and API keys. - [Register](https://docs.kymaapi.com/api-reference/auth-register.md): Create a new account and get an API key. - [Chat Completions](https://docs.kymaapi.com/api-reference/chat-completions.md): Generate a chat response from a model. - [Image Generations](https://docs.kymaapi.com/api-reference/images-generations.md): Generate or edit images. Asynchronous — returns 202 with a job_id; poll GET /v1/jobs/{id} for the result. - [Tier Matrix](https://docs.kymaapi.com/api-reference/limits-tiers.md): Public tier matrix. Returns the full TIERS array — deposit threshold, RPM, TPM, image/video/audio concurrency caps per tier. Cached 5 min. - [List Models](https://docs.kymaapi.com/api-reference/models-list.md): List all available models. - [Pricing Catalog](https://docs.kymaapi.com/api-reference/pricing-catalog.md): Public pricing catalog. Single round-trip returns every active SKU across text, image, video, and audio with per-call cost detail. Cached 60s. - [Video Generations](https://docs.kymaapi.com/api-reference/videos-generations.md): Generate video clips. Asynchronous — returns 202 with a job_id; poll GET /v1/jobs/{id} for the result. - [Changelog](https://docs.kymaapi.com/changelog.md): Latest updates to Kyma API. - [FAQ](https://docs.kymaapi.com/faq.md): Frequently asked questions about Kyma API. - [Kyma Agent](https://docs.kymaapi.com/guides/agent.md): Terminal-first AI coding agent powered by Kyma API. Install kyma once, sign in once, and switch between coding, reasoning, fast, cheap, vision, and long-context modes. - [Agent Auto-Setup](https://docs.kymaapi.com/guides/agent-setup.md): Universal setup guide for any AI agent or OpenAI-compatible client. Just provide your API key. - [Aider](https://docs.kymaapi.com/guides/aider.md): Use Kyma with Aider — AI pair programming in your terminal. - [Anthropic SDK](https://docs.kymaapi.com/guides/anthropic.md): Use Kyma with the Anthropic Python/JS SDK — drop-in compatible. - [Google Antigravity](https://docs.kymaapi.com/guides/antigravity.md): Use Kyma API with Google Antigravity — the agentic IDE from Google DeepMind. - [Any OpenAI Client](https://docs.kymaapi.com/guides/any-openai-client.md): Use Kyma with any tool that supports OpenAI-compatible APIs. - [Authentication](https://docs.kymaapi.com/guides/authentication.md): How to get and use your Kyma API key. - [Cline](https://docs.kymaapi.com/guides/cline.md): Use Kyma as your AI backend in Cline (VS Code agent). - [Cloudflare Workers](https://docs.kymaapi.com/guides/cloudflare-workers.md): Deploy a Kyma-powered AI API on Cloudflare Workers. - [GitHub Copilot](https://docs.kymaapi.com/guides/copilot.md): Use Kyma models with GitHub Copilot in VS Code. - [cURL](https://docs.kymaapi.com/guides/curl.md): Use Kyma API directly from the terminal. - [Cursor IDE](https://docs.kymaapi.com/guides/cursor.md): Use Kyma API as your AI backend in Cursor. - [Error Handling](https://docs.kymaapi.com/guides/error-handling.md): Handle errors and rate limits gracefully. - [GitHub Actions](https://docs.kymaapi.com/guides/github-actions.md): Use Kyma API in GitHub Actions for AI-powered code review and automation. - [JavaScript / Node.js](https://docs.kymaapi.com/guides/javascript.md): Integrate Kyma API with JavaScript using the OpenAI SDK. - [Kilo Code](https://docs.kymaapi.com/guides/kilo-code.md): Use Kyma with Kilo Code VS Code extension. - [Kyma Ter + Kyma Agent](https://docs.kymaapi.com/guides/kyma-ter/kyma-integration.md): How Kyma Ter and Kyma Agent work together: one package, two commands, shared setup, and a local multi-agent workflow. - [Kyma Ter Overview](https://docs.kymaapi.com/guides/kyma-ter/overview.md): Kyma Ter is the local multi-agent terminal workspace for running Kyma Agent and shell sessions side-by-side in a browser UI. - [Kyma Ter Quickstart](https://docs.kymaapi.com/guides/kyma-ter/quickstart.md): Install Kyma Agent, launch Kyma Ter, complete setup, and open your first Kyma Agent and shell panes. - [LangChain](https://docs.kymaapi.com/guides/langchain.md): Use Kyma API with LangChain for building AI applications, chains, and agents. - [MCP Server](https://docs.kymaapi.com/guides/mcp-server.md): Access 13+ AI models from Claude Code, Cursor, or any MCP client via Kyma's MCP server. - [Model Aliases](https://docs.kymaapi.com/guides/model-aliases.md): Use simple aliases like 'best', 'fast', 'code' instead of memorizing model IDs. - [Choosing a Model](https://docs.kymaapi.com/guides/models.md): How to pick the right Kyma model without memorizing the whole catalog. - [n8n](https://docs.kymaapi.com/guides/n8n.md): Use Kyma API as an AI provider in n8n automation workflows. - [OpenAffiliate](https://docs.kymaapi.com/guides/openaffiliate.md): Use Kyma API with OpenAffiliate to generate affiliate content from real program data. - [OpenClaw Setup](https://docs.kymaapi.com/guides/openclaw.md): Use OpenClaw with Kyma API — active open models, auto-failover, 2-minute setup. - [OpenCode](https://docs.kymaapi.com/guides/opencode.md): Use Kyma with OpenCode — open source coding assistant. - [Prompt Caching](https://docs.kymaapi.com/guides/prompt-caching.md): Save up to 90% on repeated prompts with automatic prompt caching. - [Python](https://docs.kymaapi.com/guides/python.md): Full Python integration guide with OpenAI SDK. - [Rate Limits](https://docs.kymaapi.com/guides/rate-limits.md): Tier-based limits across text, image, video, audio, and realtime endpoints. Sub-pool isolation per audio provider so saturating one capability doesn't starve another. - [Realtime Audio](https://docs.kymaapi.com/guides/realtime-audio.md): Build voice agents on Gemini Live — 5000 concurrent sessions, 30 voices, 24 languages, billed per minute via WebSocket. - [Roo Code](https://docs.kymaapi.com/guides/roo-code.md): Use Kyma with Roo Code VS Code extension. - [Streaming](https://docs.kymaapi.com/guides/streaming.md): Stream responses token-by-token for real-time UX. - [Structured Outputs](https://docs.kymaapi.com/guides/structured-outputs.md): Get JSON responses from models. Supports json_object and json_schema modes. - [Tool Calling](https://docs.kymaapi.com/guides/tool-calling.md): Use function calling with Kyma API. Same format as OpenAI. - [Agent Backend](https://docs.kymaapi.com/guides/use-cases/agent-backend.md): Build an AI agent backend with tool calling, multi-step reasoning, and conversation loops. - [Automation Workflows](https://docs.kymaapi.com/guides/use-cases/automation-workflows.md): Automate repetitive tasks with AI — email drafts, data processing, notifications, and more. - [Build a Chatbot](https://docs.kymaapi.com/guides/use-cases/chatbot.md): Streaming multi-turn chatbot with conversation history. - [Build a Coding Agent](https://docs.kymaapi.com/guides/use-cases/coding-agent.md): Agent that writes and executes code using tool calling. - [Content Generation Pipeline](https://docs.kymaapi.com/guides/use-cases/content-generation.md): Batch content generation with structured JSON output. - [Data Extraction & Structured Output](https://docs.kymaapi.com/guides/use-cases/data-extraction.md): Extract structured data from unstructured text using JSON schemas. - [Internal Copilot](https://docs.kymaapi.com/guides/use-cases/internal-copilot.md): Build an internal AI assistant that answers questions using your company's knowledge base. - [RAG & Search Applications](https://docs.kymaapi.com/guides/use-cases/rag-search.md): Retrieval-augmented generation — inject context, get grounded answers. - [Vercel AI SDK](https://docs.kymaapi.com/guides/vercel-ai-sdk.md): Use Kyma API with the Vercel AI SDK for streaming, chat UIs, and structured output. - [Windows Setup](https://docs.kymaapi.com/guides/windows-setup.md): Copy-paste-first setup guide for Kyma Agent and Kyma Ter on Windows. - [Windsurf](https://docs.kymaapi.com/guides/windsurf.md): Use Kyma with Windsurf (Codeium) IDE. - [What is Kyma API?](https://docs.kymaapi.com/introduction.md): LLM API gateway — active models, one endpoint, built-in redundancy. - [Audio](https://docs.kymaapi.com/models/audio.md): Two audio models on two endpoints. Speech-to-text and audio-scene Q&A — pay per minute. - [DeepSeek](https://docs.kymaapi.com/models/deepseek.md): DeepSeek's frontier models — V4 Pro and V4 Flash for the latest tier, R1 for reasoning, V3 for stable baseline. - [DeepSeek R1](https://docs.kymaapi.com/models/deepseek-r1.md): Kyma's strongest pure reasoning model for hard analysis, math, and multi-step logic. - [DeepSeek V3](https://docs.kymaapi.com/models/deepseek-v3.md): Kyma's best value frontier-class model for general work, coding, and analysis. - [DeepSeek V4 Flash](https://docs.kymaapi.com/models/deepseek-v4-flash.md): Fast, cheap V4-tier model. 1M context, MoE, MIT licensed. - [DeepSeek V4 Pro](https://docs.kymaapi.com/models/deepseek-v4-pro.md): DeepSeek's V4 flagship — 1.6T MoE for top reasoning, complex coding, and long-context work. - [Gemini 2.5 Flash](https://docs.kymaapi.com/models/gemini-2.5-flash.md): Kyma's best long-context value pick with 1M context and multimodal input support. - [Gemini 3 Flash](https://docs.kymaapi.com/models/gemini-3-flash.md): A newer 1M-context Gemini option on Kyma for long-context reasoning and multimodal analysis. - [Gemini 3 Flash (Audio)](https://docs.kymaapi.com/models/gemini-3-flash-audio.md): Gemini 3 Flash audio understanding on Kyma. Tone, music, SFX, language, speaker emotion — $0.000648/min. - [Gemini 3.5 Flash](https://docs.kymaapi.com/models/gemini-3.5-flash.md): Google's newest Gemini Flash — 1M context, multimodal input, fast reasoning. - [Gemma 4 31B](https://docs.kymaapi.com/models/gemma-4-31b.md): A strong low-cost multimodal model on Kyma for image-aware tasks and general work. - [Gemma & Gemini (Google)](https://docs.kymaapi.com/models/gemma-gemini.md): Google's open and proprietary models. - [GLM 4.5 Air](https://docs.kymaapi.com/models/glm-4.5-air.md): A cheap agentic GLM model on Kyma for bulk automation, long context, and cost-sensitive workflows. - [GLM 4.7 Flash](https://docs.kymaapi.com/models/glm-4.7-flash.md): A fast, efficient GLM model on Kyma for cheap long-context throughput and routine workloads. - [GLM 5.1](https://docs.kymaapi.com/models/glm-5.1.md): A flagship engineering model on Kyma for long-running coding agents, repo-scale work, and multi-step execution. - [GPT-4o mini Transcribe](https://docs.kymaapi.com/models/gpt-4o-mini-transcribe-2025-12-15.md): OpenAI premium STT on Kyma. Best accuracy on conversational audio, noisy backgrounds, and code-switching languages. $0.00405/min via the `transcribe-quality` alias. - [GPT Image 2](https://docs.kymaapi.com/models/gpt-image-2.md): OpenAI's flagship image model. Near-perfect text-in-image (multilingual), reasoning-augmented composition, photoreal output. Quality dropdown low/medium/high. - [GPT-OSS (OpenAI)](https://docs.kymaapi.com/models/gpt-oss.md): OpenAI's open source models. - [GPT-OSS 120B](https://docs.kymaapi.com/models/gpt-oss-120b.md): A low-cost, broad-utility model on Kyma for writing, general tasks, and budget-sensitive workloads. - [Grok 4.3](https://docs.kymaapi.com/models/grok-4.3.md): xAI's frontier model — 1M context, strong reasoning and tool use for agentic coding. - [Hailuo 02 (1080p)](https://docs.kymaapi.com/models/hailuo-02-1080p.md): MiniMax Hailuo 02 video at 1080p — premium tier, full HD output. Flat $0.780 per clip. - [Hailuo 02 (512p)](https://docs.kymaapi.com/models/hailuo-02-512p.md): MiniMax Hailuo 02 video at 512p — cheapest video tier on Kyma. Flat $0.140 per clip. - [Hailuo 02 (768p)](https://docs.kymaapi.com/models/hailuo-02-768p.md): MiniMax Hailuo 02 video at 768p — mid tier balanced quality vs cost. Flat $0.420 per clip. - [Image Generation](https://docs.kymaapi.com/models/image-generation.md): Sixteen image models on one async endpoint. Photo, edit, typography, vector, native edit-mode — pay per image, per megapixel, or per quality tier. - [Imagen 4](https://docs.kymaapi.com/models/imagen-4.md): Google's default Imagen tier — photoreal humans, sharp text, rich composition. Balanced quality and cost at $0.054/image. Sync API. - [Imagen 4 Fast](https://docs.kymaapi.com/models/imagen-4-fast.md): Google Imagen 4 fast tier — quickest gen, lower fidelity. Cheapest Imagen on Kyma at $0.027/image. Sync API, blob-hosted output. - [Imagen 4 Ultra](https://docs.kymaapi.com/models/imagen-4-ultra.md): Google Imagen 4 highest fidelity — best detail, slowest gen. Print-ready hero assets at $0.081/image. Sync API. - [Kimi (Moonshot)](https://docs.kymaapi.com/models/kimi.md): Moonshot's Kimi K2.6 agentic flagship and K2.5 previous-gen on Kyma. - [Kimi K2.5](https://docs.kymaapi.com/models/kimi-k2.5.md): Kyma's top pick for tool-heavy agents, multimodal workflows, and long coding sessions. - [Kimi K2.6](https://docs.kymaapi.com/models/kimi-k2.6.md): Moonshot's newest agentic flagship on Kyma. Best for tool-heavy agents, multimodal workflows, and long coding sessions. - [Kling 2.5 Pro](https://docs.kymaapi.com/models/kling-2.5-pro.md): Kuaishou's budget cinematic video model on Kyma. Photoreal humans, smooth motion, 5–10s clips, pay per second. - [Kling 3 Pro](https://docs.kymaapi.com/models/kling-3-pro.md): Kuaishou's flagship cinematic video model on Kyma. Sharper than 2.5 Pro, 5–10s clips, pay per second. - [Kling 3 Pro (Audio)](https://docs.kymaapi.com/models/kling-3-pro-audio.md): Kuaishou's flagship Kling with native audio on Kyma. Same visuals as Kling 3 Pro plus synchronized sound, pay per second. - [Llama (Meta)](https://docs.kymaapi.com/models/llama.md): Meta's open source language models. - [Llama 3.3 70B](https://docs.kymaapi.com/models/llama-3.3-70b.md): A balanced open model on Kyma for general work, coding, and dependable day-to-day usage. - [MiniMax](https://docs.kymaapi.com/models/minimax.md): MiniMax M2.5 and M2.7 — strong agentic coding and productivity models. - [MiniMax Image 01](https://docs.kymaapi.com/models/minimax-image-01.md): Sub-cent image generation. Cheapest tier on Kyma — $0.005 per image flat regardless of resolution. - [MiniMax M2.5](https://docs.kymaapi.com/models/minimax-m2.5.md): A strong engineering model on Kyma for coding agents, debugging, and long multi-step tasks. - [MiniMax M2.7](https://docs.kymaapi.com/models/minimax-m2.7.md): A newer MiniMax model on Kyma focused on agentic productivity and debugging workflows. - [MiniMax Music](https://docs.kymaapi.com/models/minimax-music.md): Lyrics-driven music generation on Kyma. Music-2.0 family. ~90× cheaper than ElevenLabs Music for non-hero use cases. - [MiniMax Music Pro](https://docs.kymaapi.com/models/minimax-music-pro.md): Music-2.5+ generation on Kyma. Higher fidelity than Music-2.0, richer arrangements, ~19× cheaper than ElevenLabs Music. - [MiniMax Speech HD](https://docs.kymaapi.com/models/minimax-speech-hd.md): MiniMax HD voice on Kyma. Multilingual TTS at production quality, ~2.9× cheaper than ElevenLabs Multilingual v2. - [MiniMax Speech Turbo](https://docs.kymaapi.com/models/minimax-speech-turbo.md): MiniMax low-latency voice on Kyma. ~2.2× cheaper than ElevenLabs Flash v2.5. Best for real-time voice agents. - [MiniMax Voice Clone](https://docs.kymaapi.com/models/minimax-voice-clone.md): Clone a voice from a 10s-5min reference recording. Returns a voice_id usable in /v1/audio/speech with any MiniMax HD or Turbo SKU. - [MiniMax Voice Design](https://docs.kymaapi.com/models/minimax-voice-design.md): Generate a synthesized voice from a natural-language description. No reference audio needed. - [Nano Banana](https://docs.kymaapi.com/models/nano-banana.md): Google's Gemini image-gen with native edit-mode — image-in + prompt → image-out. Flat $0.046/image. Sync API. - [Nano Banana 3 Flash (preview)](https://docs.kymaapi.com/models/nano-banana-3-flash.md): Newer Gemini 3.1 image-gen, preview tier. Same edit-mode + pricing as nano-banana; sharper output. Routed to Vertex global region. - [Other Models](https://docs.kymaapi.com/models/others.md): Specialized models on Kyma. - [All Models](https://docs.kymaapi.com/models/overview.md): Browse Kyma's active models by capability, cost, latency, and use case. - [Qwen (Alibaba)](https://docs.kymaapi.com/models/qwen.md): Alibaba's Qwen series — #1 most popular model family on Kyma. - [Qwen 3 32B](https://docs.kymaapi.com/models/qwen-3-32b.md): Kyma's fast Qwen option for coding loops, math, and lower-latency workflows. - [Qwen 3 Coder](https://docs.kymaapi.com/models/qwen-3-coder.md): A code-specialized Qwen model for generation, debugging, and structured engineering tasks. - [Qwen 3.6 Plus](https://docs.kymaapi.com/models/qwen-3.6-plus.md): Kyma's strongest default model for general work, coding, and multilingual reasoning. - [Qwen 3.7 Max](https://docs.kymaapi.com/models/qwen-3.7-max.md): Alibaba's newest closed-weight flagship — 1M context, top-tier reasoning and multilingual quality. - [Which model should I use?](https://docs.kymaapi.com/models/recommended.md): Choose the right Kyma model by task, constraint, and tradeoff. - [Seedance 2 Fast](https://docs.kymaapi.com/models/seedance-2-fast.md): ByteDance's fast Seedance tier on Kyma. Quick generation, ~20% cheaper than Pro, native audio bundled, pay per second. - [Seedance 2 Pro](https://docs.kymaapi.com/models/seedance-2-pro.md): ByteDance's flagship video model on Kyma. Multi-shot, dynamic camera, native audio bundled, pay per second. - [Sonar](https://docs.kymaapi.com/models/sonar.md): Perplexity's live web-search model on Kyma — current, cited answers grounded in a real-time search of the web. - [Sonar Pro](https://docs.kymaapi.com/models/sonar-pro.md): Perplexity's pro web-search model on Kyma — deeper multi-step search, 200K context, longer cited reports. - [Veo 3](https://docs.kymaapi.com/models/veo-3.md): Google Veo 3 flagship — 1080p with native audio (dialogue + ambient + lip-sync). Top-quality cinematic at $0.540/sec. Async LRO. - [Veo 3 Fast](https://docs.kymaapi.com/models/veo-3-fast.md): Google Veo 3 fast tier — 720p, no audio. Cheapest Veo on Kyma at $0.135/sec. Async LRO ~30–60s. - [Video Generation](https://docs.kymaapi.com/models/video-generation.md): Ten video models on one async endpoint. Cinematic, social, native-audio Veo + Seedance + Kling + Hailuo — pay per second or per call. - [Whisper Large v3 Turbo](https://docs.kymaapi.com/models/whisper-v3-turbo.md): OpenAI Whisper Large v3 Turbo on Kyma. Speech-to-text at 228x realtime, $0.0009/min, OpenAI Whisper API compatible. - [Pricing](https://docs.kymaapi.com/pricing.md): Pay-per-token pricing, credits, and rate limits. Start with $0.50 free. - [Quickstart](https://docs.kymaapi.com/quickstart.md): Get from zero to working API call in 30 seconds.