Skip to main content
POST
/
v1
/
audio
/
music
Audio Music
curl --request POST \
  --url https://kymaapi.com/v1/audio/music \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "prompt": "<string>",
  "lyrics": "<string>",
  "music_length_ms": 123,
  "response_format": "<string>"
}
'
Synchronous endpoint across two providers:
  • ElevenLabs Music (elevenlabs-music) — caller supplies music_length_ms, billed per second.
  • MiniMax Music 2.0 (minimax-music) — provider derives length from lyrics, flat per-song price.
  • MiniMax Music Pro (minimax-music-pro) — same flat-per-song shape, higher fidelity. Currently backed by music-2.6 (upgraded from music-2.5 on 2026-05-17; API contract and pricing unchanged).
# ElevenLabs — pure description
curl -X POST https://kymaapi.com/v1/audio/music \
  -H "Authorization: Bearer $KYMA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs-music",
    "prompt": "warm lo-fi hip-hop, jazz piano, soft snare, late evening city ambience",
    "music_length_ms": 30000
  }' \
  --output bed.mp3

# MiniMax — style prompt + structured lyrics
curl -X POST https://kymaapi.com/v1/audio/music \
  -H "Authorization: Bearer $KYMA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax-music-pro",
    "prompt": "uplifting indie pop, female vocals, anthemic chorus",
    "lyrics": "[verse]\nWalking through the morning light\n[chorus]\nWe will rise again tonight\n[Instrumental]"
  }' \
  --output song.mp3

Request

application/json body.
model
string
default:"elevenlabs-music"
One of elevenlabs-music, minimax-music, minimax-music-pro.
prompt
string
Description of the music. Max 2000 characters. Required for elevenlabs-music. For MiniMax models the prompt drives style/arrangement; lyrics drive vocals.
lyrics
string
MiniMax music only. Vocal content with optional structural tags: [verse], [chorus], [bridge], [Instrumental]. Pass [Instrumental] alone for a vocal-less track. ElevenLabs ignores this field.
music_length_ms
integer
default:"30000"
Output duration in milliseconds. Range: 1000 (1 second) to 300000 (5 minutes). ElevenLabs honors this; MiniMax derives length from the lyrics. Also accepts duration_ms as an alias.
response_format
string
default:"mp3_44100_128"
Audio format. Same options as audio/speech.
save_to_blob
string
Optional query param. Set to 1 to have Kyma upload the resulting MP3 to Vercel Blob, write a multimodal_jobs row (so the call appears in your gallery + share pages), and return JSON { job_id, kind: "audio", url, duration_sec, cost_usd, balance_usd } instead of streaming bytes. Used by the Canvas and Muse audio kinds — most direct API callers don’t need this.

Response

Default (streaming)

200 OK with audio bytes. Headers:
HeaderWhat
X-Kyma-Modelresolved model id
X-Kyma-Duration-Secclip duration used for billing
X-Kyma-Cost-USDactual cost charged
X-Kyma-Balance-USDremaining balance

With ?save_to_blob=1

200 OK with JSON:
{
  "object": "audio.generation",
  "job_id": "mmj_a1b2c3d4e5f6...",
  "kind": "audio",
  "model": "minimax-music-pro",
  "url": "https://blob.vercel-storage.com/mmj/.../mmj_a1b2....mp3",
  "duration_sec": 86,
  "cost_usd": 0.21,
  "balance_usd": 47.79
}
The url is a permanent Vercel Blob URL — safe to embed in apps, store in a DB, share publicly.

Pricing

SKUPricePricing mode
elevenlabs-music$0.135 / secPer-second (caller-controlled length)
minimax-music$0.045 / songFlat per generation
minimax-music-pro$0.21 / songFlat per generation
ElevenLabs is duration-proportional, MiniMax is flat per song. Cost comparison for typical lengths:
LengthElevenLabs MusicMiniMax Music ProMiniMax Music
30 seconds$4.05$0.21$0.045
60 seconds$8.10$0.21$0.045
3 minutes$24.30$0.21$0.045
5 minutes (max)$40.50$0.21$0.045
MiniMax is 19–190× cheaper than ElevenLabs for full-length tracks. Pick ElevenLabs only when you need fine-grained length control on a short bed (under ~1.5 seconds is the only crossover); otherwise the flat MiniMax pricing wins. ElevenLabs’s worst-case rate is locked at the Pro tier overage to stay safe across subscription levels.

Char limits

Per upstream provider — the gateway rejects oversized inputs before forwarding so you don’t burn bandwidth + a hold. Counts are NFC-normalized so Vietnamese / accented Latin / other diacritic-heavy scripts measure by visible characters, not UTF-16 code units.
SKUpromptlyricsWhy
elevenlabs-music2000 chars(ignored)Free-form description, vocals included via prompt cues
minimax-music200 chars600 charsProvider splits style cue from vocal content
minimax-music-pro200 chars600 charsSame shape as minimax-music
Errors include the SKU name and the field that overflowed:
prompt too long for minimax-music-pro: 245 chars (max 200).
lyrics too long for minimax-music-pro: 612 chars (max 600). Use [Instrumental] for vocal-less.

Errors

Statuserror.codeWhen
400not_a_music_modelmodel is not a music SKU
400prompt_too_longprompt > 2000 chars
400invalid_durationmusic_length_ms outside [1000, 300000]
401auth_errormissing or invalid API key
402billing_errorbalance too low
502provider_errorupstream provider failure

See also