Voice Clone (MiniMax)

Synchronous endpoint. Upload a reference audio clip via multipart form, get back a voice_id you can pass to /v1/audio/speech on any MiniMax voice model.

curl -X POST https://kymaapi.com/v1/audio/voice-clone \
  -H "Authorization: Bearer $KYMA_API_KEY" \
  -F file=@reference.mp3 \
  -F name="brand-narrator"

Request

multipart/form-data body.

file

required

Reference audio. MP3, WAV, or M4A. Max 10 MB. Duration must be 10 seconds to 5 minutes — Kyma parses the audio header locally and rejects out-of-range files with 400 reference_too_short or 400 reference_too_long before charging. Longer clips don’t improve clone quality and waste upload bandwidth.

name

string

Optional human-readable label, max 64 chars. Surfaced in the response and stored alongside the ownership row for your reference.

model

string

default:"minimax-voice-clone"

Voice clone SKU. Currently only minimax-voice-clone is supported.

Response

200 OK JSON.

{
  "voice_id": "kyma_3f8e2a1b4c9d7e60",
  "name": "brand-narrator",
  "model": "minimax-voice-clone",
  "cost_usd": 2.10,
  "balance_usd": 47.90
}

Field	What
`voice_id`	Use this in `/v1/audio/speech` `voice_id` field. Namespaced as `kyma_<rand>`.
`name`	Echo of the label you sent (or `null`).
`cost_usd`	Flat charge applied (`$2.10`).
`balance_usd`	Remaining balance after settle.

X-Kyma-Model, X-Kyma-Cost-USD, and X-Kyma-Balance-USD headers are also set.

Pricing

Flat $2.10 per cloned voice. One-time charge — once cloned, the voice_id is reusable in unlimited TTS calls.

Ownership

Cloned voice IDs are gated per Kyma user. If user A passes user B’s voice_id to /v1/audio/speech, the request returns 403 voice_not_owned. Voice IDs that aren’t on file are assumed to be MiniMax system voices (browseable by everyone).

Errors

Status	`error.code`	When
`400`	`not_a_voice_clone_model`	`model` is not a clone SKU
`400`	`invalid_request`	missing `file`, invalid form data
`400`	`reference_too_short`	audio duration < 10 seconds
`400`	`reference_too_long`	audio duration > 5 minutes
`400`	`audio_unreadable`	could not parse duration — file may be corrupt or in an unsupported codec
`402`	`insufficient_credits`	balance below $2.10
`413`	`invalid_request`	audio file > 10 MB
`415`	`invalid_request`	audio format not MP3/WAV/M4A
`500`	`ownership_write_failed`	clone succeeded but ownership row insert failed (no charge applied; safe to retry)
`502`	`provider_error`	upstream MiniMax failure

​Request

​Response

​Pricing

​Ownership

​Errors

​See also

Request

Response

Pricing

Ownership

Errors

See also