Voice Design (MiniMax)

curl --request POST \
  --url https://kymaapi.com/v1/audio/voice-design \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "description": "<string>",
  "model": "<string>",
  "name": "<string>",
  "preview_text": "<string>",
  "gender": "<string>",
  "age_group": "<string>"
}
'

POST

audio

voice-design

curl --request POST \
  --url https://kymaapi.com/v1/audio/voice-design \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "description": "<string>",
  "model": "<string>",
  "name": "<string>",
  "preview_text": "<string>",
  "gender": "<string>",
  "age_group": "<string>"
}
'

Synchronous endpoint. Describe a voice in plain English, get back a voice_id you can immediately use in /v1/audio/speech on any MiniMax voice model. Use this when you don’t have voice talent, you’re prototyping a fictional character, or you want a brand-safe persona voice from scratch.

curl -X POST https://kymaapi.com/v1/audio/voice-design \
  -H "Authorization: Bearer $KYMA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Warm female narrator with a slight British accent, mid-30s, calm cadence",
    "gender": "female",
    "age_group": "young"
  }'

Request

application/json body.

description

string

required

Natural-language voice description. Max 1000 characters. Also accepts the alias text.

model

string

default:"minimax-voice-design"

Voice design SKU. Currently only minimax-voice-design is supported.

name

string

Optional human-readable label, max 64 chars.

preview_text

string

Optional sample text MiniMax will render in the new voice for an internal preview. Max 500 characters. Doesn’t appear in the response — the audio bytes are not returned (you’d call /v1/audio/speech afterward to render).

gender

string

Optional hint: male or female.

age_group

string

Optional hint: child, young, middle-aged, or elderly.

Response

200 OK JSON. Same shape as /v1/audio/voice-clone.

{
  "voice_id": "kyma_a91f4d2e7c8b5301",
  "name": null,
  "model": "minimax-voice-design",
  "cost_usd": 4.20,
  "balance_usd": 45.80
}

Pricing

Flat $4.20 per designed voice. One-time charge — once designed, the voice_id is reusable in unlimited TTS calls. Voice design costs ~2× voice clone because synthesizing timbre from text is strictly more compute-intensive than reproducing a captured voice.

Ownership

Same gating as voice clone — designed voice IDs are owned by the requesting user. Sharing the voice_id with another account is rejected with 403 voice_not_owned.

Errors

Status	`error.code`	When
`400`	`not_a_voice_design_model`	`model` is not a design SKU
`400`	`description_too_long`	description > 1000 chars
`400`	`invalid_request`	missing description
`402`	`insufficient_credits`	balance below $4.20
`500`	`ownership_write_failed`	design succeeded but ownership row insert failed
`502`	`provider_error`	upstream MiniMax failure

​Request

​Response

​Pricing

​Ownership

​Errors

​See also

Request

Response

Pricing

Ownership

Errors

See also