GPT-4o mini Transcribe

Overview

gpt-4o-mini-transcribe-2025-12-15 is OpenAI’s premium speech-to-text model, surfaced on Kyma through the transcribe-quality alias. It’s the right pick when accuracy on real-world audio matters more than raw cost — conversational dialogue, noisy environments, and multilingual code-switching (Vietnamese ↔ English mixing, Mandarin ↔ English, etc.) where Whisper Turbo’s distilled decoder gives ground. Ships alongside the default transcribe alias (which resolves to Whisper Large v3 Turbo). Two aliases, one decision per request: pick transcribe for high-volume bulk transcription, transcribe-quality when accuracy is the constraint.

Specs

Field	Value
Model ID	`gpt-4o-mini-transcribe-2025-12-15`
Alias	`transcribe-quality`
Creator / Provider	OpenAI
Best for	Noisy / conversational / code-switching audio, multilingual dictation
Max file size	25 MB (multipart upload)
Input modalities	Audio (`mp3`, `wav`, `m4a`, `ogg`, `webm`, `flac`)
Output modalities	Text
Pricing mode	Per minute
Min billable	1 minute (rounded up)

Pricing

	Cost
Per minute	$0.00405
1-hour file	$0.243
5-second clip	$0.00405 (rounds up to 1 min)

~4.5× the cost of default Whisper Turbo ($0.0009/min) — that’s the premium you pay for OpenAI’s accuracy on hard audio. For high-volume bulk transcription, default to transcribe; reserve transcribe-quality for the cases that need it.

Use this when

Audio contains code-switching (e.g. Vietnamese + English in the same utterance) and Whisper Turbo is producing garbled output on the non-primary language.
Background noise, low-quality recording, or far-field microphones — accuracy matters more than throughput.
Conversational dictation where capitalization, punctuation, and proper nouns must be right first time.

Pick something else when

High-volume bulk transcription where ~99% accuracy is fine — use whisper-v3-turbo at ~4.5× cheaper.
The audio is over 25 MB — Quality tier currently only supports multipart upload. Use transcribe with the JSON audio_url mode for files up to 100 MB.
You need full per-segment timestamps reliably under outage — Quality tier does not fall back to Vertex Gemini (the default transcribe does), so a 5xx from OpenAI surfaces directly.

No fallback

The default transcribe alias has an automatic Whisper → Vertex Gemini fallback chain on upstream outage. The Quality tier opts out by design: you explicitly chose OpenAI for accuracy, and silently swapping to a different provider would defeat that contract. OpenAI outages return as 502 transcription_failed so your client knows to retry or downgrade. See the Fallback chain section on the endpoint reference for the full rules.

Concurrency limits

Routed through the openai audio sub-pool. Per-tier caps:

Tier	Concurrent slots
Tier 0	1
Tier 1	2
Tier 2	4
Tier 3	8
Tier 4	18

Saturating the OpenAI sub-pool does not affect the transcribe (whisper-v3-turbo) sub-pool or any other audio capability. See Rate Limits — Audio limits.

Example

curl -X POST https://kymaapi.com/v1/audio/transcriptions \
  -H "Authorization: Bearer $KYMA_API_KEY" \
  -F "file=@interview.mp3" \
  -F "model=transcribe-quality" \
  -F "response_format=verbose_json"

Python (raw HTTP — OpenAI SDK doesn’t expose Kyma aliases)

import os
import requests

with open("interview.mp3", "rb") as f:
    resp = requests.post(
        "https://kymaapi.com/v1/audio/transcriptions",
        headers={"Authorization": f"Bearer {os.environ['KYMA_API_KEY']}"},
        files={"file": f},
        data={"model": "transcribe-quality", "response_format": "verbose_json"},
    )
result = resp.json()
print(result["text"])

The Python OpenAI SDK pins to whisper-1 internally for transcription so passing model="transcribe-quality" through the SDK won’t reach Kyma’s alias resolver. Raw requests works fine.

Aliases that resolve here

transcribe-quality — premium ASR tier on Kyma.

If you want stable behavior across alias changes, pin gpt-4o-mini-transcribe-2025-12-15 directly. If you want to ride future Quality-tier upgrades (e.g. if Kyma promotes a newer OpenAI STT SKU), use the alias.

​Overview

​Specs

​Pricing

​Use this when

​Pick something else when

​No fallback

​Concurrency limits

​Example

​Python (raw HTTP — OpenAI SDK doesn’t expose Kyma aliases)

​Aliases that resolve here

​See also