Capabilities
Audio & speech
Planned: speech synthesis, transcription, dialogue audio, voices.
Status: Planned. Not callable yet. The shape below is design direction, not contract.
Audio & speech will cover narration, accessibility, transcription, dialogue audio, and voice catalog workflows.
Planned shape
Speech synthesis (text-to-speech):
Transcription (speech-to-text):
Both return 202 Accepted with a task_id. Voice ids come from a future GET /v1/voices discovery endpoint.
How delivery will work
Audio jobs use the async task model. Source media for transcription is provided via uploads.
Prepare now
- Classify transcript and voice data as sensitive unless your policy says otherwise.
- Validate file formats before upload once uploads become available.
- Keep source media, generated audio, and review state in your own records.
- Track Roadmap before wiring client calls.