beatra
Capabilities

Audio & speech

Planned: speech synthesis, transcription, dialogue audio, voices.

Status: Planned. Not callable yet. The shape below is design direction, not contract.

Audio & speech will cover narration, accessibility, transcription, dialogue audio, and voice catalog workflows.

Planned shape

Speech synthesis (text-to-speech):

POST /v1/audio/speech
Authorization: Bearer <api_key>
Content-Type: application/json
Idempotency-Key: <uuid>
 
{
  "model": "auto",
  "input": "...",
  "voice": "voice_..."
}

Transcription (speech-to-text):

POST /v1/audio/transcriptions
Authorization: Bearer <api_key>
Content-Type: application/json
Idempotency-Key: <uuid>
 
{
  "model": "auto",
  "upload_id": "upl_..."
}

Both return 202 Accepted with a task_id. Voice ids come from a future GET /v1/voices discovery endpoint.

How delivery will work

Audio jobs use the async task model. Source media for transcription is provided via uploads.

Prepare now

  • Classify transcript and voice data as sensitive unless your policy says otherwise.
  • Validate file formats before upload once uploads become available.
  • Keep source media, generated audio, and review state in your own records.
  • Track Roadmap before wiring client calls.

On this page