beatra
Capabilities

Text & chat

Build assistants, drafting, classification, and decision tools with the OpenAI-compatible chat API.

Status: Available. POST /v1/chat/completions, including stream: true.

Use text & chat for assistants, drafting, classification, extraction, summarization, and any workflow that produces text from text.

Basic request

curl https://api.beatra.ai/v1/chat/completions \
  -H "Authorization: Bearer $BEATRA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: $(uuidgen)" \
  -d '{
    "model": "auto",
    "messages": [
      {"role": "system", "content": "You are a concise product copywriter."},
      {"role": "user", "content": "Write three onboarding email subject lines."}
    ]
  }'
from openai import OpenAI
 
client = OpenAI(
    base_url="https://api.beatra.ai/v1",
    api_key="<BEATRA_KEY>",
)
 
resp = client.chat.completions.create(
    model="auto",
    messages=[
        {"role": "system", "content": "You are a concise product copywriter."},
        {"role": "user", "content": "Write three onboarding email subject lines."},
    ],
)
print(resp.choices[0].message.content)
import OpenAI from "openai";
 
const client = new OpenAI({
  baseURL: "https://api.beatra.ai/v1",
  apiKey: process.env.BEATRA_API_KEY,
});
 
const resp = await client.chat.completions.create({
  model: "auto",
  messages: [
    { role: "system", content: "You are a concise product copywriter." },
    { role: "user", content: "Write three onboarding email subject lines." },
  ],
});
console.log(resp.choices[0].message.content);

Response shape

{
  "id": "chatcmpl_01J5...",
  "model": "<resolved model id>",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "1. Welcome aboard\n2. ..." },
      "finish_reason": "stop"
    }
  ],
  "usage": { "prompt_tokens": 24, "completion_tokens": 38, "total_tokens": 62 }
}

Two things to keep per request:

  • The response body's model field — what actually handled the call.
  • The response header X-Request-Id — your fastest support lookup key.

Key parameters

FieldTypeNotes
modelstring"auto" or an account-enabled model id.
messagesarrayrole + content. system first for instructions, then alternating user and assistant.
streambooleanSee below. Default false.
temperaturenumber0.02.0. Lower = more deterministic.
max_tokensintegerCap on response length.
top_pnumberNucleus sampling. Pass either temperature or top_p, not both.

Full field reference: Chat API.

Streaming

Set stream: true for token-by-token rendering in UIs. The response is text/event-stream, emits OpenAI-compatible chunks, and ends with data: [DONE].

from openai import OpenAI
 
client = OpenAI(base_url="https://api.beatra.ai/v1", api_key="<BEATRA_KEY>")
 
stream = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Count to five."}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="", flush=True)
import OpenAI from "openai";
 
const client = new OpenAI({
  baseURL: "https://api.beatra.ai/v1",
  apiKey: process.env.BEATRA_API_KEY,
});
 
const stream = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Count to five." }],
  stream: true,
});
 
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Treat any disconnect before data: [DONE] as an incomplete response. Don't take irreversible action on partial output. See Sync vs streaming.

Billing

Chat is postpaid. Every successful completion is charged from your wallet at the end of the call. The charge is the LLM-cost figure reported by LiteLLM, multiplied by beatra's per-model markup, expressed in credits.

A worked example using the current pr1-seed defaults:

ComponentValue
LiteLLM cost (model MiniMax-M2.7)$0.00200
beatra markup (credits_per_litellm_usd for this model)1500
Charged0.00200 × 1500 = 3.0000 credits
USD reference (default rate 1000 cr/$)≈ $0.0030

The response carries the charge under usage.credits and usage.credits_usd_reference:

{
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 42,
    "total_tokens": 60,
    "credits": "3.0000",
    "credits_usd_reference": "0.0030"
  }
}

USD reference is display-only — it tracks the wallet's credits_per_usd_rate setting at the time of the response and is not a price commitment. See billing-model for the full rules.

Error handling

All errors use the standard envelope; branch on retryable. Most common for this endpoint:

  • invalid_request (400) — fix the body and resend
  • rate_limited (429) — back off, retry with the same Idempotency-Key
  • model_unavailable (503) — retry with "auto" or another approved model id

Full handling rules: Errors & retries.

On this page