> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pyai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Get an Omni call transcript

> The full conversation transcript for the call. Resolves whether the transcript is stored inline or offloaded to an external URL, the response is always the transcript document itself. `404` if no transcript exists (e.g. the call failed before any speech, or the engine has not yet pushed the record). Requires `omni:read`.



## OpenAPI

````yaml https://api.pyai.com/openapi.json get /v1/omni/calls/{call_id}/transcript
openapi: 3.1.0
info:
  title: PyAI API
  version: 1.0.0
  description: >-
    Telephony-native Voice AI behind one bearer key:


    - **Hear**, speech-to-text · `POST /v1/audio/transcriptions` (streaming +
    batch)

    - **Speak**, text-to-speech, stock voices & voice cloning · `POST
    /v1/audio/speech`, `GET /v1/voices`, `/v1/voice/clones`

    - **Cue**, streaming turn detection + knowledge-base context for your own
    LLM/voice pipeline · `GET /v1/audio/transcriptions/stream` with grounding

    - **Omni**, full-duplex agentic voice (speech-to-speech, grounded in your
    knowledge bases + tools) · `/v1/omni` (and the OpenAI-compatible
    `/v1/realtime`)

    - **AMD API**, answering-machine detection: know *who or what* answered a
    call (human, voicemail, IVR, iPhone/Google screening, dead number, fax) in a
    fraction of the dead-air dwell (under 300 ms for a human, under 800 ms for a
    machine, in-region), with the reason it decided · `wss …/v1/amd/stream`
    (Twilio Media Streams drop-in), `POST /v1/amd/config`, `GET
    /v1/amd/calls/{id}`

    - **Agents**, PyAI's feature to create, manage & track your Omni voice
    agents (no-code builder, hosted knowledge, evals, monitoring). _Coming
    soon._


    ## Authentication


    Create a key in the [console](https://console.pyai.com) (it is shown once)
    and send it as a bearer token:


    ```

    Authorization: Bearer pyai_live_...

    ```


    Keys are environment-scoped: `pyai_live_...` (production) and
    `pyai_test_...` (sandbox). New accounts receive up to $50 of prepaid credits
    after phone verification (graduated signup); `pyai_test_` keys skip the
    credit gate entirely.


    Keys are self-validating signed tokens: they work on every PyAI surface the
    instant they are created, no activation or propagation delay. Treat them as
    opaque strings (up to 512 chars) and never parse their contents.


    WebSocket endpoints can't use request headers from a browser, so pass the
    key as a **subprotocol** instead:


    ```

    Sec-WebSocket-Protocol: pyai-key.pyai_live_...

    ```


    (server-side clients may instead append `?api_key=...` to the URL). The
    gateway authenticates the key on the WebSocket upgrade and swaps it for the
    internal upstream credential, your key never reaches the model, and the
    model's key never reaches you.


    ## Quickstart, Hear (speech-to-text)


    ```

    curl https://api.pyai.com/v1/audio/transcriptions \
      -H "Authorization: Bearer $PYAI_API_KEY" \
      -F file=@audio.wav -F model=pyai-hear
    # -> { "text": "..." }

    ```


    ## Quickstart, Speak (text-to-speech)


    ```

    curl https://api.pyai.com/v1/audio/speech \
      -H "Authorization: Bearer $PYAI_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{"model":"pyai-voice","input":"Hello from PyAI.","voice":"voice_abc"}' \
      --output speech.wav
    ```


    `voice` is a stock voice id from `GET /v1/voices` (a curated set of 12
    prebuilt voices with personas, avatars, and previews) or a cloned voice id
    from `/v1/voice/clones`. Omit it to use your account's default voice.


    ## Quickstart, Omni (realtime voice agent)


    Omni is **zero-state, there is nothing to create first.** Open a WebSocket,
    pass your key as a subprotocol, and send the agent's behavior (voice,
    persona, knowledge endpoint) in the first `configure` frame:


    ```

    wss://api.pyai.com/v1/omni?session_label=support&format=pcm16&rate=24000
      Sec-WebSocket-Protocol: pyai-key.$PYAI_API_KEY
    ```


    The session is authorized by your key's **organization**; `session_label` is
    an **optional, opaque** tag (echoed to your own knowledge endpoint for
    correlation), omit it or use any value. When `session_label` equals a
    **`/v1/agents` profile id**, the engine loads persona, voice, and **greeting
    message** from that profile (turn-0 playback). `format` and `rate` are
    load-bearing on the connect URL (the SDK sets them). Send PCM16 audio as
    binary frames and receive the agent's speech the same way. **Optional
    convenience:** pre-store config via `POST /v1/agents` (including `greeting`,
    `consent_line`, `recordings_enabled`) and pass its id as `session_label`, or
    send everything inline in the post-handshake `configure` frame. Not required
    to connect. The OpenAI-realtime-compatible surface (`/v1/realtime`) is
    served by the same Omni engine; new integrations should prefer `/v1/omni`.
    (_Flow, the legacy voice-duplex engine, is retired for new customers; its
    `/v1/realtime` alias now routes to Omni._)


    For reproducible eval runs, determinism controls (`seed`/`temperature`) ride
    the Omni session's `configure` frame, which the gateway passes through
    unchanged, they are honored once the engine supports them; no platform
    change is required.


    ## Scopes


    | Scope | Grants |

    | --- | --- |

    | `hear:transcribe` | `POST /v1/audio/transcriptions` |

    | `hear:stream` | `GET /v1/audio/transcriptions/stream` (WebSocket) |

    | `voice:synthesize` | `POST /v1/audio/speech` (Speak) |

    | `voice:clone` | `/v1/voice/clones` (Speak) |

    | `voice:design` | `/v1/voice/design` (Speak) |

    | `flow:session` | _legacy_, `/v1/realtime` for existing Flow customers; new
    traffic on `/v1/realtime` uses Omni |

    | `omni:session` | `/v1/omni` (native), `/v1/realtime` (Omni), and `POST
    /v1/omni/sessions` (mint a browser session token) |

    | `omni:read` | `/v1/omni/calls` (Omni post-call records) |

    | `transcribe:jobs` | `/v1/transcription/jobs` |

    | `trace:configure` | `/v1/trace/config`, `/v1/trace/rule-packs` (Trace
    management) |

    | `trace:read` | `/v1/trace/interactions`, `/violations`, `/findings`,
    `/exposure` (Trace reads) |

    | `recap:configure` | `/v1/recap/config` (Recap management) |

    | `recap:configure` | `/v1/recap/crm-config` (Salesforce field mapping) |

    | `recap:read` | `/v1/recap/calls` (Recap reads) |

    | `amd:detect` | `wss …/v1/amd/stream` (AMD realtime detection, Twilio
    drop-in) |

    | `amd:configure` | `/v1/amd/config` (AMD operating-point dial + webhook) |

    | `amd:read` | `/v1/amd/calls` (AMD decision records) |

    | `telephony:manage` | `/v1/telephony/*` (managed numbers) |


    `GET /v1/models`, `GET /v1/voices`, and `GET /v1/me` need no specific scope,
    any active key may call them. Wildcards (`hear:*`, `voice:*`, …, and the
    global `*`) grant every scope in their family.


    ## Canonical endpoints


    One row per product surface, endpoint, auth, required scope, and lifecycle
    status. **live** = generally available; **deprecated** = works during a
    migration window (don't build new on it); **legacy** = supported for
    existing customers only; **planned** = not yet available.


    | Product | Endpoint | Auth | Scope | Status |

    | --- | --- | --- | --- | --- |

    | Identity | `GET /v1/me` | Bearer | _any active key_ | live |

    | Models | `GET /v1/models` | Bearer | _any active key_ | live |

    | Voices | `GET /v1/voices`, `GET /v1/voices/{id}` | Bearer | _any active
    key_ | live |

    | Hear (batch) | `POST /v1/audio/transcriptions` | Bearer |
    `hear:transcribe` | live |

    | Hear (streaming) | `GET /v1/audio/transcriptions/stream` (WS) |
    Subprotocol | `hear:stream` | live |

    | Cue | `GET /v1/audio/transcriptions/stream` + grounding (WS) | Subprotocol
    | `hear:stream` | live |

    | Hear (async batch) | `POST`/`GET /v1/transcription/jobs` | Bearer |
    `transcribe:jobs` | live |

    | Speak (TTS) | `POST /v1/audio/speech` | Bearer | `voice:synthesize` | live
    |

    | Speak (cloning) | `GET`/`POST /v1/voice/clones` | Bearer | `voice:clone` |
    live |

    | Speak (design) | `/v1/voice/design` | Bearer | `voice:design` | live |

    | Omni (native) | `wss …/v1/omni?agent_id=` | Subprotocol | `omni:session` |
    live |

    | Omni (OpenAI-compat) | `wss …/v1/realtime?model=pyai-omni-realtime` |
    Subprotocol | `omni:session` | live |

    | Omni (alias) | `wss …/v2/omni/chat` | Subprotocol | `omni:session` |
    deprecated |

    | Flow | `wss …/v1/realtime?model=pyai-flow-realtime` | Subprotocol |
    `flow:session` | legacy |

    | Agent profiles (optional config) | `/v1/agents`, `/v1/agents/{id}` |
    Bearer | `omni:session` | live |

    | Trace (config) | `/v1/trace/config`, `/v1/trace/rule-packs` | Bearer |
    `trace:configure` | live |

    | Trace (reads) | `/v1/trace/interactions`, `/violations`, `/findings`,
    `/exposure` | Bearer | `trace:read` | live |

    | Recap (config) | `/v1/recap/config` | Bearer | `recap:configure` | live |

    | Recap (CRM) | `/v1/recap/crm-config` | Bearer | `recap:configure` | live |

    | Recap (reads) | `/v1/recap/calls` | Bearer | `recap:read` | live |

    | Omni call records | `/v1/omni/calls`, `/v1/omni/calls/{id}` | Bearer |
    `omni:read` | live |

    | AMD (stream) | `wss …/v1/amd/stream` (Twilio Media Streams drop-in) |
    Subprotocol | `amd:detect` | live |

    | AMD (config) | `GET`/`POST /v1/amd/config` | Bearer | `amd:configure` |
    live |

    | AMD (reads) | `GET /v1/amd/calls`, `/v1/amd/calls/{id}` | Bearer |
    `amd:read` | live |

    | Telephony | `/v1/telephony/*` | Bearer | `telephony:manage` | live |

    | Agents (create/manage/track feature) | _coming soon_ |, |, | planned |


    WebSocket surfaces authenticate with the `Sec-WebSocket-Protocol:
    pyai-key.<API_KEY>` subprotocol (or `?api_key=` server-side); everything
    else takes the `Authorization: Bearer` key. Telephony's carrier-backed calls
    (search/provision/release) return 404 until a carrier is configured for the
    account.


    ## Rate limits & billing


    Every key has a per-second rate limit (with burst) and a cap on concurrent
    realtime sessions. Exceeding either returns `429` with a `Retry-After`
    header. Usage is metered per minute of audio, transcription minutes (Hear),
    synthesized audio minutes (Speak), and realtime session minutes (Cue, Omni),
    and billed against your plan and credits. List prices: Hear $0.001/min
    (async Transcribe $0.0005/min), Speak $0.04/min streaming ($0.04/min async),
    Cue $0.015/min, Omni $0.05/min, Agents $0.08/min (the create/manage/track
    feature; rolling out). The AMD API bills per **answered** call, the first
    5,000 answered calls each month are free, then $0.004/answered call
    (no-answers, busies, and failed calls are free; AMD bundled with PyAI
    telephony/Omni is included at no charge). Because AMD decides in a fraction
    of the incumbent's dead-air dwell, it is all-in lower than legacy
    answering-machine detection. AI products (Hear, Speak, Cue, Omni) bill **per
    second by default**, the pulse is applied once to each meter's
    invoice-period total, so many short sessions are summed and rounded a single
    time (never minute-rounded per call), and an empty/failed call bills
    nothing. Coarser pulses are available as an optional enterprise override.
    Managed telephony minutes keep a 1-minute pulse (carrier economics).
    Per-character Speak billing is available on enterprise contracts.
  contact:
    name: PyAI
    url: https://pyai.com
servers:
  - url: https://api.pyai.com
    description: Production
security:
  - apiKey: []
  - xApiKey: []
tags:
  - name: Identity
    description: >-
      Introspect the calling key: org/project, env, granted scopes, and
      limits/credit posture. Use it to self-diagnose a 401/403/402.
  - name: Hear
    description: Speech-to-text (streaming + batch)
  - name: Speak
    description: Text-to-speech and voice cloning
  - name: Realtime
    description: >-
      Full-duplex WebSocket sessions: Omni (agentic speech-to-speech with
      knowledge bases + tools). The legacy Flow engine is retired for new
      customers; its /v1/realtime alias routes to Omni.
  - name: Models
    description: Model catalog
  - name: Sandbox
    description: >-
      Zero-friction onboarding for coding agents: mint a free, instant, no-card
      sandbox key with no human steps.
  - name: Transcription Jobs
    description: Async batch transcription
  - name: Agents
    description: >-
      Agent profiles, OPTIONAL pre-stored Omni session config (persona,
      greeting, voice, conversation knobs) you can reference by id instead of
      sending a `configure` frame each call. Optional convenience over the
      zero-state `/v1/omni` primitive; NOT required to connect. (Distinct from
      the upcoming Agents feature, the no-code create/manage/track surface.)
  - name: Trace
    description: >-
      Compliance & guardrails: per-agent config, rule packs, and the exposure /
      violations / interaction-evidence read views
  - name: AMD
    description: >-
      Answering-machine detection: know who or what answered a call (human,
      voicemail, IVR, iPhone/Google screening, dead number, fax) in a fraction
      of the dead-air dwell (under 300 ms for a human, under 800 ms for a
      machine, in-region), with the reason it decided. Twilio Media Streams
      drop-in over `wss …/v1/amd/stream`; one operating-point dial; billed per
      answered call.
  - name: Telephony
    description: >-
      Managed phone numbers: search, provision, route to an agent, and release.
      Call minutes bill on telephony.minutes ($0.01/min).
paths:
  /v1/omni/calls/{call_id}/transcript:
    get:
      tags:
        - Omni Calls
      summary: Get an Omni call transcript
      description: >-
        The full conversation transcript for the call. Resolves whether the
        transcript is stored inline or offloaded to an external URL, the
        response is always the transcript document itself. `404` if no
        transcript exists (e.g. the call failed before any speech, or the engine
        has not yet pushed the record). Requires `omni:read`.
      operationId: getOmniCallTranscript
      parameters:
        - name: call_id
          in: path
          required: true
          schema:
            type: string
      responses:
        '200':
          description: Call transcript
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/OmniCallTranscript'
        '404':
          $ref: '#/components/responses/NotFound'
components:
  schemas:
    OmniCallTranscript:
      type: object
      properties:
        object:
          type: string
          example: omni.call.transcript
        call_id:
          type: string
        transcript:
          description: >-
            The transcript document. Shape is engine-defined: typically an
            object with a `turns` array, each turn carrying `role` (`user` or
            `assistant`), `text`, and optional `start_ms`/`end_ms` timestamps.
            May be a plain string for legacy sessions.
          oneOf:
            - type: object
            - type: string
    Problem:
      type: object
      description: >-
        RFC 7807 problem+json returned by the control plane (request-validation
        and resource errors such as 400/404/409). The stable code is the last
        path segment of `type`.
      required:
        - title
        - status
      properties:
        type:
          type: string
          description: Problem type URI; ends with the stable code.
        title:
          type: string
        status:
          type: integer
        detail:
          type: string
        request_id:
          type: string
  responses:
    NotFound:
      description: 'Resource not found (`code: not_found`)'
      content:
        application/problem+json:
          schema:
            $ref: '#/components/schemas/Problem'
  securitySchemes:
    apiKey:
      type: http
      scheme: bearer
      description: 'Use `Authorization: Bearer pyai_live_...` (or `pyai_test_...`).'
    xApiKey:
      type: apiKey
      in: header
      name: x-api-key
      description: >-
        Header alias for bearer auth on HTTP endpoints. WebSocket auth uses
        subprotocol `pyai-key.<API_KEY>`.

````