# PyAI

> Telephony-native Voice AI — Hear, Speak, Cue, and Omni behind one API key.

## Docs

- [Create an agent profile](https://docs.pyai.com/api-reference/agents/create-an-agent-profile.md): Create an OPTIONAL agent profile (persona, greeting, voice, conversation knobs) so you can reference it by id instead of sending a full `configure` frame each call. Drive it with `wss://api.pyai.com/v1/omni?session_label=agent_…` (the profile id doubles as the opaque session label; `agent_id` is the…
- [Delete an agent](https://docs.pyai.com/api-reference/agents/delete-an-agent.md)
- [Get an agent](https://docs.pyai.com/api-reference/agents/get-an-agent.md)
- [List agent profiles](https://docs.pyai.com/api-reference/agents/list-agent-profiles.md): All active agent profiles in your organization. Agent profiles are optional pre-stored Omni session config; they are not required to open an Omni session. Requires the `omni:session` scope.
- [Update an agent](https://docs.pyai.com/api-reference/agents/update-an-agent.md): Partial update: present fields are set, `null` clears a field, absent fields are untouched. Config edits are live on the agent's next call.
- [Stream transcription (WebSocket)](https://docs.pyai.com/api-reference/hear/stream-transcription-websocket.md): Upgrade to a WebSocket for low-latency streaming speech-to-text with eager partials (first partial typically within ~300 ms). Transcription-only — distinct from `/v1/realtime` (Omni duplex). Requires the `hear:stream` scope. With knowledge-base grounding enabled this surface becomes **Cue** (turn de…
- [Transcribe audio](https://docs.pyai.com/api-reference/hear/transcribe-audio.md): Whisper-compatible transcription. Requires the `hear:transcribe` scope.
- [Introspect the calling key (whoami)](https://docs.pyai.com/api-reference/identity/introspect-the-calling-key-whoami.md): Return the identity the gateway resolved for your key: `key_id`, `org_id`, `project_id`, environment (`test`/`live`), `status`, the granted `scopes`, and the rate-limit/credit posture. Any active key may call it (no special scope), so it is the fastest way to self-diagnose a `403 insufficient_scope`…
- [List available models](https://docs.pyai.com/api-reference/models/list-available-models.md)
- [Get an Omni call record](https://docs.pyai.com/api-reference/omni-calls/get-an-omni-call-record.md): Full record for one Omni session: transcript (inline or via `transcript_url`), `recording_url` when available, and `summary` when generated. Requires `omni:read`.
- [Get an Omni call recording URL](https://docs.pyai.com/api-reference/omni-calls/get-an-omni-call-recording-url.md): A short-lived URL to the call's audio recording, when one exists (recording is off by default). `404` if there is no recording for the call. Requires `omni:read`.
- [Get an Omni call summary](https://docs.pyai.com/api-reference/omni-calls/get-an-omni-call-summary.md): The structured post-call summary, when one was generated. `404` if there is no summary for the call. Requires `omni:read`.
- [List Omni call records](https://docs.pyai.com/api-reference/omni-calls/list-omni-call-records.md): Recent Omni realtime sessions for this org, newest first. Each record carries the transcript, an optional recording URL, and an optional summary. Requires the `omni:read` scope. Filter by `session_label` (the opaque tag you passed on the connect URL).
- [Mint an ephemeral Omni session token](https://docs.pyai.com/api-reference/omni/mint-an-ephemeral-omni-session-token.md): Mint a short-lived, origin-locked token a browser can use to open ONE Omni realtime session **directly** — the public/private split for realtime. Call this from your server with your secret key (which must hold `omni:session`); never ship the secret key to a page. The returned `token` carries only `…
- [Open a realtime duplex session (WebSocket)](https://docs.pyai.com/api-reference/realtime/open-a-realtime-duplex-session-websocket.md): Upgrade to a WebSocket for full-duplex STT + turn-taking + TTS in one connection.
- [Open an Omni voice-agent session (WebSocket)](https://docs.pyai.com/api-reference/realtime/open-an-omni-voice-agent-session-websocket.md): The native Omni API: a full-duplex WebSocket to a voice agent grounded in its knowledge bases + tools. Omni runs on a single engine tier (Ultra), so this is the canonical surface for agentic voice.
- [Get a recap record](https://docs.pyai.com/api-reference/recap/get-a-recap-record.md)
- [Get Recap config](https://docs.pyai.com/api-reference/recap/get-recap-config.md): Org Recap enablement, customer webhook URL, and default pack. Requires `recap:configure`.
- [Get Recap CRM config](https://docs.pyai.com/api-reference/recap/get-recap-crm-config.md): Salesforce field mapping and credentials (secrets redacted on GET). Requires `recap:configure`.
- [List recap records](https://docs.pyai.com/api-reference/recap/list-recap-records.md): Recent post-call recaps for this org. Requires `recap:read` and the Recap add-on enabled.
- [Manually trigger recap for a call](https://docs.pyai.com/api-reference/recap/manually-trigger-recap-for-a-call.md)
- [Update Recap config](https://docs.pyai.com/api-reference/recap/update-recap-config.md): Enable Recap and set the customer webhook + default pack. Requires `recap:configure`.
- [Update Recap CRM config](https://docs.pyai.com/api-reference/recap/update-recap-crm-config.md): Hand-configured Salesforce mapping for design partners. Omit secret fields on update to preserve existing values.
- [Create a cloned voice](https://docs.pyai.com/api-reference/speak/create-a-cloned-voice.md): Enroll a custom voice from reference audio. EN-only today. Requires the `voice:clone` scope.
- [Delete a cloned voice](https://docs.pyai.com/api-reference/speak/delete-a-cloned-voice.md): Remove a cloned voice. Voices are tenant-isolated — you can only delete your own (otherwise `403`). Requires the `voice:clone` scope.
- [Delete a designed voice](https://docs.pyai.com/api-reference/speak/delete-a-designed-voice.md): Remove a saved designed (prompt-to-voice) voice, freeing a library-cap slot. Tenant-isolated — you can only delete your own (otherwise `404`). Stock voices can't be deleted; cloned voices delete via `DELETE /v1/voice/clones/{id}`.
- [Design a voice from a prompt](https://docs.pyai.com/api-reference/speak/design-a-voice-from-a-prompt.md): Generate a brand-new **synthetic** voice from a text description (distinct from cloning, which copies a real person). Async: returns `202` with a `design_id`; poll `GET /v1/voice/design/{id}` for candidate previews, then `POST /v1/voice/design/{id}/save` to keep one. Requires the `voice:design` scop…
- [Get a stock voice](https://docs.pyai.com/api-reference/speak/get-a-stock-voice.md)
- [Get design candidates](https://docs.pyai.com/api-reference/speak/get-design-candidates.md): Poll a design job. When `status` is `completed`, `candidates` carries signed preview URLs (24h TTL). Candidates below the quality gate are omitted. Requires the `voice:design` scope.
- [List cloned voices](https://docs.pyai.com/api-reference/speak/list-cloned-voices.md)
- [List voices](https://docs.pyai.com/api-reference/speak/list-voices.md): Your unified voice library: the prebuilt PyAI catalog (each with a persona, avatar, and audio preview) merged with your saved **designed** voices, each tagged by `source` (`stock` | `design`). Any active key may read it (no specific scope). Use a `voice_id` from here as the `voice` in `POST /v1/audi…
- [Save a designed voice](https://docs.pyai.com/api-reference/speak/save-a-designed-voice.md): Enroll the chosen candidate as a permanent `voice_id`, usable immediately in `POST /v1/audio/speech` (`voice: vd_…`) and in Omni agents. Requires the `voice:design` scope.
- [Synthesize speech](https://docs.pyai.com/api-reference/speak/synthesize-speech.md): OpenAI-compatible text-to-speech. Returns audio bytes. Requires the `voice:synthesize` scope.
- [List your numbers](https://docs.pyai.com/api-reference/telephony/list-your-numbers.md): Your org's managed numbers, newest first. Active only unless `include_released=true`. Requires the `telephony:manage` scope.
- [Provision (buy) a number](https://docs.pyai.com/api-reference/telephony/provision-buy-a-number.md): Buy a specific available number and attach it to your org, optionally binding it to an `agent_id` for inbound routing. Carrier-side recording is never enabled — calls are recorded in PyAI's media bridge. Connected minutes bill on `telephony.minutes` ($0.01/min). Requires the `telephony:manage` scope…
- [Release a number](https://docs.pyai.com/api-reference/telephony/release-a-number.md): Release the number back to the carrier (stops the monthly rental). Idempotent. Requires the `telephony:manage` scope.
- [Route a number to an agent](https://docs.pyai.com/api-reference/telephony/route-a-number-to-an-agent.md): Bind the number to an `agent_id` (or pass `null` to unassign) so inbound calls open that agent's Omni session. Requires the `telephony:manage` scope.
- [Search available numbers](https://docs.pyai.com/api-reference/telephony/search-available-numbers.md): Search the carrier's available US local numbers to buy. Filter by `area_code` (NPA) or a `contains` digit pattern. Requires the `telephony:manage` scope.
- [Compliance exposure summary](https://docs.pyai.com/api-reference/trace/compliance-exposure-summary.md): The dashboard headline / Exposure Scan: interactions scanned, the share with a compliance gap, a per-rule exposure ranking, and the verdict mix over a trailing window. Requires the `trace:read` scope.
- [Get a rule pack](https://docs.pyai.com/api-reference/trace/get-a-rule-pack.md): Resolve a pack by `pack_id` (latest active by default; pass `version` to pin a specific version). Requires the `trace:configure` scope.
- [Get an interaction (the evidence view)](https://docs.pyai.com/api-reference/trace/get-an-interaction-the-evidence-view.md): The full per-call scorecard (findings with plain-English reasons + cited regulations, satisfied requirements, redactions, gate health, verdict) plus the tamper-evident `audit_hash`. With scorecard-v1 the response also carries the optional per-call `timeline` and `quality_metrics` eval blocks (empty…
- [Get Trace config for an agent (or the org default)](https://docs.pyai.com/api-reference/trace/get-trace-config-for-an-agent-or-the-org-default.md): Returns the per-agent Trace config (spec §5.1). Pass `agent_id` to read a specific agent's config; omit it for the org-wide default a new agent inherits. When nothing has been configured yet, returns the safe default (`enabled:false`, `mode:warn`, `fail_open:true`). Requires the `trace:configure` sc…
- [List rule packs](https://docs.pyai.com/api-reference/trace/list-rule-packs.md): Built-in packs (TCPA, HIPAA, PII, brand-voice) plus this tenant's custom uploads. Requires the `trace:configure` scope.
- [List scanned interactions (scorecards)](https://docs.pyai.com/api-reference/trace/list-scanned-interactions-scorecards.md): Cursor-paginated, newest first. Each row is one call's Tier-0 compliance scorecard. Filter by `verdict` (PASS/WARN/FAIL) or `agent_id`. Requires the `trace:read` scope.
- [List Tier-2 semantic findings](https://docs.pyai.com/api-reference/trace/list-tier-2-semantic-findings.md): Cursor-paginated Tier-2 (async semantic) findings — the model-judged concerns deterministic rules can't catch (HIPAA minimum-necessary, brand tone, hallucination-vs-knowledge-base, indirect opt-out, context-dependent PII). These are advisory and non-blocking, and are kept separate from the hash-chai…
- [List violations (findings)](https://docs.pyai.com/api-reference/trace/list-violations-findings.md): Cursor-paginated drill-down of every fired rule across scorecards. Filter by `rule_id`, `severity`, or `interaction_id`. Requires the `trace:read` scope.
- [Set Trace config for an agent (or the org default)](https://docs.pyai.com/api-reference/trace/set-trace-config-for-an-agent-or-the-org-default.md): Upsert the per-agent Trace config (spec §5.1). The body is the §5.1 object (optionally wrapped as `{ agent_id, config }`). Modes: `warn` (log only, never blocks) · `modify` (redact PII / inject disclosures) · `block` · `human_handoff`. Always fail-open; the deterministic inline gate runs models-side…
- [Upload a custom rule pack](https://docs.pyai.com/api-reference/trace/upload-a-custom-rule-pack.md): Register a custom rule pack (spec §5.3) in the Trace DSL. Structural validation only here (`pack_id`, `version`, non-empty `rules`); the kernel compiles + deep-validates it models-side at pull time, and citations/wording are attorney-curated out of band. Requires the `trace:configure` scope.
- [Cancel a transcription job](https://docs.pyai.com/api-reference/transcription-jobs/cancel-a-transcription-job.md): Cancels a `queued`/`running` job; idempotent on terminal jobs (returns them unchanged).
- [Create an async transcription job](https://docs.pyai.com/api-reference/transcription-jobs/create-an-async-transcription-job.md): Submit audio for batch transcription. Provide **exactly one** source: either `audio_url` (an https URL we fetch — privacy-cleanest, the input is never stored) **or** a multipart upload (`multipart/form-data` with an `audio` file part and the same fields as form fields).
- [Get a transcription job](https://docs.pyai.com/api-reference/transcription-jobs/get-a-transcription-job.md)
- [List transcription jobs](https://docs.pyai.com/api-reference/transcription-jobs/list-transcription-jobs.md): Cursor-paginated, newest first. Pass `limit` (1–100, default 20) and the `next_cursor` from the previous page as `cursor` to continue. `next_cursor` is null on the last page.
- [Authentication](https://docs.pyai.com/authentication.md): Authenticate PyAI requests with bearer API keys: pyai_test_ vs pyai_live_ environments, per-product scopes, WebSocket subprotocol auth, and key rotation.
- [Changelog](https://docs.pyai.com/changelog.md): Release notes for the PyAI API, SDKs, and docs: protocol updates for Hear, Speak, and Omni, new telephony audio support, and developer-experience improvements.
- [Errors & limits](https://docs.pyai.com/errors-and-limits.md): PyAI error envelopes (OpenAI-style and RFC 7807 problem+json), stable error codes, rate limits, retry guidance, and Idempotency-Key semantics.
- [Build a browser voice agent](https://docs.pyai.com/guides/browser-voice-agent.md): Capture the mic, stream PCM16 to Omni over WebSocket, and play the agent's voice back — a full talking agent in the browser in about 10 minutes.
- [Build a conversation-intelligence feature](https://docs.pyai.com/guides/conversation-intelligence.md): Build a Gong-style call analytics pipeline with PyAI Hear: batch-transcribe with speaker diarization, compute talk ratio and keyword hits, then LLM-summarize.
- [Integrate Omni with FreeSWITCH](https://docs.pyai.com/guides/freeswitch-voice-agent.md): Fork channel audio (L16/16 kHz) to an Omni agent over WebSocket, with barge-in via uuid_break, live transfer via uuid_transfer, and DTMF over ESL.
- [Stream speech-to-text in real time](https://docs.pyai.com/guides/streaming-stt.md): Build live captions on Hear's streaming WebSocket: send PCM16 audio, render greyed interim partials, force-finalize with commit, and lock in finals on the fly.
- [Build a phone voice agent with Twilio](https://docs.pyai.com/guides/twilio-voice-agent.md): Bridge Twilio Media Streams to an Omni agent: a deployable Node server that handles μ-law↔PCM16, barge-in, DTMF, and live transfer to a human.
- [Clone a voice end to end](https://docs.pyai.com/guides/voice-cloning.md): Enroll a custom voice from a short reference clip, preview it, synthesize speech with it, drop it into an Omni agent, and debug why clips get rejected.
- [Pricing & metering](https://docs.pyai.com/pricing-and-metering.md): How PyAI meters Hear, Speak, and Reason usage: the x-pyai-units response header, prepaid credits, spend caps, and 402 Payment Required semantics.
- [Quickstart](https://docs.pyai.com/quickstart.md): Get a PyAI key, verify it with GET /v1/me, synthesize your first WAV with Speak, and connect a realtime Omni voice agent — all in under 60 seconds.
- [Omni wire protocol (v2)](https://docs.pyai.com/realtime/omni-protocol.md): Reference for the Omni realtime WebSocket protocol: connect URL, optional session_label, configure frame, and zero-state speech-to-speech audio.
- [Telephony audio (8 kHz μ-law)](https://docs.pyai.com/reference/telephony-audio.md): Move phone audio in and out of Speak, Hear, and Omni cleanly: native G.711 μ-law/A-law at 8 kHz, μ-law companding, and exact integer resample ratios.

## OpenAPI Specs

- [openapi](https://api.pyai.com/openapi.json)