> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pyai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# OpenRouter provider model catalog

> Machine-readable catalog in the OpenRouter provider 'List Models' format (https://openrouter.ai/docs/guides/community/for-providers). Polled by OpenRouter's provider monitor. Models with `is_ready: false` are staged but hidden until their inference surface is ready.


## OpenAPI

````yaml https://api.pyai.com/openapi.json get /v1/providers/openrouter/models
openapi: 3.1.0
info:
  title: PyAI API
  version: 1.0.0
  description: >-
    Telephony-native Voice AI behind one bearer key:


    - **Hear**, speech-to-text · `POST /v1/audio/transcriptions` (streaming +
    batch)

    - **Speak**, text-to-speech, stock voices & voice cloning · `POST
    /v1/audio/speech`, `GET /v1/voices`, `/v1/voice/clones`

    - **Cue**, streaming turn detection + knowledge-base context for your own
    LLM/voice pipeline · `GET /v1/audio/transcriptions/stream` with grounding

    - **Omni**, full-duplex agentic voice (speech-to-speech, grounded in your
    knowledge bases + tools) · `/v1/omni` (and the OpenAI-compatible
    `/v1/realtime`)

    - **AMD API**, answering-machine detection: know *who or what* answered a
    call (human, voicemail, IVR, iPhone/Google screening, dead number, fax) in a
    fraction of the dead-air dwell (under 300 ms for a human, under 800 ms for a
    machine, in-region), with the reason it decided · `wss …/v1/amd/stream`
    (Twilio Media Streams drop-in), `POST /v1/amd/config`, `GET
    /v1/amd/calls/{id}`

    - **Agents**, PyAI's feature to create, manage & track your Omni voice
    agents (no-code builder, hosted knowledge, evals, monitoring). _Coming
    soon._


    ## Authentication


    Create a key in the [console](https://console.pyai.com) (it is shown once)
    and send it as a bearer token:


    ```

    Authorization: Bearer pyai_live_...

    ```


    Keys are environment-scoped: `pyai_live_...` (production) and
    `pyai_test_...` (sandbox). New accounts receive up to $50 of prepaid credits
    after phone verification (graduated signup); `pyai_test_` keys skip the
    credit gate entirely.


    Keys are self-validating signed tokens: they work on every PyAI surface the
    instant they are created, no activation or propagation delay. Treat them as
    opaque strings (up to 512 chars) and never parse their contents.


    WebSocket endpoints can't use request headers from a browser, so pass the
    key as a **subprotocol** instead:


    ```

    Sec-WebSocket-Protocol: pyai-key.pyai_live_...

    ```


    (server-side clients may instead append `?api_key=...` to the URL). The
    gateway authenticates the key on the WebSocket upgrade and swaps it for the
    internal upstream credential, your key never reaches the model, and the
    model's key never reaches you.


    ## Quickstart, Hear (speech-to-text)


    ```

    curl https://api.pyai.com/v1/audio/transcriptions \
      -H "Authorization: Bearer $PYAI_API_KEY" \
      -F file=@audio.wav -F model=pyai-hear
    # -> { "text": "..." }

    ```


    ## Quickstart, Speak (text-to-speech)


    ```

    curl https://api.pyai.com/v1/audio/speech \
      -H "Authorization: Bearer $PYAI_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{"model":"pyai-voice","input":"Hello from PyAI.","voice":"voice_abc"}' \
      --output speech.wav
    ```


    `voice` is a stock voice id from `GET /v1/voices` (a curated set of 12
    prebuilt voices with personas, avatars, and previews) or a cloned voice id
    from `/v1/voice/clones`. Omit it to use your account's default voice.


    ## Quickstart, Omni (realtime voice agent)


    Omni is **zero-state, there is nothing to create first.** Open a WebSocket,
    pass your key as a subprotocol, and send the agent's behavior (voice,
    persona, knowledge endpoint) in the first `configure` frame:


    ```

    wss://api.pyai.com/v1/omni?session_label=support&format=pcm16&rate=24000
      Sec-WebSocket-Protocol: pyai-key.$PYAI_API_KEY
    ```


    The session is authorized by your key's **organization**; `session_label` is
    an **optional, opaque** tag (echoed to your own knowledge endpoint for
    correlation), omit it or use any value. When `session_label` equals a
    **`/v1/agents` profile id**, the engine loads persona, voice, and **greeting
    message** from that profile (turn-0 playback). `format` and `rate` are
    load-bearing on the connect URL (the SDK sets them). Send PCM16 audio as
    binary frames and receive the agent's speech the same way. **Optional
    convenience:** pre-store config via `POST /v1/agents` (including `greeting`,
    `consent_line`, `recordings_enabled`) and pass its id as `session_label`, or
    send everything inline in the post-handshake `configure` frame. Not required
    to connect. The OpenAI-realtime-compatible surface (`/v1/realtime`) is
    served by the same Omni engine; new integrations should prefer `/v1/omni`.
    (_Flow, the legacy voice-duplex engine, is retired for new customers; its
    `/v1/realtime` alias now routes to Omni._)


    For reproducible eval runs, determinism controls (`seed`/`temperature`) ride
    the Omni session's `configure` frame, which the gateway passes through
    unchanged, they are honored once the engine supports them; no platform
    change is required.


    ## Scopes


    | Scope | Grants |

    | --- | --- |

    | `hear:transcribe` | `POST /v1/audio/transcriptions` |

    | `hear:stream` | `GET /v1/audio/transcriptions/stream` (WebSocket) |

    | `voice:synthesize` | `POST /v1/audio/speech` (Speak) |

    | `voice:clone` | `/v1/voice/clones` (Speak) |

    | `voice:design` | `/v1/voice/design` (Speak) |

    | `flow:session` | _legacy_, `/v1/realtime` for existing Flow customers; new
    traffic on `/v1/realtime` uses Omni |

    | `omni:session` | `/v1/omni` (native), `/v1/realtime` (Omni), and `POST
    /v1/omni/sessions` (mint a browser session token) |

    | `omni:read` | `/v1/omni/calls` (Omni post-call records) |

    | `transcribe:jobs` | `/v1/transcription/jobs` |

    | `trace:configure` | `/v1/trace/config`, `/v1/trace/rule-packs` (Trace
    management) |

    | `trace:read` | `/v1/trace/interactions`, `/violations`, `/findings`,
    `/exposure` (Trace reads) |

    | `recap:configure` | `/v1/recap/config` (Recap management) |

    | `recap:configure` | `/v1/recap/crm-config` (Salesforce field mapping) |

    | `recap:read` | `/v1/recap/calls` (Recap reads) |

    | `amd:detect` | `wss …/v1/amd/stream` (AMD realtime detection, Twilio
    drop-in) |

    | `amd:configure` | `/v1/amd/config` (AMD operating-point dial + webhook) |

    | `amd:read` | `/v1/amd/calls` (AMD decision records) |

    | `telephony:manage` | `/v1/telephony/*` (managed numbers) |


    `GET /v1/models`, `GET /v1/voices`, and `GET /v1/me` need no specific scope,
    any active key may call them. Wildcards (`hear:*`, `voice:*`, …, and the
    global `*`) grant every scope in their family.


    ## Canonical endpoints


    One row per product surface, endpoint, auth, required scope, and lifecycle
    status. **live** = generally available; **deprecated** = works during a
    migration window (don't build new on it); **legacy** = supported for
    existing customers only; **planned** = not yet available.


    | Product | Endpoint | Auth | Scope | Status |

    | --- | --- | --- | --- | --- |

    | Identity | `GET /v1/me` | Bearer | _any active key_ | live |

    | Models | `GET /v1/models` | Bearer | _any active key_ | live |

    | Voices | `GET /v1/voices`, `GET /v1/voices/{id}` | Bearer | _any active
    key_ | live |

    | Hear (batch) | `POST /v1/audio/transcriptions` | Bearer |
    `hear:transcribe` | live |

    | Hear (streaming) | `GET /v1/audio/transcriptions/stream` (WS) |
    Subprotocol | `hear:stream` | live |

    | Cue | `GET /v1/audio/transcriptions/stream` + grounding (WS) | Subprotocol
    | `hear:stream` | live |

    | Hear (async batch) | `POST`/`GET /v1/transcription/jobs` | Bearer |
    `transcribe:jobs` | live |

    | Speak (TTS) | `POST /v1/audio/speech` | Bearer | `voice:synthesize` | live
    |

    | Speak (cloning) | `GET`/`POST /v1/voice/clones` | Bearer | `voice:clone` |
    live |

    | Speak (design) | `/v1/voice/design` | Bearer | `voice:design` | live |

    | Omni (native) | `wss …/v1/omni?agent_id=` | Subprotocol | `omni:session` |
    live |

    | Omni (OpenAI-compat) | `wss …/v1/realtime?model=pyai-omni-realtime` |
    Subprotocol | `omni:session` | live |

    | Omni (alias) | `wss …/v2/omni/chat` | Subprotocol | `omni:session` |
    deprecated |

    | Flow | `wss …/v1/realtime?model=pyai-flow-realtime` | Subprotocol |
    `flow:session` | legacy |

    | Agent profiles (optional config) | `/v1/agents`, `/v1/agents/{id}` |
    Bearer | `omni:session` | live |

    | Trace (config) | `/v1/trace/config`, `/v1/trace/rule-packs` | Bearer |
    `trace:configure` | live |

    | Trace (reads) | `/v1/trace/interactions`, `/violations`, `/findings`,
    `/exposure` | Bearer | `trace:read` | live |

    | Recap (config) | `/v1/recap/config` | Bearer | `recap:configure` | live |

    | Recap (CRM) | `/v1/recap/crm-config` | Bearer | `recap:configure` | live |

    | Recap (reads) | `/v1/recap/calls` | Bearer | `recap:read` | live |

    | Omni call records | `/v1/omni/calls`, `/v1/omni/calls/{id}` | Bearer |
    `omni:read` | live |

    | AMD (stream) | `wss …/v1/amd/stream` (Twilio Media Streams drop-in) |
    Subprotocol | `amd:detect` | live |

    | AMD (config) | `GET`/`POST /v1/amd/config` | Bearer | `amd:configure` |
    live |

    | AMD (reads) | `GET /v1/amd/calls`, `/v1/amd/calls/{id}` | Bearer |
    `amd:read` | live |

    | Telephony | `/v1/telephony/*` | Bearer | `telephony:manage` | live |

    | Agents (create/manage/track feature) | _coming soon_ |, |, | planned |


    WebSocket surfaces authenticate with the `Sec-WebSocket-Protocol:
    pyai-key.<API_KEY>` subprotocol (or `?api_key=` server-side); everything
    else takes the `Authorization: Bearer` key. Telephony's carrier-backed calls
    (search/provision/release) return 404 until a carrier is configured for the
    account.


    ## Rate limits & billing


    Every key has a per-second rate limit (with burst) and a cap on concurrent
    realtime sessions. Exceeding either returns `429` with a `Retry-After`
    header. Usage is metered per minute of audio, transcription minutes (Hear),
    synthesized audio minutes (Speak), and realtime session minutes (Cue, Omni),
    and billed against your plan and credits. List prices: Hear $0.001/min
    (async Transcribe $0.0005/min), Speak $0.04/min streaming ($0.04/min async),
    Cue $0.015/min, Omni $0.05/min, Agents $0.08/min (the create/manage/track
    feature; rolling out). The AMD API bills per **answered** call, the first
    5,000 answered calls each month are free, then $0.004/answered call
    (no-answers, busies, and failed calls are free; AMD bundled with PyAI
    telephony/Omni is included at no charge). Because AMD decides in a fraction
    of the incumbent's dead-air dwell, it is all-in lower than legacy
    answering-machine detection. AI products (Hear, Speak, Cue, Omni) bill **per
    second by default**, the pulse is applied once to each meter's
    invoice-period total, so many short sessions are summed and rounded a single
    time (never minute-rounded per call), and an empty/failed call bills
    nothing. Coarser pulses are available as an optional enterprise override.
    Managed telephony minutes keep a 1-minute pulse (carrier economics).
    Per-character Speak billing is available on enterprise contracts.
  contact:
    name: PyAI
    url: https://pyai.com
servers:
  - url: https://api.pyai.com
    description: Production
security:
  - apiKey: []
  - xApiKey: []
tags:
  - name: Identity
    description: >-
      Introspect the calling key: org/project, env, granted scopes, and
      limits/credit posture. Use it to self-diagnose a 401/403/402.
  - name: Hear
    description: Speech-to-text (streaming + batch)
  - name: Speak
    description: Text-to-speech and voice cloning
  - name: Realtime
    description: >-
      Full-duplex WebSocket sessions: Omni (agentic speech-to-speech with
      knowledge bases + tools). The legacy Flow engine is retired for new
      customers; its /v1/realtime alias routes to Omni.
  - name: Models
    description: Model catalog
  - name: Sandbox
    description: >-
      Zero-friction onboarding for coding agents: mint a free, instant, no-card
      sandbox key with no human steps.
  - name: Transcription Jobs
    description: Async batch transcription
  - name: Agents
    description: >-
      Agent profiles, OPTIONAL pre-stored Omni session config (persona,
      greeting, voice, conversation knobs) you can reference by id instead of
      sending a `configure` frame each call. Optional convenience over the
      zero-state `/v1/omni` primitive; NOT required to connect. (Distinct from
      the upcoming Agents feature, the no-code create/manage/track surface.)
  - name: Trace
    description: >-
      Compliance & guardrails: per-agent config, rule packs, and the exposure /
      violations / interaction-evidence read views
  - name: AMD
    description: >-
      Answering-machine detection: know who or what answered a call (human,
      voicemail, IVR, iPhone/Google screening, dead number, fax) in a fraction
      of the dead-air dwell (under 300 ms for a human, under 800 ms for a
      machine, in-region), with the reason it decided. Twilio Media Streams
      drop-in over `wss …/v1/amd/stream`; one operating-point dial; billed per
      answered call.
  - name: Telephony
    description: >-
      Managed phone numbers: search, provision, route to an agent, and release.
      Call minutes bill on telephony.minutes ($0.01/min).
paths:
  /v1/providers/openrouter/models:
    get:
      tags:
        - Models
      summary: OpenRouter provider model catalog
      description: >-
        Machine-readable catalog in the OpenRouter provider 'List Models' format
        (https://openrouter.ai/docs/guides/community/for-providers). Polled by
        OpenRouter's provider monitor. Models with `is_ready: false` are staged
        but hidden until their inference surface is ready.
      operationId: listOpenRouterModels
      responses:
        '200':
          description: OpenRouter-format model catalog
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/OpenRouterModelList'
components:
  schemas:
    OpenRouterModelList:
      type: object
      properties:
        data:
          type: array
          items:
            $ref: '#/components/schemas/OpenRouterModel'
    OpenRouterModel:
      type: object
      description: A model entry in the OpenRouter provider 'List Models' format.
      properties:
        id:
          type: string
          example: pyai-hear
          description: Exact id OpenRouter calls this provider with.
        hugging_face_id:
          type: string
          example: ''
        name:
          type: string
          example: 'PyAI: Hear (Speech-to-Text)'
        created:
          type: integer
          description: Unix epoch seconds.
        description:
          type: string
        input_modalities:
          type: array
          items:
            type: string
          example:
            - audio
        output_modalities:
          type: array
          items:
            type: string
          description: >-
            OpenRouter modalities: e.g. transcription (STT), speech (TTS),
            audio, text.
          example:
            - transcription
        context_length:
          type: integer
          description: Token window (text/chat models); omitted for audio.
        max_output_length:
          type: integer
        pricing:
          type: object
          description: >-
            USD per unit, string-typed to avoid float drift. Per-second audio
            pricing is configured with OpenRouter during onboarding.
          properties:
            prompt:
              type: string
              example: '0'
            completion:
              type: string
              example: '0'
            request:
              type: string
              example: '0'
            image:
              type: string
              example: '0'
            input_cache_read:
              type: string
              example: '0'
        supported_sampling_parameters:
          type: array
          items:
            type: string
          example:
            - temperature
        supported_features:
          type: array
          items:
            type: string
        is_ready:
          type: boolean
          description: false keeps the model staged-but-hidden on OpenRouter.
        is_free:
          type: boolean
        openrouter:
          type: object
          properties:
            slug:
              type: string
              example: pyai/hear
        datacenters:
          type: array
          items:
            type: object
            properties:
              country_code:
                type: string
                example: US
  securitySchemes:
    apiKey:
      type: http
      scheme: bearer
      description: 'Use `Authorization: Bearer pyai_live_...` (or `pyai_test_...`).'
    xApiKey:
      type: apiKey
      in: header
      name: x-api-key
      description: >-
        Header alias for bearer auth on HTTP endpoints. WebSocket auth uses
        subprotocol `pyai-key.<API_KEY>`.

````