Credentials
API keys are the only secret you hold. Treat them as opaque bearer tokens and keep them server-side wherever possible.- Keys are never stored in the clear. Only a salted
HMAC-SHA256digest of each key is persisted; the raw secret is shown once at creation and cannot be recovered afterward. Authentication looks up the digest and compares in constant time, so a database read never exposes a usable key. - The signing pepper lives in a managed secret store, separate from the database, and is never logged.
- Logs redact secrets. Request logging strips
Authorizationheaders, cookies, and any field namedsecretorpasswordbefore anything is written. - The gateway never forwards your key upstream. At the edge, your
Authorizationheader is removed and replaced with an internal token before the request reaches an inference engine, so your key never leaves the trust boundary.
Rotation & revocation
Every key in the console has rotate and revoke controls. Revocation propagates fleet-wide within ~60 seconds — an instant cache-invalidation signal evicts the key from every gateway, rather than waiting for a cache to expire. After revocation, calls with the old key return401 unauthorized. See Authentication for the full flow.
Encryption in transit
All customer-facing traffic is encrypted:- REST and WebSocket terminate at an HTTPS/WSS load balancer in front of the
edge gateway —
api.pyai.comis HTTPS-only. - Internal hops between the gateway and inference services run on a private Google network using Private Service Connect; that traffic never traverses the public internet.
- Database connections require encryption (
ENCRYPTED_ONLY).
Tenancy isolation
PyAI is multi-tenant with a strict organization → project → key hierarchy. Every key belongs to a project, and every project belongs to exactly one org.- Management routes scope by org membership. Console and management endpoints
enforce org / project / key ownership on every resource id; a request for a
resource your org doesn’t own returns a uniform
404(it never reveals whether the id exists). This is covered by cross-tenant denial tests in CI. - Org identity is unspoofable on the data plane. The gateway strips any
client-supplied identity headers (such as
x-pyai-org-id) and overwrites them from the authenticated key’s projection. Inference routes refuse a request that arrives without a gateway-stamped org. A client cannot impersonate another tenant by setting a header. - Stored records are org-scoped. Transcription jobs and call records are keyed by org id; a lookup with a mismatched org returns nothing.
Data retention
How long data lives depends on the surface. PyAI’s bias is to keep as little as possible.| Surface | What is stored | Retention |
|---|---|---|
| Async transcription jobs (uploaded audio) | Input audio + result files in private, tenant-scoped storage | Inputs deleted after 7 days, results after 30 days |
| Async transcription jobs (your own URL) | Nothing — the input is fetched and not persisted | n/a |
| Streaming Hear / Cue | Audio is relayed for inference and metering | Not written to durable storage |
| Omni realtime | No per-session agent config is stored; the platform keeps zero per-agent state | Live audio is relayed, not persisted |
| Omni call records (transcript + metadata) | Post-call transcript + call metadata (GET /v1/omni/calls) | Retained 90 days, then deleted |
| Omni recordings | Stereo call audio, only when you set recordings_enabled | Off by default (no recording URL stored). When enabled, retained 30 days, then deleted |
| Recap (call intelligence) | Post-call summaries/extractions for the Recap add-on (GET /v1/recap/calls) | Retained 90 days, then deleted |
| Voice clones | Your enrollment audio + the trained voice | Persisted so the voice is reusable; delete the clone to remove it |
If your workload is privacy-sensitive, prefer streaming transcription or
your-own-URL batch jobs: in both cases PyAI does not retain the input audio.
Voice cloning & consent
Voice is biometric data. PyAI’s cloning enrollment validates audio quality, but establishing consent is your responsibility — only clone voices you have explicit permission to use. For telephony, agents support an optionalconsent_line and recording is off by default; you decide what disclosure
your application plays and whether calls are recorded. See the
voice cloning guide.
Abuse & availability controls
Defense-in-depth limits protect both your account and the platform. When a limit is hit you get a specific, branchable error code (see Errors & limits):- Rate limits per key →
429 rate_limit_exceeded(honorRetry-After). - Concurrency caps on realtime sessions →
429 concurrency_limit_exceeded(WebSocket close code4429). - Daily caps on sandbox and publishable tokens →
429 daily_cap_exceeded(resets 00:00 UTC). - Spend gates →
402 credit_exhausted,402 key_budget_exceeded, or402 insufficient_quota, so a runaway client can’t overspend silently. - Publishable tokens add a per-IP fair-use limit and are origin-locked, and ephemeral realtime tokens are short-lived (default 60s, single-use concurrency).
Sandbox vs production
pyai_test_ (sandbox) and pyai_live_ (production) keys share the same API and
models; the difference is policy, not a separate system. Sandbox keys skip the
credit gate (so a fresh integration never 402s) and carry hard daily and
concurrency caps, with no billing. Use sandbox keys for prototypes, evals,
and CI; switch to a live key for real traffic. See
Authentication for the prefix semantics.
Reporting a vulnerability
Found a security issue? Email security@pyai.com with steps to reproduce. Please don’t open a public issue or test against other tenants’ data.Pursuing formal attestations? Reach out via the console
to discuss your compliance requirements — we’ll tell you exactly where each
program stands rather than overstate it here.