Skip to main content
If your telephony already runs on FreeSWITCH, you can drop an Omni agent onto any channel without leaving your dialplan. FreeSWITCH forks the call’s audio to a small WebSocket bridge, the bridge relays it to Omni and plays the agent’s reply back into the channel, and you drive call control (barge-in, transfer, DTMF) over the Event Socket.

How it fits together

mod_audio_stream (or mod_audio_fork) forks channel audio as L16 — signed linear 16-bit PCM — at 16 kHz, and plays audio you send back into the same channel. L16/16 kHz is byte-for-byte the same as Omni’s PCM16 at 16 kHz, so run Omni with ?rate=16000 and the audio path is a straight passthrough — no resampling. Call control rides a separate ESL (Event Socket) connection.
This guide is correct on transport, the fork codec/rate, and the Omni event behaviors. The exact JSON payloads of Omni events (notably the dtmf frame you forward) are defined by the Omni wire protocol — they live in one helper so updating them is a one-line change.

Prerequisites

1

FreeSWITCH with a fork module

mod_audio_stream (bidirectional; recommended) or mod_audio_fork loaded and in modules.conf.xml. Confirm with fs_cli -x "module_exists mod_audio_stream".
2

Event Socket access

The inbound Event Socket enabled (default 127.0.0.1:8021, password ClueCon). Lock this down to localhost or your bridge host.
3

An Omni agent + key

An agent_id and a pyai_test_ key, supplied to the bridge as environment variables.

Project layout

freeswitch-omni-bridge/
├── server.js               # /fork WebSocket bridge + ESL control
├── package.json
├── .env                    # PYAI_API_KEY, PYAI_AGENT, FS_HOST, FS_PASSWORD, ...
└── dialplan/omni-agent.xml # the extension that starts the fork
package.json
{
  "name": "freeswitch-omni-bridge",
  "type": "module",
  "scripts": { "start": "node server.js" },
  "dependencies": {
    "@fastify/websocket": "^11.0.0",
    "dotenv": "^16.4.0",
    "fastify": "^5.0.0",
    "modesl": "^1.6.0",
    "ws": "^8.18.0"
  }
}

Build it

1

Dialplan: fork the channel to your bridge

Answer the call, then start mod_audio_stream toward your bridge at 16 kHz mono, and park the channel so it stays up while audio streams. Pass the channel UUID on the URL so the bridge can issue ESL commands for it.
dialplan/omni-agent.xml
<include>
  <extension name="omni-agent">
    <condition field="destination_number" expression="^9000$">
      <action application="answer"/>
      <action application="audio_stream"
              data="start wss://${bridge_host}/fork?uuid=${uuid} mono 16k"/>
      <action application="park"/>
    </condition>
  </extension>
</include>
With mod_audio_fork the app is uuid_audio_fork/audio_fork; the arguments (mix type mono, rate 16k) are the same. Keep the rate at 16k so it lines up with Omni’s rate=16000 and no resampling is needed.
2

Bridge: relay fork audio ↔ Omni

FreeSWITCH connects to /fork and streams binary L16 frames. Forward them to Omni untouched, and stream Omni’s binary frames straight back into the channel.
server.js (audio bridge)
import "dotenv/config";
import Fastify from "fastify";
import websocket from "@fastify/websocket";
import WebSocket from "ws";
import esl from "modesl";

const { PYAI_API_KEY, PYAI_AGENT, FS_HOST, FS_PASSWORD } = process.env;

const app = Fastify();
await app.register(websocket);

const calls = new Map(); // uuid → { omni, fsWS }

const omniURL =
  `wss://api.pyai.com/v1/omni?agent_id=${PYAI_AGENT}` +
  `&format=pcm16&rate=16000`;

app.get("/fork", { websocket: true }, (fsWS, req) => {
  const uuid = new URL(`http://x${req.url}`).searchParams.get("uuid");
  const omni = new WebSocket(omniURL, [`pyai-key.${PYAI_API_KEY}`]);
  calls.set(uuid, { omni, fsWS });

  // caller audio (L16 16 kHz) → Omni (PCM16 16 kHz): identical bytes
  fsWS.on("message", (data, isBinary) => {
    if (isBinary && omni.readyState === WebSocket.OPEN) omni.send(data);
  });

  // agent audio → back into the channel; text frames are session events
  omni.on("message", (data, isBinary) => {
    if (isBinary) fsWS.send(data);
    else handleOmniEvent(JSON.parse(data.toString()), uuid);
  });

  const cleanup = () => { calls.delete(uuid); omni.close(); };
  fsWS.on("close", cleanup);
  omni.on("close", () => fsWS.close());
});
L16 and PCM16 are both signed 16-bit, but endianness must match. If you hear static or white noise, your fork is emitting big-endian — byte-swap each 16-bit sample (buf.swap16()) before forwarding, and again on the way back.
3

ESL: barge-in, transfer, and DTMF

One Event Socket connection drives every call. Subscribe to DTMF events and forward digits to Omni; react to Omni events with uuid_break (barge-in) and uuid_transfer (escalate to a human).
server.js (control plane)
const control = new esl.Connection(FS_HOST, 8021, FS_PASSWORD, () => {
  control.subscribe(["DTMF"]);
});

control.on("esl::event::DTMF::*", (e) => {
  const uuid = e.getHeader("Unique-ID");
  const digit = e.getHeader("DTMF-Digit");
  const call = calls.get(uuid);
  if (call) forwardDtmf(call.omni, digit);
});

function forwardDtmf(omni, digit) {
  // Localized: exact dtmf frame per the Omni wire protocol reference.
  if (omni.readyState === WebSocket.OPEN) {
    omni.send(JSON.stringify({ type: "dtmf", digit }));
  }
}

function handleOmniEvent(evt, uuid) {
  switch (evt.type) {
    case "hello":
    case "session_started":
      break;

    case "flush":
      // Barge-in: stop the agent audio currently playing into the channel.
      control.api("uuid_break", uuid);
      break;

    case "transfer_to_human":
      // Hand the live channel off to a human extension.
      control.api("uuid_transfer", `${uuid} ${process.env.HUMAN_EXTENSION} XML default`);
      break;

    case "session_ending":
      calls.get(uuid)?.fsWS.close();
      break;
  }
}

app.listen({ port: 8080, host: "0.0.0.0" });
The flush, dtmf, transfer_to_human, and session_ending event names are stable; exact payload fields come from the Omni wire protocol. forwardDtmf is the only spot that constructs an Omni control frame.

Run it

npm install
FS_HOST=127.0.0.1 FS_PASSWORD=ClueCon bridge_host=<bridge-host> npm start
Reload the dialplan (fs_cli -x "reloadxml"), set bridge_host to where your Node process is reachable, then dial extension 9000. The agent should greet the caller within a second. Talk over it to confirm uuid_break cuts the agent off; press a DTMF key and watch it reach Omni; trigger a transfer to confirm the channel reaches the human extension.

Codec & rate notes

  • No resampling at the fork. Fork at 16k and run Omni at rate=16000 — L16/16 kHz ↔ PCM16/16 kHz is a passthrough. If the caller’s leg is 8 kHz, FreeSWITCH transcodes it to 16 kHz for the fork (8000 → 16000 is a 2:1 upsample done inside FreeSWITCH), so Omni always sees a clean 16 kHz stream.
  • Endianness, not rate, is the usual culprit for distorted audio — see the warning above.
  • park keeps the channel alive. Without it the call can tear down before the fork is established.

Troubleshooting

SymptomLikely causeFix
Static / white noise both waysEndian mismatchbuf.swap16() on the fork audio in both directions
Agent sounds slow or fastFork rate ≠ Omni rateFork at 16k and connect Omni at rate=16000
Channel hangs up before audioMissing parkAdd <action application="park"/> after starting the fork
Agent talks over the callerflush not wired to uuid_breakCall uuid_break <uuid> on every flush event
DTMF never reaches OmniNot subscribed to DTMF, or wrong UUIDcontrol.subscribe(["DTMF"]); map by Unique-ID
Transfer failsBad extension/contextVerify the uuid_transfer <uuid> <ext> XML <context> args route in your dialplan
Omni WS closes immediatelyBad key/agentCheck the close code in Errors & limits
module_exists returns falseFork module not loadedLoad mod_audio_stream in modules.conf.xml and reloadxml

Next steps

Omni wire protocol

Exact event payloads and close codes.

Browser voice agent

The same agent in the browser via WebRTC.

Phone agent with Twilio

Media Streams bridge at μ-law 8 kHz.

Errors & limits

Rate limits, concurrency, and retries.