What “voice control” means here

While a call is live, the agent can:
  1. Open a form (optionally with pre-filled values) when the conversation calls for it
  2. Read every field value as the user fills it in — debounced ~250 ms
  3. Guide the user verbally (“It looks like your email is incomplete”)
  4. Advance a stepper or skip a step based on what the user said
  5. Receive a confirmation when the form is submitted, so it can continue the conversation naturally
All of this happens over the LiveKit data channel using topic-keyed JSON messages — the same room your audio runs through. No extra HTTP polling.

Lifecycle

Agent decides to open form

        │   topic: form.{form-id}     ←── Agent → Widget
        ▼   payload: { prefill values }
Widget opens panel

        │   topic: form.state         ←── Widget → Agent (every ~250ms)
        │   payload: { is_open, step_index, values, fields[] }

Agent reads state, speaks guidance

User clicks Submit

        │   POST submit_url           (HTTP — not data channel)

Submission delivered

        │   topic: voice.user_text    ←── Widget → Agent
        │   payload: { type: "{form-id}_submitted", form: {...} }

Agent continues conversation

Topics

TopicDirectionWhenPayload type
form.{form-id}Agent → WidgetAgent opens the formPre-fill values object
form.stateWidget → AgentEvery ~250 ms while open; once on closeform_state
form.stateWidget → AgentSubmit failureform_submit_failed
voice.user_text (default)Widget → AgentAfter successful submit{form_id}_submitted
voice.agent_handoffAgent → WidgetMid-call agent transfer{ agent_name }

Agent opens the form

The LLM calls a tool named after the form’s id:
{
  "name": "book_demo",
  "arguments": { "name": "Alice", "email": "alice@acme.com" }
}
The agent worker republishes the arguments on the data channel: | Topic | form.book-demo (hyphens preserved here) | | Payload | { "name": "Alice", "email": "alice@acme.com" } | Any argument matching a field name is pre-filled. Empty arguments ({}) open the form blank.

To trigger this from the system prompt

When the user expresses interest in scheduling a demo, open the 'book-demo' form.
Pre-fill any details you already know — name, email, company — from the conversation
or from session metadata. Then guide the user verbally through the remaining fields.

Live state from widget to agent

Topic: form.state. Published every ~250 ms while open:
{
  "type": "form_state",
  "form_id": "book-demo",
  "is_open": true,
  "step_index": 1,
  "total_steps": 3,
  "values": {
    "first_name": "Alice",
    "email": "alice@"
  },
  "fields": [
    { "name": "first_name", "label": "First name", "type": "text",  "required": true },
    { "name": "email",      "label": "Email",       "type": "email", "required": true }
  ]
}
FieldMeaning
is_opentrue while panel visible; false once closed
step_index / total_stepsStepper position (0-based). 0 / 1 for single-page
valuesCurrent field state, keyed by field name
fieldsSchema of fields on the current step only — labels, types, options, required flags
The LLM uses fields to phrase contextual prompts (“What’s your last name?” rather than “next field?”), and values to detect blanks or invalid entries.

Prompt instructions that play well with form state

While a form is open you will receive form_state messages. Use them to:
 - Acknowledge the current step briefly when step_index changes
 - Ask for missing required fields by their label
 - Stop prompting once a field has a valid value
 - If is_open becomes false without a submit, ask whether to continue

Submission confirmation

Topic: voice.user_text (default — overridable per form via confirmation_topic).
{
  "type": "book-demo_submitted",
  "form_id": "book-demo",
  "text": "I have confirmed the form submission.",
  "form": {
    "first_name": "Alice",
    "email": "alice@acme.com",
    "date": "2026-06-15"
  }
}
The LLM treats text as user speech and continues naturally — e.g. “Perfect, I’ve got your details. Someone will reach out to alice@acme.com before June 15th.” Override the type if your prompt keys off a specific event:
{ "id": "book-demo", "confirmation_type": "demo_booking_confirmed" }

Submission failure

If submit_url returns non-2xx or the network fails, the widget publishes: | Topic | form.state |
{
  "type": "form_submit_failed",
  "form_id": "book-demo",
  "text": "The form submission failed. Please try again or continue via voice."
}
The agent can offer retry, fall back to collecting verbally, or escalate.

Implementing the protocol in a custom UI

If you’re building your own front-end without the widget, subscribe to the same topics from the LiveKit room:
import { Room, RoomEvent } from "livekit-client";

const room = new Room();
await room.connect(livekitUrl, token);

room.on(RoomEvent.DataReceived, (payload, _participant, _kind, topic) => {
  const msg = JSON.parse(new TextDecoder().decode(payload));

  if (topic?.startsWith("form.") && topic !== "form.state") {
    const formId = topic.slice("form.".length);
    openFormUI(formId, msg);          // msg = pre-fill values
    return;
  }

  if (topic === "voice.agent_handoff") {
    showHandoffScreen(msg.agent_name);
  }
});

Publish state while the user fills the form

function publishFormState({ formId, stepIndex, totalSteps, values, fields }) {
  const payload = JSON.stringify({
    type: "form_state",
    form_id: formId,
    is_open: true,
    step_index: stepIndex,
    total_steps: totalSteps,
    values,
    fields,
  });
  room.localParticipant.publishData(
    new TextEncoder().encode(payload),
    { topic: "form.state" }
  );
}

// Call this on every input change, debounced to ~250ms

Publish confirmation after submit

async function publishSubmitConfirmation(formId, values) {
  const payload = JSON.stringify({
    type: `${formId}_submitted`,
    form_id: formId,
    text: "I have confirmed the form submission.",
    form: values,
  });
  await room.localParticipant.publishData(
    new TextEncoder().encode(payload),
    { topic: "voice.user_text" }
  );
}

On close without submit

Send one final form.state with is_open: false:
function publishFormClosed(formId, lastValues, lastFields) {
  const payload = JSON.stringify({
    type: "form_state",
    form_id: formId,
    is_open: false,
    step_index: 0,
    total_steps: 1,
    values: lastValues,
    fields: lastFields,
  });
  room.localParticipant.publishData(
    new TextEncoder().encode(payload),
    { topic: "form.state" }
  );
}

Customising data-channel behaviour

Override topics and event types per form:
Field on form defDefaultPurpose
topics[]Extra topics that should also open this form
confirmation_topic"voice.user_text"Where to publish submission confirmation
confirmation_type"{form_id}_submitted"type field on confirmation payload
Example — route confirmations to a dedicated topic instead of treating them as user speech:
{
  "id": "book-demo",
  "confirmation_topic": "form.confirmed",
  "confirmation_type": "demo_booking_confirmed"
}