The Voice API allows you to integrate Exei’s real-time AI Voice Agent directly into your own applications using APIs and WebSockets.
This channel is designed for developers who want full control over the voice experience, such as building custom voice assistants, call flows, kiosks, mobile apps, or internal tools—outside of Exei’s default UI.
What the Voice API Enables
Using the Voice API, you can:
Connect to Exei’s AI Voice Agent programmatically
Send real-time audio input (speech)
Receive real-time audio output (AI responses)
Handle speech-to-text (STT) and text-to-speech (TTS)
Manage interruptions and transcripts
Build custom voice experiences on top of Exei
All conversations created through the Voice API are tracked inside Exei.
Where to Find Voice API Configuration
To access the Voice API details:
Open your AI agent from My Agents
Go to Channels
Select Voice API
This page provides everything required to integrate the Voice API.
Voice API Credentials
The Voice API configuration page provides:
This is the base endpoint used to connect to Exei’s Voice API.
A unique identifier used to authenticate your Voice API requests.
Both values are required to establish a connection and should be kept secure.
How the Voice API Works
The Voice API uses WebSocket-based real-time communication.
A typical flow looks like this:
Establish a WebSocket connection using the API endpoint
Authenticate using the Client ID
Generate a session ID for the conversation
Send audio data (user speech) to Exei
Receive audio responses (AI speech) in real time
Handle transcripts and interruption events
This enables natural, low-latency voice conversations.
Step-by-Step Voice API Flow
Create a WebSocket connection using the provided API endpoint.
This connection is used for sending and receiving real-time audio data.
2. Generate a Session ID
Each voice conversation requires a unique session ID.
The session ID:
Identifies the conversation
Keeps audio streams and transcripts in sync
Is required for tracking the session in Exei
3. Initialize the Voice Session
Once connected, initialize the session by sending:
Session ID
Client ID
Any required configuration parameters
This tells Exei to start a new voice interaction.
4. Send Audio for Speech-to-Text (STT)
Capture microphone audio from the user and send it to the Voice API.
The API:
Converts speech to text
Uses the text to generate an AI response
Supports real-time streaming audio
5. Receive Audio Responses (TTS)
The Voice API streams back:
AI-generated audio responses
Partial or complete speech output
You can play this audio directly in your application.
6. Handle Interrupts and Transcripts
The Voice API supports interruption handling.
If a user speaks while the AI is responding:
The API sends an interrupt event
Current audio playback can be stopped
The new input is processed immediately
Transcript events are also sent, allowing you to:
Display live text
Store conversation logs
Debug voice interactions
Voice Settings Used by the API
The Voice API respects the agent’s voice configuration.
Voice behavior such as:
Language
Accent
Speech style
Voice model
is controlled from Channels → Voice in Exei.
Any changes made there automatically apply to Voice API interactions.
Conversations & Tracking
All conversations created via the Voice API:
Appear in the Conversations section
Include full transcripts
Support feedback and Instant Retrain
Can be reviewed for analytics and debugging
Are included in Insights (based on plan availability)
Voice API conversations are treated the same as other voice channels.
Security Best Practices
Keep API endpoints and Client IDs private
Do not expose credentials in client-side code
Use secure server-side handling where possible
Rotate credentials if compromised
Best Practices for Voice API Integration
Generate a new session ID for each conversation
Handle interruptions gracefully
Monitor transcripts for accuracy
Test with different accents and languages
Log errors and fallback events
Common Mistakes to Avoid
Reusing session IDs across conversations
Not handling interrupt events
Sending unsupported audio formats
Hardcoding credentials in public clients
When to Use the Voice API
Use the Voice API when:
You are building a custom voice application
You need full control over the UI and flow
You are integrating Exei into external systems
Default website or VoIP voice channels are not sufficient
What’s Next?
Once Voice API is integrated, you can enhance and optimize the experience.
Recommended next guides: