🎙️Voice AgentsAgents with custom personalities, knowledge bases, and tools. Deploy to phone calls, WebSocket, or WhatsApp.
🎬Video Generation & AvatarsText-to-video, talking-head avatars, lipsync, dubbing — all async. Frontier and open-source models, same API.
🔊Pluggable TTS & STTSwap providers per request. BYOK skips metering. Streaming, batch, diarization, word-level timestamps.
🧬Voice CloningClone any voice from a short sample. Use the same voice across TTS, video avatars, and dub.
🧠Frontier LLMsFrontier and open-weight models, swappable per turn. Rotate for resilience or pin one for consistency.
📚Knowledge (RAG)Ingest documents, transcripts, and audio into vector collections. Agents auto-retrieve per turn.
⚡Real-Time SessionsSub-second voice loop. Twilio, WebSocket streaming, barge-in, interruption handling.