ph0nyVoice + Video AI API

Build voice agents, clone voices, generate video. One API, every major provider.

Voice Agents

Agents with custom personalities, knowledge bases, and tools. Deploy to phone calls, WebSocket, or WhatsApp.

Text-to-video, talking-head avatars, lipsync, dubbing — all async. Frontier and open-source models, same API.

Swap providers per request. BYOK skips metering. Streaming, batch, diarization, word-level timestamps.

Clone any voice from a short sample. Use the same voice across TTS, video avatars, and dub.

Frontier and open-weight models, swappable per turn. Rotate for resilience or pin one for consistency.

Ingest documents, transcripts, and audio into vector collections. Agents auto-retrieve per turn.

Sub-second voice loop. Twilio, WebSocket streaming, barge-in, interruption handling.

Free credits to start. Pay-as-you-go after that. BYOK skips our metering entirely.