Video
ph0ny treats video the same way it treats voice: one API, many providers. You don't bind to Sora or Runway or HeyGen up front — you call our endpoint and pin a provider per request, or let us route by quality / latency / price.
What you can do
| Capability | Use case | Endpoint |
|---|---|---|
| Generate | Marketing b-roll, demo clips, hero animations | /v1/video/generate |
| Avatar | Talking-head from a single photo + script | /v1/video/avatar |
| Lipsync | Re-voice existing footage cleanly | /v1/video/lipsync |
| Dub | End-to-end translation with lipsync | /v1/video/dub |
| Analyze | Search, embed, Q&A over recorded video | /v1/video/analyze |
Quickstart — generate
import { ph0ny } from '@ph0ny/sdk'
const job = await ph0ny.video.generate({
prompt: 'Aerial shot of a mountain pass at golden hour, slow drift',
duration_seconds: 6,
width: 1280,
height: 720,
provider: 'runway',
})
const out = await ph0ny.video.waitForJob(job.id)
console.log(out.result.video_url)Quickstart — talking-head avatar
A single photo + script becomes a video your agent can use as outbound media — sales decks, voicemails, onboarding clips.
const job = await ph0ny.video.avatar({
image_url: 'https://uploads.example.com/founder.jpg',
text: 'Hey, thanks for booking a demo. Quick intro before we meet.',
voice_id: 'rachel',
tts_provider: 'elevenlabs',
provider: 'hallo3',
})If you already have audio rendered, pass audio_url instead of text + voice_id. ph0ny handles either path.
Picking a provider
The default routes by use case if you omit provider:
| Use case | Default | Why |
|---|---|---|
| Cinematic generation | runway | Strongest motion + composition |
| Sora-class realism | sora | Highest fidelity short clips (BYOK) |
| Photorealistic avatar | heygen | Best commercial avatar (BYOK) |
| Open-source avatar | hallo3 | Best of the self-hosted set |
| Lipsync dubbing | sync-labs | Tightest mouth alignment |
| Open-source lipsync | latentsync | High fidelity, self-hosted |
| Video Q&A / search | twelve-labs | Marengo + Pegasus |
Always pass provider for production — provider defaults can shift as we add new engines. The full matrix lives on Providers → Video.
Async by default
Every video op returns a VideoJob immediately. Wait via:
- Webhook: pass
webhook_url, we POST a signed payload on completion. - Polling:
GET /v1/video/:jobIdevery 1–5s. - SDK helper:
ph0ny.video.waitForJob(id)does long-poll for you.
Pricing
Video is metered in seconds of output. Self-hosted engines (hallo3, liveportrait, latentsync, etc.) run on ph0ny GPUs at our flat per-second rate. Commercial providers (runway, heygen, sora, sync-labs, twelve-labs) pass through the provider's price. BYOK skips ph0ny metering entirely — bring your own Runway / HeyGen key and only the provider charges you.
See Pricing for the per-second tiers.
Where to next
- Video API reference → — every field, every shape.
- Providers → Video → — engine matrix with capabilities.
- Voice cloning → — pair avatars with cloned voices.