YouTube is taking its auto-dubbing feature to the next level with AI-powered lip-syncing, and early tests suggest it may eventually come with a price tag.
The platform is reportedly piloting a system that uses artificial intelligence to sync creators’ lips with translated audio, making dubs look more natural and engaging.
In an interview with Digital Trends, YouTube Product Lead for Autodubbing Buddhika Kottahachchi explained the system works by “[modifying] the pixels on the screen to match the translated speech,” noting that the tech requires a 3D understanding of the world, including lip shapes, teeth, posture, and face.
New YouTube features could cost users more money
Right now, the system is optimized for 1080p resolution and not yet tuned for 4K. Kottahachchi confirmed that the team is working to scale the feature across more languages, eventually covering the same 20+ supported by YouTube’s auto-dubbing.
However, he also warned that it’s still early days. “We are not ready to make any broad statements about how broadly we will make it available,” he said. “We want to make it available to more creators and understand the compute constraints and the quality.”
Kottahachchi hinted that the feature could carry an extra cost. While no figures are finalized, YouTube is reportedly evaluating how much compute power the system demands and whether creators will be charged for access.
That’s why the feature is currently being tested with a small group of trusted creators, similar to how auto-dubbing began before its wider rollout.
The move fits squarely into YouTube’s growing AI strategy. The platform is already using AI to automatically create Shorts clips, help place ads at peak engagement moments, and power VEO 3, its advanced video generation model.
Lip-syncing could be the next major step in making content more globally accessible, but creators may have to pay to use it. A full launch timeline hasn’t been announced.