AI Thread

Fed up talking videogames? Why?
User avatar
Xeno
Member
Joined in 2008

PostRe: AI Thread
by Xeno » Fri Apr 19, 2024 2:22 pm

https://arstechnica.com/information-tec ... dio-track/

https://www.microsoft.com/en-us/researc ... ct/vasa-1/

On Tuesday, Microsoft Research Asia unveiled VASA-1, an AI model that can create a synchronized animated video of a person talking or singing from a single photo and an existing audio track. In the future, it could power virtual avatars that render locally and don't require video feeds—or allow anyone with similar tools to take a photo of a person found online and make them appear to say whatever they want.

"It paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors," reads the abstract of the accompanying research paper titled, "VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time." It's the work of Sicheng Xu, Guojun Chen, Yu-Xiao Guo, Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, and Baining Guo.

The VASA framework (short for "Visual Affective Skills Animator") uses machine learning to analyze a static image along with a speech audio clip. It is then able to generate a realistic video with precise facial expressions, head movements, and lip-syncing to the audio. It does not clone or simulate voices (like other Microsoft research) but relies on an existing audio input that could be specially recorded or spoken for a particular purpose.


The new Clippy?

The videos are definitely a little off but if this is the early stages then things may well be gooseberry fool for a lot of us.


Return to “Stuff”

Who is online

Users browsing this forum: Cumberdanes, Google [Bot], Met, shy guy 64, zXe and 318 guests