Microsoft researchers have released a new artificial tool capable of creating deeply lifelike human avatars—but have not provided a schedule for making it available to the public, citing concerns about facilitating deep false content.
The AI model known as VASA-1, which stands for “visual affective skills,” can generate an animated movie of a person conversing with synchronized lip movements from a single image and a voice audio clip.
Disinformation researchers are concerned about the widespread use of AI-powered programs to create “deep fake” images, videos, and audio samples during a critical election year.
READ MORE: A “Historic” Alliance On AI Is Formed By Microsoft And Labor Groups
“We are opposed to any behavior that creates misleading or harmful content about real people,” stated the authors of the VASA-1 report, which was published this week by Microsoft Research Asia.
“We are dedicated to developing AI responsibly, with the goal of advancing human well-being,” said the company.
READ MORE: Microsoft May Owe The IRS $28.9 Billion In Back Taxes
“We have no plans to release an online demo, API, product, additional implementation details, or any related offerings until we are certain that the technology will be used responsibly and in accordance with proper regulations.”
Microsoft researchers stated that the technology can detect a wide range of facial characteristics and realistic head movements.
“It paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors,” researchers wrote in their study.
According to Microsoft, VASA is compatible with artistic photographs, melodies, and non-English speech.
Researchers emphasized the potential benefits of the technology, such as offering virtual teachers to kids and therapeutic help to those in need.
“It is not intended to create content that is used to mislead or deceive,” they went on to say.
According to the report, VASA videos still include “artifacts” indicating that they were created using AI.
Ben Werdmuller, ProPublica’s technology head, stated that he would be “excited to hear about someone using it to represent themselves in a Zoom meeting for the first time.”
“So, how did it go? “Did anyone notice?” he asked on the social network Threads.
In March, OpenAI, the creator of ChatGPT, announced a voice-cloning tool dubbed “Voice Engine” that can virtually mimic someone’s speech based on a 15-second audio sample.
However, it added that it was “taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse.”
Earlier this year, a consultant for a long-shot Democratic presidential candidate admitted to being behind a robocall impersonation of Joe Biden addressed to New Hampshire voters, claiming he was trying to illustrate the dangers of artificial intelligence.
The call featured what sounded like Biden’s voice asking people not to vote in the state’s January primary, raising concerns among experts who predict a flood of AI-powered deep fake misinformation in the 2024 presidential election.
Radiant TV, offering to elevate your entertainment game! Movies, TV series, exclusive interviews, music, and more—download now on various devices, including iPhones, Androids, smart TVs, Apple TV, Fire Stick, and more.