Written by Stephen Nellis
(Reuters) – Nvidia (NASDAQ:) on Monday unveiled a new artificial intelligence model for generating music and audio that can modify voices to produce novel sounds. This is a technology aimed at music, film, and video game creators.
Nvidia, the world’s largest supplier of chips and software used to build AI systems, said it has no immediate plans to release the technology, which it calls Fugatto, short for Foundational Generative Audio Transformer Opus 1, to the public. .
This joins other technologies demonstrated by startups like Runway and larger companies like Meta Platforms (NASDAQ:), which can generate audio and video from text prompts.
Santa Clara, California-based Nvidia’s version generates sound effects and music from text descriptions, including novel sounds such as a dog making trumpet noises.
What sets it apart from other AI technologies is that it can take existing audio and modify it. For example, you can convert a line played on a piano to a line sung by a human voice, or you can capture and modify a recording of spoken language. The accents used and the atmosphere expressed.
“If you think about synthesized audio over the last 50 years, music sounds different than it does now because of computers and synthesizers,” said Brian Catanzaro, vice president of applied deep learning research at Nvidia. “I think generative AI will bring new capabilities to music, video games, and just people in general who want to create things.”
Companies such as OpenAI are negotiating with Hollywood studios about whether and how AI can be used in the entertainment industry, especially since Hollywood star Scarlett Johansson has expressed concerns that OpenAI could imitate her voice. The relationship between technology and Hollywood has been strained ever since.
Nvidia’s new model is trained on open source data, and the company said it is still evaluating whether and how to make it available to the public.
“There’s always some risk with generation technology, because people can use it to generate things we don’t want,” Catanzaro said. “We have to be careful about that, so we don’t have any plans to release this right away.”
Creators of generative AI models have not yet determined how to prevent users from misusing the technology, such as generating false information or infringing copyrights by generating copyrighted characters. .
OpenAI and Meta similarly haven’t said when they plan to make their audio and video-generating models publicly available.