Google Creates AI that Can Generate Music from User Text

Google researchers have created an AI that can generate music based on a text description provided by a user.
Research on the Noise2music program, which was published last week and is still in its early days, adds a new dimension to what conversational AIs can do.
Users can already request AIs like DALL·E 2 to generate images based on simple text descriptions. ChatGPT can generate full answers, write essays or even generate code based on a user’s request.
The Noise2music AI can now generate customized sound at a user’s request.
It has been a busy month for Google, which is rushing to commercialize its AI research after being upstaged last week by Microsoft, which is putting AI technology from OpenAI into its products. OpenAI created ChatGPT conversational AI and the image-generating DALL·E 2.
The Noise2music research uses the same large language model called LaMDA, which Google used for its Bard conversational AI, which the company plans to incorporate in its search engine.
Google announced Bard last week after Microsoft’s surprise announcement that it was incorporating a ChatGPT-style chatbot in its Bing search engine. Google had largely kept its major AI projects away from public view, though it has published research papers.
Some other major Google AI projects include Imagen, which can generate images and video based on text descriptions, and PaLM, which is a large language model with 540 billion parameters.
Large language models run on the Transformer architecture, introduced by Google in 2017, which helps tie together relations between parts of sentences, images, and other data points. By comparison, convolutional neural networks look at only immediate neighboring relationships.
The Google researchers fed Noise2music hundreds of thousands of hours of music, and attached multiple labels to the music clips that best described the audio. That involved using a large-language to generate descriptions that could be attached as captions to audio clips, and then using another pre-trained model to label the audio clip.
The LaMDA large-language model generated 4 million long-form sentences to describe hundreds of thousands of popular songs. One description included “a light, atmospheric drum groove provides a tropical feel.”
“We use LaMDA as our LM of choice because it is trained for dialogue applications, and expect the generated text to be closer to user prompts for generating music,” the researchers wrote.
The researchers used the diffusion model, which is used in DALL·E 2 to generate higher-quality images. In this case, the model goes through an upscaling process that generates higher-quality 24kHz audio for 30 seconds. The researchers generated AI processing on TPU V4 chips in the Google Cloud infrastructure.
The researchers posted sample audio generated by the Noise2music AI on Google Research’s Github website. EnterpriseAI ran the audio clips through song recognition application Shazam, and the app couldn’t recognize any clip as an existing song.
The researchers noted that there is much work to be done to improve the music generation based on text prompts, and one direction for this AI could be “to fine-tune the models trained in this work for diverse audio tasks including music completion and modification,” the researchers noted.
The improvements may need to be done with the help of musicians and others to develop a co-creation tool, the researchers noted.
“We believe our work has the potential to grow into a useful tool for artists and content creators that can further enrich their creative pursuits,” the researchers noted.
Related
-
Happening Now
Wednesday, February 15- NVIDIA and Intel Announce Next-Gen Workstations
- Intel Launches New Xeon Workstation Processors
- GIGABYTE to Present 5G Edge and Green Computing Solutions at MWC 2023
- Domino Data Lab Named to Constellation ShortList for 2nd Year in a Row
- Expert.ai Announces Integration of GPT into Its Platform
- Concentric AI Achieved 200% Growth in 2022 as Demand for Its Data Security Posture Management Solution Accelerated Worldwide
- Fujitsu Partners with the Digital Center ‘Arena of IoT’ at Deutsche Bank Park
Tuesday, February 14
- Kyndryl and Nokia Expand Global Network and Edge Computing Alliance
- TSMC Details Resolutions from Board of Directors Meeting
- BSC to Facilitate the Safety Certification of Critical Autonomous AI-based Systems
- Call for Code 2023 Encourages Sustainability Solutions in AI, Opens Registration
- IBM Unveils New Survey Results on STEM Skills Development
- Infrastructure and Compute Are Top Barriers to AI, Run:ai Report Reveals
- Tachyum Validates Prodigy with Kubernetes for High-Performance, High-Density Computing for Containers
- Spectra Logic Announces New Digital Archive Solution
- QCI Commences Cyber Security Launch as Patent is Awarded Addressing Unconditional Network Security
- Sway AI Joins AWS Partner Network to Make AI Accessible for All
- Altair and New York Yacht Club American Magic Announce Partnership for America’s Cup
- OSS Receives $1.3M Order from US Army to Develop Vehicle Visualization Systems Using NVIDIA GPUs
Monday, February 13
-
-
Recent News
-
Google Creates AI that Can Generate Music from User Text
February 15, 2023 -
AI Career Notes: February 2023 Edition
February 8, 2023 -
Microsoft, Google Set AI Paths for Search Engines in Historic Week
February 8, 2023 -
-
-
-
Contributors
Oliver PeckhamEditorTiffany TraderEditorial DirectorJohn RussellContributing EditorAlex WoodieContributing EditorJaime HamptonStaff Writer -
Upcoming Events
-
IoT Evolution Expo 2023
February 14 @ 8:00 am – February 17 @ 5:00 pm -
Stanford Conference 2023
February 14 @ 8:00 am – February 16 @ 5:00 pm -
AI Summit West
February 15 – February 16 -
RICE Energy High Performance Computing Conference 2023
February 28 @ 8:00 am – March 1 @ 5:00 pm -
Intelligent Automation Show Middle East 2023
March 8 – March 9
-
-