In today’s AI world, all the rage is about ChatGPT, the free natural language processing tool launched on Nov. 30, 2022, by OpenAI. The new AI-driven chatbot writes human-like answers about any topic faster than any subject matter expert, and to some extent, it is scary.
Last week, Google published a research paper describing MusicLM, an AI-driven software that can create music in any genre from a text description. In addition, the generative AI system can transform humming or whistling sounds into songs according to the music style described in the text caption.
AI music and song generators are not new, with the likes of Amper Music, AIVA, Soundraw, Amadeus Code, or OpenAI’s Jukebox. However, it looks like the MusicLM system can deliver better music with more variety than existing software, thanks to its access to a large “dataset containing five million audio clips, amounting to 280,000 hours of music” to train its model.
Google does not plan to release its AI music generator but aims at supporting future research by releasing MusicCaps, its own hand-curated, high-quality dataset of 5.5k music-text pairs provided by musicians. The researchers at Google used MusicCaps to “demonstrate that [their] method outperforms baselines” (page 8 of the paper). The scientific team acknowledges that there is a long way to go to improve the quality of the music generated by bots, including the “modeling of high-level song structure like introduction, verse, and chorus”.
The authors of the paper assessed some of the risks involved in developing such an Ai generator, including “cultural appropriation […and] potential misappropriation of creative content associated to the use-case”.