Musicians are up in arms about generative AI. And Stability AI’s new music generator shows why they are right to be

Axelle—Bauer-Griffin/FilmMagic

Hello and welcome to Eye on AI.

Things are heating up in the world of AI-generated music. Stability AI—whose founder and CEO resigned late last month amid increasing turbulence at the company—unveiled Stable Audio 2.0, the latest version of its AI music generation model. The model lets users create songs using their own audio samples or by writing simple text prompts, and unlike the initial version that could only create 90-second clips, it can create full-length songs up to three minutes. The other difference with the 2.0 model, Stability AI told The Verge, is that its outputs actually sound like complete songs with differentiated intros, progressions, and outros.

If anyone’s excited about this, it’s not musical artists. The release comes just days after more than 200 artists came together to sign an open letter urging tech companies to cease creating AI technologies that “sabotage creativity and undermine artists, songwriters, musicians, and rightsholders.” Signatories include Billie Eilish, Nicki Minaj, Elvis Costello, Katy Perry, Smokey Robinson, Sheryl Crow, Pearl Jam, and the estates of Bob Marley and Frank Sinatra, among many other notable artists.

“Unchecked, AI will set in motion a race to the bottom that will degrade the value of our work and prevent us from being fairly compensated for it,” reads the letter, which my colleague Chloe Berger also wrote about in Fortune prior to the news of Stable Audio 2.0’s release. “This assault on human creativity must be stopped. We must protect against the predatory use of AI to steal professional artists’ voices and likeness, violate creators’ rights, and destroy the music ecosystem.”

The music industry has a history of sometimes resisting new technologies. The electric guitar was initially met with skepticism from some musicians and audiences—partly because it had technical challenges at first, and partly because some questioned the untraditional sound it produced. The instrument, of course, grew to become wildly popular and ignited the creation of new genres like rock and roll. More recently, some electric guitar-playing rock musicians criticize EDM artists—who create music using Digital Audio Workstation software and other digital technologies—for not playing "actual instruments.”

Aside from the reception to new tools for making music, this moment has parallels to the Napster era, where file-sharing technology enabled people to download songs en masse for free and without any regard for copyright or compensation. This didn’t last for long, but the streaming model that followed further entrenched a system where individual artists make very little from the digital distribution of their music. Most musicians are now paid less than a penny per stream—and in some cases, they don’t make anything at all. Now generative AI seeks to take this a step further, using artists’ own music to design tools to do the very thing they do without them.

The issue of LLMs being trained on copyrighted material without consent or compensation has not only been at the center of debates about AI, but also at the center of some of Stability AI’s own high-level departures. Last November, the company’s former vice president for audio, Ed Newton-Rex, resigned after the release of Stable Audio over disagreement with the company’s stance that training generative AI models on copyrighted works constitutes “fair use.”

“Companies worth billions of dollars are, without permission, training generative AI models on creators’ works, which are then being used to create new content that in many cases can compete with the original works,” he wrote at the time in an op-ed explaining his resignation. “I don’t see how this can be acceptable in a society that has set up the economics of the creative arts such that creators rely on copyright.”

Copyright aside, musicians are rightfully worried the artistry of music will be lost to a sea of junk, as is already happening with various types of AI-generated content online. As I wrote earlier this year, in reaction to the widespread fear around OpenAI’s text-to-video model Sora, creating works like films and music is something people immensely enjoy. Many would go as far as to say music is what makes them feel alive, is their calling, or their reason for living. It’s one thing to delegate organizing our emails or supercharging our spreadsheets to AI, but it’s a whole other to let AI into the driver's seat of our passions.

Now, this technology generally isn’t very good (yet). The “rock song with a chorus that gets stuck in your head, a guitar solo, and lots of keys” I prompted Stable Audio 2.0 to make literally sounded like nails on a chalkboard. Its output for “a reggae song with slow verses and a more energized chorus” resembled a reggae tune, but it sounded as if it was being played on a warped vinyl during a windstorm. Both sounded kind of disturbing and lacked any soul or feeling. Neither had anything resembling distinguishable parts. But, as we’ve seen, these models only tend to get better, and they’ve already come a long way in a short time.

While some pieces of AI-generated content will likely capture the public consciousness, I’m betting there will always be more demand for music we create to share with each other and that tells stories about our experiences as humans. The question becomes how to ensure AI doesn’t lead to the exploitation of artists and the further deterioration of the business model that supports their work.

And with that, here’s more AI news.

Sage Lazzaro
sage.lazzaro@consultant.fortune.com
sagelazzaro.com

This story was originally featured on Fortune.com

Advertisement