Songs can be generated in seconds; Suno and Udio are sued by the three major rec
Artificial intelligence music has suddenly faced a moment of life and death. On June 24th, Suno and Udio, two leading AI music startups, were sued by the three major record companies. The tools developed by these companies can generate complete songs based on prompts in just a few seconds.
Sony Music, Warner Music, and Universal Music claim that these startups have used copyrighted music on a "scale almost unimaginable" in their training data, enabling AI models to generate songs that "imitate the quality of real human records."
Two days later, the Financial Times reported that YouTube is taking a relatively aboveboard approach.
According to reports, instead of training AI music models on secret datasets, YouTube is considering paying an undisclosed one-time fee to top record companies in exchange for permission to use their copyrighted works for training.
In response to the lawsuit, both Suno and Udio issued statements saying they are working to ensure that their models do not imitate copyrighted works, but neither company has specified whether their training datasets contain these works.Udio states that its model "has 'listened to' a vast amount of music and learned from it."
Advertisement
Before being sued, Suno's CEO Mikey Shulman told me that its training dataset "conforms to industry standards and is legal," but specific information could not be disclosed.
Although the situation is changing rapidly, these measures are not unexpected: the "training data war" starting with lawsuits is already a challenge that generative artificial intelligence companies must face.
This trend leads many companies, including OpenAI, to deal with lawsuits while also paying for data licensing agreements.
However, the risks of AI-generated music are higher than those of image generation tools or chatbots. Generative AI companies targeting text or photos may have ways to circumvent lawsuits.For example, they can train models with texts that are not subject to copyright restrictions. In contrast, music that is not protected by copyright is much less common, and has a smaller audience.
Other AI companies can also more easily reach licensing agreements with interested publishers and creators, with a wide range of options available.
However, industry experts have said that music copyright is much more centralized than film, image, and text copyrights.
Music copyrights are mainly managed by the three major record companies, which are also the three plaintiffs mentioned at the beginning. The publishing departments of these three record companies collectively own the copyrights to more than 10 million songs and most of the music from the 20th century.
The lawsuit lists a long list of artists, and the record companies claim that they have been incorrectly included in the training data, including the Swedish ABBA band and artists from the Hamilton musical soundtrack.In addition to this, creating pleasant music has become even more challenging. Generating acceptable poetry or illustrations using artificial intelligence is a technical challenge, but infusing the models with our preferred musical tastes presents another challenge altogether.
Of course, it is possible for AI companies to win this lawsuit, but that is of little consequence. Their desire to use copyrighted music for free to train their models is nothing more than an empty promise.
Experts have stated that the record companies have a strong case. If AI companies wish to survive, they may soon have to pay out substantial amounts of money, and it could be a significant sum.
Should the court rule that AI music companies cannot train on these record companies' music for free, then expensive licensing agreements, such as those reportedly being sought by YouTube, seem to be the only way out. Ultimately, this will allow the companies with the deepest pockets to take the lead.
The outcome of this case will determine a significant application branch of generative artificial intelligence and whether it still has the potential to develop, more so than any training data lawsuit seen thus far.The Origin and Progression of the Case
Suno Company's music generation tool, launched less than a year ago, has already garnered 12 million users. It recently secured $125 million in financing last month and established a partnership with Microsoft Copilot.
Udio Company is even younger. Its tool only went online in April 2024, backed by $10 million in seed funding from music investors such as will.i.am and Common.
Record companies claim that both of these startups have infringed on copyrights in their model training and output.James Grimmelmann, a professor of digital and information law at Cornell Law School, said: "In all lawsuits, the plaintiff in this case has the highest chance of defeating the artificial intelligence company."
He compared it with the ongoing lawsuit by The New York Times against OpenAI. He said that, so far, this is the best example of a rights holder bringing a strong lawsuit against an artificial intelligence company. However, for many reasons, the lawsuit against Suno and Udio is "worse."
The New York Times accused OpenAI of using its published articles without authorization and using them for model training, which infringed on copyright.
Grimmelmann said that OpenAI can "exploit loopholes" in dealing with this accusation. Because the company can say that it has scraped a large amount of training corpus on the Internet, and copies of The New York Times articles appeared in places unknown to the company.
For Suno and Udio, this self-defense is less credible.Grimelman said: "It cannot say, 'We searched all the audio on the internet, but we were unable to distinguish commercial production songs from other songs.' It is clear that they must have collected a database of major commercial records."
In addition to complaints about training, the new case also claims that tools like Suno and Udio are more imitative than generative artificial intelligence.
This means that their output content imitates the style of copyrighted artists and songs.
Grimelman pointed out that The New York Times cited examples of ChatGPT copying its entire articles, but record companies claim that they can generate problematic responses from artificial intelligence music models with much simpler prompts.
For example, the plaintiff said that prompting Udio with "my tempting 1964 girl smokey sing hitsville soul pop" produces a song that "any listener familiar with the Temptations will immediately recognize its similarity to the classic song 'My Girl'."The court documents include examples from Udio, but these songs appear to have been removed. The plaintiff also cited similar examples from Suno, including a song that resembles the style of ABBA. The song was generated with the prompt of "1970s pop music" and the lyrics of "Dancing Queen."
More importantly, Grimmelmann explained that there is more copyright-protected information in songs than in news articles.
He said, "In capturing Mariah Carey's singing and voice, the information density is much higher than in text." This may be one of the reasons why past lawsuits involving music copyright have been so lengthy and complex.
Schulman wrote in a statement that Suno prioritizes originality, and the model "is designed to generate completely new outputs, rather than memorizing and repeating previously existing content."
He added, "This is why we do not allow users' prompts to mention specific artists."Udio's statement also mentioned that it uses "the most advanced filters to ensure that our models do not replicate copyrighted works or the voices of artists."
In fact, if a request specifies an artist, these tools will block it. However, record companies claim that there are significant loopholes in these safeguards.
For example, some examples shared by social media users after the lawsuit news broke indicate that if the letters in the artist's name are separated by spaces, the request may pass.
My own request for a "song sung like Kendrick" was blocked by Suno, but "a song sung like k e n d r i c k" passed successfully, producing a "hip-hop rhythm-driven" track.
To be fair, the generated result is not like the artist's unique style, but the model can generate a similar style, which indicates that it is actually familiar with the works of many well-known artists. Similar workarounds failed on Udio.Possible Outcomes
Grimelman stated that there are three possible directions this case could take. One of them is the failure of the lawsuit, where the court fully supports the artificial intelligence startup, potentially determining that the company did not violate fair use or imitate copyrighted works in the model's output.
If these models are considered fair use, this would mean that songwriters and rights holders would need to seek compensation through different legal mechanisms.
Another possibility is a mixed outcome. The court finds that the AI company did not violate fair use in its training but must better control the output of its models to ensure they do not casually imitate copyrighted works.Grimelman stated that this would be similar to one of the initial rulings against Napster. In that ruling, the company was forced to prohibit the search for copyrighted works in its library, although users quickly found workarounds.
The third direction is the most devastating, where the court considers that the company is at fault in both the training of artificial intelligence models and their output.
This means that these companies cannot use copyrighted works for training without permission, nor can they allow the output of content that imitates copyrighted works.
The companies may be ordered to pay damages for the infringement, with each company's compensation potentially amounting to hundreds of millions of dollars.
Even if they do not go bankrupt due to such a ruling, licensing agreements will force them to completely recollect training datasets, which could also lead to excessive costs.Licensing or No Licensing
Although the plaintiff's direct goal is to make the artificial intelligence company stop training and pay compensation, the chairman of the Recording Industry Association of America, Mitch Glazier, is already looking forward to a future with licensing.
He wrote in a column article: "As in the past, music creators will exercise their rights to protect the creative engine of human art and promote the development of a healthy, sustainable licensing market that recognizes the value of creativity and technology."
Such a licensing market may be similar to the current situation faced by text generation tools. OpenAI has reached licensing agreements with several news publishers, including Politico, The Atlantic, and The Wall Street Journal.These agreements allow OpenAI's products to access the content of publishers, although the model can do extremely limited in terms of transparency in information citation.
If artificial intelligence music companies follow this pattern, then the only companies capable of creating powerful music models may be those with the most funding. This may be exactly what YouTube is thinking.
YouTube did not immediately respond to MIT Technology Review's questions about the details of its negotiations, but considering the vast amount of data required to train artificial intelligence models and the concentration of music copyrights, the final agreement price may be astronomical.
In theory, artificial intelligence companies could completely avoid obtaining permission and only build models on music not protected by copyright, but this would be a daunting task.
There are similar efforts in the field of text and image generation. A legal consulting firm in Chicago, USA, created a model based on regulatory documents, and Hugging Face created a model trained with images of Mickey Mouse from the 1920s.However, these models are quite small and inconspicuous. If Suno and Udio are forced to train only in the public domain, think about the free songs in military propaganda films and corporate promotional videos, the final model (outcome) will be far from what we see today.
Grimelman said that if artificial intelligence companies really promote licensing agreements, negotiations may be tricky.
Music licensing is complicated by two different copyrights: one is the copyright of the song, which usually covers the composition, such as music and lyrics, and the other is the post-processing copyright, which covers recording and mastering, etc. (The songs you hear are usually post-processed).
Some artists, such as Taylor Swift and Frank Ocean, have gained control over the post-processing copyright of their music library after a long legal struggle, so they will dominate any potential licensing agreements.
However, many other singers only retain the copyright of the song, while the record company retains the post-processing copyright (master).In these circumstances, theoretically, record companies may be able to grant artificial intelligence companies the right to use music without the artist's permission, but this could lead to a break with the artist and trigger more legal disputes.
Whether to authorize their music to these companies has caused divisions within the music community. In the contract rules passed by the American SAG-AFTRA union, which represents artists and actors, in April 2024, the voices of union members can be cloned by artificial intelligence, but there are minimum payment standards.
As early as December 2023, an organization called the Indie Musicians Caucus expressed disappointment with the American Federation of Musicians (AFM), which has 70,000 members.
Because it believes that the latter has not done enough in the face of artificial intelligence infringement and has failed to protect its ordinary members.
The group wrote that it will vote against any agreement that "requires AFM members to participate in training generative artificial intelligence (our permanent replacements) without the right to consent, compensation, or recognition, thus digging their own graves."At this point, however, AFM does not seem to be in a hurry to facilitate any transactions.
I asked Kenneth Shirk, the International Financial Secretary of AFM, whether he thinks musicians should cooperate with artificial intelligence companies to promote fair compensation, whatever that means, or to completely boycott licensing agreements.
He told me: "These issues have given me a lot to think about. Would you rather have a swarm of fire ants crawling all over your body, or roll around on a bed full of broken glass? We want musicians to be compensated, but we also want to ensure that our future generations can continue to work in music."
Support: Ren
Operation/Typesetting: He Chenlong