Traditional copyright laws attribute authorship to humans, not machines. Since AI music generators are a relatively new technology, authorship laws have not been updated to address them. So how do we go about attributing copyright when human input creates new music.
For a work to be copyrighted, it traditionally needs to be created by a human and exhibit a degree of originality. AI-generated music is criticized for struggling to create something that sounds original, given it's output is generated by pulling from the dataset that was used to train it.
The concerns surrounding MusicGen and its use of extensive training data from licensed music collections highlight a growing debate in the intersection of AI, copyright, and creative rights. MusicGen, through its sophisticated algorithm, analyses and learns from a vast array of musical patterns, structures, and nuances. It then applies this knowledge to generate new music that resonates with the input prompts, essentially using its learned understanding of musical relationships to create original compositions.
However, the foundation of MusicGen's learning process—the data it is trained on—poses significant ethical and legal questions. The model's training on 20,000 hours of licensed music, which includes a mix of complete tracks and instrument-only versions from platforms like Shutterstock and Pond5, is legally backed by agreements with rights holders (1). This approach by Meta, the parent company, is part of a larger trend where tech companies leverage existing creative works to train their AI systems, asserting that such usage is covered under legal contracts with content providers.
This practice, however, has not been universally accepted by the artist community. The partnership between Shutterstock and OpenAI for DALL-E, and Shutterstock's development of its own AI image generator, serves as a precedent, showing how AI is being trained on a wide range of creative content. While these technological advancements promise innovative ways to generate content, they also raise concerns about the commodification of individual creativity for corporate AI training purposes without explicit consent from the original creators.
The resulting legal battles, such as those faced by Stability AI and Midjourney (2), underscore the tension between the AI industry's appetite for large datasets to improve their models and the rights of artists to control the use of their work. These lawsuits focus on the premise that AI companies are infringing on copyrights by indiscriminately incorporating copyrighted material into their training datasets without obtaining permission from every content creator.
For end-users and consumers of AI-generated content, there is an underlying risk that the output may inadvertently mirror existing copyrighted works too closely, leading to accusations of plagiarism. This risk is especially pronounced in contexts where tech giants can afford extensive licensing deals, potentially creating an uneven playing field that favors well-resourced companies over individual creators and smaller entities.
The development and use of AI like MusicGen in creative fields are pioneering yet fraught with complex challenges. They necessitate a careful balancing act between harnessing AI's potential for innovation and respecting the copyrights and creative inputs of individual artists. As the technology advances, ongoing dialogue, clearer regulations, and more transparent practices will be crucial in navigating these ethical and legal landscapes.