I remember similar news about ML services that generate mnusic: they are able to reproduce melodies and lyric from copyrighted songs (if you find a way around filters on song or artist title) and even producer tags in Hip-Hop tracks.
All this latest ML growth is built on massive copyright violations.
I wanted to note that you cannot compare learning by a human and "ML learning" which basically is calculating coefficients from copyrighted materials. Don't those coefficients fall under definition of a "derivative work" by the way?
ML models are not "learning" in the same way as human do, and while they use the misleading word "learning" is has completely different meaning; also, ML models are not humans and therefore they are not subject of laws; the engineers who perform calculations are.
So comparing calculation of ML model parameters to a person studying art is incorrect; you should compare engineers performing calculations with data from copyrighted material to a person studying art. It is immediately obvious that there cases are not equivalent. And those engineers are not learning anything in the process so the cannot use the analogy as an excuse.
The fact that those service can reproduce copyrighted content proves that it was used during training. And was it legally obtained? How do you think, services like Udio bought millions of CDs? Or they got the training material somewhere else? You cannot legally download content from streaming services for example.
There's no difference between producing and copying. Let's take a real world example: music samples. There's an entire clearinghouse process for music sampling, more or less forged after the 80s/90s when sampling blew up. Record companies and artist were like, "hey that's my song", courts agreed, a market was created.
This is pretty analogous to what's happening now, which is code samples. Developers are like, "hey that's my code". But that's where we're diverging, and this is probably because big companies aren't involved. People were sampling Atlantic Records' stuff. People aren't sampling Microsoft's stuff, they're sampling random GitHub OSS project guy's stuff.
But to your point, you're basically arguing that it's fine as long as no one listens to "Bitter Sweet Symphony". Most people think it's not the end user (listener) who's infringing copyright, but the party doing the copying (The Verve). Even if we accept your principle here, you're putting way too heavy a burden on people who use services like Copilot. Am I supposed to check that everything I autocomplete is properly licensed? You more or less said "shut these services down" in so many words.
That isn't true, strictly speaking. The right to reproduce work is covered by copyright law, irrespective of whether the reproduction is commercial or not.
So what? Most uses when we're talking about code or artwork are going to involve someone taking the generated result and publishing it somewhere.
> But if I use it, in a commercial manner, then it becomes a copyright violation.
No, that's incorrect. Commercial use has nothing to do with it. Any act of distribution, regardless of whether or not it's for commercial or personal use, regardless of whether you charge $100,000 or $0, falls under copyright law.
You can usually dismiss anyone who talks about the law in such absolute terms, especially when it comes to copyright.
The U.S. Copyright Office has some guidelines about fair use, and non-commercial as well as personal use are listed as considerations that courts take into account when judging whether an unlicensed copy constitutes an infringement:
You're technically correct, but this ignores some unfortunate realities of DMCA for the smaller fry.
The bigger fry are and will keep fighting to change the ideas of copryright, now that it's inconvinient for them after spending decades strenghening it.
I pretty much said that, yes. But the argument of "personal use" ends when you post on the internet. Which is what most "big fries" are doing as some endpoint.
But those are more "medium fries" anyway. The "big fries" aren't gonna take any risk whatsoever unless they are going the edgy parody route (someone like Adult Swim). There's little upside to Bungie or Microsoft or Laika posting Mickey mouse to begin with
In summary, you probably can and Disney won't care, but it is technically not allowed outside of fair use constraints. It's not legal in the same realm that it's illegal to ride a bike in a swimming pool in Baldwin Park,CA. Key point: don't make money, don't be stupid, and don't be unlikeable.
Even then, most platforms control the content and they probably won't defend you and would take it down. That's not a legal matter so much as a platform policy.
>people aren't getting sued for drawing Mickey Mouse for personal use because Disney would rather go after the big guys.
It's even simpler than that. If a lawyer costs $10k to run a small claims case, and they have a shaky chance (fair use) or a low payout, it's not profitable to go after you.
That's why other factors need to come into play, like potential brand damage, scaring off imitators, or simply pissing off the wrong lawyer somehow.
yes, and no. Stricly speaking you need a licesnse to allow you to use copyright information. Even if it's non-commercial (obvious example, if you post, say, Jasmine making out with Hitler on some non-monetized account, Disney can take that down. Or try. It really comes down to if the platform wants to argue fair use or parody or whatnot. Most won't defend you, though).
But enforcement-wise, Disney won't bother going after every copyright potential. They will focus on the biggest money makers or the biggest potential to brand damage. So it's not a worry for most people who will just draw some mickey mouse for a friend, or even private client as long as they aren't stupid.
All this latest ML growth is built on massive copyright violations.