Machines are allowed to sample more freely than humans
- KVRian
- 1112 posts since 26 Jun, 2008 from Czech Republic
So this video just came out:
...and it got me thinking. If I sample "DJ KHALID!!!!" into one of my songs, UMG, Warner or Sony can come in and sue my butt off to oblivion. (Especially here, in Europe, where "Fair Use" is not existing as a legal construct.) But if Suno or Udio spit out this very same tag in a track, it's perfectly legal, because as of right now, the output of AI models is not subject to copyright.
(Source: https://copyrightalliance.org/faqs/is-a ... copyright/)
I'm aware that there's the big RIAA lawsuit happening. But that concerns more of the training phase of the model. Not the user-facing so called "inference" phase. So even when that gets settled, even when Suno and Udio negotiate some licensing models with labels, it won't change anything on the logic. You sample as a human being? Bad. Machine samples the same thing? ...yeah, that's alright.
...and it got me thinking. If I sample "DJ KHALID!!!!" into one of my songs, UMG, Warner or Sony can come in and sue my butt off to oblivion. (Especially here, in Europe, where "Fair Use" is not existing as a legal construct.) But if Suno or Udio spit out this very same tag in a track, it's perfectly legal, because as of right now, the output of AI models is not subject to copyright.
(Source: https://copyrightalliance.org/faqs/is-a ... copyright/)
I'm aware that there's the big RIAA lawsuit happening. But that concerns more of the training phase of the model. Not the user-facing so called "inference" phase. So even when that gets settled, even when Suno and Udio negotiate some licensing models with labels, it won't change anything on the logic. You sample as a human being? Bad. Machine samples the same thing? ...yeah, that's alright.
Evovled into noctucat...
http://www.noctucat.com/
http://www.noctucat.com/
- Beware the Quoth
- 35433 posts since 4 Sep, 2001 from R'lyeh Oceanic Amusement Park and Funfair
That's not 'the logic' that the law applies though. The license is pivotal to that logic.FarleyCZ wrote: Thu Jul 04, 2024 9:00 am But even when that gets settled, even when Suno and Udio negotiate some licensing models with labels, it won't change anything on the logic.
Legally, it would actually beYou sample as a human being? Bad. Machine samples the same thing? ...yeah, that's alright.
"You sample as a human being who hasnt licensed the content? Bad.
You sample as a human being who has licensed the content? yeah, that's alright..
Machine samples the same thing after a license is negotiated?...yeah, that's alright.
Machine samples the same thing before a license is negotiated? Probably bad, plenty of ongoing legal cases, but no legal precedent set yet."
That legal precedent will almost certainly hinge on whether AI generation counts as 'sampling'. As I understand it, in essence the argument these companies will make is going to be based on two things:
that creation of an AI model does not entail 'reuse' of the original work; that material is analysed to build the model, but that data is measurables about the work, not something derived from it.
the generation of the final output is solely based on the model, not any original work.
How that holds up in court is anyone's guess.
An idiot on Set Theory:
"In some cases there is an object called red that contains everything that is red. In much the same way a pot is a plate."
"In some cases there is an object called red that contains everything that is red. In much the same way a pot is a plate."
- KVRian
- Topic Starter
- 1112 posts since 26 Jun, 2008 from Czech Republic
I absolutely understand and agree with everything you wrote. But here's the catch: They did not have the license, yet bits and pieces of the original works are appearing in the generated material.
Yes. It comes down to the question: "Is training a set of weights and storing that weights" a form of copying/storage? That is not decided yet. But here's my 2 cents on how it should go: When you train a model, that model usually has a parameter called "temperature" that dictates how closely the output is following the input data. This parameter is not baked in during the training phase. It can be adjusted during the inference. If you set it to 0, theoretically it will spit out 1:1 copies of the original data. For it to be able to do that, it has to posses the ability to reconstruct that data, hence the data is embedded in the weights, hence I would suggest it indeed is a form of storage. ...whether judges in those lawsiuts come to the same conclusion remains to be seen.
Yes. It comes down to the question: "Is training a set of weights and storing that weights" a form of copying/storage? That is not decided yet. But here's my 2 cents on how it should go: When you train a model, that model usually has a parameter called "temperature" that dictates how closely the output is following the input data. This parameter is not baked in during the training phase. It can be adjusted during the inference. If you set it to 0, theoretically it will spit out 1:1 copies of the original data. For it to be able to do that, it has to posses the ability to reconstruct that data, hence the data is embedded in the weights, hence I would suggest it indeed is a form of storage. ...whether judges in those lawsiuts come to the same conclusion remains to be seen.
Evovled into noctucat...
http://www.noctucat.com/
http://www.noctucat.com/
- Beware the Quoth
- 35433 posts since 4 Sep, 2001 from R'lyeh Oceanic Amusement Park and Funfair
Oh, its not like I subscribe to that argument. Personally, Im not one of those people that think artists have no right to control the reuse of their work. I believe that even if it doesnt currently count as 'reuse' (ie 'derived from') in terms of the way existing copyright law defines that term, the appropriate laws needs to be unequivocally updated with an equivalent level of protection for the original artist that covers this.
(Although I suspect that would just effectively mean many AI companies will move their servers to those places where copyright law is not as strong)
(Although I suspect that would just effectively mean many AI companies will move their servers to those places where copyright law is not as strong)
An idiot on Set Theory:
"In some cases there is an object called red that contains everything that is red. In much the same way a pot is a plate."
"In some cases there is an object called red that contains everything that is red. In much the same way a pot is a plate."
- KVRian
- Topic Starter
- 1112 posts since 26 Jun, 2008 from Czech Republic
I don't really care how they decide. I have no influence over it anyway. I just don't like the idea of missalignment. The double measure. If human is subjected to original artist's control while trying to create a derivative work, machine should be as well. If machine is freed from that control, human remixer should be as well.
Evovled into noctucat...
http://www.noctucat.com/
http://www.noctucat.com/
- Beware the Quoth
- 35433 posts since 4 Sep, 2001 from R'lyeh Oceanic Amusement Park and Funfair
Well its not going to make a difference to me what they decide... I dont reuse work where I dont have permission to do so from the copyright owner.
An idiot on Set Theory:
"In some cases there is an object called red that contains everything that is red. In much the same way a pot is a plate."
"In some cases there is an object called red that contains everything that is red. In much the same way a pot is a plate."
- KVRian
- Topic Starter
- 1112 posts since 26 Jun, 2008 from Czech Republic
Yes, but you have whole generes like early hiphop, where crate-digging and creative re-use of old material in new contexts were pivotal for the actual genere. Or just inventive remixes in any genere. All of these have to be cleared. And it was always disproportional. If you were big, you were more likely to acquire the license. But small artists?
Here are two videos, that just get my blood boiling everytime algoritm slaps them to my face:
...both of them pretend like derivative work is something extremely normal. Both not mentioning that in order for them to be able to do this, they had to be either asked by the label to make a remix, or be with a looooooong negotiations with the label. Essentially telling kids: "Yeah do this. It's awesome. We'll sue you later, don't worry."
...and now (unless the RIAA lawsuit succeedes), the thing you SHOULD be worried about when attempting remix work like that is not even applicable to a commercially paid machine that does it on it's own? Come on.
Here are two videos, that just get my blood boiling everytime algoritm slaps them to my face:
...both of them pretend like derivative work is something extremely normal. Both not mentioning that in order for them to be able to do this, they had to be either asked by the label to make a remix, or be with a looooooong negotiations with the label. Essentially telling kids: "Yeah do this. It's awesome. We'll sue you later, don't worry."
...and now (unless the RIAA lawsuit succeedes), the thing you SHOULD be worried about when attempting remix work like that is not even applicable to a commercially paid machine that does it on it's own? Come on.
Evovled into noctucat...
http://www.noctucat.com/
http://www.noctucat.com/
- KVRAF
- 14138 posts since 20 Nov, 2003 from Lost and Spaced
I don't know much about Suno, but if you try that with Udio it won't accept it those instructions. I've tried it, and it was 'in the style of' type thing. It wasn't even a big mainstream artist either.
- KVRian
- Topic Starter
- 1112 posts since 26 Jun, 2008 from Czech Republic
Yes, because they smell trouble comming in, so they blacklisted pretty much every artist name ever. Even in the RIAA lawsuit there are workarounds mentioned to get around this. (If I remembered correctly, they for example input a whole lyrics of a certain song and surprise surprise, it spat out almost 1:1 copy of the song including structure, chords, melodies and the singer's voice.) And funny part is, once Udio or Suno read about those workarounds, they blacklist them as well. Half of the "look they steal" videos are not replicable, because they took care of it.
Evovled into noctucat...
http://www.noctucat.com/
http://www.noctucat.com/
-
- KVRist
- 215 posts since 5 Jun, 2002 from corpus christi tx
I scrub whatever video I get of sound. I do use text to speech but not singing. I was finding until the video that I just put out that for whatever reason google seemed to suppress the amount of advertising when any description of a.i. is used. It could be just some of them are being suppressed. Almost 900 (page views not plays) views since 5pm on a new release (when I have been releasing on youtube every day) is about normal. the 2% clickthrough rate would just make it around 40 people max so that may be what I am getting on the last one. The time per song has been seeming fairly low. as low as 8 seconds average on some so we will see if the story brings them in.
- KVRAF
- 16802 posts since 8 Mar, 2005 from Utrecht, Holland
Technically the AI does not sample. It recreates a sound (or voice or melody) with a huge likeness, from things it has 'heard before'.
Legally this is a very grey (uncharted) area.
Legally this is a very grey (uncharted) area.
We are the KVR collective. Resistance is futile. You will be assimilated. 
My MusicCalc is served over https!!
My MusicCalc is served over https!!
- KVRian
- Topic Starter
- 1112 posts since 26 Jun, 2008 from Czech Republic
Yes, but there's a big body of research comparing LLMs and lossy compression.
For example:
https://learnandburn.ai/p/an-elegant-eq ... tween-llms
Or:
...so if you use that logic, every lossy compression would mean the data is not coppied, but "described". All MP3s wouldn't be subjected to copyright.
Also most models have something called "temperature" parameter. When it's 100%, the output is essentilly just noise. When it's 0%, the output is pretty much the source data. And that parameter is not baked in during learning. That's a variable you can change while using already trained model. Most models we use have it set around 20-30% afaik. So if there is a possibility of the model outputing the original data, are we sure that it shouldn't be considered a form of storage?
For example:
https://learnandburn.ai/p/an-elegant-eq ... tween-llms
Or:
...so if you use that logic, every lossy compression would mean the data is not coppied, but "described". All MP3s wouldn't be subjected to copyright.
Also most models have something called "temperature" parameter. When it's 100%, the output is essentilly just noise. When it's 0%, the output is pretty much the source data. And that parameter is not baked in during learning. That's a variable you can change while using already trained model. Most models we use have it set around 20-30% afaik. So if there is a possibility of the model outputing the original data, are we sure that it shouldn't be considered a form of storage?
Evovled into noctucat...
http://www.noctucat.com/
http://www.noctucat.com/
- KVRAF
- 7001 posts since 20 Mar, 2012 from Babbleon
huh?
humans own/operate machines
the owner/operator is to blame for any wrongdoing of his/her machine
but if a machine owns another machine and says,
"say hello to my little friend"
then that's maybe the end of humans
humans own/operate machines
the owner/operator is to blame for any wrongdoing of his/her machine
but if a machine owns another machine and says,
"say hello to my little friend"
then that's maybe the end of humans
ah böwakawa poussé poussé