KVR Audio

Borbolactic · Post by **Borbolactic** » Wed Sep 27, 2023 6:01 am

What do artificial intelligence systems use to, and/or how do they (forms of synthesis, etc.?) clone sounds, instruments and/or people's voices and output them?

If we can clone a human voice and get it to sing, we should be able to clone an instrument-- electronic or acoustic-- and get it to play.

Implications?

Here's a video I might watch, ostensibly about building a real virtual synth using ChatGPT as a tool-- so this kind of thing is already beginning to happen apparently-- and the following video of an April fool's joke that might nevertheless be very close to impending reality:

DCrown · Post by **DCrown** » Thu Sep 28, 2023 3:41 am

I remember that about 2 years ago there was a thread here about people developing software of voice imitations or clones and most KVR members were laughing and stating it won't ever happen or no one would use something like this. Some were even angry when they imagined that someone could use the voice of Fredy Mercury or whomever, but development won't stop, we are in the era of technogy and innovation. Now it's 2023 and AI will control everything in music production and no one will be upset except the people who repeat talking about the good old times. The young generation is the one who mainly produces music and is the main target and today it's the Tiktok-AI-Generation, whether one likes. it or not.
AI writes the lyrics, composes snd arranges the songs, chooses and clones instruments, AI is able to imitate or clone every voice, AI will mix and master and more.
If you use AI, you just have to decide what AI's suggestion you like the best.
I suppose in about 10 years, songs will be fully made by AI, maybe in 25 years producers will make own things again with some kind of revival, maybe add a little creativity and change two or three words of song lyrics, maybe play two or three notes with a synth by themselves, maybe there won't be human producers any more and the whole music biz will have been taken over by AI!?

rod_zero · Post by **rod_zero** » Thu Sep 28, 2023 5:58 am

The only thing that I doubt will be easy is training AI to learn to mix, not because I think it is impossible but because the data sets required to train AI on that would be very difficult to produce, since it is context dependent and a lot of little steps along the way. Or would they synthesize the whole song already mixed? I can't imagine going through the same process humans do now.

Borbolactic · Post by **Borbolactic** » Thu Sep 28, 2023 9:08 am

DCrown wrote: Thu Sep 28, 2023 3:41 am I remember that about 2 years ago there was a thread here about people developing software of voice imitations or clones and most KVR members were laughing and stating it won't ever happen or no one would use something like this. Some were even angry when they imagined that someone could use the voice of Fredy Mercury or whomever, but development won't stop, we are in the era of technogy and innovation...

I didn't see the full video (yet), but some guy on You Tube may have created his own virtual/software synth using AI. Apparently AI is already very good, with some help, at computer coding and has access to it-- presumably that includes code for different kinds of virtual synths and their algorithms and so on-- all over the internet.

Already, too, one can download some AI software-- maybe from Discord-- that can be trained to learn a human voice and then to have it say or sing whatever in that voice.
So if AI can code a synth (with some human guidance for now)-- let's say even modeled after a favorite synth ('cloned')-- and be trained on a particular kind of synth's sound like a favorite or two (even a 'morph' between different synths and their character sounds), then that should go a ways toward each person creating their own instruments and orchestras. In a sense, then, all one might need to do is wave a wand like a conductor, so to speak, and different 'virtual people' would 'sing' and 'play' different 'instruments'.

(I wonder if there was a drop in the number of people playing chess when computers finally started winning.)

So what might also happen, along with your points, is that live human-played music in live venues might especially return, especially where it would be difficult or undesireable to get robots/androids to play things like the guitar. (The piano might be easier.)

Borbolactic · Post by **Borbolactic** » Thu Sep 28, 2023 9:17 am

rod_zero wrote: Thu Sep 28, 2023 5:58 am The only thing that I doubt will be easy is training AI to learn to mix, not because I think it is impossible but because the data sets required to train AI on that would be very difficult to produce, since it is context dependent and a lot of little steps along the way. Or would they synthesize the whole song already mixed? I can't imagine going through the same process humans do now.

Well if AI is used more as a tool, at least for now if not forever, then I can imagine that it could learn the processes humans would have it apply until it could do them itself with only minimal input-- basically perhaps by talking to it like an assistant; "Ok, that's good, but compress track 17 a little more and fade it out at the 2 minute mark...".

Anyway, I think I might like to see if I can clone a particular voice I have in mind and apply it to a personally-composed song... or even add it to a song already existing by replacing its previous vocalist with my cloned one.

DCrown · Post by **DCrown** » Thu Sep 28, 2023 9:42 am

Borbolactic wrote: Thu Sep 28, 2023 9:08 am
DCrown wrote: Thu Sep 28, 2023 3:41 am I remember that about 2 years ago there was a thread here about people developing software of voice imitations or clones and most KVR members were laughing and stating it won't ever happen or no one would use something like this. Some were even angry when they imagined that someone could use the voice of Fredy Mercury or whomever, but development won't stop, we are in the era of technogy and innovation...
I didn't see the full video (yet), but some guy on You Tube may have created his own virtual/software synth using AI. Apparently AI is already very good, with some help, at computer coding and has access to it-- presumably that includes code for different kinds of virtual synths and their algorithms and so on-- all over the internet.

Already, too, one can download some AI software-- maybe from Discord-- that can be trained to learn a human voice and then to have it say or sing whatever in that voice.
.

So what might also happen, along with your points, is that live human-played music in live venues might especially return, especially where it would be difficult or undesireable to get robots/androids to play things like the guitar. (The piano might be easier.)

I rather think people who will be able to play instruments will get less, because AI will make the music / complete songs.
Skills how to use AI efficiently will be the most important thing.
Performers will be replaced by holograms and the venue will be the computer screen only. The big advantage could be that you won't have to wear a mask at home in front of the screen in case there will be something similar to covid or even worse.

Borbolactic · Post by **Borbolactic** » Thu Sep 28, 2023 10:35 am

DCrown wrote: Thu Sep 28, 2023 9:42 am
Borbolactic wrote: Thu Sep 28, 2023 9:08 am
DCrown wrote: Thu Sep 28, 2023 3:41 am I remember that about 2 years ago there was a thread here about people developing software of voice imitations or clones and most KVR members were laughing and stating it won't ever happen or no one would use something like this. Some were even angry when they imagined that someone could use the voice of Fredy Mercury or whomever, but development won't stop, we are in the era of technogy and innovation...
I didn't see the full video (yet), but some guy on You Tube may have created his own virtual/software synth using AI. Apparently AI is already very good, with some help, at computer coding and has access to it-- presumably that includes code for different kinds of virtual synths and their algorithms and so on-- all over the internet.

Already, too, one can download some AI software-- maybe from Discord-- that can be trained to learn a human voice and then to have it say or sing whatever in that voice.
.

So what might also happen, along with your points, is that live human-played music in live venues might especially return, especially where it would be difficult or undesireable to get robots/androids to play things like the guitar. (The piano might be easier.)
I rather think people who will be able to play instruments will get less, because AI will make the music / complete songs.
Skills how to use AI efficiently will be the most important thing.
Performers will be replaced by holograms and the venue will be the computer screen only. The big advantage could be that you won't have to wear a mask at home in front of the screen in case there will be something similar to covid or even worse.

I think a lot depends on what people are willing to accept and reject and how long society lasts. There are people who don't even agree with the current modus operandi of 'crony capitalism' and would rather have a gift economy.

But right now, we have some global energy and sociogeopolitical problems that could make a lot of pipe dreams and nightmares moot if it all comes tumbling down and we're left scrambling to grow our own food and hunt and gather again. And I guess make music using the most musical things we can find in nature.

But what I speak about in part WRT AI is already possible or at least possible within the next year or two. It's moving quite fast.

And virtual instrument software developers have already been 'cloning' ('emulating') hardware synths for around 20 years. So the precedence has already been set.

It might be nice to clone a favorite synth and maybe improve on it, such as where the dev has abandoned it or hasn't the time, money or inclination to do so. Even if only to get the AI to make it 64 bit so it can work on one's current system.

rod_zero · Post by **rod_zero** » Thu Sep 28, 2023 2:32 pm

Borbolactic wrote: Thu Sep 28, 2023 9:17 am
rod_zero wrote: Thu Sep 28, 2023 5:58 am The only thing that I doubt will be easy is training AI to learn to mix, not because I think it is impossible but because the data sets required to train AI on that would be very difficult to produce, since it is context dependent and a lot of little steps along the way. Or would they synthesize the whole song already mixed? I can't imagine going through the same process humans do now.
Well if AI is used more as a tool, at least for now if not forever, then I can imagine that it could learn the processes humans would have it apply until it could do them itself with only minimal input-- basically perhaps by talking to it like an assistant; "Ok, that's good, but compress track 17 a little more and fade it out at the 2 minute mark...".

Anyway, I think I might like to see if I can clone a particular voice I have in mind and apply it to a personally-composed song... or even add it to a song already existing by replacing its previous vocalist with my cloned one.

The data sets to train AI have to be enormous, even more for music. You can feed them thousands of tracks so they learn progressions, melody shapes, rythm and so on.

But if you want to teach it to mix as humans do know, within a DAW, adding effects, tweaking, using busses, etc. How are you going to produce a data set of each if those things?

If you make some tools, let's say a channel strip that is designed just to collect data and then maybe a DAW company can gather other info on other stuff, and you get enough people that is good at mixing to participate, it will take a good chunk of time.

I think it will be done from the top down, like synthetizing the full song, without really a "mixing session".

Borbolactic · Post by **Borbolactic** » Thu Sep 28, 2023 6:35 pm

rod_zero wrote: Thu Sep 28, 2023 2:32 pm
Borbolactic wrote: Thu Sep 28, 2023 9:17 am
rod_zero wrote: Thu Sep 28, 2023 5:58 am The only thing that I doubt will be easy is training AI to learn to mix, not because I think it is impossible but because the data sets required to train AI on that would be very difficult to produce, since it is context dependent and a lot of little steps along the way. Or would they synthesize the whole song already mixed? I can't imagine going through the same process humans do now.
Well if AI is used more as a tool, at least for now if not forever, then I can imagine that it could learn the processes humans would have it apply until it could do them itself with only minimal input-- basically perhaps by talking to it like an assistant; "Ok, that's good, but compress track 17 a little more and fade it out at the 2 minute mark...".

Anyway, I think I might like to see if I can clone a particular voice I have in mind and apply it to a personally-composed song... or even add it to a song already existing by replacing its previous vocalist with my cloned one.
The data sets to train AI have to be enormous, even more for music. You can feed them thousands of tracks so they learn progressions, melody shapes, rythm and so on.

But if you want to teach it to mix as humans do know, within a DAW, adding effects, tweaking, using busses, etc. How are you going to produce a data set of each if those things?

If you make some tools, let's say a channel strip that is designed just to collect data and then maybe a DAW company can gather other info on other stuff, and you get enough people that is good at mixing to participate, it will take a good chunk of time.

I think it will be done from the top down, like synthetizing the full song, without really a "mixing session".

The idea-- at least for some people-- would not be to let the AI create music entirely on its own, but to have AI help humans to create music-- so, a tool-- again, at least for some people.

Other people might want to just lie down on the couch and tell the AI to create an entire song from scratch, already 'mixed' (at least so to speak) and so on, for them to listen to.
(Perhaps that would be more like the AI 'singing' to them than any notion of mixing/recording/producing.)
That's probably what you mean at the end of your comment, yes? That's entirely possible too, such as if AI's already know about songs, singers, instruments, their general volumes, pans, reverbs, etc. (and/or at least sound and auditory-field spacial concepts), and can just throw something together, maybe with some coaxing, like 'That's not bad, but give me a voice more like that woman from the Faroe Islands, what's her name-- Eivor?-- but with a slight merge with Grimes.'.

(I don't know Grimes' work at all, but have heard of her, so am just taking a guess.)

As for datasets, AI already has the code for DAWs, and so it would seem trivial to get them to understand what each part of the code actually means and does.
It's possible that many datasets, if not most, are created by the AI, itself, perhaps leveraging neural net learning, with little intervention by humans and I've heard before that even the human creators don't fully understand what is going on under the hood.

So once whatever need to be done is done, then all a human would need to do is get the AI to operate the DAW for it, or at least operate its 'understanding of the DAW, or how to mix music' in conjunction with what the human requests.

Here's a video I might watch, ostensibly about building a real virtual synth using ChatGPT as a tool-- so this kind of thing is already beginning to happen apparently-- and the following video of an April fool's joke that might nevertheless be very close to impending reality:

DCrown · Post by **DCrown** » Fri Sep 29, 2023 12:47 pm

Borbolactic wrote: Thu Sep 28, 2023 6:35 pm
rod_zero wrote: Thu Sep 28, 2023 2:32 pm
Borbolactic wrote: Thu Sep 28, 2023 9:17 am
rod_zero wrote: Thu Sep 28, 2023 5:58 am The only thing that I doubt will be easy is training AI to learn to mix, not because I think it is impossible but because the data sets required to train AI on that would be very difficult to produce, since it is context dependent and a lot of little steps along the way. Or would they synthesize the whole song already mixed? I can't imagine going through the same process humans do now.
Well if AI is used more as a tool, at least for now if not forever, then I can imagine that it could learn the processes humans would have it apply until it could do them itself with only minimal input-- basically perhaps by talking to it like an assistant; "Ok, that's good, but compress track 17 a little more and fade it out at the 2 minute mark...".

Anyway, I think I might like to see if I can clone a particular voice I have in mind and apply it to a personally-composed song... or even add it to a song already existing by replacing its previous vocalist with my cloned one.
The data sets to train AI have to be enormous, even more for music. You can feed them thousands of tracks so they learn progressions, melody shapes, rythm and so on.

But if you want to teach it to mix as humans do know, within a DAW, adding effects, tweaking, using busses, etc. How are you going to produce a data set of each if those things?

If you make some tools, let's say a channel strip that is designed just to collect data and then maybe a DAW company can gather other info on other stuff, and you get enough people that is good at mixing to participate, it will take a good chunk of time.

I think it will be done from the top down, like synthetizing the full song, without really a "mixing session".
The idea-- at least for some people-- would not be to let the AI create music entirely on its own, but to have AI help humans to create music-- so, a tool-- again, at least for some people.

Other people might want to just lie down on the couch and tell the AI to create an entire song from scratch, already 'mixed' (at least so to speak) and so on, for them to listen to.
(Perhaps that would be more like the AI 'singing' to them than any notion of mixing/recording/producing.)
That's probably what you mean at the end of your comment, yes? That's entirely possible too, such as if AI's already know about songs, singers, instruments, their general volumes, pans, reverbs, etc. (and/or at least sound and auditory-field spacial concepts), and can just throw something together, maybe with some coaxing, like 'That's not bad, but give me a voice more like that woman from the Faroe Islands, what's her name-- Eivor?-- but with a slight merge with Grimes.'.

(I don't know Grimes' work at all, but have heard of her, so am just taking a guess.)

As for datasets, AI already has the code for DAWs, and so it would seem trivial to get them to understand what each part of the code actually means and does.
It's possible that many datasets, if not most, are created by the AI, itself, perhaps leveraging neural net learning, with little intervention by humans and I've heard before that even the human creators don't fully understand what is going on under the hood.

So once whatever need to be done is done, then all a human would need to do is get the AI to operate the DAW for it, or at least operate its 'understanding of the DAW, or how to mix music' in conjunction with what the human requests.

Here's a video I might watch, ostensibly about building a real virtual synth using ChatGPT as a tool-- so this kind of thing is already beginning to happen apparently-- and the following video of an April fool's joke that might nevertheless be very close to impending reality:

They won't use AI just to help, I can see it in my office. Company suggested to use chatGPT to help writing or getting some inspiration. The truth is, they use it 100% and it's like everything is written by some strange robot person, but it's accepted and it will get more and more accepted everywhere. Handwriting is also something that won't be needed any more. It's mostly older people with a good ole times thinking who dislike it, the younger geneneration will use it for sure.
It's not a big deal to make AI learn mixing and mastering, especially when you lookhow simple, repetetive and short most songs are and they will get shorter and simpler. Tiktok rules!

Artificial Intelligence & Implications: Acoustic/Electronic Sound, Instrument & Human Voice Cloning