KVR Audio

perpetual3 · Post by **perpetual3** » Thu Oct 22, 2020 11:22 pm

Unaspected wrote: Thu Oct 22, 2020 9:42 pm
fmr wrote: Thu Oct 22, 2020 3:11 pm
perpetual3 wrote: Thu Oct 22, 2020 2:52 pm .../... if the sound is complex enough, eventually it becomes really difficult, for nearly everyone, to synthesize a physical sound using elementary modules or DSP elements because humans have limitations (two eyes, two arms, ten fingers, difficulty paying attention to more than one thing at once, etc).
That's where your line of thinking fails. Synthesizing the sound isn't "that" difficult. Actually it was done in the sixties by Jean-Claude Risset using acoustic analysis and additive synthesis via the Music language created by Max Mathews. It has nothing to do with human limitations, since we can program each element at a time, and then trigger most of them using relatively simple controls.
The task would be beyond daunting if you were to approach it as a human using this method - Also noting that a human has the advantage of a complex understanding pitch before he approaches the synth. This isn't tuning an additive bank of sine waves via FFT; this is tuning each partial/oscillator by trial and error untill best guesses are arrived at - whilst also tuning every other aspect of the synth and calculating how to play it so it best matches the input data.

This is why we have sustained sounds which seem to be made up of scraps of audio. The machine hasn't finished learning but that is the best match at that point.

Perpetual3 does seem to understand the process well.

If we take Gaussian distribution: We could start with a cluster of sound - seemingly random - but as time progresses, the sound converges on a single pitch and floats around that. The difference is that whilst an orchestra generally understand their instruments and have a reasonable grasp as to the starting and end point, the machine only knows a reality of what previously worked and what didn't.

As a slightly meta example: Consider how the Fourier Transform uses correlation: Except, imagine the machine having no algorithm. So it prints each sample value by sample value, making comparisons to the original audio, keeping the best matches and scrapping the worst. It would be a much slower process in human terms. However, at some point, we might find that the machine learns how to generate and match sine waves. Within that data there will be some understanding. Just as with Google's audio engine, the machine will work out tones and articulations that sound realistic.

Consider Google's offering of proof of concept - and it sounds damn good to me. Very impressive work with plenty of possible audio applications for both generating and processing.

Thanks for this explanation, Unaspected. Better than mine! And I’ve coded a few deep neural networks myself...

SeeingInMidi · Post by **SeeingInMidi** » Fri Oct 23, 2020 12:29 am

Haha this thread is a mess. Machine learning/neural network/AI oriented DSP is still in its infancy. Take a look at other ML oriented projects were able to achieve outside of audio. Take a look at what Topaz, Adobe and Nvidia have done for image using ML. There's no need to be afraid, accept and kneel down to your AI overlords, the future of ML and DSP is exciting!

melomood · Post by **melomood** » Fri Oct 23, 2020 3:09 am

It's the future of the future

SneakyBeats · Post by **SneakyBeats** » Fri Oct 23, 2020 6:29 am

audiot wrote: Thu Oct 22, 2020 10:23 pm
SneakyBeats wrote: Thu Oct 22, 2020 2:22 pm Nice idea.. But I just can't figure out any situation where I'd want to turn my fart into a bunch of violins.
And what about the other way around? Turning instruments into farts? This could be the next big thing! I'm imagining new genres like fart-step or fart-core

Afaik there wasn't an option for that. But when it's available, I'm interested

Polomo123 · Post by **Polomo123** » Fri Oct 23, 2020 3:49 pm

audiot wrote: Thu Oct 22, 2020 10:23 pm
SneakyBeats wrote: Thu Oct 22, 2020 2:22 pm Nice idea.. But I just can't figure out any situation where I'd want to turn my fart into a bunch of violins.
And what about the other way around? Turning instruments into farts? This could be the next big thing! I'm imagining new genres like fart-step or fart-core

vurt · Post by **vurt** » Fri Oct 23, 2020 4:08 pm

Unaspected wrote: Thu Oct 22, 2020 10:04 pm
vurt wrote: Thu Oct 22, 2020 9:49 pm nm.
tired
Aw. I just quoted what you wrote previously and was going to reply. It turned into this.

But also imagine the synth having all this data for natural variation within it. Every key press could sound varied in a natural manner. Eventually we could sample playing styles. Then mix and match.

Instant obvious application for turning one instrument into another would be only ever needing one crappy guitar that can sound like any other guitar. Maybe never even tune the thing. Have a parameter for playing style being slightly less or more drunk. Adjust the amount of THC, etc.

apologies, i read it back and realised id basically described resynthesis.

but its that that id want, access to the sound engine, to play the sounds
not so interested in changing an instrument to another personally, in the sense of playing the guitar to get a trumpet phrase

but yes, minor random variations of some parameters per note, not enough for it to change the sound, but enough that a fast run of 16s on one note, each note sounds a little more natural. that could be cool.

izonin · Post by **izonin** » Sat Oct 24, 2020 6:31 pm

But can it transform your voice into other (famous) people's voices?

I would be great if Mariah Carey could sing on my new track.

vurt · Post by **vurt** » Sat Oct 24, 2020 6:33 pm

izonin wrote: Sat Oct 24, 2020 6:31 pm But can it transform your voice into other (famous) people's voices?

I would be great if Mariah Carey could sing on my new track.

would they need to train the ai to be batshit crazy as well?

izonin · Post by **izonin** » Sat Oct 24, 2020 6:42 pm

vurt wrote: Sat Oct 24, 2020 6:33 pm
izonin wrote: Sat Oct 24, 2020 6:31 pm But can it transform your voice into other (famous) people's voices?

I would be great if Mariah Carey could sing on my new track.
would they need to train the ai to be batshit crazy as well?

Every AI is, by default.

vurt · Post by **vurt** » Sat Oct 24, 2020 6:43 pm

izonin wrote: Sat Oct 24, 2020 6:42 pm
vurt wrote: Sat Oct 24, 2020 6:33 pm
izonin wrote: Sat Oct 24, 2020 6:31 pm But can it transform your voice into other (famous) people's voices?

I would be great if Mariah Carey could sing on my new track.
would they need to train the ai to be batshit crazy as well?
Every AI is, by default.

maybe we shouldn't put them in charge of nuclear weapons installations.

izonin · Post by **izonin** » Sat Oct 24, 2020 6:48 pm

vurt wrote: Sat Oct 24, 2020 6:43 pm
izonin wrote: Sat Oct 24, 2020 6:42 pm
vurt wrote: Sat Oct 24, 2020 6:33 pm
izonin wrote: Sat Oct 24, 2020 6:31 pm But can it transform your voice into other (famous) people's voices?

I would be great if Mariah Carey could sing on my new track.
would they need to train the ai to be batshit crazy as well?
Every AI is, by default.
maybe we shouldn't put them in charge of nuclear weapons installations.

Ah, but they already are.

And talking about lunacy, here's a little rhyme:

Google fart, google fart, go away,
Come again another day,
Little Johnny wants to play (his guitar)

Wes Kilbride · Post by **Wes Kilbride** » Sat Oct 24, 2020 7:17 pm

vurt wrote: Sat Oct 24, 2020 6:43 pm
izonin wrote: Sat Oct 24, 2020 6:42 pm
vurt wrote: Sat Oct 24, 2020 6:33 pm
izonin wrote: Sat Oct 24, 2020 6:31 pm But can it transform your voice into other (famous) people's voices?

I would be great if Mariah Carey could sing on my new track.
would they need to train the ai to be batshit crazy as well?
Every AI is, by default.
maybe we shouldn't put them in charge of nuclear weapons installations.

You just convinced me that Mariah Carey controls Donald Trump. And that's very AI.

SeeingInMidi · Post by **SeeingInMidi** » Sat Oct 24, 2020 10:20 pm

izonin wrote: Sat Oct 24, 2020 6:31 pm But can it transform your voice into other (famous) people's voices?

I would be great if Mariah Carey could sing on my new track.

You joke but Adobe actually has created something similar to this. The last time I checked, they will not be releasing it to the public but there's a stage demo of it.

Leonard Bowman · Post by **Leonard Bowman** » Sun Oct 25, 2020 12:06 am

I attempted to train the Timbre Transfer script on solo classical guitar - it didn't turn out too well. Everything turned into full chord strums. You can download the .zip of the training result if you're curious about it.

While this was a fun exercise (that is, an exercise in patience as I let it train for 8 hours or so), the end result isn't particularly useful to me.
A couple takeaways from my experience:

It is surprisingly easy to set up the training, as the script walks you through each step.
Machine Learning still has some major milestones to conquer - at the very least polyphony and non-tonal/drum sounds. I'm looking forward to the days when I can translate my vocal percussion into a clean drumkit without having to own, store, and practice a drumset.
Magenta is doing a great job of making these tools accessible. That speaks volumes to me.
Karplus-Strong is better than my weak attempt at training a model for guitar. That's partly a limitation of the material I used for training, though.
this is more of a competitor to the big samplers than physical modeling. And I think it's a great step towards building expression into synthesized sounds.

Unaspected · Post by **Unaspected** » Sun Oct 25, 2020 12:57 am

perpetual3 wrote: Thu Oct 22, 2020 11:22 pm Thanks for this explanation, Unaspected. Better than mine! And I’ve coded a few deep neural networks myself...

Thank you for confirming my grasp of the theory. I've yet to dive into practical applications. Think I might set such a program the task of tuning filter/reverb topologies when I do. Though I have been having fun blowing things up and unlike the machine, I'm learning other principals indirectly.

vurt wrote: Fri Oct 23, 2020 4:08 pm but yes, minor random variations of some parameters per note, not enough for it to change the sound, but enough that a fast run of 16s on one note, each note sounds a little more natural. that could be cool.

There are a number of methods that are easy enough to employ using basic synthesis methods but they cost more CPU as you're adding modules rather than adjusting parameters in modules already provided. So my thinking should save CPU at a cost to RAM.

Leonard Bowman wrote: Sun Oct 25, 2020 12:06 am I attempted to train the Timbre Transfer script on solo classical guitar - it didn't turn out too well. Everything turned into full chord strums. You can download the .zip of the training result if you're curious about it.

While this was a fun exercise (that is, an exercise in patience as I let it train for 8 hours or so), the end result isn't particularly useful to me.
A couple takeaways from my experience:

It is surprisingly easy to set up the training, as the script walks you through each step.

Machine Learning still has some major milestones to conquer - at the very least polyphony and non-tonal/drum sounds. I'm looking forward to the days when I can translate my vocal percussion into a clean drumkit without having to own, store, and practice a drumset.

Magenta is doing a great job of making these tools accessible. That speaks volumes to me.

Karplus-Strong is better than my weak attempt at training a model for guitar. That's partly a limitation of the material I used for training, though.

this is more of a competitor to the big samplers than physical modeling. And I think it's a great step towards building expression into synthesized sounds.

Thank you for this.

Physical modeling is dead! Google tone transfer is here ...