KVR Audio

Wes Kilbride · Post by **Wes Kilbride** » Thu Oct 22, 2020 6:04 am

highkoo wrote: Thu Oct 22, 2020 5:52 am Literally does not understand the technology wtfsoever, but it makes him vomit.

Huh? What are you talking about?

Wes Kilbride · Post by **Wes Kilbride** » Thu Oct 22, 2020 6:04 am

psynical wrote: Thu Oct 22, 2020 5:45 am wow.

> It's not art unless i say it is!

go back to your academics and please stay there.

Huh? What are you talking about?

highkoo · Post by **highkoo** » Thu Oct 22, 2020 6:24 am

I was talking about how you seem to be totally out of place in this thread, I guess.
No offense.
For most people remotely knowledgable or interested or experienced in sound design, this tech is obviously the seed of some kind of amazing shit. But somehow it makes you vomit, and you dont understand why it even exists.

Wes Kilbride · Post by **Wes Kilbride** » Thu Oct 22, 2020 7:47 am

highkoo wrote: Thu Oct 22, 2020 6:24 am I was talking about how you seem to be totally out of place in this thread, I guess.
No offense.
For most people remotely knowledgable or interested or experienced in sound design, this tech is obviously the seed of some kind of amazing shit. But somehow it makes you vomit, and you dont understand why it even exists.

Anyone can join this thread. How can you possibly know what I know or don't know about sound design? I'm only nauseated by the use of technology to evade understanding what makes music music. If you can whistle a tune into a microphone and then push a button and expect something artistically wonderful to emerge from some piece of software, this is delusional.

revvy · Post by **revvy** » Thu Oct 22, 2020 8:07 am

Try travel sickness pills, can work wonders.

fmr · Post by **fmr** » Thu Oct 22, 2020 8:31 am

deastman wrote: Wed Oct 21, 2020 11:42 pm MIDI files with sample playback are rarely very convincing on their own without extensive controller modulation. That’s why MPE controllers are so interesting. But the goal of something like this research is to allow someone who doesn’t play any instrument to simply hum, sing, whistle, and create realistic instrumental performances. Likewise, if you are proficient on one instrument, you can use this to create realistic sounds of other instruments without dedicating decades of practice to those instruments. Maybe I want some violin or oboe in my song, but I only play guitar. This is really no difference than what people have been doing for ages, trying to create the most realistic string sections and piano patches for their MIDI keyboards. Beyond this, what is wrong with doing basic research, even if there isn’t an immediately obvious practical application?

^^^ Very much this ^^^

fmr · Post by **fmr** » Thu Oct 22, 2020 8:33 am

highkoo wrote: Thu Oct 22, 2020 5:33 am I mean, ffs, this is tech that literally creates sounds never heard before. Impossible sounds.
Sound designer shit.

Actually, it doesn't. It mimicks spectra you feed it with, as much as I saw. Nothing revolutionary. It's called spectral synthesis, and has been done since quite some time. Physical Modeling is more controllable, and much more capable of creating "sounds never heard before. Impossible sounds."

fmr · Post by **fmr** » Thu Oct 22, 2020 10:22 am

DELETED

bermudagold · Post by **bermudagold** » Thu Oct 22, 2020 10:29 am

audio to midi has been out for close to 20 yrs...because it has always been the holy grail to be able to play any instrument without having to learn a new interface...but it is difficult to track pitch, timing, and velocity...some may track one well, but I've never seen any track all three well...there was even a commercial hardware device that was a neck collar that measured sensory data from your vocal tract to improve the transcription
Academics have also been trying to objectively quantify the variables that encompass timbre for decades, since it is been known for ages that additive can theoretically reproduce any instrument if you fully understand the structure.
But the problem with both of these is they only analyze the input, they don't refine the analysis based on the intended output...that's why audio to midi doesn't immediately sound good on any random vsti preset you pull up...it's also why sampleson's additive emulations of specific instruments sound so much more realistic than the results of their flexible general additive synth
Google is combining both of these methods to get more convincing results by knowing in advance the intended destination and refining the analysis...that's why there are only four destination models...pitch and timbre tracking, additive resynthesis, and spectral morphing are not new...using machine learning to gain as much insight into the characteristics of the input and the intended output to improve the resynthesis algorithm is the novel part...
Even without the AI, creating a timbre characteristics database of traditional instruments would allow the creation of heuristics that could be implemented in an additive synth where the user selects from a table of source types and a table of desired destinations before performing the resynthesis

imrae · Post by **imrae** » Thu Oct 22, 2020 10:41 am

I feel that it slightly misses the point of instrumental performance to map one instrument to another. Different instruments provide different kinds of expression; the way I perform vibrato on a guitar is not the same as the way I would choose to use vibrato on a trombone. There is no equivalent to hammer-on/pull-off articulations, and the concept of "tremolo" is entirely different. How can bow-pressure be inferred from whistling?

My suspicion is that such transfer technology, like most ROMplers, does not really imitate the whole scope of an instrument, but only a particular limited way of playing the instrument. Of course, like ROMpling, this would be sufficient for many commercial purposes.

perpetual3 · Post by **perpetual3** » Thu Oct 22, 2020 11:37 am

fmr wrote: Thu Oct 22, 2020 8:33 am
highkoo wrote: Thu Oct 22, 2020 5:33 am I mean, ffs, this is tech that literally creates sounds never heard before. Impossible sounds.
Sound designer shit.
Actually, it doesn't. It mimicks spectra you feed it with, as much as I saw. Nothing revolutionary. It's called spectral synthesis, and has been done since quite some time. Physical Modeling is more controllable, and much more capable of creating "sounds never heard before. Impossible sounds."

Please read my earlier posts, or better read the blog (where I got my info from - I searched for “google tone match.”) The principle is that any sound can be synthesized if you have enough of the fundamental DSP elements - oscillators, filters, delays, reverbs, etc. Each of these DSP elements is going to have their own parameters. And enough here means a lot - far more than a human is capable of handling with two arms, two hands, and ten fingers. In other words, imagine if you had enough arms, hands, and fingers to adjust the hundreds of oscillators, filters etc (so there corresponding parameters) required to “realistically” synthesize the sound (violin, sax). Now also imagine that in your brain you know exactly the precise combination of DSP elements and corresponding parameters to generate that sound (you have a precise model of the sound using the DSP elements in your brain), but when you sit down to program the synth, all the parameters are wrong, for example totally randomized. And then you adjust all the parameters at once, hit a note, make some changes, hit a note make some more changes until the patch sounds like a violin or Sax or whatever. That is what this software is doing. Even though is based on how a human could synthesize a sound, it extends the paradigm in such a way that it would be impossible for a human to do.

fmr · Post by **fmr** » Thu Oct 22, 2020 2:00 pm

perpetual3 wrote: Thu Oct 22, 2020 11:37 am
fmr wrote: Thu Oct 22, 2020 8:33 am
highkoo wrote: Thu Oct 22, 2020 5:33 am I mean, ffs, this is tech that literally creates sounds never heard before. Impossible sounds.
Sound designer shit.
Actually, it doesn't. It mimicks spectra you feed it with, as much as I saw. Nothing revolutionary. It's called spectral synthesis, and has been done since quite some time. Physical Modeling is more controllable, and much more capable of creating "sounds never heard before. Impossible sounds."
Please read my earlier posts, or better read the blog (where I got my info from - I searched for “google tone match.”) The principle is that any sound can be synthesized if you have enough of the fundamental DSP elements - oscillators, filters, delays, reverbs, etc. Each of these DSP elements is going to have their own parameters. And enough here means a lot - far more than a human is capable of handling with two arms, two hands, and ten fingers. In other words, imagine if you had enough arms, hands, and fingers to adjust the hundreds of oscillators, filters etc (so there corresponding parameters) required to “realistically” synthesize the sound (violin, sax). Now also imagine that in your brain you know exactly the precise combination of DSP elements and corresponding parameters to generate that sound (you have a precise model of the sound using the DSP elements in your brain), but when you sit down to program the synth, all the parameters are wrong, for example totally randomized. And then you adjust all the parameters at once, hit a note, make some changes, hit a note make some more changes until the patch sounds like a violin or Sax or whatever. That is what this software is doing. Even though is based on how a human could synthesize a sound, it extends the paradigm in such a way that it would be impossible for a human to do.

Bunch of bullshit. Did you actually ever use a synth, or a sampler? Any sampler can realistically reproduce the sound of a violin, a sax, a trumpet or whatever. And a violin player doesn't need to be an octopus to play a violin. That's nonsense.

Actually, what usually fails when trying to mimic an acoustic instrument is the real time variations of pitch, timbre, volume, attack articulations, phraseology, etc. that a violinist performs in real time. No Ai will be able to do that because that changes from note to note, from piece to piece, etc. And the examples in the site are something to laugh about. They don't sound at all like a real instrument. They "resemble" the real thing, but much more distant than any good sample library can.

All the rest you wrote is plain mumbo jumbo. No, it's not like you say, at all. I can synthesize a realistic violin sound, or sax sound. If I use a sampler, I can get a VERY realistic sound of a violin or a sax. Problem is, it plays realistically only if I respect all the idiosyncrasies of THAT instrument. Otherwise, it will sound synthetic and artificial (as the examples in the site did).

SneakyBeats · Post by **SneakyBeats** » Thu Oct 22, 2020 2:22 pm

Nice idea.. But I just can't figure out any situation where I'd want to turn my fart into a bunch of violins.

At least any situation that had something to do with music.

perpetual3 · Post by **perpetual3** » Thu Oct 22, 2020 2:52 pm

fmr wrote: Thu Oct 22, 2020 2:00 pm
perpetual3 wrote: Thu Oct 22, 2020 11:37 am
fmr wrote: Thu Oct 22, 2020 8:33 am
highkoo wrote: Thu Oct 22, 2020 5:33 am I mean, ffs, this is tech that literally creates sounds never heard before. Impossible sounds.
Sound designer shit.
Actually, it doesn't. It mimicks spectra you feed it with, as much as I saw. Nothing revolutionary. It's called spectral synthesis, and has been done since quite some time. Physical Modeling is more controllable, and much more capable of creating "sounds never heard before. Impossible sounds."
Please read my earlier posts, or better read the blog (where I got my info from - I searched for “google tone match.”) The principle is that any sound can be synthesized if you have enough of the fundamental DSP elements - oscillators, filters, delays, reverbs, etc. Each of these DSP elements is going to have their own parameters. And enough here means a lot - far more than a human is capable of handling with two arms, two hands, and ten fingers. In other words, imagine if you had enough arms, hands, and fingers to adjust the hundreds of oscillators, filters etc (so there corresponding parameters) required to “realistically” synthesize the sound (violin, sax). Now also imagine that in your brain you know exactly the precise combination of DSP elements and corresponding parameters to generate that sound (you have a precise model of the sound using the DSP elements in your brain), but when you sit down to program the synth, all the parameters are wrong, for example totally randomized. And then you adjust all the parameters at once, hit a note, make some changes, hit a note make some more changes until the patch sounds like a violin or Sax or whatever. That is what this software is doing. Even though is based on how a human could synthesize a sound, it extends the paradigm in such a way that it would be impossible for a human to do.
Bunch of bullshit. Did you actually ever use a synth, or a sampler? Any sampler can realistically reproduce the sound of a violin, a sax, a trumpet or whatever. And a violin player doesn't need to be an octopus to play a violin. That's nonsense.

Actually, what usually fails when trying to mimic an acoustic instrument is the real time variations of pitch, timbre, volume, attack articulations, phraseology, etc. that a violinist performs in real time. No Ai will be able to do that because that changes from note to note, from piece to piece, etc. And the examples in the site are something to laugh about. They don't sound at all like a real instrument. They "resemble" the real thing, but much more distant than any good sample library can.

All the rest you wrote is plain mumbo jumbo. No, it's not like you say, at all. I can synthesize a realistic violin sound, or sax sound. If I use a sampler, I can get a VERY realistic sound of a violin or a sax. Problem is, it plays realistically only if I respect all the idiosyncrasies of THAT instrument. Otherwise, it will sound synthetic and artificial (as the examples in the site did).

First - don’t shoot the messenger.

Second - are you familiar with the way that a deep neural network works? Input layers, hidden layers, weights, backpropagation, cost functions, activation functions, etc? The linear algebra, calculus, differential equations, etc? All I was trying to do was explain, using a metaphor, what is kind of going on (because without understanding the mathematics and computer science that makes tone match possible, it’s hard to really explain it clearly in ordinary language; it’s one of the “problems” of artificial intelligence and machine learning and many scientists and mathematicians argue its can never be explainable to the extent that we wish, even if you do understand the math). And the crazy thing is, your own description of how you synthesize a violin or sax is, in essence, what I described. You have a model of the violin sound in your head, and given a synth with enough modules (whatever that is in your case) you can generate that sound even if all the parameters of your modules are initially in a random state. But if the sound is complex enough, eventually it becomes really difficult, for nearly everyone, to synthesize a physical sound using elementary modules or DSP elements because humans have limitations (two eyes, two arms, ten fingers, difficulty paying attention to more than one thing at once, etc).

Third - with all due respect, your opinion on whether your synthesized violin or sax represents a “natural,” “realistic,” is not representative of a generally accepted (or valid) opinion at all. Your opinion and experience is not sufficient to form a valid generalization.

Fourth - Sure, go ahead an use a sample when you want something as close to the physical instrument - it’s a recording after all.

Fifth and finally - I wrote this message and my previous in good faith. I’ve avoided the use of profanity or trying to indirectly imply that you’re an idiot. Good day.

fmr · Post by **fmr** » Thu Oct 22, 2020 3:11 pm

perpetual3 wrote: Thu Oct 22, 2020 2:52 pm .../... if the sound is complex enough, eventually it becomes really difficult, for nearly everyone, to synthesize a physical sound using elementary modules or DSP elements because humans have limitations (two eyes, two arms, ten fingers, difficulty paying attention to more than one thing at once, etc).

That's where your line of thinking fails. Synthesizing the sound isn't "that" difficult. Actually it was done in the sixties by Jean-Claude Risset using acoustic analysis and additive synthesis via the Music language created by Max Mathews. It has nothing to do with human limitations, since we can program each element at a time, and then trigger most of them using relatively simple controls. That's exactly what we do when we play a real musical instrument. It seems to me you're not a musician and don't play a real instrument. A real instrument is programmed in real time every time a note is produced, and lots of "elementary modules" are activated, with just our own arms and hands, by means of the specific instrument playing techniques.

perpetual3 wrote: Thu Oct 22, 2020 2:52 pm Third - with all due respect, your opinion on whether your synthesized violin or sax represents a “natural,” “realistic,” is not representative of a generally accepted (or valid) opinion at all. Your opinion and experience is not sufficient to form a valid generalization.

My opinion is informed, as it comes from someone with a longtime experience using and performing with real instruments. It certainly is representative. You may value it more or less, I don't care.

perpetual3 wrote: Thu Oct 22, 2020 2:52 pm I wrote this message and my previous in good faith. I’ve avoided the use of profanity or trying to indirectly imply that you’re an idiot. Good day.

And I am not implying you're an idiot either. Sorry if it looked that way. Just that you lack some sources of information, like for example the playing techniques of each instrument. You see - I don't need to understand neural networks to know how a violin or a sax work and produce sound, and certainly not to recognize whether they are realistic or not. You are valuing the process, which I couldn't care less. I am interested solely in the results.

Physical modeling is dead! Google tone transfer is here ...