KVR Audio

TW5011 · Post by **TW5011** » Tue Oct 27, 2020 5:16 am

There's a lot of potential to this, while realizing this is basically a tech demo of where it's currently at.

But since someone mentioned farting in music, such a thing does exist, albeit in a more "musical" way than the video above.

There's actually a whole EP of that, with the genre of "Stank". I wonder about it, but there's probably not much to say about it -- either you find it funny or you don't. I think it's funny at the right time, but it's not going to be in my regular playlist...

elnn · Post by **elnn** » Tue Oct 27, 2020 11:12 am

If the sole aim of physical modelling was exact reproduction of physical instruments, then it would be in some trouble. It would be doomed from the get-go. Gladly, 'physical modelling' is just a moniker for a methodology that takes physical properties involved in the production of a sonic imprint of real objects as an inspiration for a method for making mathematical systems that could calculate interactions between virtual objects that would approach equivalence to physical interactions. Its aim is not reproduction of timbre and its capacities are different from that.

Now a machine learning program that would learn to somehow calculate and model these interactions, rather than just the imprint in the frequency domain (which does not involve a physical instrument per se, but only its audio sample), and, as a further step, to allow the user to manipulate some of the parameters, that would be the real end to physical modelling synthesis. And its new birth.

I'd make an analogy between Deepfakes and 3d modelling. Deepfakes involve restructuring of certain data from an input, while 3d modelling is entirely generative, i.e. needs no external input. I get the impression that the illusion of deepfakes feels more authentic than 3 d modelling. These two methods, I guess, could be combined, so a 3d modelled face could be swapped with a Deepfake for a more authentic feeling. And I'd guess the same would be possible in the audio realm too. What I want to say, is that this tone transfer thing could go very well together with physical modelling : generating a sound through physical modelling and then adding a further dimension of nuance and control through this tone transfer.
What I imagine is new methods in convolution.

teilo · Post by **teilo** » Tue Oct 27, 2020 7:38 pm

elnn wrote: Tue Oct 27, 2020 11:12 am If the sole aim of physical modelling was exact reproduction of physical instruments, then it would be in some trouble. It would be doomed from the get-go. Gladly, 'physical modelling' is just a moniker for a methodology that takes physical properties involved in the production of a sonic imprint of real objects as an inspiration for a method for making mathematical systems that could calculate interactions between virtual objects that would approach equivalence to physical interactions. Its aim is not reproduction of timbre and its capacities are different from that.

Tell that to Modart.

elnn · Post by **elnn** » Tue Oct 27, 2020 9:01 pm

teilo wrote: Tue Oct 27, 2020 7:38 pm
elnn wrote: Tue Oct 27, 2020 11:12 am If the sole aim of physical modelling was exact reproduction of physical instruments, then it would be in some trouble. It would be doomed from the get-go. Gladly, 'physical modelling' is just a moniker for a methodology that takes physical properties involved in the production of a sonic imprint of real objects as an inspiration for a method for making mathematical systems that could calculate interactions between virtual objects that would approach equivalence to physical interactions. Its aim is not reproduction of timbre and its capacities are different from that.
Tell that to Modart.

I did not say it can't model a sonic imprint. It can. But you can't hit a Modart piano with a hammer. It would merit a modelling of this ugly situation. Their current products are hardly of much help here.

Furthermore, it can as well not do it. You can disregard actual physical possibility.

cron · Post by **cron** » Wed Oct 28, 2020 7:20 am

A bit late to the party here, but I've had my eye on this since someone here linked to it with a less bait thread title that didn't pull half as much attention/vitriol.

Neural networks are notoriously difficult to analyse or learn from because you can't really look inside them to see what they're doing, but this project seems to sidestep some of those problems. By using the network to drive a 'known' framework - i.e. where the neural network drives known parameters rather than itself deciding what the parameters should be, Xavier Serra's SMS Tools in this case - it becomes easier to disentangle and modify various aspects of the synthesis, spectacularly demonstrated in the voice to violin audio examples.

It's also a lot lighter when the parameter space is known and well suited to the task at hand, and this is of particular note when it comes to asking neural networks to deal with audio. Although Google's AI Jukebox has shown spectacular results in 'directly' synthesising complete tracks since this work was last posted, neural networks struggle with long, serial, largely unpredictable chains of values (e.g., the 44100 samples per second of CD quality audio - IIRC AI Jukebox employed a multiresolution approach)

It's only really in the age of music information retrieval that scientific classification of timbre has become a mainstream research concern. MIR underpins, for instance, Siri or Youtube's Content ID algorithm being able to identify tracks even if they've been transformed in some way (e.g. sped up or pitch shifted.) According to Roads, MPEG-7 from the mid-00s was the first serious attempt to exhaustively describe timbre in purely scientific, measurable terms. While such research has mainly been focused on analysis (which can still be useful to the music producer, as recent plug-ins like XO and Atlas have shown,) my hope is that projects like this will start moving those ideas into synthesis and audio processing.

As such, for me the most interesting aspect of this project relates to parameterisation - how might we actually apply all this research into timbre in a controllable way. We could attempt to scientifically define the perceptual quality of brightness as being "the difference between the fundamental frequency and the spectral centroid" but this doesn't hold for all sounds (even on similar sounds produced by the same instrument) because perception is messy and stubbornly human. A neural network trained on a corpus whose perceptual qualities have been rated by humans may, ironically, do a much better job of mapping subjective descriptions of timbre to scientific descriptions of timbre than humans can. We're quite a way from it now, but projects combining timbre research and machine learning point toward a future where, with the right synthesis framework under the hood, perceptual descriptors like "brightness" or "warmth" become single-knob modifiers that produce plausible results on any sound.

nunrgguy · Post by **nunrgguy** » Fri Mar 12, 2021 1:53 pm

"it's not real music , all you do is push a button and let the computer do it all for you.

blah blah blah"

some boring old c*nt - 1976

WasteLand · Post by **WasteLand** » Fri Mar 12, 2021 2:11 pm

didn't read all, read the beginning, and see how this thread, well, see for yourself. so i glanced at things..

ok, last post. nice.

o well nietzsche and spectralayers pro in one thread, about physical modelling...

(by the it is always contributed to nietzsche, but it was already said by hegel, and of course by authors, i think french ones, enlightment... nietzsche only states the meaning of it... and in a way immanual kant did the trick already, intentionally???)

mmmmmh, how do; ow i have vistiors; physical, i hope i can model them...

Erisian · Post by **Erisian** » Fri Mar 12, 2021 2:23 pm

When I saw the thread title, my heart leapt at the thought of taking a real instrument sample and somehow converting it into instructions for an oscillator. Having read the first two pages I have lost interest in what seems to be a pointless waste of time. Its a long thread and I haven't read it all, so forgive me if I am wrong.

Physical modeling is dead! Google tone transfer is here ...