KVR Audio

Music Engineer · Post by **Music Engineer** » Fri Jun 10, 2022 10:09 pm

mystran wrote: Fri Jun 10, 2022 8:42 pmI think you can get better result with one of the smooth functions like swish.

Aha! An interesting function:

https://www.desmos.com/calculator/i5md2q79ax

...swish - I've never heard about that before. I scaled it by 2 such that at b=0, it becomes the identity. Could be useful for general waveshaping purposes, too.

The little I've played around with NNs I just wrote the training code from scratch in C++ (partially to gain better understanding of what is going on)

Yes - same here

I just wanted to prove to myself that I actually understand backprop well enough. I generally tend to think that the best way to actually understand a math concept is to "explain" it to a computer. It's unforgivingness really forces you to get all the details right.

QuadrupleA · Post by **QuadrupleA** » Fri Jun 10, 2022 10:23 pm

I owe some other replies but quick update on the phasor / sin+cos, just hacked it in to the trainer & synth and it definitely solves the discontinuity nicely:

Can also confirm feeding -1..1 inputs is no prob - the neural net stuff on this one is my own C/C++ code, had the same feeling as you guys there to try and "understand by implementing" first.

Haven't experimented with it extensively (I should get some other stuff done today

) but so far the damages seem a little more subjectively "boring", about 90% just pushing parts of the wave up and down:

mystran · Post by **mystran** » Fri Jun 10, 2022 11:06 pm

[...]

mystran · Post by **mystran** » Fri Jun 10, 2022 11:07 pm

If the phasor turns out to be too boring (though looking at the animations they look like the sort of thing that might sound quite musical anyway), perhaps there's another continuous parameterization that might work better (eg. perhaps have even more inputs and have them cycle with a looping B-spline basis or something like that).

ahanysz · Post by **ahanysz** » Sat Jun 11, 2022 12:16 am

Fascinating concept! I'd be very interested in a Linux version. (I tried running the dll on Linux using LinVST, and it crashed without showing any useful error messages.)

QuadrupleA wrote: Wed Jun 08, 2022 4:59 pm...So think I'm going to shelve the project.

This would be a shame, but your position makes a lot of sense.

Some options to consider:

If you're not going to develop it further, would you consider releasing the source code?
Contact the Surge Synth Team and see if there's a way of building a "braindamage oscillator" into Surge?
Develop it as a VCV Rack module. This can be done as closed source if you want, either free or paid. The benefit is that you only need to develop it as a waveform generator, and people can plug in other modules for whatever type of modulations and effects they want. So it could be used as part of a fully featured synth without you spending effort on developing those other features.

imrae · Post by **imrae** » Sat Jun 11, 2022 2:21 am

QuadrupleA wrote: Fri Jun 10, 2022 5:28 pm
imrae wrote: I suggested this on page 1, as well as morphing between pre-trained coefficients
You're right, sorry

Not offended, just thought it was funny that it happened twice

Thanks again for sharing your very cool work at an early stage.

imrae wrote: I’ve been thinking it would be a lot of fun to try and implement this algorithm as a Eurorack VCO.
Cool idea. I did come across this Eurorack module while researching other neural net music projects:

https://www.analogueresearch.com/produc ... al-network

The whole module is just two analog neurons though so you'd need a big rack

Well I was thinking digital but... hmm... maybe a small network of digipots and diode clippers can be trained

I have some other projects to finish first but will post here if I do try something related!

mystran · Post by **mystran** » Sat Jun 11, 2022 9:37 am

mystran wrote: Fri Jun 10, 2022 8:15 pmThe papers suggest just branching when abs(x1-x0) < epsilon, though another possibility is to let a=copysign(epsilon,x1-x0) and rewrite as (F(x1)-F(x0)+a*f((x1+x0)/2))/(x1-x0+a) which is a bit more expensive, but will reach the limit smoothly.

Actually upon thinking about this for about 5 more minutes, here's another (better) strategy that simply forces the integration interval to a non-zero value (observing that it doesn't matter which direction we integrate):

let x=(x1+x0)/2, delta=(x1-x0)/2 and alpha=sqrt(delta^2+eps^2),
then rewrite as (F(x+alpha)-F(x-alpha))/(2*alpha).

Practically speaking alpha=abs(delta)+eps also works (probably just as well when eps is small), but the sqrt() version kinda feels more elegant (in the theoretical sense at least) to me.

QuadrupleA · Post by **QuadrupleA** » Sat Jun 11, 2022 10:28 pm

mystran wrote: The basic idea with the 1st order version is that if we "reconstruct" the signal in continuous time with a triangular kernel (which results in linear interpolation between the current and previous sample values) and we put the resulting linear slope through a non-linear function f(x) and then filter the result with a box-filter before resampling (ie. average over the sampling period) then we can compute the whole thing "analytically" from the two sample values by taking a definite integral as (F(x1)-F(x0))/(x1-x0) where F(x) is the antiderivative (=indefinite integral) of f(x) and x0, x1 are the previous and current sample values.

Thanks mystran, helpful. Had some time to dig into ADAA a bit today and work through that notebook. I was confused earlier thinking it applies to time-domain functions that map time to amplitude, e.g. volume = Sawtooth(t), but looks like it's more about waveshaping functions that map amplitude -> amplitude, like hard clipping, tanh saturation, etc.

While under the time delusion I tried it with a simple descending sawtooth function, but aliasing noise seemed the same with or without the "ADAA":

Code: Select all

    // Sawtooth func:   1.0f - 2.0f * phase;
    // SawtoothAD func: phase - phase * phase;
    oscSample = (SawtoothAD(x1) - SawtoothAD(x0)) / (x1 - x0);

Not sure the AD there properly accounts for the discontinuity when phase wraps (sort of a dirac delta / impulse, not sure if that's even integrable).

So yeah, still curious but a little stumped how / whether it'd apply to the neural net with it's time input. The activation functions are more mapping time to amplitude in this configuration than amplitude to amplitude like a waveshaper, at least in the earlier neurons.

Intuitively it makes sense in time-domain though, a definite integral is the "area under the graph" for a pair of t's / samples, so if you divide that by dt (length between samples) you get the average of the infinitesimal timeslices under the graph, which should give you the blurring/averaging you want for eliminating aliasing at larger sample intervals. But doesn't seem to work in practice with my saw experiment. Maybe because of the discontinuity when phase wraps from 0 to 1? Not sure what the AD of that impulse / discontinuity would be.

mystran wrote: Also.. ReLU might not be the best choice for activation when doing function approximation, because it'll essentially force your network to learn a piecewise linear approximation.

You're right - I always assumed ReLU could result in quadratic curvature with 2 layers, cubic with 3 etc. but that's not the case, weights just multiply by a constant (change slope) and biases just adjust the y offset, so linear stays linear no matter how many passes.

My original experiments were all sigmoid, but the exp() is a little expensive (and the two exp()'s in the derivative during training) and the damage characteristics for ReLU seemed more dramatic / interesting. Curious to experiment with tanh, swish, etc.

QuadrupleA · Post by **QuadrupleA** » Sun Jun 12, 2022 3:07 am

imrae wrote: Thanks again for sharing your very cool work at an early stage.

No prob

Thanks for the ideas & checking it out.

ahanysz wrote: Fascinating concept! I'd be very interested in a Linux version. (I tried running the dll on Linux using LinVST, and it crashed without showing any useful error messages.)

Thanks - yeah, I should try that, I've got a few Linux machines around. On crashes it should dump to an Error.log file alongside the DLL, if you see that in there let me know. Suspect it's the OpenGL context creation (like RunBeerRun was probably running into, no pun intended). I'm using Windows raw wgl* functions there but there are more compatible approaches, might switch to SDL or maybe Juce on that.

Good suggestions for open sourcing or Surge / VCV rack module, would be good do that before I abandon it - been having fun noodling / learning / trying ideas from the pros here, so might keep it going in hobby-mode for a while, but don't know yet how far I'll take it.

ahanysz · Post by **ahanysz** » Sun Jun 12, 2022 3:31 am

QuadrupleA wrote: Sun Jun 12, 2022 3:07 am On crashes it should dump to an Error.log file alongside the DLL, if you see that in there let me know.

No Error.log file for me, but I figured out how to view some error text on the screen. I've emailed you the details. Thanks!

QuadrupleA · Post by **QuadrupleA** » Sun Jun 12, 2022 3:43 am

ahanysz wrote: No Error.log file for me, but I figured out how to view some error text on the screen. I've emailed you the details. Thanks!

Awesome, thanks! That email was super helpful. Looks like it's crashing on a null pointer dereference in the OpenGL DLL. I think I'll switch the synth over to SDL, they've already sorted out the best ways to do context creation stuff cross-platform.

My past windows releases with the underlying game engine were all DirectX (GL was just for a web build) so got some GL compatibility lessons to learn

2DaT · Post by **2DaT** » Sun Jun 12, 2022 1:43 pm

QuadrupleA wrote: Sat Jun 11, 2022 10:28 pm My original experiments were all sigmoid, but the exp() is a little expensive (and the two exp()'s in the derivative during training) and the damage characteristics for ReLU seemed more dramatic / interesting. Curious to experiment with tanh, swish, etc.

If performance becomes a problem, you could use vectorization. It is especially effective when you need to compute a lot of transcendental functions in parallel.

I wrote some vector functions that can be helpful in that.

exp
tanh

QuadrupleA · Post by **QuadrupleA** » Mon Jun 13, 2022 3:22 pm

2DaT wrote: Sun Jun 12, 2022 1:43 pm If performance becomes a problem, you could use vectorization. It is especially effective when you need to compute a lot of transcendental functions in parallel.

I wrote some vector functions that can be helpful in that.

exp
tanh

Nice! exp_mp() compiles successfully under LLVM (release build). Not MSVC (debug build) but probably just some compiler options / #defines to enable those intrinsics.

Curious to experiment with more optimizations. LLVM is pretty good about auto-vectorizing in the IR / ASM output, e.g. a loop with variable length gets split into 4 or 8-at-a-time sections when length >= 4 and then a wrapup branch for the last 1-3. But would probably pay to explicitly structure things that way and maybe bring in intrinsics for SSE instructions etc.

BrainDamage is just single-threaded at the moment (or 2-threaded, UI and audio, with some critical sections to synchronize parameters) so could maybe exploit more threads in the NN feedforward evaluation, other polyphony voices etc. Although in a DAW with other plugins I feel like it's better to just let the other plugins use other cores and be speedy with single-threaded perf. So vectorization would be good there.

Itching to work more on this but got a looming house move at the end of the month so probably need to hold off for a few weeks...

Tj Shredder · Post by **Tj Shredder** » Tue Jun 14, 2022 5:31 am

This concept seems ideal to throw at the neural processors of a M1 in the new Macs…

mystran · Post by **mystran** » Tue Jun 14, 2022 7:24 am

Tj Shredder wrote: Tue Jun 14, 2022 5:31 am This concept seems ideal to throw at the neural processors of a M1 in the new Macs…

Doesn't seem like there is a public API for this, plus what I can dig up (eg. some reverse engineered details) suggests that it might be one of those "reduced precision" affairs that run 8-bit integers or 16-bit float, which is pretty common for machine learning (since it's generally fine for many classification and image processing tasks). Unfortunately such low precision is basically useless for audio purposes and you'd generally want at least 32-bit floats (which is basically pure waste of silicon for most other ML tasks).

BrainDamage, a neural network synthesizer - feedback welcome