Richard_Synapse wrote:What kind of speedup does this advanced algorithm give? The Fuzz Face I built with Halite needs about 100% CPU at 44.1 kHz so I'm curious just how low it could be pushed without simplifications.
There's basically three main things that can speed it up a lot:
1. we start from scratch every iteration, even though most of the matrix is constants so it could be pre-calculated just once; this is done for simplicity, but it's really inefficient
2. related to the previous point, we solve the matrix in the order it was specified (if you ignore the partial pivoting) even though we could re-ordering it to further maximise the amount of stuff that can be pre-calculated and minimise the number of nodes we have to solve if we just want a partial solution (ie. outputs, new capacitor voltages, etc)
3. looping through an actual matrix with generic code is just very slow in general; once you have a pivoting order that works (which is practically always good enough to use forever), you can make it a lot faster by simply compiling the actual math into straight-line machine code
I might try to get an actual figure at some point, but with the above optimisations (and a bit of help from a C-compiler) the bulk of the CPU is going go into the Newton-evaluations and you might be looking at a speedup factor of 100 or so (which is really just an educated guess at this point, but you probably get the idea).
The "parent project" that applies optimisations 1 and 2, but otherwise uses a similar generic solver currently solves the original BJT test circuit (all nodes) with about 250k steps per second on my MBP (i5-5257U CPU @ 2.70GHz, currently running on battery), with the original function generator as the input and the cost of the Fuzz-face circuit would like be similar (unless it has to iterate a lot more). My educated guess would be a speedup of up to 10 or so once you compile to native (which I can try once I fix a few minor things with the C-code backend).