What's up with vastly different performance on the Apple Silicon builds?

DSP, Plugin and Host development discussion.
Post Reply New Topic
RELATED
PRODUCTS

Post

So i've been toying with the M1 and running benchmarks of various plugins...
And i don't understand how and why vastly different some plugins perform.
Examples:
Reveal Sound:
i7 mini: 28 tracks
M1 rosetta: 28 tracks (in line with expectations and benchmarks)
M1 native pre fix: 20 tracks (wtf) (30% WORSE performance)
M1 native post fix (non publilc build): 40 tracks (~40% better performance)

FabFilter:
M1 rosetta: 92 Saturn2 instances (LP, Superb oversampling)
M1 native: 116 Saturn2 instances (25% better performance)

MeldaProduction:
Sameish on Native and Rosetta, a little worse native. (fraction tho, effectively same) (internal benchmarks correspond to real world performance)

Fuse Audio Labs VPRE-2C
M1 rosetta: 225 tracks
M1 native: 165 tracks


Looks like some compilers must be absolute garbage, it makes absolutely no sense for a plugin to work better through REAL TIME emulation layer than natively, that just doesn't make any sense, at the very least they should perform the same.
So even if most plugins will eventually be native, if they're not compiled well you're still better off using Rosetta. that's just... blah.

Logic itself is a rocket via rosetta, it absolutely mops the floor with the Intel i7 mini.

(fwiw urs (u-he) also said he saw significant improvement on arm builds)
Image

Post

Many developers have spent a huge effort over the years optimizing their plugins for x86/x64, now we have to learn the same tricks for ARM, it could take a few months for developers to catch up.

Post

Here's what I get with my native M1 beta of Superchord:

Code: Select all

Native: 59 instances/core
Rosetta2: 32 instances/core
I'm seeing similar ~2x performance on my other plugins.

I find it very hard to believe that clang would be so bad at generating optimized ARM code that Rosetta translation would perform better.
Perhaps they're not using the proper optimization flags and/or Neon intrinsics.

Post

Well, ray said he's gonna be back with a new build soon, and first build of Spire was also performing atrociously.

Which native plugs do you have?
Image

Post

I suspect that between hand-optimized bits of x86 code, SSE and other Intel-specific optimizations, there's going to be a few bumps in the road for people just wanting to recompile their codebase for ARM.

I can see Rosetta code generally out-performing native Intel code, but not native ARM-code. That tells me there's something wrong in the codebase that needs fixing.

Clang is going to be quite good at optimization for ARM since its been supporting it for many years now, as long as there's been ARM-based iPhones. If I remember correctly, C++, Swift, Objective-C, etc. are all compiled to the same IL which is then run through the optimization process, so choice of language should not be an issue.
I started on Logic 5 with a PowerBook G4 550Mhz. I now have a MacBook Air M1 and it's ~165x faster! So, why is my music not proportionally better? :(

Post

Native M1 code performs much better than emulated code. From our current experience, native M1 code gains 20-30% relative to Rosetta. Both Intel and M1 code are highly optimized here (using SSE/AVX on Intel, and Neon on M1).

Richard
Synapse Audio Software - www.synapse-audio.com

Post

Richard_Synapse wrote: Thu Jan 28, 2021 10:24 am Native M1 code performs much better than emulated code. From our current experience, native M1 code gains 20-30% relative to Rosetta. Both Intel and M1 code are highly optimized here (using SSE/AVX on Intel, and Neon on M1).
I'd imagine that those plugins that run better emulated probably have aggressive SSE/AVX code-paths without ARM native equivalents (ie. the ARM builds might be running more scalar code). This is probably a temporary problem in most cases (ie. make it work now, make it fast later).

Post

Around 20% boost with native M1 code here.
Rosetta2 seems to work fine with all our existing products. So far we did not get many requests for native ARM builds. The customers do not seem to care much about it, as long as 'it works'

Post

Markus Krause wrote: Mon Feb 01, 2021 9:12 am Around 20% boost with native M1 code here.
Rosetta2 seems to work fine with all our existing products. So far we did not get many requests for native ARM builds. The customers do not seem to care much about it, as long as 'it works'
Do you not get crashes with the Rosetta emulation layer (regardless of the plugin used)? I had a few, it is not easy to reproduce though, can take hours. Also there is quite a few bug reports related to hosts running under Rosetta, so I would expect DAW makers to ship M1 versions as soon as they can. So it is good you jumped on the M1 train ihmo :)

Richard
Synapse Audio Software - www.synapse-audio.com

Post

M1 performance hit can be due to lack of (or absent) x86 SSE to NEON code port. Not a single complain from our customers for the Universal 2 Binary builds here too.

Perhaps the adoption rate is slower than expected so the architecture switch may take 3-4 years instead unless Apple has very high performance boost with their M1X and M2 chips arriving in 2021 making them more appealing.

Post

i guess that since M1's via rosetta generally perform in the vicinity of 6-core i9 mobile - i7 6-core desktop, while being vastly cheaper and also more portable, means that you gain a lot by going M1 even if you run stuff via rosetta.
However when i go native, ZOooooOM.

Intel quad's are ~50% worse while being louder and hotter and with worse battery.
Image

Post

I'm not seeing this kind of performance using Rosetta, under Logic (with native M1 it is good of course). How did you measure this exactly?

And yeah, the absence of noise is amazing. I'm looking forward to the next-gen M1 chips. :D

Richard
Synapse Audio Software - www.synapse-audio.com

Post

Richard_Synapse wrote: Mon Feb 01, 2021 6:51 pm I'm not seeing this kind of performance using Rosetta, under Logic (with native M1 it is good of course). How did you measure this exactly?

And yeah, the absence of noise is amazing. I'm looking forward to the next-gen M1 chips. :D

Richard
I loaded up a Logic project with Diva, Spire, FabFilter and ran it on both my Mini i7 and my M1 13" Pro at different buffer sizes, using both Logic Native and Logic Rosetta - kept adding tracks till it started crackling
Image

Post Reply

Return to “DSP and Plugin Development”