u-he on Apple silicon (Updated)

Official support for: u-he.com
Post Reply New Topic
RELATED
PRODUCTS

Post

david.beholder wrote: Mon Nov 16, 2020 8:21 pm Depends on definition of "moderate". If you want to run Ableton and several instances of Diva don't even look at low performance laptops YMMV of course.
I've got a beefy desktop for serious work, but I'd like to be able to run a few VSTi's without my laptop glitching and stuttering (which is my current state). My current laptop really was made for web browsing and productivity, and is several years old. Would just like something that could handle basic audio work without major glitches.

Post

There was a benchmark done with Logic and Diva:

Mac Mini M1 vs. iMac i9 10-core:

Mac Mini could handle 24 instances of Diva and the iMac could handle 68 instances.

So in terms of Diva, the M1 can provide 35% of the performance of an i9 10-core. But it just costs 26% of the price of the i9 ($699 vs $2699). Yes, the iMac has a display. So YMMV.

But I am eager to see if u-he can optimize the performance. Right now it looks like it is holding up well. But I guess this is all using Rosetta 2 with Diva.

LINK:
https://www.youtube.com/watch?v=SBRSjA5zB8Q

Post

Not sure it's a 100% valid comparison.

Still waiting our M1 Macs to arrive... probably next week.

Post

What would be a valid comparison? i7 8-Core?

Post

tslays wrote: Thu Nov 19, 2020 1:34 pm What would be a valid comparison? i7 8-Core?
One with software that was compiled for the system its tested on, i.e. not running in an emulation system.

Still, for that, the performance is truly amazing.

Post

In other benchmarks, after optimization for Apple Silicon, there was performance improvement of about 30-40%.
How hard will it be for you to optimize for Apple Silicon? Will that take a considerable amount of development resources?

Post

tslays wrote: Thu Nov 19, 2020 1:45 pm In other benchmarks, after optimization for Apple Silicon, there was performance improvement of about 30-40%.
How hard will it be for you to optimize for Apple Silicon? Will that take a considerable amount of development resources?
It's already happening, we're testing our internal builds for Apple Silicon on the Apple Transition Kit, which we've had for a few months, and - from next week - on M1 Macs.

Post

😬
adapt or die
to me apple, despite its “size”, compared to Intel or amd or nvidia is so immensely quick in adaption it’s a miracle.
to me it’s a bold move, and the right one, too.

I remember my younger me self teaching ARM risc assembly on a Acorn Archimedes computer, within a week or so. It’s so efficient and easy, and was light years ahead of Intel pentium processor performance. Wonder why it took so long for someone to bring it to the desktop masses.

Post

tslays wrote: Thu Nov 19, 2020 1:45 pm In other benchmarks, after optimization for Apple Silicon, there was performance improvement of about 30-40%.
How hard will it be for you to optimize for Apple Silicon? Will that take a considerable amount of development resources?
It's not even about optimisation. It is far more basic than that. It is about compiling a native binary for the Apple's new chip. The example you've posted is a Diva version built for Intel emulated on a Risc platform (using Apple's Rosetta emulation layer). And in spite of this the numbers are actually pretty impressive. The comparison aforementioned is I think good news but it is far from what native performance will be I suspect.

Post

Rosetta 2 is actually not emulating AFAIK, it is converting Intel instruction set to matching ARM instructions at install time and creates a new binary during that process. It seems they also do some clever optimizations while at it (probably courtesy of clang or LLVM?).

Post

EvilDragon wrote: Thu Nov 19, 2020 6:16 pm Rosetta 2 is actually not emulating AFAIK, it is converting Intel instruction set to matching ARM instructions at install time and creates a new binary during that process. It seems they also do some clever optimizations while at it (probably courtesy of clang or LLVM?).
LLVM optimizations are definitely doing a bunch of heavy lifting here, but there are plenty of additional optimizations that can only be deemed safe when coming from the original source code (and that's where Clang would play its role.) Intel architectures have different semantics in a number of areas that mean matching their behavior can be inherently inefficient when you're on a processor built around a different set of assumptions. Given that you can't mix and match Rosetta-generated code and natively compiled code in the same process I assume they're also sticking to the Intel calling conventions in order to make strict translation possible, and this likely means they're missing out on additional opportunities around register allocation that native compilation will bring. It may also suggest that there's a slim adaptation layer required between system API and translated code.

I'm very much looking forward to seeing my U-he products updated to Universal versions!

Post

The machine code will still use the limited number of registers that Intel machines provide for, whereas RISC machines usually have a lot of them. Many Intel instructions will have to be broken down into multiple instructions for an ARM port, which only a compiler from source will be able to optimize for.

OTOH maybe they are decompiling with something like IDA and then recompiling. But that would take very long, and I don't see this happening.

Post

Ummm there is a buuuuuuunch of instructions if you count all 4 versions of SSE and all the AVXs... NEON doesn't even cover them all I don't think.


EDIT: Oh right. Registers vs instructions! Made a mixup there.

Post

Any update on m1 benchmark yet? Did you receive your arm macs Urs? :)

Post

We got the Macs, but they're still nicely wrapped and/or on their way to people working remotely (Covid 'n stuff).

I guess next week we'll have an idea.

On a related note, while we can compile natively for Arm, we are currently relying on the compiler to do some crucial optimisations ("auto vectorisation"). Apparently, these are not as good as we would have hoped for. Hence, we'll most certainly have to spend 2-3 weeks adding NEON support to our vector library (one large text file that unifies SIMD like SSE and AltiVec for us).

Post Reply

Return to “u-he”