Hardcoding a specific SIMD register size in your algorithms is usually not a good idea, as it requires writing multiple versions of nearly identical code. A much simpler approach is to create a generic algorithm that uses some tricks like padding the end of aligned buffers to get rid of loop peeling. The register size here is just a parameter determined by the target instruction set.coroknight wrote: Wed Aug 27, 2025 3:48 pm AVX2 being able to process 2x as much data in a single instruction is not small, and it makes a difference when you tell your engineering team they can rely on that when considering features, algorithms, etc.
Of course, this requires some level of abstraction and libraries like XSimd can help. As I said earlier, I doubt the Bitwig devs use raw SIMD intrinsics directly. It'd require deep knowledge of various microarchitectures and a lot of time. They create separate binaries for different sets, probably because it's simpler/faster, than having a dynamic dispatch in the source code.
