Nice thread ...
I dev with 20+years experience and quite some experience in parallel programming...mostly java.
I think I would rethink whether "threads" is what you want. You want things get done. And each thread does come with cost. thus you have a ratio: unit of work scheduled to thread / cost.
And cost is a sum of maintaining a thread pool in a consistent state, thread context switches (i.e. kicking all the registers of all the fancy SSE units for instance) and staling the caches. My gut feeling is that music software is rather a data-driven business, i.e. I have to optimize the throughput of data pipelines (data pipeline is the path of a voice through all plugins). Therefore I would avoid in any case the cost of stalling caches, i.e. working first on voice 1, filter 1 (cpu 1) then kicking that of the cpu for voice 2, filter 3 (cpu 2) then again voice 1, amp1 (cpu 5) and then voice 3, eq2 (cpu 6) ... I would rather organize a pipeline for voice 1 as "one unit of work" and thus guranteeing for cache consistency...
Having more threads than the cpu does - my gut feeling - only makes sence if you optimize for "first response reaction time of a single request" in I/O bound systems (like web apps that put threads on wait for db queries) on the cost that your overall throughput degrades...
Anyway ... that said I would never implement such stuff on my own
, but rely on pros who implement those frameworks. And guess what there's guys for instance in finance industry ... who invented the disruptor pattern
. Freaky, that they implemented stuff in Java to be lighning fast for profit. But they are lock free, and cater for cache consistency
... I'm not really into all of the magic they do, but I'd use solutions like that and just use that rather than get my own thread pool scheduling and optimization going.
Edit: Thread Pools that pick "units of work" from a queue that clients put onto the queue have another problem. they are mostly not fair. they don't have priorities like OS processes where the OS scheduler gurantees with some fairness policy that a process will get cpu time. You can easily spam a threadpool with thousands of "units of work" in one go and thus block other clients from getting cpu time.