CLAP: The New Audio Plug-in Standard (by U-he, Bitwig and others)

DSP, Plugin and Host development discussion.
Post Reply New Topic
RELATED
PRODUCTS

Post

----
Last edited by S0lo on Tue Jul 12, 2022 8:32 pm, edited 3 times in total.
www.solostuff.net
The 3rd law of thermo-dynamics states that: the 2nd law has two meanings, one of them is strictly wrong, the other is massively misunderstood.

Post

Urs wrote: Tue Jul 12, 2022 3:57 pm Psst... PCK also requires the Note_End message, but only 1 for all overlapping instances of the same port/channel/note.
Ahh! Sorry. You are of course correct!

Post

EvilDragon wrote: Tue Jul 12, 2022 7:15 pm MPE reuses channels when 15 is reached - one channel is global and not supposed to take any voices.
:tu:
mystran wrote: Tue Jul 12, 2022 6:55 pm std::unorder_map is not ideal, because it's an open table (ie. collisions chained into a linked list) and therefore has to allocate the nodes for said linked list.

For this type of thing a closed table (ie. collisions probes again for another slot) would be better. Unfortunately there isn't one in STL, but they aren't that hard to write yourself. These don't ever need to allocate anything as long as the load factor stays below some chosen maximum load factor and given that the reasonable maximum load factor is typically around 70% (eg. I think my own table resizes at 66%) you can just make the table 2-3 times the number of maximum voices you support and you'll be fine. There's some subtleties with the probing to avoid clustering if you suspect the host is going to be sending really nasty sequences of note IDs.. but mostly you should be able to get performance that's more or less O(1) lookup even if you had 1k voices.
The root of the issue is caused by using arbitrary note_id. I think a better solution is to enforce on all hosts, a contiguous range from 0 to n for note_id where n is the maximum number of notes concurrently possible at any single time. It's easy and foolproof. And I think most devs will immediately realize the benefit of using straight arrays.

Port/Channel should benefit from such enforcement too. Where the maximum is again chosen by the plugin.

We could even skip that explicit maximum statement and allow the plugin to simply ignore inbound mods when an internal maximum is reached. Edit: on second thought this might cause problems
Last edited by S0lo on Tue Jul 12, 2022 9:14 pm, edited 1 time in total.
www.solostuff.net
The 3rd law of thermo-dynamics states that: the 2nd law has two meanings, one of them is strictly wrong, the other is massively misunderstood.

Post

For small arrays (most synths will not have more than, say, 32 voices), a linear search while obviously still slower than a constant time lookup is still plenty fast. Since you only need to do the lookup once per event, all your actual DSP will dwarf the time it takes to do those lookups. And as a minor optimization you could keep track of the index of the last voice that was created and then search backwards from there, but you probably won't notice a difference either way.

Post

robbert-vdh wrote: Tue Jul 12, 2022 8:47 pm For small arrays (most synths will not have more than, say, 32 voices), a linear search while obviously still slower than a constant time lookup is still plenty fast. Since you only need to do the lookup once per event, all your actual DSP will dwarf the time it takes to do those lookups. And as a minor optimization you could keep track of the index of the last voice that was created and then search backwards from there, but you probably won't notice a difference either way.
I've addressed that earlier viewtopic.php?p=8471766#p8471766.

In short, It boils down to, do we want to have audio rate modulation or not.
www.solostuff.net
The 3rd law of thermo-dynamics states that: the 2nd law has two meanings, one of them is strictly wrong, the other is massively misunderstood.

Post

Events come at some kind of control rate, e.g. "once every 64 samples". It's sample accurate by timestamp, but it's not an audio rate stream.

Afaik Bitwig Studio will only ever issue up to 32 Note_IDs at once or so. The currently drafted Voice Info extension will allow hosts to set a maximum of concurrent Note_IDs with a bit of headroom for safety, depending on how the plug-in iterates.

All in all this is unchartered territory. Even in its first incarnation, it works very well, with very little overhead. Over time I guess there'll be rules of thumb and stuff as to how to best implement things. E.g. atm I'm using a technique similar to quad trees in 3D rendering where I set up a quickly searchable list of vectors for available modulation values. I'm sure at some point there'll be reusable example code and data structures that people can just copy and paste in their plug-ins and hosts.

Post

baconpaul wrote: Tue Jul 12, 2022 9:53 am The pck does not require the notification (a note off is the end of the voice) but doesn’t allow overlap or modulation in the release phase.
I think there is some confusion here.
If one is conditioned to think of a MIDI key-number as permanently welded to a specific pitch. Then you can't support playing two voices at the same pitch at the same time (and being able to address them independently for the purpose of modulation).
I think this is where the idea of adding another layer of abstraction (the note-id) stems from.

On the other hand, if you consider a MIDI key-number (and channel) as merely a switch that can be tuned instantly to any pitch. Then it's clear that you can easily play the same pitch on two voices (while being able to address them independently). If this wasn't true, then MPE wouldn't work.

So the question is: If MPE and MIDI 2.0 manages to address voices unambiguously, even when they're playing the same pitch (overlapped), even allowing modulation during the release phase, then what is the point of adding a redundant layer of complexity? (NOTE_IDs)
Last edited by Jeff McClintock on Tue Jul 12, 2022 9:27 pm, edited 1 time in total.

Post

If you think of it. A contiguous range would probably be easier to adapt to from existing VST code which already use contiguous parameter ranges.
www.solostuff.net
The 3rd law of thermo-dynamics states that: the 2nd law has two meanings, one of them is strictly wrong, the other is massively misunderstood.

Post

S0lo wrote: Tue Jul 12, 2022 9:00 pm I've addressed that earlier viewtopic.php?p=8471766#p8471766.

In short, It boils down to, do we want to have audio rate modulation or not.
True audio rate modulation was one of those things we looked at little while back, but was put on the backburner in order to prepare for the CLAP 1.0 release. CLAP's event-based approach to automation and modulation is incredibly convenient, but it doesn't scale well to full audio rate. For most things that's not a problem. You're probably using smoothing internally anyways, and Bitwig gives you one event per 64 samples which is more than granular enough for those use cases. However, if you were to use the same mechanism to send parameter events for every sample, for multiple parameters, and for multiple note IDs at the same time, then the overhead starts to add up real quick. The main problems here are the one you already mentioned, i.e. having to handle every single one of those events individually, and the amount of memory overhead involved in using events for this. The clap_event_param_mod_t struct is 56 bytes large, while the double value you care about for the audio rate modulation is only 8 bytes. That means that assuming you already know what the event is modulating in advance, almost 86% of the struct is unnecessary overhead.

So the idea was to have an extension that lets you tell the host for which parameters you support dense automation/modulation, and the host could then use an event from that extension to provide you with a dense buffer containing the rendered automation of modulation data for that parameter (and note ID combination) for every sample in the buffer. Then you can simply read the offsets from that as part of your processing loop. But this extension does not yet exist.

Post

Jeff McClintock wrote: Tue Jul 12, 2022 9:22 pm So the question is: If MPE and MIDI 2.0 manages to address voices unambiguously, even when they're playing the same pitch (overlapped), even allowing modulation during the release phase, then what is the point of adding a redundant layer of complexity? (NOTE_IDs)
You can only do that in MPE if you rotate the channels (don't know enough about MIDI 2.0 to say anything about that). If I tell my DAW to play a note on channel n (and for instance have a multi timbral synth set up that plays different sounds depending on the channel), then I don't want my DAW setting some other arbitrary channel for that note. That's a limitation of the MPE approach. With note IDs (which should have really been named voice IDs) you can have multiple overlapping voices with the same note number and channel.

Post

robbert-vdh wrote: Tue Jul 12, 2022 9:32 pm With note IDs (which should have really been named voice IDs) you can have multiple overlapping voices with the same note number and channel.
I could argue facetiously "the problem with note-ID is that I can't have two overlapping voices with the same note-ID".
The argument is just as illogical and circular as dismissing the use of MIDI key-numbers for addressing voices.

No one is explaining why "overlapping note numbers" is so important?

Two overlapping voices playing the same pitch? essential, yes. (but we can do that already)
Independently addressable? yes, essential. (but we can do that already)

Two voices with the same key-number? Why do you need that? What essential user-facing feature does it support? Let's bring this back to making music. What can note-ID do audibly that MIDI 2.0 or MPE can't do already (without it)?

The question really is: If I choose to implement key-number (and channel) as 100% equivalent to 'note-id'. Then why do I need note-id too?

Post

robbert-vdh wrote: Tue Jul 12, 2022 8:47 pm For small arrays (most synths will not have more than, say, 32 voices), a linear search while obviously still slower than a constant time lookup is still plenty fast. Since you only need to do the lookup once per event, all your actual DSP will dwarf the time it takes to do those lookups. And as a minor optimization you could keep track of the index of the last voice that was created and then search backwards from there, but you probably won't notice a difference either way.
Or you could use a closed hash-table. Average O(1) and worst-case overhead over a linear search whatever cycles you spend hashing (assuming you don't just set the hash function to the identity function, which is a reasonable choice if the host sends mostly sequential note_ids).

Post

Jeff McClintock wrote: Tue Jul 12, 2022 9:51 pm Two voices with the same key-number? Why do you need that? What essential user-facing feature does it support? Let's bring this back to making music. What can note-ID do audibly that MIDI 2.0 or MPE can't do?

I mean could I argue facetiously "the problem with note-ID is that I can't have two overlapping voices with the same note-ID".
The argument is just as illogical and circular as dismissing the use of MIDI key-numbers for addressing voices.

The question really is: If I choose to implement key-number (and channel) as 100% equivalent to 'note-id'. Then why do I need note-id too?
Now how would you do that? Would you send pitch bend messages and use pitch bend range MIDI RPN to add specific pitch offsets to keys? Some proprietary SysEx based solution? Yes, MIDI can in theory do lots of things, especially if you add in (N)RPN and SysEx messages. But what MIDI can do in theory is irrelevant if the software you're using doesn't support that. The fact of the matter is, most plugins' MIDI implementation doesn't extend beyond the basic channel events and custom MIDI CC mapping.

You could base CLAP around MIDI, force everyone to implement several bespoke protocols on top of that, and use that to emulate note IDs. Or you can just add note IDs.

Post

robbert-vdh wrote: Tue Jul 12, 2022 10:15 pm The question really is: If I choose to implement key-number (and channel) as 100% equivalent to 'note-id'. Then why do I need note-id too?
Now how would you do that? Would you send pitch bend messages and use pitch bend range MIDI RPN to add specific pitch offsets to keys? Some proprietary SysEx based solution?
[/quote]

much simpler.
You already can bend a note right? You send some event with a note-id and a 'bend amount' I guess.

Keep everything exactly the same, except swap out 'note-id' for 'key-number'. That's all I do. Never needed anything more.

I'm not questioning the entire API, just one little thing - what does 'note-id' achieve, that 'key-number' can't?

Post

Jeff McClintock wrote: Tue Jul 12, 2022 10:45 pm [much simpler.
You already can bend a note right? You send some event with a note-id and a 'bend amount' I guess.

Keep everything exactly the same, except swap out 'note-id' for 'key-number'. That's all I do. Never needed anything more.
And that's exactly the problem I was trying to explain. What do you use for the 'bend amount'? If I send the same MIDI to three synths to layer them, do I as a user need to manually configure all of them to use some specific DAW-specified pitch bend range or all notes will play at the wrong pitch? What if the synth does not support that range? And what if the MIDI is also used to key track a filter plugin that doesn't respond to pitch bend messages? And also, how will you pitch bend individual keys with MIDI 1.0 without MPE in a way that any random plugin will understand it? If the solution is to ask all of these plugins to support plugin-specification defined MIDI protocols for individual key pitch bends and pitch bend range information using RPN and NRPN messages, then isn't it much simpler to just add a note ID (or, voice ID) field?

Post Reply

Return to “DSP and Plugin Development”