manager std:string on audio thread: safe?

DSP, Plugin and Host development discussion.
Post Reply New Topic
RELATED
PRODUCTS

Post

Hi there,

I'm going to make a sequencer which will convert some specific patterns (defined as std: string) in notes (midi/cv/trigger signal, basically).

Question is: working with std:string (such as split, add/sub, convert to int) is safe for audio thread? Or does it could involve heap/memory allocation at runtime (= glitch)?

Note: it will processed/generated only when pattern start/restart (so once every x samples), but not sure this has some meaning (heap allocation for a single sample is bad in any case).

Thanks for any hints.

Post

Out of the box, std::string will likely allocate/deallocate unless you're very careful (creating them outside the audio thread, only doing safe stuff like reading on the audio thread).

But note that std::string is a specialization of the template std::basic_string, which supports custom allocators. So you could probably use a custom allocator or a std::polymorphic_allocator with enough pre-allocated space for strings on the audio thread.

However, is it really strictly necessary to process those strings on the audio thread? If they're created from some GUI interaction you could process them in the message thread into some symbolic or object representation that tells the DSP code what to do through a message queue or something (e.g. notes, parameter changes). Similarly, if the strings need to be created from audio analysis, you could go the other way around. What you're writing doesn't sound like the strings need to actually live on the audio thread.

Post

Derozer wrote: Fri Mar 29, 2024 5:26 pm I'm going to make a sequencer which will convert some specific patterns (defined as std: string) in notes (midi/cv/trigger signal, basically).
You probably should not. Even if the patterns are input as strings, you should probably parse (and store) them as some easier to handle binary data right where they are input. You should typically do this even if there was no real-time considerations to worry about, because this allows you to isolate all the string handling in one place and rest of your code does not need to worry about whether the strings actually contain valid data.
Question is: working with std:string (such as split, add/sub, convert to int) is safe for audio thread? Or does it could involve heap/memory allocation at runtime (= glitch)?
Whether or not you actually should do it (you should not), you can safely access the contents of the string, but pretty much anything that actually changes the string will cause an allocation and therefore potentially a glitch. If you're working with recent enough C++ standard then I'd imagine you could perhaps split strings by creating the substrings as stringviews, which is basically a range over character data... but honestly you should probably do all this stuff (strings or not) in GUI thread and just pass the final results to the audio thread if possible.
Note: it will processed/generated only when pattern start/restart (so once every x samples), but not sure this has some meaning (heap allocation for a single sample is bad in any case).
Any amount of memory allocation in audio thread is a potential problem... and in fact when you allocate, you typically end up also deallocating previous chunks and that's also potentially risky.

The issue here is that the way memory allocators are designed, they try to be as efficient as possible on average and it's the on average part that's a problem for real-time. Most memory allocations (and deallocations) will be satisfied quickly (unless the allocator really sucks) and will not cause a glitch... but then once in a while it'll happen that the memory allocator doesn't just have the correct size block waiting for you and needs to go and start doing more complicated things... or just randomly decides to do a bit of cleanup or bookkeeping maintenance or whatever... and it's the "once in a while some allocation can be really slow" that is the real problem for a real-time thread, because real-time threads don't care about averages, they care about hitting their deadlines.

In other words, when you allocate, most of the time the allocator grabs a block from a freelist and that's it. Most of the time when you deallocate it puts that block back to the freelist and that's it... but if you build a memory allocator that only ever does this, you'll find that it performs poorly on average, because the heap will get all fragmented.. and that's why memory allocators have to do a bit more and if they decide to do that "bit more" when it's your realtime thread calling, that's a glitch.

Post

Everyone is correct.

I want to add that on some 64 bit platforms you might get away with strings under 23 bytes not allocating at all because of short string optimization.

https://stackoverflow.com/questions/216 ... on-in-libc

Totally platform and stdlib implementation dependent. Probably not testable at compile time. Rely on it at your own risk.

Post

Just to maintain the discussion, with C++20, you can get some insight into the short string optimization by declaring a constexpr std::string global variable and looking at which point the compiler stops allowing it. Visual Studio is particularly funny about it; right now it allows me up to 15 characters, but only in non-debug mode (not /MDd).

Post

Additional question: and if I use std char Array? This shouldnt be a problem right? (Such as search pattern, split on pre-allocated Array, and such...).

Post

Derozer wrote: Mon May 13, 2024 10:46 am Additional question: and if I use std char Array? This shouldnt be a problem right? (Such as search pattern, split on pre-allocated Array, and such...).
std::array is just a wrapper around a regular "stack" array, so those are fine...

However.. if what you want to do with your string data is search/tokenize/etc, then there is absolutely no need to actually copy any data around into tokens. All you really need to do is represent your tokens by integer tupples of the form [start,count] (or [start,end]; either works).

So for example, if we had "aa:bbb:cc" and we wanted to split this on every ':' then rather than creating three strings, we can keep the original data as-is and simply represent the substrings as tupples [0,2],[3,3],[7,2] (where I've chosen the [start,count] representation). In some applications this representation also has the advantage that it also keeps the information of where the token came from.

Another similar approach is to replace the "start" index with a direct pointer to the first character, but still keep the count. That's basically how std::stringview works. The key in either case is that there is no allocation for a new string, we're just referring to the original string and store which part we want. The potential downsides in some cases is that we need to keep the original data around and our substrings won't be null-terminated in memory (well, in the CVS tokenizing case we could fix that by in-place replacing the separating tokens by null-characters, but that only works if we have separators that we can throw away).

Post

[0,2],[3,3],[7,2] is not a correct scenario for my purpose.

Its more like 3|2(1),-4|1(3) etc pattern, where the first its a 1/3 note in 2 beats (ch 1), the second one a muted 1/4 note in 1 beat (ch 3), and so on (I've lots of option, which are easy to represent in string with deterministic conventions).

So would be perfect to pre-allocated (let say) 1000 char, and than do a sort of split (token ","), processing each token (- = muted), for every option I'll introduce.

Not sure this can be done "easily" in char[] without using std:string? Another solution maybe its binary and process 4 bits for each note? (First bit sign, than I don't really know how to represent for example fraction easily hehe).

Maybe a sort of bit masking (10 bit)?
- First bit sign note on off
- 3 bit division
- 4 bit length
- 2 bit channel?!?!

The complex operations now would be work with a concatenation of those 10 bits (20 for two notes, 30 for 3, and so on). What for example if I want to change Channel of the second note from 1 to 4? It become weird to manage...

Post

Derozer wrote: Sat May 18, 2024 7:11 pm [0,2],[3,3],[7,2] is not a correct scenario for my purpose.

Its more like 3|2(1),-4|1(3) etc pattern, where the first its a 1/3 note in 2 beats (ch 1), the second one a muted 1/4 note in 1 beat (ch 3), and so on (I've lots of option, which are easy to represent in string with deterministic conventions).
I think you missed the main point: it's not necessary to create any new strings in order to parse an existing string (except perhaps for identifiers to be interned into symbol table if we're talking about a full programming language). If you need tokenization, represent those tokens as ranges. If you can avoid explicit tokenization, then it's even more efficient to just parse in-place directly (even if we don't care about real-time constraints).

You can literally parse almost every full programming language without ever taking substrings except perhaps to intern new identifiers to a symbol table... and if you don't have symbols to intern, then there's no reason you would ever need to allocate.

Post

mystran wrote: Sat May 18, 2024 7:40 pm I think you missed the main point: it's not necessary to create any new strings in order to parse an existing string
Nice, so I can get the x occurrence of a std:string such as "3|2(1),-4|1(3)" (giving a fixed delimiter, such as ",") without memory allocation? How? std:find is sure?

That would be 90% of work I believe (for eventual add/sub on place I could just shift its char[] to be sure)...

Post

What you would really want here is a "lexer" and a "parser" or some hybrid of the two.

A "lexer" is essentially a state machine that looks at each character at a time and it's job is to figure out "the next token." For example, if we look at your example string at the beginning, we see the character '3' which is a digit... so we transition into the state "lex a numberical value" and store the '3' into an integer, then peek at the next character. If it was also a digit, we'd multiply the accumulated integer by 10 and add the new digit, but in this case we see '|' so we store that we've consumed 1 character (the '3') and return an integer token (with value 3). Then when we want to fetch the next token, the lexer looks at the '|' again, "consumes it" (by incrementing the "consumed this far" counter) and returns a "vertical bar" token. For the next token we have a digit again, so we parse another integer.. and so on. For the minus sign in '-4' we treat the minus as also starting integer parse, but store a flag that we should negate the value before we return it.

This whole thing can run simply based on a counter of how far we have consumed. Normally the "tokens" (ie. values representing delimiters or actual numeric values or whatever) are then consumed by a "parser" which is typically a stack machine (for context free grammars) so that it can keep track of nesting structures like parenthesis. Here the "parser" probably only needs to validate that the order of different tokens is valid.. but for something like parsing mathematical expression I'd probably start from "shunting yard algorithm" which makes it relatively easy to understand the concept of stack machines.

The key point though is that if you have a lexer that converts everything into tokens, the parser part doesn't have to work with the strings at all, rather it works with tokens. Even in a full programming language what you usually do (with some exceptions 'cos some languages have nasty syntax) is let the lexer also figure out identifiers, look them up in a symbol table (and add new ones when not found) and then perhaps return an index to the symbol to the parser, so even if you have say "std::find" what the parser sees is symbol:123 (representing the string "std"), namespace delimiter, symbol:456 (representing the string "find"). There are other ways to do this.. but the point is.. the normal way to parse stuff is to move get rid of the string representation as early as possible (in lexer) so that your main parsing can just work with tokens, numeric values and pointers/indexes to symbol tables etc.

Parsing a full programming language is complex, sure... but the concepts of how to build a lexer and parser are not... and they pretty high on top of the list of theoretical computing science topics that are actually practically useful in day-to-day programming. So.. rather than trying to hack together something by calling methods like std::find that are not really great for parsing, I'd just spend a little bit of time learning about how to write lexers and (if necessary) simple parsers. They are the kinds of things that you learn once, then you never have to think about "how do I parse a string" anymore.

I'm sorry for not being able to recommend a specific tutorial ('cos .. I think I learned from dead trees a long time ago) but like.. there are plenty, if one seems overly complicated, just look for another one.

Post Reply

Return to “DSP and Plugin Development”