Lets talk DAW/Sequencer design and architecture

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

It can be months or years later that one finally understands "the way it should have been done from the beginning". :)
This becomes a disaster when when the project is a contract instead of being an in-house private project, yet it very naturally happens because unless one designs the same project again, some experience is always missing. Probably there is no other business with so many opportunities to shoot oneself in the foot.
~stratum~

Post

JCJR wrote: ...
Is there a reason no-one uses much higher PPQN values than for example 480? Why not use something close to 32000 or so?

Post

Kraku wrote: Is there a reason no-one uses much higher PPQN values than for example 480? Why not use something close to 32000 or so?
Hi Kraku

Some have used bigger PPQN values. In the early days some of the first good sequencers such as opcode and motu picked 480 and so far as I recall there were other "common" PPQN but more and more settled on 480 as time went on.

Maybe because it is "big enough" for fairly decent timing, and still small enough that most musicians nerdy enough to use sequencers can easily do timestamp math in the head. When they sometimes end up "toughing it out" on a long painful track edit typing in endless numbers in a list editor window.

Maybe 480 has become so common and so many musicians are accustomed to it, that it doesn't make much sense for a sequencer to default to some other value? So far as I know the user can specify the PPQN in many sequencers, so if he doesn't like 480 then he'll set some other value?

I suppose 32000 would work just as good. Some non-mathematical users might take awhile getting used to the numbers in list editor windows, and the bigger numbers would require more typing to get the same work done in a list editor window.

Haven't thought about it lately. I recall thinking a few years ago if I was gonna write a new sequencer would probably just use float numbers rather than integers for the tempo-based timestamps. It might not matter, but I recall thinking in that case maybe the timestamps ought to be doubles to properly account for very long sequences-- An hour or more playtime. For instance in addition to movies and tv/radio shows, sometimes folks will make very long repetitive sequences to practice against or for "24/7 background music" .

I was wondering whether float32 timestamps might get big enough to become a bit "numerically fuzzy" after a few hours at 480 PPQN. An int32 timestamp remains numerically exact all the way until it overflows. But I was figuring a float64 timestamp ought to be good for long playtimes anyway. Maybe float32 timestamps wouldn't really be a practical problem.

Was just thinking, with float timestamps why not just use a PPQN of 1? 32nd notes at 1.0, 1.125, 1.25, 1.375, 1.5, etc. If you want to push a note at 3.125 the tiniest little bit, maybe type in 3.124999 or whatever. The float64 timestamp ought to have lots better resolution than the actual timing and slop of the typical computer.

Long ago when floating ops were expensive, integer tempo and playback calculations were needed for efficiency. But nowadays especially reconciling tempo against audio during playback, it was to me convenient to do most of the calcs in doubles, so promoting ints to float, doing the calcs, then rounding back to int could be a slight source of slop. Probably runs about the same speed whether you load an int to float, or just load the float. Was just thinking maybe more elegant to quit using the ints entirely. Maybe just use float64 Seconds for the audio timestamp and float64 one tick per quarter note for tempo-based timestamp?

Post

Hmm, the ease of writing the values in a list editor sounds like a logical reason for the smaller PPQN values to become standard during the years.

Reading your description, the float64 sounds like a logical step forward.

I've been thinking of writing a simple sequencer. Maybe I should try how it works out if I don't use the PPQN but maybe map the floating point range of 0..1 to a whole note or even the full measure (4/4, 3/4, etc.).

Post

Linked lists are fine, unless you need to jump in arbitrarily into the middle. Then an array that you can shorten/lengthen as needed may be the better choice. You can live with holes in arrays, but you will need to periodically condense the content, especially during heavy editing. In linked lists, editing is far more simple and efficient, especially cutting and pasting huge chunks of data. Doing the same in a fixed array can be very time consuming.

Pad your structures out to 64-bit boundaries, or similar. The padding will give you room to grow, plus every little cycle gained in speed will add up overall.

Learn SIMD so that you can see patterns that can benefit from it and let it influence your design early on so major rewrites don't happen.

There's a lot of design decisions that need to be made before you commit to your first line of code. Best to get as many of them out of the way as possible so any snags are easily overcome.

And remember: Everything is an Object. :lol:
I started on Logic 5 with a PowerBook G4 550Mhz. I now have a MacBook Air M1 and it's ~165x faster! So, why is my music not proportionally better? :(

Post

Kraku wrote:Hmm, the ease of writing the values in a list editor sounds like a logical reason for the smaller PPQN values to become standard during the years.

Reading your description, the float64 sounds like a logical step forward.

I've been thinking of writing a simple sequencer. Maybe I should try how it works out if I don't use the PPQN but maybe map the floating point range of 0..1 to a whole note or even the full measure (4/4, 3/4, etc.).
Hi Kraku
Sequencing began when cpu's were slow and memory was small. I worked with a company long ago where internal timestamps were uint32 and duration was uint16 because it seemed too "memory wasteful" to use int32 for the duration field. And it was a serious worry about the size of memory-resident MIDI tracks at the time. Not just silly "saving bytes where its not worth worrying about".

But nowadays they are trivially small compared to the size of the audio data even if you might throw everything including the kitchen sink into yer private MIDI event object.

Using only a uint16 for duration meant that it was low-probability but not uncommon that a user might want to play a longer-duration note than the uint16 could handle. So there had to be annoying kludges to handle the occasional exceptions of very long notes, all in the name of saving 2 bytes per MIDI event. :)

Just saying, at 32000 PPQN, maybe there could be long songs that overflow a uint32 "too soon" for some uses, and so if using int timestamps, perhaps advisable to use an int64 for timestamp and duration. Computers are commonly 64 bit nowadays, but int64's were a bit annoying to me on 16 bit or 32 bit computers. Not to mention 8 bitters such as Commodore 64. :)

I haven't looked at the float resolution issue for a long time, but even with float64 it may be that very long songs might get somewhat "numerically fuzzy" toward the end at 32000 PPQN. Or maybe not, can't recall.

So far as I recall, such as MIDI file tempo mapping was oriented to "make the best sense" based on quarter notes. So assigning float One Tick Per Whole Note might (or might not) be more brain damage depending on the time signature. In 3/4 or 6/8, bar 2 would have a timestamp of 1.75 rather than a nice 2.0. Bar 3 would start at 2.5, etc.

Even with 1 tick per quarter note, it might be slightly awkward with timesigs such as 5/8. I mean, the computer would play just fine, but timestamps might not make a lot of sense in a list editor unless the timestamp and duration fields are parsed to some more musically-understandable format.

Maybe could use "1 tick per bar" regardless of the timesig, but that would probably hurt my brain. Which is not very difficult to do.

I was thinking if float 1 tick per quarter note works good enough for internal resolution, it would be simple to display list editor data or other locations to whatever resolution the user wants. If he wants 96 PPQN, just multiply all the float 1 tick per quarter note values by 96 for display. The innards would always work exactly the same regardless of the user's preferred PPQN.

As best I recall, you typically do PPQN conversions when loading or saving MIDI files anyway. On file load, you would scale the MIDI file PPQN to whatever PPQN the sequencer is running at. So if using a float 1 tick per quarter note or whatever seems to make sense, it would be the same conversions when loading or saving MIDI files.

Maybe somewhat similar to the advantage of always using float audio internal to a program. Regardless whether you load an 8, 16, 24 or 32 bit file, you convert to float. So all the internal operations handle any audio file the same way with no cumbersome exceptions, just To-Float conversion on the way in and From-Float conversion on the way out.

Maybe float Tempo Ticks would offer the same advantage. Or maybe not.

Post

Ticks were integers because faster and more accurate. 480 = 2^5 * 3 * 5, giving you 32nd notes, triplets and quintuplets, but not a lot of room for swing.

I find floats to be awful to work with, working in a VST with all values from 0.0 to 1.0. I have to use fudge factors in some cases when rounding to get the rounded integers to come up right! I can imagine the horrors of time synching based on floats!

Just pick a good, big number with lots of divisors (100,800?) and go with 32-bit or 64-bit values. You no longer have to fit into a 512k Atari!
I started on Logic 5 with a PowerBook G4 550Mhz. I now have a MacBook Air M1 and it's ~165x faster! So, why is my music not proportionally better? :(

Post

JCJR wrote:Just saying, at 32000 PPQN, maybe there could be long songs that overflow a uint32 "too soon" for some uses, and so if using int timestamps, perhaps advisable to use an int64 for timestamp and duration. Computers are commonly 64 bit nowadays, but int64's were a bit annoying to me on 16 bit or 32 bit computers. Not to mention 8 bitters such as Commodore 64. :)
int32 probably still works fairly well. If I was using 15 bits for PPQN, the rest from the int32 (32-15 = 17 bits) could still map out a really long song at 125 BPM. I'll have to do some math here to figure out myself how long the song could be:

PPQN length at 125 BPM = 60/125 seconds.
The timestamp/length can have 2^17 quarter note locations, which equals to 60/125 * 2^17 = ~62 914 seconds, which is ~1 048 minutes or ~17.5 hours.

Hopefully I got the math right :?

So int32 wouldn't be great for broadcast environment, but for regular music production it's still probably fine. But as has been mentioned, there's plenty of memory in today's computers so it makes sense to use int64 and be done with it :)

I haven't looked at the float resolution issue for a long time, but even with float64 it may be that very long songs might get somewhat "numerically fuzzy" toward the end at 32000 PPQN. Or maybe not, can't recall.
That's has started to be my concern now that I think of it. I might want to stick to integers with timestamps.

I was thinking if float 1 tick per quarter note works good enough for internal resolution, it would be simple to display list editor data or other locations to whatever resolution the user wants. If he wants 96 PPQN, just multiply all the float 1 tick per quarter note values by 96 for display. The innards would always work exactly the same regardless of the user's preferred PPQN.
This would be a really nice advantage.

Post

syntonica wrote:Just pick a good, big number with lots of divisors (100,800?) and go with 32-bit or 64-bit values. You no longer have to fit into a 512k Atari!
Actually I might have to. I might try to make a hardware sequencer instead of a software one. The microcontrollers have fairly limited memory :) Not sure yet though if I'll go with software or hardware seq.

Post

I was thinking if float 1 tick per quarter note works good enough for internal resolution, it would be simple to display list editor data or other locations to whatever resolution the user wants. If he wants 96 PPQN, just multiply all the float 1 tick per quarter note values by 96 for display. The innards would always work exactly the same regardless of the user's preferred PPQN.
I wouldn't do this because of how floats work.
You need to take care to not run into rounding errors and handle non-fractional incements, ect.

i.e. try this:

Code: Select all

const int PPQN = 96;
const float PQN_step = 1.0f / float(PPQN);

for (float f = 0.0f; f < 1.0f; f += PQN_step) {
	printf("%f => %d\n", f * PPQN, int(f * PPQN));
}
...
7,000000 => 7
8,000000 => 7 :o :o :o
8,999999 => 8
9,999999 => 9
10,999999 => 10
..
==> needs special care, and I might forget to rember about .. so I always use integer types for this kind of data

Post

This ^^^^

Thank you for the quick demo. That's what I was talking about with fudge factors. For my conversions from floats->ints, I have to add .001 to each before conversion. I figure with only 128 values needed, that's about right.
I started on Logic 5 with a PowerBook G4 550Mhz. I now have a MacBook Air M1 and it's ~165x faster! So, why is my music not proportionally better? :(

Post

Kraku wrote:int32 probably still works fairly well. If I was using 15 bits for PPQN, the rest from the int32 (32-15 = 17 bits) could still map out a really long song at 125 BPM. I'll have to do some math here to figure out myself how long the song could be:

PPQN length at 125 BPM = 60/125 seconds.
The timestamp/length can have 2^17 quarter note locations, which equals to 60/125 * 2^17 = ~62 914 seconds, which is ~1 048 minutes or ~17.5 hours.

Hopefully I got the math right :?

So int32 wouldn't be great for broadcast environment, but for regular music production it's still probably fine. But as has been mentioned, there's plenty of memory in today's computers so it makes sense to use int64 and be done with it :)
Hi Kraku
Yep, lots of ways to skin the cat. Regardless what strategy is picked, maybe if one develops on a program for enough years one could get "painted into a corner" by early decisions. Maybe about any approach will get cumbersome after you have added-on "too many features". :)

Maybe worst-case playtime would be based on the fastest tempo you intend to allow? For instance if the max allowed tempo is 500 BPM, 2^32 / 2^15 = 131072 quarter notes. 131072 quarter notes / 500 BPM = 262.14 minutes or about 4.37 hours.

Hardly anybody will actually use 500 BPM, but if you don't allow at least that fast then people will likely complain about it. OTOH, even if you max it out at 500 BPM then somebody will eventually complain that yer program sux because 501 BPM is impossible. :)
Kraku wrote:
I haven't looked at the float resolution issue for a long time, but even with float64 it may be that very long songs might get somewhat "numerically fuzzy" toward the end at 32000 PPQN. Or maybe not, can't recall.
That's has started to be my concern now that I think of it. I might want to stick to integers with timestamps.
Well, if I was gonna base it on floats, it would be float64 without exception.

I could be way wrong-- Am not so sharp in my dotage and haven't thought much about it lately-- But if it is deemed mandatory to preserve the "numeric purity" of tick timestamps by storing integers-- Then you would have to do all tempo-based play/record/sync calculations with high-bit-depth fixed point math, and even that seems ultimately futile because even with fixed point math you inevitably have to trunc or round the results.

If you use any float64 math on timestamps during play/record/sync calculations, then floating-point oddities have entered the system as soon as you load the ints to float registers and do float calculations on them. In that case, might as well save the timestamps as float64 because saving them as integers I don't think would make the playback "more accurate" or whatever.

I did fixed-point playback math for many years before computers could crunch fast floats. It is not much fun. And ultimately, it isn't any more accurate unless you use ridiculously big fixed-point calc registers. Multiplications and divisions make lots more bits output than the number of bits input, and you usually have to discard the extra bits. Which results in trunc or rounding, as with floats.

Post

Cool thread, my reply is not on a technical level as much as on 'let's build a daw for commercial release' level and the path it took me on.

I made (outsourced the coding) a little 16track sequencer/daw for beginners with sounds - and it exploded on the market (dubturbo). It is all done in flash, and rather limiting compared to any C++/actual DAW out there. So 5 versions later (and 4-5years) I tried to find the talent to overhaul my little flash tool into a C++ full blown DAW.

In short, the last 2.5 years, I went through 2 dev teams, a bunch of money/asset dev/UI design/Audio design etc - and I had to drop the project totally or seek out a 3rd dev team.

It's the most ambitious thing I've tried to do in music and in business - and has both, brought me my biggest success, and biggest loss (biggest fish in the pond to tadpole in an ocean). We got half way or so the first time, with JUCE, then it was removed because of incompatibility with redrawing ui or something, then re-added, then QT was tied in, and we got stuck for months at a time debugging low level stuff. I commend you for taking on this challenge and wish I found you guys a few years ago as it sounds like you're capable.

LMMS is another neat collective project that has come quite far for what it is.

Best of luck with it, if you get it to a point where it's useable I'd love to see what you come up with :)
N.
Get All My Free Plugins/Packs/Etc...
>> http://www.toneden.io/lofiradio >> :phones:

Post

syntonica wrote:That's what I was talking about with fudge factors. For my conversions from floats->ints, I have to add .001 to each before conversion.
In some cases it's better to use round() instead of trunc().

Post

I pouls definitely use an int to track time, and not a float. Float are not precise on all their range, so it is not possible to accurately track time. Hence int.
For numerical data like sound waves, or other types of signal, this requirement for accuracy is not as stringent, so floats are better than int/fixed-point arithmetics.

Post Reply

Return to “DSP and Plugin Development”