In summary, I think there will be no integer or even floating-point rhyme or reason behind these offsets. I am convinced they are hand-picked integers in a two-dimensional matrix. The integers you search for do exist, but you won't find any specific, pretty, mathematical rhyme or reason behind them.antto wrote:
1.1077003479
1.0633006096
1.0204008818
1
0.9810999632
0.9382010102
0.8908000588
i was (and still am) curious to find out if the real detuning coefficients somehow turn out to be some simple whole numbers
it seems (iirc) that the detuning of the upper three saws vs the lower three saws is not symetrical in any way, i think i tried converting the frequencies into pitch (log scale) and it isn't symetrical there either
This isn't a problem from the instrument designer's or musician's point of view, because if there is a simple mathematical relationship between the frequencies, the sound would be less interesting. For instance: if our center is 440Hz, and the sidebands were +- 2Hz, 438 and 442, then we'd hear both sidebands beat with the center frequency exactly twice a second and they'd beat against each other 4 times a second. Just not that interesting. If it were +2Hz, -2.5Hz, then we'd hear the beat rhythm repeat every 5 seconds, and so on. Much better to pick ratios where the numbers are irrational, or at least fractions so long that they'd take minutes to repeat, if not days or years.
--------------------------
So first let's review how a single sawtooth is probably generated.
The easiest way to make a (noisy, non-bandwidth-limited sawtooth)(which is what these apparently are) is to have a, say, 32-bit integer (sometimes called an "accumulator") that represents phase, and if you use that phase directly as your audio output, you get a sawtooth. From your frequency, you determine an "increment", an amount that you increment that phase by every cycle. (The accumulator accumulates these increments, hence the name.) When you go off one end (too high) the CPU automatically wraps it around to "low" which means the sawtooth output has a vertical jump once a cycle. This is just like a car odometer rolling over from 999,999.9 to 0. I'm POSITIVE this is how the supersaw sawtooths are generated. Nothing is simpler, and we see the exact kind of noise this generates.
Example: say the synth runs at 44.1kHz and we're playing A4 at 440Hz. The increment would be 2^32 * 440 / 44100 = 42,852,281. We just have a phase--it starts out at a random value--and we add 42,852,281 to the accumulator 44,100 times a second. Every 100 or 101 additions the accumulator wraps back from most-high value to most-low, at which point our phase--and the sawtooth we see if we use the raw phase as audio--go from gently climbing to taking a big plummet. (And the fact it's sometimes 100, sometimes 101, is why it is so noisy.)
(They can't, I think, pre-calculate 2^32/44100 and simply do one operation to take that *440, because in integer math the order of operations is critical because of rounding and overflow. I haven't checked but I think you'd have unacceptable pitch inaccuracy if you pre-calculated 2^32/44100=97391, and just calculated your increment as 97391 * freq. You get a more accurate answer if you FIRST do all your multiplies together, THEN do all your dividing. But even then you still have a worry: make sure you have big-enough integer variables that they don't overflow. To do 2^32*20,000/44,100, at the top of human hearing, after the multiply you need at least 47 bits...)
Now, pitch isn't steady. It changes smoothly, due to pitch envelope, portamento, and LFO. To get that smooth change, with every sample, the increment must change. And, pitch rising or falling is basically always on an exponential scale: falling by an octave a second means frequency changes by a factor of 2 a second, and our increment would also need to halve in a second. But, I don't think there's a way to do exponential calculations like this without floating point. What you can do fast in integers is add an increment to your increment. That gives a reciprocal curve, not an exponential curve. But if you recalculate that increment-of-the-increment every 64 samples or something, then you end up getting a super-close approximation of any curve you want with a bunch of short reciprocal curves. And the math per sample isn't just simple, but simplest-POSSIBLE. A single integer addition/subtraction to change the increment, then another single integer addition to change the accumulator:
increment = increment + incrementOfIncrement
accumulator = accumulator + increment
Do that 64 times, then finally one time calculate a new increment-to-the-increment. That calculation is simply:
figure out desired pitch "pitchThen" we'll want in 64 samples (could be LOTS of math)
incrementThen = 2^32 * pitchThen / 44100
incrementOfIncrement = (incrementThen - incrementNow+32)/64.
And the divide by 64 is actually just shifting right six bits, not even an integer divide (much slower than a shift).
-------------
So now let's look at the sidebands. I've looked at trying to use the center frequency's increment-of-increments for the sidebands too and while you can get close for a short segment, after the segment is over the sideband increments will need such a big adjustment I think it'd make too much noise. However, simply calculating the sidebands the same way the main signal is calculated should be fast enough.
Calculating an increment for say the top sideband, at maximum detune, naively, would be 2^32 * 440 / 44100 * 1.1077003479. The 1.1077003479 could be calculated on the fly with floating point math, but it'd be much faster to simply pre-calculate it for each detune amount for each sideband and put those pre-calculated values into a two-dimensional array.
But even if pre-calculated, we still have a floating point operation. Especially when supersaw started, I don't think there was any floating point math--too expensive! My guess is that instead of getting 1.1077003479 out of a table, they're further pre-calculating 44100/1.1077003479 and putting that into the table. So, calculation for the sideband increment, given a frequency of 440, is instead: 2^32 * 440 / 39812. The 39812 would again come from a two-dimensional array, indexed by sideband# and detune amount.
-----------------------------
OK, so that's the bad news for you. Those numbers 1.1077003479 and so on are probably in fact integers like 39812 in a table, and those integers are so high and have so many factors that you're not going to boil their relationship down into a simple "small" fraction like 18/17 or something. Furthermore, being in a table frees them from having to be calculated, which means they don't have to be an easy-to-calculate number. They could be (and for reasons I gave at the top, probably are) irrationals like 10th root of 2 or something, selected to be nearish but not exactly on -10%/6%/2%/+2%/6%/10%. And finally, these irrational numbers aren't even static; instead they're multipled by a detune amount then converted to integers.
For every detune, you'll get different rounding errors... which means even if you could do hyper-accurate pitch detection and calculate these integer values for one detune level, the integers for the next detune level would have to be found from scratch.
OTOH, if you could find the output frequency of the keyboard in question (may be 44.1kHz or 32kHz or something) and you got HYPER-accurate measurements of the detune amounts, I'm guessing that for each value of detune you WOULD ultimately find very specific integers.
In summary, the numbers you're curious about probably exist, but my guess is that they're both big integers, not small fun ratios, and that their ratios also vary somewhat with detune amount.