Roland Supersaw - any idea how the original was done?

DSP, Plugin and Host development discussion.
Post Reply New Topic
RELATED
PRODUCTS

Post

Xoxos, don't forget panning too.

Moselle has a module called Adder, which specializes in additive synthesis. It ONLY has pitch and amplitude for each osc, but is high performance.

There's also a module called FMAlgo, which is like Adder, but additionally has a modulation input for each osc. Each waveform has specific outputs for pre-gain (for DX-7-style feedback), non-amplitude-scaled (for modulation) and audio purposes, and phase inputs.

But there's no fixed number of oscillators, so if you just want to vary everything simultaneously, just make a big handful of normal oscillators. I've got a demo patch that uses 10 regular oscillators... then uses an FMAlgo not as an oscillator but as a CPU-efficient battery of 30 LFOs, setting panning, volume, and detune for the regular oscillators.

Back to the swarm: you can specify an overall "shape" for the waveforms, controlling frequency and amplitude. For instance you might want the detuned ones to get quieter as they get farther from the center, and spaced farther apart. You can set those EXACTLY how you want, osc by osc if you want. Now: once the Swarm is playing, you can't vary that shape, but you can adjust the overall shape to be wider and narrower, taller and shorter.

Post

cheers, wanted to build it myself for ensemble emulations, also with variable gate time per osc. that ought to be feasible with envelopes on the modulators then :)
you come and go, you come and go. amitabha neither a follower nor a leader be tagore "where roads are made i lose my way" where there is certainty, consideration is absent.

Post

It's supeso day!
This entire forum is wading through predictions, opinions, barely formed thoughts, drama, and whining. If you don't enjoy that, why are you here? :D ShawnG

Post

Concerning panning supersaws, there are several options.

Let's take a 8 saw supersaw, we could pan 4 saws hard left and 4 hard right or we could distribute equaly over the stereo field.

What sounds the best?

Also, should the detuning be so that on one side (eg. L) the detuning is down and on the other side (R) the detuning is up?
(NI Massive is doing this).

Post

valhallasound wrote:
Borogove wrote:
Swiss Frank wrote:A question about the killer article at: http://www.ghostfact.com/jp-8000-supersaw/

"This makes sense when you think about it - efficiency was king, and it's hard to beat the efficiency of two extra adds per sample - things are a lot easier when you embrace aliasing rather than reject it, huh?"

I don't get it-two extra adds to do what? Or do you mean the 7 sawtooths really are just 7 naive, non-BWL sawtooths (which I guess may only take a few additions each)?! Thats the secret?! And if so why only for non-BWL sawtooths?
That must be two adds per wave per sample. One add to increment the phase of the naive saw, one add to sum the waves together.

The spectra shown in that article strongly suggest naive aliased saws with a highpass, but I'd like to see the plots for higher pitched notes where the aliasing would be more dominant.
My guess is that there was a multiply or two involved in there, as odds are this was in fixed point, where you would need to scale before summing in order to avoid clipping. If multiplications were dear, a simple bit shift might have been used to scale the volume down.
Just revisiting this discussion. I think I'm wrong about the multiply or two in there. My guess is that the structure of the supersaw is something like this:

out = highpass((mixA*masterSaw) + (mixB*superSaw));

The superSaw, in this case, would consist of six saws. Each saw is simply an unsigned int, with its own phase increment and stored phase state. The output of all 6 saws are added together, as unsigned ints. If things exceed the maximum or minimum value of the unsigned int, it will just wrap around. This is similar to the comparator trick described in hardware (i.e. a sawtooth + an offset voltage, run through a comparator, results in a sawtooth of a different phase). It also means that the output level of the summed saws will never exceed the level of a single saw.

As far as aliasing: it would be interesting to see the harmonic spectra of just the aliasing of the supersaw, for higher oscillator frequencies. I wonder if, when things get detuned enough, the aliasing noise is spread out enough that it sounds like plain ol' noise. If you don't hear noise below the fundamental, the aliasing noise of the 7 saws might be closer to what you hear in bowed strings.

Post

Chris-S wrote:Also, should the detuning be so that on one side (eg. L) the detuning is down and on the other side (R) the detuning is up?
I'd mix up the detuning. If you have six detuned saws, and the detuning is Up/DownA, Up/DownB, Up/DownC (where A, B, and C are the detune factors discussed earlier in the thread), then do something like the following:

outLeft = masterSaw + upA + downB + upC;
outRight = masterSaw + downA + upB + downC;

You might want to vary the levels of the masterSaw versus the other saws, so that the masterSaw doesn't dominate (scaling masterSaw in each channel by 0.5 or 0.7071 might be useful).

Post

My own software synthesizer has a new module called "Swarm" that can do hundreds or thousands of sawtooths (or triangles or squares). Here's a video letting you hear it at work. https://www.youtube.com/watch?v=EbCqdmUbAo4

I'm thankful you asked the question of panning, because I've wondered how to do that myself, with my own software (!) and just thinking about your question I see that it's not only possible but pretty easy.

OK: the Swarm is mono. It lets you specify a formula that gives the amplitude of each component wave. That lets you make the components get quieter as they get farther from the center, or anything else you want.

To pan something digitally, the simplest way is to state your pan in terms of 0 (full left) to pi/2 (full right), and have the left channel be the signal times cos(pan), while the right channel is the signal times sin(pan).

If you wanted, you could specify TWO swarm modules, and pan them hard left and hard right. Then, have the amplitudes additionally vary by cos( f(n) ) and sin( f(n) ) respectively. The resulting effect is that of having them panned around. f(n) would just give a pan position for each of n components. You could do low frequency left/hi frequency right, or anything else you wanted.

My development workstation is currently in international shipping or I'd drop everything and make a demo of this right now.

Post

JediMind wrote:IIRC, there are just 7 detuned saws :shrug:
Not in 1996.Seven real emulated saws would bring the JP to the knees.

Post

Swiss Frank wrote:My own software synthesizer has a new module called "Swarm" that can do hundreds or thousands of sawtooths (or triangles or squares). Here's a video letting you hear it at work. https://www.youtube.com/watch?v=EbCqdmUbAo4

I'm thankful you asked the question of panning, because I've wondered how to do that myself, with my own software (!) and just thinking about your question I see that it's not only possible but pretty easy.

OK: the Swarm is mono. It lets you specify a formula that gives the amplitude of each component wave. That lets you make the components get quieter as they get farther from the center, or anything else you want.

To pan something digitally, the simplest way is to state your pan in terms of 0 (full left) to pi/2 (full right), and have the left channel be the signal times cos(pan), while the right channel is the signal times sin(pan).

If you wanted, you could specify TWO swarm modules, and pan them hard left and hard right. Then, have the amplitudes additionally vary by cos( f(n) ) and sin( f(n) ) respectively. The resulting effect is that of having them panned around. f(n) would just give a pan position for each of n components. You could do low frequency left/hi frequency right, or anything else you wanted.

My development workstation is currently in international shipping or I'd drop everything and make a demo of this right now.
I hate that sound of hundreds of saws, it sounds like a f**king mess where the pleasant character of a single saw is totally destroyed and replaced by an annoying fuzzy wall of sound, reminds me of fingernails on a blackboard.

Post

> I hate that sound of hundreds of saws, it sounds like a f**king mess

But I bet you loved the sound of thousands! :-D

Hehe, just kidding, you make a good point. This demo had zero filtering to let you hear the full output, but I agree it's one of the roughest sounds out there.

I'll be making a stereo demo of swarm soon, and will do some filtered versions as well. Maybe I should add like a 20 and 200 voice demo too. While the thing CAN go to high numbers, it certainly doesn't MAKE you go there...

Post

Matt-vank wrote:
JediMind wrote:IIRC, there are just 7 detuned saws :shrug:
Not in 1996.Seven real emulated saws would bring the JP to the knees.
I don't think people here are arguing for seven "real emulated" saws. The theory is that there are 7 sawtooth oscillators, with no antialiasing going on. Each saw would require, per sample:

- calling up the saw state from memory (hopefully a register)
- adding an increment value to this saw state
- adding the resulting output to a running sum for the output of the saws
- saving the current state of the saw to memory

Once all 7 saws are generated and summed (or, more likely, the 6 detuned saws are generated separately from the master), the sum is run through a simple highpass filter, with the cutoff of the highpass located at the fundamental frequency of the master saw.

This wouldn't be super prohibitive, even by 1996 standards. 2 adds, a memory recall, and a memory save per saw, plus a single shared 1st order (?) highpass filter for all saws.

Post

Roland used such a system on the Juno synth, where all waveforms were derived from a master wave. At least that is what I read somewhere...

Post

not sure if this would help, but imma put it here
last year i reopened Sandy's (i think) recordings from the jp80x0 in wav format, specifically one of them which has the supersaw playing long notes at different pitches, with detune going from zero to max
using oscillators in SynthEdit, i added some maths to get tuning coefficients from the center sawtooth for the other 6 saws
then i panned the wav files left, my oscillators right, and used my experimental bandpass-based spectrogram to compare and match my oscs to the wav
with enough zoom in, i got a supposedly more accurate tuning coefficients than i had before
here are my notes from then:

Code: Select all

jp8000 supersaw detune coeffs 2017-08-24
----------------------------------------

freq = 0.4395004511 kHz
multipliers:

1.1077003479
1.0633006096
1.0204008818
1
0.9810999632
0.9382010102
0.8908000588
-------------
these are multipliers in linear frequency scale
for each sawtooth, the frequency (in Hz) is:
F = F_center * detune_slider * dt[i]
  F_center = main oscillator center freq (in Hz)
  detune_slider = 0 to 1
  dt[i] = the multiplier coeff from above, for the given sawtooth

-------------
rounded:
1.1077
1.0633
1.0204
1
0.9811
0.9382
0.8908
i was (and still am) curious to find out if the real detuning coefficients somehow turn out to be some simple whole numbers
it seems (iirc) that the detuning of the upper three saws vs the lower three saws is not symetrical in any way, i think i tried converting the frequencies into pitch (log scale) and it isn't symetrical there either
It doesn't matter how it sounds..
..as long as it has BASS and it's LOUD!

irc.libera.chat >>> #kvr

Post

I didn't read the whole thread but in case you're trying to replicate the JP-80xx, I guess that Roland Cloud will release a software version soon, So the only reason to implement this algo is if you have a better plan for a unique synth yourself.

The SuperSaw is reversed engineered many times and most come very close (7" saw spread) but this idea was not new and not Roland property.

Post

antto wrote:
1.1077003479
1.0633006096
1.0204008818
1
0.9810999632
0.9382010102
0.8908000588

i was (and still am) curious to find out if the real detuning coefficients somehow turn out to be some simple whole numbers
it seems (iirc) that the detuning of the upper three saws vs the lower three saws is not symetrical in any way, i think i tried converting the frequencies into pitch (log scale) and it isn't symetrical there either
In summary, I think there will be no integer or even floating-point rhyme or reason behind these offsets. I am convinced they are hand-picked integers in a two-dimensional matrix. The integers you search for do exist, but you won't find any specific, pretty, mathematical rhyme or reason behind them.

This isn't a problem from the instrument designer's or musician's point of view, because if there is a simple mathematical relationship between the frequencies, the sound would be less interesting. For instance: if our center is 440Hz, and the sidebands were +- 2Hz, 438 and 442, then we'd hear both sidebands beat with the center frequency exactly twice a second and they'd beat against each other 4 times a second. Just not that interesting. If it were +2Hz, -2.5Hz, then we'd hear the beat rhythm repeat every 5 seconds, and so on. Much better to pick ratios where the numbers are irrational, or at least fractions so long that they'd take minutes to repeat, if not days or years.

--------------------------

So first let's review how a single sawtooth is probably generated.

The easiest way to make a (noisy, non-bandwidth-limited sawtooth)(which is what these apparently are) is to have a, say, 32-bit integer (sometimes called an "accumulator") that represents phase, and if you use that phase directly as your audio output, you get a sawtooth. From your frequency, you determine an "increment", an amount that you increment that phase by every cycle. (The accumulator accumulates these increments, hence the name.) When you go off one end (too high) the CPU automatically wraps it around to "low" which means the sawtooth output has a vertical jump once a cycle. This is just like a car odometer rolling over from 999,999.9 to 0. I'm POSITIVE this is how the supersaw sawtooths are generated. Nothing is simpler, and we see the exact kind of noise this generates.

Example: say the synth runs at 44.1kHz and we're playing A4 at 440Hz. The increment would be 2^32 * 440 / 44100 = 42,852,281. We just have a phase--it starts out at a random value--and we add 42,852,281 to the accumulator 44,100 times a second. Every 100 or 101 additions the accumulator wraps back from most-high value to most-low, at which point our phase--and the sawtooth we see if we use the raw phase as audio--go from gently climbing to taking a big plummet. (And the fact it's sometimes 100, sometimes 101, is why it is so noisy.)

(They can't, I think, pre-calculate 2^32/44100 and simply do one operation to take that *440, because in integer math the order of operations is critical because of rounding and overflow. I haven't checked but I think you'd have unacceptable pitch inaccuracy if you pre-calculated 2^32/44100=97391, and just calculated your increment as 97391 * freq. You get a more accurate answer if you FIRST do all your multiplies together, THEN do all your dividing. But even then you still have a worry: make sure you have big-enough integer variables that they don't overflow. To do 2^32*20,000/44,100, at the top of human hearing, after the multiply you need at least 47 bits...)

Now, pitch isn't steady. It changes smoothly, due to pitch envelope, portamento, and LFO. To get that smooth change, with every sample, the increment must change. And, pitch rising or falling is basically always on an exponential scale: falling by an octave a second means frequency changes by a factor of 2 a second, and our increment would also need to halve in a second. But, I don't think there's a way to do exponential calculations like this without floating point. What you can do fast in integers is add an increment to your increment. That gives a reciprocal curve, not an exponential curve. But if you recalculate that increment-of-the-increment every 64 samples or something, then you end up getting a super-close approximation of any curve you want with a bunch of short reciprocal curves. And the math per sample isn't just simple, but simplest-POSSIBLE. A single integer addition/subtraction to change the increment, then another single integer addition to change the accumulator:

increment = increment + incrementOfIncrement
accumulator = accumulator + increment

Do that 64 times, then finally one time calculate a new increment-to-the-increment. That calculation is simply:

figure out desired pitch "pitchThen" we'll want in 64 samples (could be LOTS of math)
incrementThen = 2^32 * pitchThen / 44100
incrementOfIncrement = (incrementThen - incrementNow+32)/64.

And the divide by 64 is actually just shifting right six bits, not even an integer divide (much slower than a shift).

-------------

So now let's look at the sidebands. I've looked at trying to use the center frequency's increment-of-increments for the sidebands too and while you can get close for a short segment, after the segment is over the sideband increments will need such a big adjustment I think it'd make too much noise. However, simply calculating the sidebands the same way the main signal is calculated should be fast enough.

Calculating an increment for say the top sideband, at maximum detune, naively, would be 2^32 * 440 / 44100 * 1.1077003479. The 1.1077003479 could be calculated on the fly with floating point math, but it'd be much faster to simply pre-calculate it for each detune amount for each sideband and put those pre-calculated values into a two-dimensional array.

But even if pre-calculated, we still have a floating point operation. Especially when supersaw started, I don't think there was any floating point math--too expensive! My guess is that instead of getting 1.1077003479 out of a table, they're further pre-calculating 44100/1.1077003479 and putting that into the table. So, calculation for the sideband increment, given a frequency of 440, is instead: 2^32 * 440 / 39812. The 39812 would again come from a two-dimensional array, indexed by sideband# and detune amount.

-----------------------------

OK, so that's the bad news for you. Those numbers 1.1077003479 and so on are probably in fact integers like 39812 in a table, and those integers are so high and have so many factors that you're not going to boil their relationship down into a simple "small" fraction like 18/17 or something. Furthermore, being in a table frees them from having to be calculated, which means they don't have to be an easy-to-calculate number. They could be (and for reasons I gave at the top, probably are) irrationals like 10th root of 2 or something, selected to be nearish but not exactly on -10%/6%/2%/+2%/6%/10%. And finally, these irrational numbers aren't even static; instead they're multipled by a detune amount then converted to integers.
For every detune, you'll get different rounding errors... which means even if you could do hyper-accurate pitch detection and calculate these integer values for one detune level, the integers for the next detune level would have to be found from scratch.

OTOH, if you could find the output frequency of the keyboard in question (may be 44.1kHz or 32kHz or something) and you got HYPER-accurate measurements of the detune amounts, I'm guessing that for each value of detune you WOULD ultimately find very specific integers.

In summary, the numbers you're curious about probably exist, but my guess is that they're both big integers, not small fun ratios, and that their ratios also vary somewhat with detune amount.

Post Reply

Return to “DSP and Plugin Development”