KVR Audio

kvotchin · Post by **kvotchin** » Thu Apr 16, 2020 7:29 pm

So on a whim, I’ve decided it’s vitally important to understand how formants work, with regards to synthesis.

More urgently, how to create/manipulate them, as it were, from scratch.

I’m willing and able to read/watch any resources you might point to. Or indeed, if you happen to know all about this yourself, please do post said information!

Thank you.

deastman · Post by **deastman** » Thu Apr 16, 2020 7:52 pm

Our mouth shapes amount to a filter with specific resonances. You can find these listed in many charts online. The easiest way to synthesize this is by feeding an oscillator through three or more resonant bandpass filters, tuned to the desired formant frequencies.

kvotchin · Post by **kvotchin** » Thu Apr 16, 2020 8:02 pm

Nice. That is a very good start, thanks.

CHOOS · Post by **CHOOS** » Thu Apr 16, 2020 8:08 pm

Lookup BiFilter2

not showing how to do it from scratch but has the vowel settings in it.

vurt · Post by **vurt** » Thu Apr 16, 2020 8:23 pm

doesnt thor in reason have a vowel filter?

Erisian · Post by **Erisian** » Thu Apr 16, 2020 8:32 pm

I have played with modulating the formant filter in XSRDO Analogy 64bit with interesting results. I could probably set up a patch with a few instances to experiment with. Thank you for this thread @kvotchin.

CrystalWizard · Post by **CrystalWizard** » Sat Apr 18, 2020 6:28 am

That’s three resonant bandpass filters in parallel just to be sure.

GRUMP · Post by **GRUMP** » Sat Apr 18, 2020 3:35 pm

VOX here

Long Story. "Formants" are just Theory. VOX Sounds are way more complicated.

Unison is very important and Wavetables work best = Dune 3 best Synth. Believe me.

Vowel Filter: the Orb. But: Vowel creates just one Kind of Sound.

More Essentials: Overtones vs Noise/Air, Chords, Frequencys + Balance,...

There are many Types of VOX Sounds and they are all created differently. So ...

kvotchin · Post by **kvotchin** » Sat Apr 18, 2020 5:09 pm

Formants are not “just theory”. No more than a sine wave is that, for example.

I checked out your YouTube, and yeah, none of those sounds remotely like what I was referring to the creation of (i.e., patches that are voice-like whatsoever).

cybilopsin · Post by **cybilopsin** » Sat Apr 18, 2020 5:11 pm

A term I have heard in place of formant is "spectral envelope". Just as the contours of a sound's amplitude movement (or any other parameter) are described as its "amplitude envelope", every sound has a "spectral envelope" referring to the contours of its spectrum. In other words, the concept of spectral envelope refers to a distinctive and mathematically rigorous aspect of sound's timbre. Thus, vocoders are sometikes described as applying one sound's spectral envelope to another sound's frequency spectrum, for example when the human voice's spectral envelope is applied to a synth's frequency spectrum resulting in a robot-voice synth.

Formant (singular) is often used to mean "a single peak in the spectral envelope". A single formant will have both a center frequency and a bandwidth, thus a single formant will affect the amplitude of multiple frequencies within a sound. Look at a recording of someone talking in a spectrogram. The formants appear as streaks in the harmonics where certain regions of harmonics are louder than their neighbors. These streaks have their own movement independent of the movement of pitch. Its like someone laid a stencil over the frequencies of a sound. The exact Hz of the frequencies is not dependent on the formant, but which broad regions of the frequency spectrum get amplified, is.

Formants are a huge component of timbre for many sounds - especially harmonic sounds, because of their fixed frequency content. Why do a trumpet a saxophone sound different when both play A-440? The frequencies that make up the sound are the same: 440, 880, 1320, 1760 Hz and all other multiples of 440. In large part, they sound different because these frequncies are filtered very differently (through differently shaped chambers), resulting in unique patterns of loud and soft. The exact frequencies are the same but the spectral envelope - the formants - are different.

Now the above definition comes from the academic world. in the practical synthesis world we have a more restrictive meaning for the term formant - specifically it usually means a spectral enevelope reminscent of that of a human voice.

Each vowel has a characteristic spectral envelope which can be approximately described as a set of 5 formants, each with its own center frequency, bandwidth, and relative volume. See this chart:

http://www.csounds.com/manual/html/MiscFormants.html

Parsing the information in that chart should lead to the conclusion that formants of human vowels can be mimicked with the use of 5 bandpass filters in parallel.

However, bandpass filters are not the only way to directly generate and control formants. I will pause here for questions, then later I will return and cover a granular synthesis technique which can directly create formants without any filtering.

vurt · Post by **vurt** » Sat Apr 18, 2020 5:13 pm

bruce haack might be worth reading up on.
he built a hw speech synth back in the 70s? (possibly earlier).

then one i had for my spectrum, the currah speech unit. a bit roboty but fun at the time

kvotchin · Post by **kvotchin** » Sat Apr 18, 2020 7:35 pm

Wow, thank you cybilopsin!

Was not expecting to encounter some C (albeit indirectly - I had a look through some of the rest).

But yes, that seems to be very much what I have been seeking. Any further information is still welcome, of course.

GRUMP · Post by **GRUMP** » Sat Apr 18, 2020 9:39 pm

kvotchin wrote: Sat Apr 18, 2020 5:09 pm Formants are not “just theory”. No more than a sine wave is that, for example.

I checked out your YouTube, and yeah, none of those sounds remotely like what I was referring to the creation of (i.e., patches that are voice-like whatsoever).

That´s what I meant: give us a concrete Example. You´re asking about a whole Universe here.

And be sure - you can dial in all those Formant Values with what ever Tool you like (we have different Options) - the Result will be more or less far away from what you are heading for. That´s what I meant with "Formants are just Theory".

To be honest: I skipped the Physics Lessons. But my Experience tells me that there are many more relevant Parameters for a Formant/VOX Sound than theoretically dominant Formants.

By the Way: Sample Magic is everywhere since Yamaha, Korg and Roland worked on this Topic. If you are looking for Sounds with very pleasant high Frequency Sectrum it´ll quiet sure not have been generated by a Synth.

cybilopsin · Post by **cybilopsin** » Sun Apr 19, 2020 12:04 am

And be sure - you can dial in all those Formant Values with what ever Tool you like (we have different Options) - the Result will be more or less far away from what you are heading for. That´s what I meant with "Formants are just Theory".

To be honest: I skipped the Physics Lessons. But my Experience tells me that there are many more relevant Parameters for a Formant/VOX Sound than theoretically dominant Formants

Im not sure if you're saying that the information in the formant chart isn't sufficient to create vowel-like tones using bandpass filters, but if so I assure you from personal experience it is. There are a few key parts of the process beyond just setting the bandwidth cutoff frequencies:

1. The waveform you start with should have very strong upper harmonics. A saw or even a supersaw is not ideal - its true they have upper harmonics, but their lower harmonics are much louder. A pulse wave (0%/100% pulse width, i.e. an impulse train) is much better because all of the harmonics are equal in intensity. Waveforms that are even further weighted towards the upper harmonics can be created through additive synthesis, but an impulse wave works very well.

2. The chart contains 3 pieces of information for formants: frequency, bandwidth, and amplitude. All 3 need to be used to create a good vowel tone. The formants have to be at the correct volume relative to each other to sound realistic - thats what the dB readings are for. The lowest frequency formant is always the loudest so it is listed as 0 dB. If you're using parallel bandpass filters, that means they have to be mixed at different volumes when summed.

I also want to point out that the formant chart I linked to is most definitely not theoretical, since the values are empirically derived in the first place.

GRUMP · Post by **GRUMP** » Sun Apr 19, 2020 1:03 pm

cybilopsin wrote: Sun Apr 19, 2020 12:04 am
And be sure - you can dial in all those Formant Values with what ever Tool you like (we have different Options) - the Result will be more or less far away from what you are heading for. That´s what I meant with "Formants are just Theory".

To be honest: I skipped the Physics Lessons. But my Experience tells me that there are many more relevant Parameters for a Formant/VOX Sound than theoretically dominant Formants
Im not sure if you're saying that the information in the formant chart isn't sufficient to create vowel-like tones using bandpass filters, but if so I assure you from personal experience it is. There are a few key parts of the process beyond just setting the bandwidth cutoff frequencies:

1. The waveform you start with should have very strong upper harmonics. A saw or even a supersaw is not ideal - its true they have upper harmonics, but their lower harmonics are much louder. A pulse wave (0%/100% pulse width, i.e. an impulse train) is much better because all of the harmonics are equal in intensity. Waveforms that are even further weighted towards the upper harmonics can be created through additive synthesis, but an impulse wave works very well.

2. The chart contains 3 pieces of information for formants: frequency, bandwidth, and amplitude. All 3 need to be used to create a good vowel tone. The formants have to be at the correct volume relative to each other to sound realistic - thats what the dB readings are for. The lowest frequency formant is always the loudest so it is listed as 0 dB. If you're using parallel bandpass filters, that means they have to be mixed at different volumes when summed.

I also want to point out that the formant chart I linked to is most definitely not theoretical, since the values are empirically derived in the first place.

The Table shows the empirical dominant Formants for certain Vowel (!) Sounds - but the Results are far away from a pracical Approach in SD. They represent the physical Theory of Voice-like Sounds and will typically result in a Vowel-Sawtooth.

I strongly believe that this kind of Sound doesn't match contemporary Demands anymore and er could simply prove this empurically.

On the other Hand: "vocalistic" Sounds (...) may consist on completely different Characteristica. Just analyze different Presets and Sounds. Most of them are not vowelish, some are Wavetables (e.g. DM Enjoy the Silene), some are Stacks, Vocoder Sounds, Samples ... VOX is really a Universe on its own.

And vowelish Single Cycle Waveforms are I. m. O. not of pracical Value. But OK - maybe with a lot of unison, Filters and the right Chords (...).

Formant synthesis