Is rendering limited by sound card bit depth?

Discussion about: tracktion.com
RELATED
PRODUCTS

Post

The creamware stuff has onboard DSP mixing doesn't it? I would guess the 32-bit option would be the best bet, so any further processing / mixing you do with the hardware has the full resolution available to it.

Don't own any CW stuff though (unfortunately) so I might be wrong..

Post

I work now in 24 / 44.1 and everything renders at 24 bit inside Tracktion when I record. This makes sense as I've selected it as the bit depth I prefer to use. Unlike some peole who have stated they cannot hear a difference, I can. After years of working in 16 bits and having to deal with inferior sound cards and slowly moving up the ladder of quality, 24 bits sounds more expansive to me. I can hear more detail in quieter parts. Even when dithering back down to 16 bits, it still sounds better. But it probably depends on what kind of music you produce. I prefer more dynamics in tracks, less compression and I consistently mix quieter sections with louder ones.

Tracktion works in 32 bit floating point like plenty of apps do. They do all their work in this manner to try and preserve all the quality so there is no bit truncation. I don't see how it's strange that Tracktion should render at the bit depth you choose and not default to 32? You can still choose to render at 32 bit if you want...
Mixcraft 8 Recording Studio : Reason 10

Post

Ixox wrote:Now add two 16 bits audio tracks wich contains at one moment :
1001 0110 0000 1111
+ 1100 0000 1001 0000
=1 0101 0110 1001 1111 <= 17 bits....
If you render to 16 bits you'll lose the last zero...
Is this really the way audio summing works? I thought that 1111 1111 1111 1111 would be your maximum wave amplitude in 16-bit, so summing two signals to produce a result that requires 17-bits, as you describe, would cause clipping rather than a loss of precision.

Post

I'm afraid that's the way it works... Adding is adding. If you sum two (or more) loud signals, then it goes over the maximum (or minimum) and you have a very nice "clip".

Same as with a cash register that only has three digits, because the shop owner thought nothing in the shop is priced over $9.99. You can actually sell a lot of bubble gum to one customer before you get an overflow, but each gum is accounted for. So a lot of faint signals can be mixed before you get clippings.

This explains why you have to set channel faders way below zero, or have a limiter in the master effects chain.

Post

Thanks for confirming my understanding. So rendering in a higher bit-depth wouldn't prevent you from losing precision in this case; it would just give you more precise clipping.

Post

:?

Higher bit depth = greater resolution = more precision

Clipping is not really an option, so the more signals we mix together, the more those signals need attenuating to provide sufficient headroom, and the more need for increased resolution to avoid losing the low level details.

Post

I think we may be saying the same thing from different directions. Higher bit-depth gives you more precision for sampling points over a range of values. The range of values (amplitude) is the same for all bit-depths. So in Ixox's example of summing two signals which takes you past the maximum value that can be represented, a higher bit-depth doesn't help as you've gone past the maximum value and into clipping. That is the part I was questioning.

Post

32-bit float does provide extra headroom, which is why its used internally for the mixing engine. 16 or 24 bit integer will both clip above 0dBFS howerver.

Post

msl wrote:So in Ixox's example of summing two signals which takes you past the maximum value that can be represented, a higher bit-depth doesn't help as you've gone past the maximum value and into clipping.
What you need to bear in mind is that if you are adding together two 16 bit values, you add a leading bit before you add them together (sign extending them, assuming 2's complement numbers are being used) to the word length you are targetting, this means that effectively the 16 bit is aligned to the right of (in this case) the 17 bits word, hence the most significant bit is available for use.

i.e. if you take two 16 bit words to add together, say

1111 0000 1111 0000 (-3856 decimal) and
1111 1010 1010 1111 (-64176 decimal)

they would become

1 1111 0000 1111 0000 and
1 1010 1010 1010 1010

with the same decimal value with the result -68032 after summing which can be represented in 17 bits but would obviously clip in 16 bits. In a properly designed audio application, you will convert the source numbers from their original bit depth to the target bit depth *before* adding them together. In the case of most applications, this would be a conversion to 32 bit floating point, with +/- 1 representing full scale.

The upshot is that even if you record in 16 bit, there is a definite advantage to mixing in 24 or 32 bits.

Post

But the mixing in Tracktion is always done in 32-bits internally, so how is the bit-depth you are rendering to significant in this process?

Post

It determines how much of your 32-bit mix resolution is retained.. obviously! :?

Post

Is that an advantage if you're eventually going to end up with 16-bit CD audio anyway?

What I am trying to understand is, given internal mixing is done as 32-bit float regardless, what is the advantage to rendering to a higher bit-depth than the one you actually want? I appreciate there are real advantages in recording tracks at a higher bit-depth, but what about rendering mixed output? Is it so that you can render a high quality mix to use in a seperate mastering process?

Post

msl wrote: Is it so that you can render a high quality mix to use in a seperate mastering process?
Yes. General mathematical principal: if you need to perform several calculations to achieve an answer you will get a more accurate result if you preserve all the decimal places for intermediate values, than if you round to two decimal places each time.. even if the final answer only needs two decimal places. Each rounding introduces some innacuracies (ie: distortion and noise) which are cumulative, so it is best to keep this to a minimum my sticking to higher resolutions right up to the last stage.

Post

So if you're rendering a final mix and not an intermediate mix to be further processed later, there is no advantage to rendering in higher resolutions, right?

One final thing I'm not quite getting. Hopefully somebody can clarify for me. It's this notion that mixing tracks together introduces additional quality and therefore requires a higher bit-depth result to accurately represent, which it seems to me that people are suggesting above -- could be I'm not reading things right though.

Here's a simplistic example:

Say we have a range of (amplitude) values to work with in a digital scale of +/-10. Our low bit-depth gives us whole number precision. Our high bit-depth gives us precision to 0.01.

Recording two tracks at low bit-depth we have the values 7 and 8 at a specific sample point. Summing and attenuating to mix them together without clipping we have a result rendered in high bit-depth of 7.50. At low bit-depth the rounded result would have been 8, so it seems like rendering at the higher bit-depth has given us better accurancy.

However, if we had recorded at high bit-depth we would have had the more accurate values 7.49 and 8.49 to work with and the high bit-depth result of the mix should really have been 7.99. So rendering at a higher bit-depth than the source tracks doesn't gain us any accurancy because we are fundamentally limited by the quality of the source material.

Post

msl wrote:So if you're rendering a final mix and not an intermediate mix to be further processed later, there is no advantage to rendering in higher resolutions, right?
There is always an advantage to higher resolutions (ie: better sound) but if the final mix is destined for CD it has to go to 16-bit at some stage. The important thing is to only drop to 16-bit once, because each time you will lose more quality. Also, if you maintain a high resolution up till the last stage some of this extra information can be statistically squeezed into the lowest bit through proper use of dither.. right you've asked for it, here's my dither quote:
Mithat Konar wrote: Here's a simple thought experiment that explains why dither is necessary and how it works. Lets create a basic A/D converter. We'll make it sensitive to DC and bipolar, so it responds to both positive and negative analogue inputs, and we'll give it a very big LSB threshold of 1 volt to make the numbers easy. We'll construct our ADC so that an analogue source over the range between +.5 volts and 1,5 volts produces an output of 1, and so on. If, without applying any dither, we present a 0.25 volt DC (continuous) signal to the input of the ADC, the output of the ADC will be a string of zeros. In fact any signal between -0.5 and 0.5 volts will result in an ADC output of zero. Any information below the LSB threshold is completely lost.

Remove the 0.25 volt signal and apply dither to the input of the ADC in the form of a completely random signal (i.e.,noise) centred around 0 volts. Its peak amplitude randomly toggles the LSB of the ADC. The output of the ADC will be a stream of very small random values. However the average of all these values will be zero.

Now lets apply our 0.25 volt signal again (with the dither on). The two analogue voltages sum together, the dither and our signal. At each sample point (in time), the 0.25 value of our analogue source is added to the random dither value. The output stream wil again look like a stream of very small random numbers, but guess what? The AVERAGE of all those numbers will now be...you guessed it, 0.25. We have thus retained the information that was previously lost (even though its buried in "noise"). In other words, our resolution has improved. The conversion is still essentially random, but the presence of the 0.25 volt signal biases the randomness. Put another way, the characterization of the system with dither on is transformed from completely deterministic to one of statistical probability. The periodic alternation of the LSB between the states of 0 and 1 results in encoding a source value that is smaller than the LSB. In other words, on the average, the LSB puts out a few more ones than zeros because of our +0.25 volt signal. We say that dither exercises or toggles or modulates the LSB.

With the dither on, we can now change the input signal over a continous range and the average of the ADC will track it perfectly. An input signal of 0.371476 volts will have an average ADC output of (the binary equivalent of) 0.371476. The same will hold true of inputs going over the LSB threshold: an input of 3.22278 will have an average ADC output of 3.22278. So not only has the dither enhanced the resolution of the system to many decimal places, but it has also eliminated "stepping" quantisation effects!

Post Reply

Return to “Tracktion”