Mid/Side encoding attentuation
-
binaryoblivion binaryoblivion https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=374814
- KVRist
- Topic Starter
- 81 posts since 18 Feb, 2016
Is there a preferred approach to attenuating encoded and decoded mid/side signals, to compensate for the 6dB gain caused by decoding M/S to L/R?
I ask because I see different approaches recommended, or in practice.
For example, an article on SoS recommends applying -3dB gain during the encoding stage, and again during the decoding stage. The algorithm for this is:
// Encode
M = (L+R)-3dB
S = (L-R)-3dB
// Decode
L = (M+S)-3dB
R = (M-S)-3dB
On the other hand, I have noticed that Voxengo's MSED and the mid/side encoder bundled with Reaper apply -6dB gain during encoding, and don't compensate during decoding. The algorithm for this is:
// Encode
M = (L+R)-6dB
S = (L-R)-6dB
// Decode
L = (M+S)
R = (M-S)
In both cases the output signal is identical to the input signal. Where the two approaches differ is the volume of the M/S signal inside the system.
Does one approach have an advantage over the other, or is it a matter of personal preference?
I ask because I see different approaches recommended, or in practice.
For example, an article on SoS recommends applying -3dB gain during the encoding stage, and again during the decoding stage. The algorithm for this is:
// Encode
M = (L+R)-3dB
S = (L-R)-3dB
// Decode
L = (M+S)-3dB
R = (M-S)-3dB
On the other hand, I have noticed that Voxengo's MSED and the mid/side encoder bundled with Reaper apply -6dB gain during encoding, and don't compensate during decoding. The algorithm for this is:
// Encode
M = (L+R)-6dB
S = (L-R)-6dB
// Decode
L = (M+S)
R = (M-S)
In both cases the output signal is identical to the input signal. Where the two approaches differ is the volume of the M/S signal inside the system.
Does one approach have an advantage over the other, or is it a matter of personal preference?
-
- KVRAF
- 1668 posts since 11 Nov, 2009 from Northern CA
The -3dB version is symmetrical. That is, you can feed it LR and it will produce MS, and vice versa, using the same calculations.
But the divide-by-2 version requires half as many multiplications if going both ways, so is more efficient. It could also be argued that the divide-by-2 version is less prone to clipping, given that we almost always go from LR to MS first and back again later.
But the divide-by-2 version requires half as many multiplications if going both ways, so is more efficient. It could also be argued that the divide-by-2 version is less prone to clipping, given that we almost always go from LR to MS first and back again later.
-
binaryoblivion binaryoblivion https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=374814
- KVRist
- Topic Starter
- 81 posts since 18 Feb, 2016
Ah, that makes sense about the symmetry of the system.
It did occur to me that the -6db version would protect against clipping e.g. if the input was a dual mono signal where each channel peaked at 0db, the encoded Mid signal would also peak at 0db. I suppose that clipping doesn't really matter in a floating point digital system, but I can see how this approach might make the M/S signal more predictable.
I am thinking that -6db is probably the way to go. Thanks!
It did occur to me that the -6db version would protect against clipping e.g. if the input was a dual mono signal where each channel peaked at 0db, the encoded Mid signal would also peak at 0db. I suppose that clipping doesn't really matter in a floating point digital system, but I can see how this approach might make the M/S signal more predictable.
I am thinking that -6db is probably the way to go. Thanks!
-
- KVRist
- 53 posts since 4 Sep, 2014
Personal preference. I never considered splitting the attenuation between the two calculations, I've done it during encoding just in case there was a +1 on one side and a -1 on the other (or values to that effect).
- KVRAF
- 7890 posts since 12 Feb, 2006 from Helsinki, Finland
In terms of mathematics the [M,S] sample vector is essentially a 45 degrees (=pi/4 radians) rotated version of the [L,R] sample vector. If we treat encoding and decoding as such rotations, you get the -3dB gain factor out of the usual rotation formula as cos(pi/4)=sin(pi/4)=sqrt(0.5) (which strictly speaking is about -3.01dB). The main advantage of this approach is that it preserves the squared magnitude which in practical terms means that the total RMS power will be the same in either representation.binaryoblivion wrote:Is there a preferred approach to attenuating encoded and decoded mid/side signals, to compensate for the 6dB gain caused by decoding M/S to L/R?
I ask because I see different approaches recommended, or in practice.
For example, an article on SoS recommends applying -3dB gain during the encoding stage, and again during the decoding stage.
However, if you don't care about preserving RMS, then we can combine the gain factors from the encoding and decoding together, which saves one pair of multiplies. The actual CPU advantage these days is basically irrelevant, but there's another "gotcha" which is potentially more meaningful in some cases: sqrt(0.5) is irrational number and doesn't have an exact representation in finite precision arithmetics, where as .5 = 2^-1 which is a simple (and exact) decrement of the exponent in floating point (and a simple bit-shift in fixed-point).
In practice, when doing typical audio processing in floating point I would almost always prefer the RMS preserving "-3dB twice" approach, but if you want transparent pass-thru without rounding errors then the "lumped -6dB" approach can give you that.
-
binaryoblivion binaryoblivion https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=374814
- KVRist
- Topic Starter
- 81 posts since 18 Feb, 2016
Thanks for the detailed reply mystran! It was my impression that the -3dB version preserved the average volume, and your reply explains why (though I confess that the math taxes my simple brain!)
You’ve certainly given me something else to think about.
You’ve certainly given me something else to think about.