Pre-Conditiong for Low-Bitrate WMA 9 Codec

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

Good morning,

I'm looking for someone to create a plug-in (VST2, 32-bit) that will "pre-condition" the source audio to reduce, as much as possible, audible artifacts from WMA 9 (not 9.1 or 9.2) at very low bitrates (specifically 32kbps, 32kHz, Stereo) when encoded.

Filters for this very purpose seem to exist in the broadcast world as hardware, not software. For example, the Neural Audio NeuStar 4, DaySequerra (now Orban) NeuAir 2 and the Orban Optimod 6300 / 1101 (with their "PreCode" filtering technology).

In my searches, I came across a plug-in called UNCHIRP, which, although seems to have the right idea, focuses on MP3 and HE-AAC, not WMA 9.

Can someone make such a plug-in? I am more than happy to pay whatever is needed to create this.

Thank you.

Post

Hmm.. to be honest, I find that concept rather strange.
How does pre-code filter know about my encoder, codec settings, profiles, ect pp?
I mean, encoding a file with hacked-over-the-weekend-mp3-enc mode vs MP3pro will deliver different result. Even with exactly same input config, i.e. because my hacked-over-the-weekend-mp3-enc is not optimized for low-bitrate (this will come next weekend), but MP3pro is.
So what is this devices pre-filtering, if they don't know how I'm going t process it?

UNCHIRP makes more sense, because it operates on encoded audio. It try's to remove artefarts introduced by the compression.
The pre-filtering approach is trying to remove something that has not been added yet, w/o exactly knowing what will be added. Sound like some rough guessing to me (maybe that's the reason why you cannot find such a pre-filter plugin? Because it's hocus-pocus? ;) )

And about WMA 9: what's you use-case on this?
Are you going to encode it? If yes, and you only care bout 9.0: just use a current version of windows media encoder. :D
9.1 is there since .. uhhm.. 2008? I wouldn't even know where to get windows media encoder 9.0 from to create WMA9.0 ;)

(Important on WMA: don't mix codec versions with encoder version. WMA9, 9.1, 9.2 is no codec version, but an encoder (=tool) version. WMA and WMA2 is a codec (=bitstream) version).

Post

PurpleSunray wrote:Hmm.. to be honest, I find that concept rather strange.
How does pre-code filter know about my encoder, codec settings, profiles, ect pp?
I mean, encoding a file with single-pass on my hacked-over-the-weekend-mp3-enc vs multi-pass on MP3pro will deliver different result. Even with exactly same input config, i.e. because my hacked-over-the-weekend-mp3-enc is not optimized for low-bitrate and has no multi-pass (this will come next weekend), but MP3pro has it already.
So what is this devices pre-filtering, if they don't know how I'm going t process it?

UNCHIRP makes more sense, because it operates on encoded audio. It try's to remove artefarts introduced by the compression.
The pre-filtering approach is trying to remove something that has not been added yet, w/o exactly knowing what will be added. Sound like some rough guessing to me (maybe that's the reason why you cannot find such a pre-filter plugin? Because it's hocus-pocus? ;) )

And about WMA 9: what's you use-case on this?
Are you going to encode it? If yes, and you only care bout 9.0: just use a current version of windows media encoder. :D
9.1 is there since .. uhhm.. 2008? I wouldn't even know where to get windows media encoder 9.0 from to create WMA9.0 ;)

(Important on WMA: don't mix codec versions with encoder version. WMA9, 9.1, 9.2 is no codec version, but an encoder (=tool) version. WMA and WMA2 is a codec (=bitstream) version).
The PreCode is specifically gor bitrates below 64kbps with WMA 9. Since WMA is not open-source, a WMA encoder must use WMADMOE.DLL from Microsoft.

Once the "problem areas" of the codec are identified, the remaining variables you mentioned should theoretically disappear.

The use case will be for streaming and archiving, using WMA 9 at 32 kbps, 32 khz, Stereo. This will be for personal use and the cost for having this created is the least of my concerns as this would be immensely helpful for my needs.

Is anyone able/willing to undertake this project?

Thanks.

Post

Once the "problem areas" of the codec are identified, the remaining variables you mentioned should theoretically disappear.
The "problem areas" have been identified by Microsoft already.
And for them it is waaaay easier to solve it, because they can make the pre-filter adaptive (look what the encoder outputs and adapt the input) , while a pre-filter outside of the encoder can just do guessing.

That's why you do not find such kind of plugins.
It is either a huge project, because you basically need to modell the encoder so that your filter knows what to filter .. or it is a rip-off (I can build that for you, it be will a 3-band EQ with MP3, AAC and WMA preset for 999.90$).

Post

PurpleSunray wrote:
Once the "problem areas" of the codec are identified, the remaining variables you mentioned should theoretically disappear.
The "problem areas" have been identified by Microsoft already.
And for them it is waaaay easier to solve it, because they can make the pre-filter adaptive (look what the encoder outputs and adapt the input) , while a pre-filter outside of the encoder can just do guessing.

That's why you do not find such kind of plugins.
It is either a huge project, because you basically need to modell the encoder so that your filter knows what to filter .. or it is a rip-off (I can build that for you, it be will a 3-band EQ with MP3, AAC and WMA preset for 999.90$).
Ok thank you. I was hoping for something more complex than a basic EQ. Perhaps something that would model the peaks/transients and/or perhaps noise reduction for problematic sounds.

Is this something you (or anyone else) would be willing to explore and create this plugin?

Do you think that this is too large of a project to be requesting?

Thanks.

Post

As said, you cloud try to understand how the encoder works and put some config onto a chain of processing plugins.
But the problem is, such an encoder runs a lot of non-linear processing, especially on low-bitrate.
Throwing a preset onto a bunch of effect plugins might sound good on one track and on next it sounds crap.
I won't build that.. put and EQ, transient shaper and some saturation onto your master lane and optimize until you'r happy.

The modelling / adaptive approach is not worth it. The outcome will be WMA10 LBR :lol:

So I won't do this project, since there is an easier solution: upgrade your Windows Media Encoder :P

remeber:
WMA9, 9.1, 9.2, 10 is all same bitstream (=WMA2). A decoder that can play WMA9 can play also WMA10 (but not WMA10Pro..). So why stick with 9.0 if there are better encoders out already? introducing better low-bitrate processing?

thx MS for that version hell on WMA btw :x

Post

PurpleSunray wrote:As said, you cloud try to understand how the encoder works and put some config onto a chain of processing plugins.
But the problem is, such an encoder runs a lot of non-linear processing, especially on low-bitrate.
Throwing a preset onto a bunch of effect plugins might sound good on one track and on next it sounds crap.
I won't build that.. put and EQ, transient shaper and some saturation onto your master lane and optimize until you'r happy.

The modelling / adaptive approach is not worth it. The outcome will be WMA10 LBR :lol:

So I won't do this project, since there is an easier solution: upgrade your Windows Media Encoder :P

remeber:
WMA9, 9.1, 9.2, 10 is all same bitstream (=WMA2). A decoder that can play WMA9 can play also WMA10 (but not WMA10Pro..). So why stick with 9.0 if there are better encoders out already? introducing better low-bitrate processing?

thx MS for that version hell on WMA btw :x
Thanks for your thorough responses and also for the humor, it made me laugh.

If anybody else wants to jump in, don't be shy.

I understand this is not as simple as I perhaps originally wrote in the OP. I also understand that perfection with such a task is impossible. What I ask is some research and an a "best attempt". I have plenty of test/sample audio that can be used when modelling the plugin.

Hope someone we ill take a stab at this.

Thanks.

Post

genie40204 wrote:The use case will be for streaming and archiving, using WMA 9 at 32 kbps, 32 khz, Stereo. This will be for personal use and the cost for having this created is the least of my concerns as this would be immensely helpful for my needs.
Imho it is far more economical to:
1. Invest in storage and bandwith. Both are insanely cheap nowadays. Do the math: how much storage and bandwith can you get for, say, $1000? And how much TB do you really need?
2. Switch to a better encoder than WMA9. I'd recommend AAC which offers better quality than MP3 with lower bitrates. And it is supported by ffmpeg.

Could you elaborate why you are limiting your solution to this ancient MS encoder, instead of switching to a modern industry standard such as AAC? Really, you should take the tool proven to be best fitted for the task.
We are the KVR collective. Resistance is futile. You will be assimilated. Image
My MusicCalc is served over https!!

Post

BertKoor wrote:
genie40204 wrote:The use case will be for streaming and archiving, using WMA 9 at 32 kbps, 32 khz, Stereo. This will be for personal use and the cost for having this created is the least of my concerns as this would be immensely helpful for my needs.
Imho it is far more economical to:
1. Invest in storage and bandwith. Both are insanely cheap nowadays. Do the math: how much storage and bandwith can you get for, say, $1000? And how much TB do you really need?
2. Switch to a better encoder than WMA9. I'd recommend AAC which offers better quality than MP3 with lower bitrates. And it is supported by ffmpeg.

Could you elaborate why you are limiting your solution to this ancient MS encoder, instead of switching to a modern industry standard such as AAC? Really, you should take the tool proven to be best fitted for the task.
The main reasons are convenience, comfort, and preference. To whom I will be streaming and for what I will be storing/archiving are also critical factors in the decision to use WMA 9, but I'd prefer not to detail those specifics if possible. The plug-in will be for private use only, nothing commercial, that is for certain.

The decision to go the WMA 9 route is not able to be changed at this point in the project. I do appreciate your suggestions and I do fully understand your concerns and observations, but it must be WMA 9.

Thanks.

Post

genie40204 wrote: The decision to go the WMA 9 route is not able to be changed at this point in the project. I do appreciate your suggestions and I do fully understand your concerns and observations, but it must be WMA 9.
Again (trying to tell you that since beginning): your listeners are not affected by your decission to upgrade the encoder version.
If you start streaming WMA10Pro LBR tomorrow, nobody of your listeners has to change anything.
Old WMA9 decoders play the baseband audio with crappy quality as of now.
WMA10 capable decoder (any WMP later than v11) can also decode the LBR extension so that LowBitRate sounds good.

Post

What happens if you run it through some sort of dither - Would that improve things, or as is so low level would be stripped out anyway ?

Post

mcbpete wrote:What happens if you run it through some sort of dither - Would that improve things, or as is so low level would be stripped out anyway ?
Dithering is something I hadn't tried yet.

EDIT: Did a few quick tests before leaving for work. Dithering does seem to help. I dithered to 8-bit prior to encoding to WMA, is that what you meant?

Also, WMA 9 is required because of compatability w/Windows 95 and 98 which do not support WMP 11.
Last edited by genie40204 on Tue Nov 14, 2017 3:43 pm, edited 2 times in total.

Post

How does dithering help there?
Quantization errors will not be a major problem on 32kbps WMA.
But you have problems like pre-echoes / post-echos.
They come from the heavy band filtering via FFT.
How do you want to solve that via pre-filtering??

Example:
There is a loud frequency and 2 not so loud-frequencies on same critial band.
According to the psychoacoustics model of the encoder, a human ear will focus on the loud frequency and less focus the 2 not so loud-frequencies if they are on same critial band.
Because of the low bitrate setting, the encoder might now decide to further reduce amplitude on the 2 not so loud-frequencies to save bits. It might even decide to drop a frequency completly.
On order to that, it splits the critial band into sub-bands and removes the unwanted frequncies.
This sharp cutoffs are what causes the rining.

To solve it, you first need to understand the psychoacoustics model of the encoder so that you know what the encodee will actually do. Hard enough, since WMA encoder is close source.
And then you need to come up with an algorithm for pre-processing a signal, so that a FFT with heavy band-filter aftewards does not cause rining.

And this only one of the many processing steps that can cause artefarts.

What you are asking here is a huge research task at the very first place.
But motivation on doing this is minimal if you'r a codec-guy.

Solution to this problems exist already. You get rid of rining by designing the band-filtering differently.
But you need to do on the encoder, to on a plugin before the encoder.

It is like if you put an Apple into your smoothie-mixer.
Your request is like: please pre-process this Apple, so that it still looks like an Apple when it comes it from the Mixer.
Well.. I can do this, by modifying the mixer and removeing the blades.
But no idea how to pre-process that Apple so that the Mixer does not shredder it to Apple-smoothie

Post

PurpleSunray wrote:How does dithering help there?
Quantization errors will not be a major problem on 32kbps WMA.
But you have problems like pre-echoes / post-echos.
They come from the heavy band filtering via FFT.
How do you want to solve that via pre-filtering??

Example:
There is a loud frequency and 2 not so loud-frequencies on same critial band.
According to the psychoacoustics model of the encoder, a human ear will focus on the loud frequency and less focus the 2 not so loud-frequencies if they are on same critial band.
Because of the low bitrate setting, the encoder might now decide to further reduce amplitude on the 2 not so loud-frequencies to save bits. It might even decide to drop a frequency completly.
On order to that, it splits the critial band into sub-bands and removes the unwanted frequncies.
This sharp cutoffs are what causes the rining.

To solve it, you first need to understand the psychoacoustics model of the encoder so that you know what the encodee will actually do. Hard enough, since WMA encoder is close source.
And then you need to come up with an algorithm for pre-processing a signal, so that a FFT with heavy band-filter aftewards does not cause rining.

And this only one of the many processing steps that can cause artefarts.

What you are asking here is a huge research task at the very first place.
But motivation on doing this is minimal if you'r a codec-guy.

Solution to this problems exist already. You get rid of rining by designing the band-filtering differently.
But you need to do on the encoder, to on a plugin before the encoder.

It is like if you put an Apple into your smoothie-mixer.
You request is like: please pre-process this Apple, so that it still looks like an Apple when it comes it from the Mixer.
Well.. I can do this, by modifying the mixer and removeing the blades.
But no idea how to pre-process that Apple so that the Mixer does not shredder it to Apple-smoothie
I see, so because we cannot see the inner workings of the mixer, we cannot mold the apple properly to protect it from the mixer's shortcomings.

Ok, that makes sense. So, do you believe that the Orban PreCode and NeuStar i mentioned in the OP are marketing gimmicks or well researched algorithms based their understanding of the "mixer"?

Post

I don't know what PreCode is doing.
There is zero information available about it, other then the marking blabla (<wall of buzzwords>)

There is stuff you can do - like on the example above.
If you know that certain frequencies will be cut, and the cut will cause rining, you could remove it beforehand so there is nothing left to be removed by the encoder.
But then we are back to the problem that you don't know what the encode will do.
This is what makes me very sceptical on PreCode.
- Orban's PreCode technology ... There are several factory presets tuned specifically for low bitrate codecs.
So are there like "Apple AAC", "FAAC", "libav", "Nero AAC", "Nero AAC low latency", "Nero AAC multi-pass on LBR mode" .. presets on this device? I spend quite some time on LBR support for the Nero AAC for instance and I know it was sooooooooo much better than FAAC at the time I implement it. So Mr. Orban ... I now select "low-bitrate AAC".
If you keep that specical frequency, it will cause artefarts on FAAC - so you should remove it.
Nero AAC can handle it in multi-pass mode, so if your PreCode removes it, I slap you in the face because all my multi-pass LBR optimization work was useless then.

All I can say about PreCode is speculation.. they do not describe the what PreCode does, so wihout having & analyzing it, I cannot say much about.

Post Reply

Return to “DSP and Plugin Development”