The way a vocal leveler SHOULD work.

Official support for: meldaproduction.com
RELATED
PRODUCTS

Post

The following is the result of THOUSANDS of hours of manual vocal leveling, countless discussions with experienced engineers, and extensive use of every available leveling plugin.

PROBLEM

Evey current leveler changes levels during vocal phrases. This is, by definition, distortion. MAutoVolume's additional controls help in other ways, but the distortion is inherent in the way all such plugins work. Meanwhile, experienced engineers still insist on performing this tedious task manually despite a variety of levelers being available for years. This will continue to be the case until an entirely different approach is adapted that not only eliminates this distortion entirely, but emulates a proper manual workflow.

SOLUTION

The ONLY way to avoid the distortion all the auto-levelers are causing is, by definition, to avoid changing levels during vocal phrases... and therefore to restrict level changes to moments of silence. In basic terms, the ONLY time a level change causes no waveform distortion is when it happens at a level of zero. What a user wants from such a plug is to normalize the levels (at least to a degree). What they don't want is to HEAR that leveling... as is the case with every currently available leveling plugin.

Every engineer I've met who does a particularly good job of vocal leveling already intuitively incorporates this concept into their workflow in one way or another... yet the plugins based on the current approach do not... and can not.

IMPLEMENTATION

Properly adjusting each section between silences is just one of those processes that REQUIRES a separate read pass (like Melodyne, Vocalign, and others) to do properly. It's not that vocal riding CAN'T be fully automated. It's that it can't be done within the currently over-saturated and under-effective single-stage Vocal Rider paradigm. As a two stage process, it not only achieves zero distortion, but does so with greater leveling accuracy, more precise control, and a more intuitive interface.

STAGE 1: READ PASS

The plug reads the entire track (think Vocalign or similar), and places internal markers at each silence of a minimum length. It then calculates the peak, RMS, and LUFS values of each resulting segment.

STAGE 2: USER CONTROL

There are two main dials for the user control. The first simplifies by reducing the number of markers. As the dial is turned, markers get grayed in order of increasing silence length. (Similar to many other simplification controls with markers like in Apple Loops utility.) It then recalculates the values for each resulting segment.

In more general terms, it controls the number of segments. At one end of the range, there are many segments, and at the other end there are fewer (and generally larger) segments. This is similar in manual editing to how many cuts an engineer might make to adjust individual regions either per word, per phrase, etc.

The second dial determines how tight the level matching is. At the one extreme, it changes nothing, and the performance is just as loose as originally recorded. At the other extreme, all segments are perfectly level matched. Effectively, this is a simple control for level matching "tightness" that maintains all existing dynamic relationships in exact ratios. The user can further choose whether the matching should be done by peaks, RMS, or LUFS.




That's it.





For those who want more control (and isn't that what drew most of us to Melda?), I would suggest that the user should also be able to manually grey out or re-enable the markers individually... just like they can with flex markers, etc. This allows the complete elimination of any residual issues where a breath was included as it's own segment when the user didn't want it to be, but everything else was good, etc.

This isn't a casual suggestion. It's YEARS in the making. On top of being the ONLY leveler design that causes ZERO distortion, it also completely eliminates the need for all sorts of complex time constant controls, and replaces them all with a simple and supremely intuitive dial. Gone are the days of chasing one leveler after another with different settings just to minimize the damage each is causing. Let's just stop causing ANY damage in the first place.

If this doesn't immediately strike anyone reading it as the right solution to a daily problem engineers have been facing since the dawn of recording, please read it again or ask questions. If I could drop everything and build one plug, this would be it. I can't right now, so I'm just going to sing it from the mountain tops until someone builds it.

I'd prefer Melda as I'd prefer to have more control over it whereas other devs might dumb it down too much... Plus I'm already paying for the subscription. :D
Last edited by Annabanna on Sun Aug 12, 2018 1:42 am, edited 20 times in total.

Post

I second that imidiatly! Hopefully Vocalign will not, talking about distortion, be used to much as an example.
Looking forward in excitement.
Cheers!

Post

...question, is your reference to " Drum Leveler" the Melda version or one of the others?.../s~
mba m2 15" | 16gig.ram | 1tb ssd | Sonoma 14.2.1 (23C71)
mbp i9 16" | 16gig.ram | 1tb ssd | Sonoma 14.2.1 (23C71)
logic10.8.1  | reaper7.07 | focusrite.2i2

Post

SGWork wrote:I second that imidiatly! Hopefully Vocalign will not, talking about distortion, be used to much as an example.
Looking forward in excitement.
Cheers!
Haha. Just thinking off the top of my head of 2-stage plugin examples. A better visual analogy is the Apple Loops utility as it not only shows the waveform in it's own window after a scan, but also has a slider to reduce the number of automatic markers. Not sure how many people have used that, though.

For years, I've thought this solution was so obvious that any day someone would make it so I could buy it, so I didn't take the time and effort to track down devs to get it built. In the meantime, years have passed, and all we have to show for it is another half dozen plugs based on the Vocal Rider model. Meanwhile, other plugs have AI's, machine learning, cross-channel communication, and all sorts of other advances, and top engineers everywhere are STILL doing their vocal leveling manually.

I've shared the idea with a few successful engineers, and been told multiple times now that it's exactly what they've been doing manually for years.


If I had known we'd still be stuck here in 2018, I'd have taken the time to figure out how to build the damn thing years ago, and offered to sell it to Waves.
Last edited by Annabanna on Sat Aug 11, 2018 7:34 pm, edited 3 times in total.

Post

steve2KVR wrote:...question, is your reference to " Drum Leveler" the Melda version or one of the others?.../s~
Sorry. I originally typed "Sound Radix Drum Levler." I think I may have revised the post to keep it short, and accidentally erased that bit.

Edited OP to remove the reference. That dial does the same thing as what I'm describing, but it's name is misleading, and will probably only cause confusion.

Post

I am having difficulty understanding the problem.
Singers will sing a vocal phrase with certain parts being much louder than others.
Our job is to even this out so as the intelligibilty remains.
Why would you want to leave the whole phrase at a certain changing level and only adjust the next phrase.
This makes no sense to me.
Spencer

Post

spencerlee wrote:I am having difficulty understanding the problem.
Singers will sing a vocal phrase with certain parts being much louder than others.
Our job is to even this out so as the intelligibilty remains.
Why would you want to leave the whole phrase at a certain changing level and only adjust the next phrase.
This makes no sense to me.
Spencer

In your case you're describing, all you'd have to do is... not turn the dial. It sets more markers than needed as a starting point, but gives you a simple way to have fewer if it suits you. There's no reason at it's most sensitive setting that there shouldn't be a marker between each word at a minimum. After all, it's just a flick of the dial to dial them back for the entire track if you want. You COULD have no active markers in an entire phrase, but that would never be the default case.

Loop/flex utilities are no different. You can just do nothing, and it will set the markers and do all the work for you, or you can choose to dial it back if the auto-sensing was too sensitive, or correct just a few individual markers if you really want perfection. There are actually quite a few tools in graphic design that work this way too, and they're HUGE time savers due to the ridiculous simplicity if you don't want to manually edit anything.

The only thing it would tend to have an issue with is sustained notes that vary wildly in level. In such a case, though, I suspect 5 different engineers would have 5 different takes on exactly what they would WANT to happen... and if they're paying attention to that level of detail, it's unlikely any of them would be satisfied with how any of the single stage Vocal Rider style plugs was able to handle it.

As for seeing the problem: The current plugs tend to cause more issues than they (attempt to) solve, and engineers everywhere are still doing it manually as a result.

As someone who has spent thousands of hours manually editing vocals, it's hard for me NOT to see the problem.

What I am describing just automates what I and many others are already doing manually. The difference is, if you decide later you want everything a bit tighter, or a bit looser, you just turn the second dial. Working manually, you'd have to start all over. The ability to make it as tight or loose as you want with a single dial, and the choice of peak, RMS, or LUFS matching would make it far more powerful than anything on the market, and far easier to use.

Post

I have spent more than 50 years dealing with dialog and vocals and find a couple of great compressors and possibly a limiter with a deesser or two deal with the voice quite nicely. Any audio that the processors don't catch just gain the waveform.
Any sses that the deessers dont catch just gain the waveform.
It seems you are making things too difficult.
Most off my work has been records but I used to get booked specifically by advertising agencies to mix commercials because my dialog was louder and clearer than anybody elses.
At 70 years old I am at this moment mixing several asian records because they like my vocal sound.
By the way here is a picture of the Voice Processor I am just finishing in MXXX1
Spencer
Voice Processor.jpg
You do not have the required permissions to view the files attached to this post.

Post

Reducing 95% of the work down to two dials is almost the polar opposite of making things complicated.

I don't compress when I want leveling or vice versa. Nor do I eat an apple when I want an orange. Nothing wrong with doing it that way if it's the sound you prefer, but leveling is still a cornerstone of the workflow for countless engineers. It's also one of the most common complaints due to the sheer tediousness of doing it well.

The fact that I never use distortion plugs doesn't change the fact that plenty of other folks use them, and may care about building a better one. It just means we have different workflows.

Post

The sooner levelling goes away and dies the better.
There is nothing worse than watching a TV series or Movie where the dialog has just been levelled.
The average household has so much residual noise that if every part of dialog is not the same level you wont understand the words.
CSI series is a good example of Dialog where the space between phrases and words has been stripped and whats left over compressed to the max.
You just don't miss any dialog.
For music if you compresses the shit out of the vocal it allows the music to come up to meet the vocal.
If you just level the vocal the music cannot be loud enough.
Spencer

Post

https://www.youtube.com/watch?v=MrpZlN5vcFI
Here is an example. I recorded this 2 weeks ago in one take with live vocal.
I used my Voice Processor on the Vocal.
The mike was my own build of a 251/C12 tube mike with inbuilt mike preamp.
The guitar was a DI eqed with MTurboEq.
On the output was the Sonnox limiter.
Spencer

Post

I get it. You don't like leveling.

You seem to be under the impression I am unfamiliar with both compression and limiting on vocals.

So be it. Thanks for sharing.

Post

Well, I don't think it's that simple really. From my experience there just are fluctuations in level caused by the singer. We used compressors for that, but it's not exactly ideal. MAutoVolume does that pretty well, but yes, it causes distortion as any dynamics processor. I think 2 MAutoVolumes in series may actually work well, first slow one, then a quick one to fix the details. Not sure.

Anyways the solution a'ka vocalign is rather complex, both from dev's and user's points of view. I wanted to try to make something like VocAlign some day, but didn't really get time for it, so if I get to that, I'll make sure the leveling is included too. But there are several obstacles - for instance what makes you think that any combination of RMS/Peak/LU will get you the correct level for segments of different length? I of course have an answer for that - it won't :D. These things seem easy from user's point of view, but when you actually start making it, you run into severe troubles.
Also this is all sort of offline process, which seems much more suitable for a DAW... Similarly to drum replacing, which imho doesn't make sense in a plugin, this one should be done in DAW with easy, thought I'm not sure if any DAW actually does that.
Vojtech
MeldaProduction MSoundFactory MDrummer MCompleteBundle The best plugins in the world :D

Post

I get why everyone tries to force a single stage solution. It's just an unfortunate fact that this particular problem can't be solved that way. I don't pretend to know the details, but assume it's probably quite a bit more difficult to develop any 2 stage plug. If that weren't the case, surely someone would have already built this years ago.

I spent years chaining every combination of every available leveling plug. It's really just spreading multiple problems in smaller doses. The fast time constant causes problems in one place, the slow constant in others... now you have both... just in smaller doses.

Between chaining all the leveling plugs, and every permutation involving the various compressors and limiters, and I'm sure I've tried tens of thousands of chains at this point. If anything, it's pushed me toward the pure zero-crossing manual approach as I've become intimately familiar with the problems each of the other approaches causes. I rarely speak to any top tier engineer that hasn't come to similar conclusions... though they may choose to also use heavy compression/limiting in conjunction depending on their style / personal sonic preferences.

I get that single stage plugs will always be the low hanging fruit from a development standpoint, but vocal leveling really is a case where there is a right approach, and it just doesn't fit neatly into that paradigm. SHOULD it be part of the DAW? Maybe... in the sense that it's as much a utility (like Apple Loops Utility) as a plugin. The same could be said of Melodyne or any other 2 stage plug.

That doesn't change the fact that it ISN'T part of the DAW, though... so until someone builds it, it just doesn't exist... and folks around the world who would be more than willing to pay for a real fix to their single most time consuming and frustrating task just keep doing it manually.

I've paid good money on many occasions to just have an engineer buddy of mine do the manual leveling for me, and I've already got every leveling plug there is. It's not just a time issue. It's a money issue.

Post

The main problem here is that the plugin does not have access to the actual signal. So in a way the plugin system is not designed for 2-stage usage. Whenever you change the track, move anything, effect anything, the whole thing will stop working properly, because it was counting on some signal used when analysing. Plus there's potentially a bigger issue - the plugin needs to store the analysis somewhere. It technically should be inside the project itself, but depending on the length of the audio and the amount of data for each region, that could potentially mean huge project files. I think vocalign even saves the data elsewhere, I wonder what happens when you move the project to a different computer then...

Anyways when it comes to the levelling vocals, I think it should be possible, but the question is how well this would actually work, because I personally don't believe that changing level only between phrases is enough. And changing level in zero crossing will NOT avoid distortion. Then there's the problem with detecting level...
Vojtech
MeldaProduction MSoundFactory MDrummer MCompleteBundle The best plugins in the world :D

Post Reply

Return to “MeldaProduction”