KVR Audio

Chrisboy2000 · Post by **Chrisboy2000** » Sun Nov 17, 2013 10:41 pm

This seems like a very dumb question, but how is it possible that a better audio interface can yield lower latencies without glitching? I started C++ coding with Juce and wonder because the processBlock method that fills the audio buffer is calculated on the CPU and therefore heavy stuff like convolution stresses only the CPU so I was wondering whats the role of the audio interface here? I know that graphic cards have a seperate GPU that can be accessed with eg. OpenGL and do graphic stuff like texture mapping and so on, but have audio cards something similar? Most likely I didn't understand how audio interface and CPU work together, but this question is bogging me for a few days.

AdmiralQuality · Post by **AdmiralQuality** » Mon Nov 18, 2013 6:14 am

It's basically because we're attempting to do real-time (or at least almost real-time) music on a non-real-time operating system.

It also has to do with the way the bus technology that connects the audio interface to the host CPU and RAM work, as well as how the device driver (which is typically proprietary to the manufacturer of the audio-interface) behaves.

Basically, the audio interface is in charge and asks the driver to send and/or receive more samples when it needs them. The audio interface is running on a fixed clock, samples have to come out exactly at the right time, or we'd hear all kinds of horrible artifacts. So ever 44100th of a second, the audio interface needs a new sample value that better be there, or we'll hear a click.

It could ask for them one sample at a time, but that's rather inefficient compared to asking for a bunch at a time. So there are buffers that hold many samples that are passed between the audio interface and the host machine, and the driver and the DAW software connected to the driver. If those buffers can't be filled in time, then we get the clicking/buzzing sound and everything goes to shit.

Here's an analogy I've used before, you can think of the buffer as a hopper full of cookie dough that squeezes out one cookie (sample) worth of dough onto a conveyor belt (that represents time) at a time. We don't want any missing (or double) cookies on our belt, we want them all nicely evenly spaced. So having enough dough for many cookies in our hopper gives us some some slack time that we can use in retrieving more dough to put in the hopper.

As our non-real-time operating system doesn't allow us to predict exactly how long any task will take, we make a whole bunch of samples all at once, then load the buffer of the audio interface with them, where they can trickle out at exact 44 thousandths (or whatever) of a second timing.

The audio driver, which is the only software on the host that actually understands how to talk to the audio interface out there on some bus, actually "pumps" the DAW software. You might imagine that when you hit a key on a softsynth, the DAW calls the driver to say it's got some audio for it. But it's the other way around. The driver is constantly calling the DAW, saying, you got my samples ready? And it better have them ready right then, or again, we get dropouts/clicking/missing cookies! Once the host knows the driver has received a new buffer's worth, it starts writing into another buffer. Typically 2 or 3 of these buffers swap in rotation. (You may have heard terms like double and triple buffering. That's what that is. Comes up in display graphics too.) So while the host is rendering to one buffer, the driver/interface are playing out the previous already-rendered buffer.

Normally, when everything's working okay, the CPU can produce these samples MUCH faster than they play out. (And if the CPU can't keep up with the demand, obviously we get the overtaxed DAW and everything turns to nothing but horrible dropouts as we're hearing only partially completed buffers.) So rather than the driver bothering the host for each and every sample, one at a time, it makes more sense to ask for a bunch all at once. But the bigger that bunch is, the longer it will be before we hear it come out of the audio interface. This is latency, and for performing live music it's BAD because we humans need to hear the note when we played it, not some time next week.

But, if in our attempt to minimize latency, we make our buffer too small, the CPU might be too busy right at that moment to get around to it. Even though we're not asking for it to produce more samples over a given amount of time than with a long latency, we reach a point where the operating system simply doesn't have enough time. The CPU is multitasking a whole bunch of things and the operating system controls which process gets the current time slice, and again, our desktop operating systems aren't optimized for real-time performance.

This is why relaxing (increasing) your latency will often cause a clicky/buzzy DAW to play clean. But this only goes so far, if we add enough processes (plug-in synths and effects as well as the host's built-in processing functions) we'll hit a point where we're simply asking the CPU to calculate more than it can in that amount of time, and no amount of additional latency/buffer-size will help.

As for why some DAWs might perform better than others, this is a bottomless pit of esoteria about the physical architecture of the machine (things like what kind of chipset is running the bus the audio interface is on) as well as peculiarities of the software driver for that hardware product. This is why it can be so hard to know if the machine you're buying is good for audio. (Gawd bless the gamers for bringing high performance systems into the realm of affordability!)

And no, except for certain hardware (UAD cards?) the audio interface is pretty dumb. Your hosts and plug-ins (unless they REQUIRE some particular piece of acceleration hardware) run on the host CPU.

Hope that was more illustrative than confusing. Let us know.

Chrisboy2000 · Post by **Chrisboy2000** » Mon Nov 18, 2013 1:13 pm

Thank you for your in depth answer. But it still seems a bit unclear to me:

Normally, when everything's working okay, the CPU can produce these samples MUCH faster than they play out

If I got this right, the CPU usage is calculated by the time the CPU spends in the audio interupt, right? So in pseudocode:

Code: Select all

cpuUsage = processingTime / bufferTime * 100

but this means that the performance only depends on the speed of the cpu so it is still unclear to me how an interface can improve the performance of the audio interupt. Also this means processing is not "MUCH" faster (I would expect the processing time to be less than 1 percent in this case).

Even though we're not asking for it to produce more samples over a given amount of time than with a long latency, we reach a point where the operating system simply doesn't have enough time.

Although the same amount of samples are processed with a longer buffer size, I think the CPU will benefit from float vector operations which improve the speed of fewer calls with bigger buffers plus the reduced overhead of decreased function calls. But again, I don't know how an interface would make any difference here.

we'll hit a point where we're simply asking the CPU to calculate more than it can in that amount of time, and no amount of additional latency/buffer-size will help.

This is exactly what I mean. How is it possible that a better interface (like RME Fireface) pushes this limit while low budget interfaces exceed the processors capability more easily? Is it only the quality of the audio driver (I don't know anything about driver development)?[/code]

Keith99 · Post by **Keith99** » Mon Nov 18, 2013 1:56 pm

If you are not taking any audio input in via your interface you are right to say that the CPU is the limiting factor not the interface, although the interface driver will be involved in holding the buffer.

AdmiralQuality · Post by **AdmiralQuality** » Mon Nov 18, 2013 1:59 pm

Chrisboy2000 wrote:Thank you for your in depth answer. But it still seems a bit unclear to me:

Normally, when everything's working okay, the CPU can produce these samples MUCH faster than they play out
If I got this right, the CPU usage is calculated by the time the CPU spends in the audio interupt, right? So in pseudocode:
Code: Select all
cpuUsage = processingTime / bufferTime * 100

Remember, the machine is running more processes than just one DAW. It's not on an interrupt. (The driver may use them but the host doesn't.) I believe in most cases the driver calls a callback function in the host. (Correct me if I'm wrong here folks, I've not written a host yet so I don't have any experience connecting to the various types of drivers directly.)

But yes, in THEORY, if the CPU can process a buffer's worth of samples in the time that buffer represents, you're okay. But again, our DAW isn't the only process on the system and the OS determines when the host gets a time slice. (And the OS doesn't give a shit if you drop samples.)

but this means that the performance only depends on the speed of the cpu so it is still unclear to me how an interface can improve the performance of the audio interupt. Also this means processing is not "MUCH" faster (I would expect the processing time to be less than 1 percent in this case).

You say "the audio interrupt" like there is such thing. That's all entirely dependent on the particular audio hardware and its proprietary driver.

I'm not sure what you're asking in the rest of that. Much faster than what? How can we be expecting a particular processing time when we don't even know what the process is and what kind of CPU it's running on?

Even though we're not asking for it to produce more samples over a given amount of time than with a long latency, we reach a point where the operating system simply doesn't have enough time.
Although the same amount of samples are processed with a longer buffer size, I think the CPU will benefit from float vector operations which improve the speed of fewer calls with bigger buffers plus the reduced overhead of decreased function calls. But again, I don't know how an interface would make any difference here.

Again, you're talking about implementation details that may or may not be there. Remember, there's not just one process happening per buffer, there's a whole series of them. Each effect needs to be processed in the order they're chained. (Parallel effects can be run on separate cores simultaneously though.)

Yes, there's extra overhead with increased function calls. CPU cache behavior may be impacted too, I would assume. As well as branch prediction. But again, during each buffer we typically call a whole bunch of processes from a whole bunch of different vendors (us plug-in developers) as well as the host's own processes.

we'll hit a point where we're simply asking the CPU to calculate more than it can in that amount of time, and no amount of additional latency/buffer-size will help.
This is exactly what I mean. How is it possible that a better interface (like RME Fireface) pushes this limit while low budget interfaces exceed the processors capability more easily? Is it only the quality of the audio driver (I don't know anything about driver development)?[/code]

I don't know a whole bunch about it either but obviously that's part of it. Of course not just the driver, but also how the hardware works, what kind of bus it's on, and again, specifics like what kind of chipset your Firewire and USB are on can make a huge difference as well. It's not JUST the audio interface. I've had machines that sucked with their built in USB or Firewire but rocked when a card was put in that had a better chipset.

And don't forget operating system. It sounds like you're from the hardware world. We're at the mercy of the OS for a lot of this stuff.

You should find that with a relaxed latency you can get the same performance from just about any audio interface. It's at the low end that the differences show. And you can get the breakup when there's still plenty of CPU power left according to the host or OS's performance meters. Power doesn't do us any good if the instruction pointer isn't in our code.

AdmiralQuality · Post by **AdmiralQuality** » Mon Nov 18, 2013 2:03 pm

Keith99 wrote:If you are not taking any audio input in via your interface you are right to say that the CPU is the limiting factor not the interface, although the interface driver will be involved in holding the buffer.

What?

There's always inputs, whether the host is "listening" to them or not.

toddhisattva · Post by **toddhisattva** » Mon Nov 18, 2013 4:20 pm

[quote="Chrisboy2000"]how is it possible that a better audio interface can yield lower latencies without glitching?[/quote]

As has been mentioned, it could be the bus. Think on this - you might have a cheap built-in audio system that's hanging off an I2C serial bus, and the I2C driver might have to do every bit of signaling by itself - the driver and hardware combo require the CPU to do more work.

Compare to something like PCIe and a driver that leverages DMA - the hardware itself will move the data leaving the CPU free (-er).

CableChannel · Post by **CableChannel** » Mon Nov 18, 2013 4:38 pm

Geez guys, I am interested in the answer to this thread's question, but I dont wanna read all the above. Can you do a short summary when you are done?

BTW my short answer to date was that it depends on the drivers. Is that correct? Then my question would be, couldn't someone come up with some super driver, like ASIO4ALL but only more efficient, and at the same time independent of any interface device?

AdmiralQuality · Post by **AdmiralQuality** » Mon Nov 18, 2013 4:42 pm

CableChannel wrote:Geez guys, I am interested in the answer to this thread's question, but I dont wanna read all the above. Can you do a short summary when you are done?

BTW my short answer to date was that it depends on the drivers. Is that correct? Then my question would be, couldn't someone come up with some super driver, like ASIO4ALL but only more efficient, and at the same time independent of any interface device?

That's what ASIO4ALL is (or tries to be).

CableChannel · Post by **CableChannel** » Mon Nov 18, 2013 5:44 pm

AdmiralQuality wrote: That's what ASIO4ALL is (or tries to be).

Yeah, emphasis on try. It's not reknown to deliver killer performance, right?

AdmiralQuality · Post by **AdmiralQuality** » Mon Nov 18, 2013 5:51 pm

CableChannel wrote:
AdmiralQuality wrote: That's what ASIO4ALL is (or tries to be).
Yeah, emphasis on try. It's not reknown to deliver killer performance, right?

In my experience it's buggy and doesn't handle pro (or semi-pro) interfaces with more than 2 channels very well. Though it's been a few years since I've used it, maybe it's been fixed. If it's working for someone, great!

But yes, a driver is meant to be proprietary. This is to keep us software developers from having to understand the interface trivialities of thousands of devices and code to support them all. (Anyone remember back in the DOS days when games would come with a list of the hundreds of graphics cards they supported?)

So the driver abstracts all that esoteric stuff into one common interface that all software can use.

JCJR · Post by **JCJR** » Mon Nov 18, 2013 6:23 pm

I don't know if the issues I mention are still significant, but in the past some combinations of hardware and driver seemed to call my program's render callbacks "as soon as possible" when the audio interface would need more data, and it would manage to successfully send the data out the pipe even if my program's render callback would return "on the edge of being too late".

In other words, it seemed that some driver and hardware combinations just allowed more time in the render callback before the output audio would start sputtering.

PERHAPS some audio interface and driver combinations have less jitter in calling a program's render callback. For instance if interface A always gives you a "real big window of time" for rendering-- But interface B has more jitter, and sometimes interface B gives you a big window of rendering time, and other times interface B calls you too late and gives you a short window of rendering time, causing occasional clicks that happen once in awhile apparently at random.

Also in the past, some combinations of audio hardware and driver were very brittle, and could even bluescreen the computer if my program was having trouble and missing deadlines in responding to buffer requests from the driver. A click or dropout is no fun, but clicks are lots more fun than a crash and reboot.

That isn't really a latency issue as much as an issue that will drive you crazy trying to debug software if the computer is crashing all the time and having to be rebooted into the development system. Also as a user, it is no fun to "watch over your shoulder" and make sure not to load the audio processing beyond a certain amount, or you crash out and have to reboot.

Another issue of the past, which I don't know still affects modern setups-- Some entire runs of USB and Firewire chipsets were subtly incompatible with some models of USB or Firewire audio interfaces. So sometimes it could be a guessing game whether the intermittent bug was the fault of the audio interface, or the fault of the company's driver, or the fault of the chipset in your computer. This could be difficult to debug because often the different manufacturers involved in the screw-up would blame each other for the problems.

So if you got unlucky and things were too brittle, you might have to buy an add-on card with the "officially blessed" usb or firewire chip, sometimes only to find out that the combination of driver and audio interface continued to be "too brittle" to avoid going crazy using the dang thing.

Keith99 · Post by **Keith99** » Mon Nov 18, 2013 6:42 pm

AdmiralQuality wrote:
Keith99 wrote:If you are not taking any audio input in via your interface you are right to say that the CPU is the limiting factor not the interface, although the interface driver will be involved in holding the buffer.
What?

There's always inputs, whether the host is "listening" to them or not.

Yes but if the host is not asking the driver for input from the interface no time is spent processing it. The hardware itself will always be processing away yes of course

AdmiralQuality · Post by **AdmiralQuality** » Mon Nov 18, 2013 7:02 pm

Keith99 wrote:
AdmiralQuality wrote:
Keith99 wrote:If you are not taking any audio input in via your interface you are right to say that the CPU is the limiting factor not the interface, although the interface driver will be involved in holding the buffer.
What?

There's always inputs, whether the host is "listening" to them or not.
Yes but if the host is not asking the driver for input from the interface no time is spent processing it. The hardware itself will always be processing away yes of course

The host doesn't ask the driver. The driver tells the host "I've got new buffers ready for you, time to swap".

BertKoor · Post by **BertKoor** » Tue Nov 19, 2013 8:26 am

I have always envisioned the driver being a little guy sitting inbetween two conveyer belts: the host delivers a stream of sampled audio and the hardware needs it delivered. He does the job by lifting a full pallet of samples (a full buffer) from one belt onto the other.

If the pallets are big, then he has not much work to do. Lots of waiting and occasionally shift the load (he's a strong guy.) But with very small pallets it's a lot of running around to get back & forth in time. For us it's a black box, we don't know what's happening inside. So we don't know how efficient that guy can do his job. It depends on the guy who wrote that part of the driver.

Some factors that may influence it:
* data conversion: it may accept 16 bit integers, 24 bit integers, 32 bit floats, and has to convert that all to one common format the DAC needs.
* some DACs are fixed to 48 kHz but the driver supports any sampling rate you throw at it. So then it has to do a sampling rate conversion as well.
* merging in streams of different applications & protocols (WMA, MME, ASIO, etc) which all might be delivered in different formats.

Why does the audio interface affect the lowest possible latency