KVR Audio

KBonzo · Post by **KBonzo** » Thu Feb 21, 2019 5:18 pm

Is float data classed as PCM? e.g. a wav file with 32 bit floats? The definition of PCM seems to imply that it is: https://en.wikipedia.org/wiki/Pulse-code_modulation but some sources seem to claim that it's not. http://www-mmsp.ece.mcgill.ca/Documents ... /WAVE.html

If I use the program sndfile-info on a float wav file it complains: All non-PCM format files should have a 'fact' chunk.

The spec for a wav file says that "For all formats other than PCM, the Format chunk must have an extended portion. The extension can be of zero length, but the size field (with value 0) must be present."

A wav file that uses floats can be created with or without an extended section. Both seem to be valid.

Xenakios · Post by **Xenakios** » Thu Feb 21, 2019 6:18 pm

Why do you care? Is there some actual problem you are having because of what the sndfile-info program reports?

earlevel · Post by **earlevel** » Thu Feb 21, 2019 6:55 pm

KBonzo wrote: Thu Feb 21, 2019 5:18 pm Is float data classed as PCM? e.g. a wav file with 32 bit floats?

Samples, in anything you're likely using, are PCM. That is, if the audio is sampled at a constant rate and turning into a number, it's PCM by definition. Sampling is the same as: multiplying the audio ("Modulation") by a pulse train of constant amplitude ("Pulse"), and encoding it into a digital value ("Code"). PCM—Pulse Code Modulation.

KBonzo · Post by **KBonzo** » Fri Feb 22, 2019 12:24 am

earlevel wrote: Thu Feb 21, 2019 6:55 pm
KBonzo wrote: Thu Feb 21, 2019 5:18 pm Is float data classed as PCM? e.g. a wav file with 32 bit floats?
Samples, in anything you're likely using, are PCM. That is, if the audio is sampled at a constant rate and turning into a number, it's PCM by definition. Sampling is the same as: multiplying the audio ("Modulation") by a pulse train of constant amplitude ("Pulse"), and encoding it into a digital value ("Code"). PCM—Pulse Code Modulation.

This is how I understand it from the definition on wikipedia. So what is non-pcm? Microsoft seems to think it's related to compressed audio https://docs.microsoft.com/en-us/window ... ve-formats
I haven't been able to find anything that explains the difference between PCM and non-PCM. Is it just compressed/non-compressed?

There are a few reasons why I want to know this. One is for reading and writing audio files. Ambiguity is not good if you're creating software.

earlevel · Post by **earlevel** » Fri Feb 22, 2019 12:45 am

KBonzo wrote: Fri Feb 22, 2019 12:24 am This is how I understand it from the definition on wikipedia. So what is non-pcm? Microsoft seems to think it's related to compressed audio https://docs.microsoft.com/en-us/window ... ve-formats
I haven't been able to find anything that explains the difference between PCM and non-PCM. Is it just compressed/non-compressed?

There are a few reasons why I want to know this. One is for reading and writing audio files. Ambiguity is not good if you're creating software.

Yeah, I figured that might be why you're asking, I do recall that confusion in the spec.

Taking a quick look at the McGill site, "PCM data is two's-complement except for resolutions of 1-8 bits, which are represented as offset binary." That implies that even floating point data (which is never two's complement) is not considered PCM, which makes no sense. Elsewhere it's implied that "non-PCM" is synonymous with "compressed" ("All (compressed) non-PCM formats must have...", and "It is aimed at carrying PCM or MPEG audio data").

I think the problem with the spec wording is due to it originally supporting fixed/integer samples, and later adding floating point. And compressed PCM is still PCM when uncompressed, so the distinction seems to be only compressed or not compressed, not whether its PCM—no one's going to be doing much with compressed digital audio till it's uncompressed. In my book, it's all PCM.

I think you can sort through it by ignoring the PCM distinction and just considering how it's stored, and whether it's compressed. But feel free to bring up specific questions if I'm forgetting about tricky parts. I've written software that reads and writes various integer/fixed and floating point, but have had no need for compressed for formats.

KBonzo · Post by **KBonzo** » Fri Feb 22, 2019 3:45 pm

It's dificult to ignore the term PCM when you're reading Microsoft docs on their audio system api and they keep harping on about PCM and non-PCM. I want to know what they mean by it. On the library of congress site they claim that PCM can be compressed audio. https://www.loc.gov/preservation/digita ... 0016.shtml

I think it might be that the term PCM refers to the raw output of the ADC. From the library of congress: "Variants are based on different mathematical techniques for quantization, including linear, logarithmic, and adaptive."

The quantization process can output compressed audio. Unless the ADC can output float data directly then it's not PCM. Maybe?

Sorry if this seems a bit anal or not relevant but to me it is relevant because it's digital audio.

mystran · Post by **mystran** » Fri Feb 22, 2019 5:33 pm

KBonzo wrote: Fri Feb 22, 2019 3:45 pm It's dificult to ignore the term PCM when you're reading Microsoft docs on their audio system api and they keep harping on about PCM and non-PCM. I want to know what they mean by it.

You pretty much have to infer the correct interpretation from the context, because the usage of the term is not all that consistent.

As for the WAV format... FORMAT_PCM and FORMAT_IEEE_FLOAT data can also be stored in FORMAT_EXTENSIBLE and some programs actually always do this, so if you want to parse files, you want to support _EXTENSIBLE headers even if you don't support anything beyond _PCM and _IEEE_FLAOT as actual formats. Given that apparently you need a "fact" chunk for _IEEE_FLOAT and probably nobody knows whether or not you need one for _EXTENSIBLE, if you are writing WAV files I'd probably just put a "fact" chunk in there, it's not like it's going to hurt (and likely almost all programs will just ignore it). Most WAV files contain a bunch of chunks (some of them in common use and others specific to the program that wrote it) beyond what is required by the format anyway.

mystran · Post by **mystran** » Fri Feb 22, 2019 5:44 pm

The format in general is a mess though. For serious WAV format support, you also need to deal with gems like this:

This redundancy has been appropriated to define new formats. For instance, Cool Edit uses a format which declares a sample size of 24 bits together with a container size of 4 bytes (32 bits) determined from the block size and number of channels. With this combination, the data is actually stored as 32-bit IEEE floats. The normalization (full scale 223) is however different from the standard float format.

The real fun starts when you want to support things like regions, pitch/tempo info, loop points... because you'll probably have to support at least a few possible ways to store each of these.

KBonzo · Post by **KBonzo** » Fri Feb 22, 2019 7:15 pm

Writing wav files is straightforward. Reading them can get messy. My own code is just to read hopefuly simple enough cases such as an impulse response. The question of PCM audio has wider implications than just creating vst's. It can be interesting to look at how the system works "under the hood".
https://docs.microsoft.com/en-us/window ... cm-support

mystran · Post by **mystran** » Fri Feb 22, 2019 8:40 pm

KBonzo wrote: Fri Feb 22, 2019 7:15 pm Writing wav files is straightforward. Reading them can get messy. My own code is just to read hopefuly simple enough cases such as an impulse response.

Well, if you parse 8/16/24/32-bit linear (integer) PCM and 32/64-bit floats with either regular or _EXTENSIBLE headers and a varying number of channels, then that will cover most files you'll see in practice. In my opinion support for the non-linear PCM formats is mostly an academic exercise and dealing with compressed formats will get ugly for portable code (and on Windows you could just let MS parse the files for you). The main thing to watch out is that you should generally not make too many assumptions about the order of chunks (although I guess "fmt " is supposed to come somewhere before "data"; my code doesn't really care, so no idea if this is ever violated in practice).

mystran · Post by **mystran** » Fri Feb 22, 2019 8:55 pm

After looking at the driver docs... I'd guess MS uses PCM basically as a shorthand for WAVE_FORMAT_PCM (ie. linear PCM, integer only).

KBonzo · Post by **KBonzo** » Sat Feb 23, 2019 7:23 pm

I asked Microsoft how they define it. They seemed reluctant to answer at first but eventualy said :

"We use different definitions in different places, depending on whether the distinction between integer and floating-point or between compressed and uncompressed is more important."

At least their answer makes me feel a bit less dumb for asking

https://social.msdn.microsoft.com/Forum ... evelopment

PCM Audio float