KVR Audio

matt42 · Post by **matt42** » Mon Aug 31, 2015 7:13 am

I'm looking at losslessly compressing audio data for my current project. I'm wondering if anyone can recommend a format to use?

I'm trying to get to grips with FLAC, but I can't get the example project to compile. I'm getting the error:

Code: Select all

libFLAC_static.lib(stream_encoder.obj) : error LNK2001: unresolved external symbol _fopen_utf8

Thanks,

Matt

aciddose · Post by **aciddose** » Mon Aug 31, 2015 7:19 am

https://en.wikipedia.org/wiki/Audio_file_format

This is another option:
http://www.wavpack.com/

matt42 · Post by **matt42** » Mon Aug 31, 2015 9:18 am

Thanks,

I'm taking a look at wavpack now.

4damind · Post by **4damind** » Mon Aug 31, 2015 9:53 am

Afaik there is a preprocessor directive "FLAC_USE_FOPEN_UTF8" which will use this version and I expect the library will also only include/build this version if FLAC_USE_FOPEN_UTF8 was set.

4damind · Post by **4damind** » Mon Aug 31, 2015 9:53 am

[double post]

matt42 · Post by **matt42** » Mon Aug 31, 2015 12:49 pm

Thanks 4damind,

I actually got up and running with wavpack, but I might try and make some comparisons with FLAC at some point.

matt42 · Post by **matt42** » Tue Sep 01, 2015 6:51 am

OK, so a couple more things:

Regarding FLAC it seems I need to build and link to the win_utf8_io static library. Not done it yet as still playing with wavpack. But from what I saw (just initial look) the FLAC lib seems set up to deal with files, is it a mission to get it compress/decompress raw data?

With Wavpack, I was able to compress/decompress raw 32 bit sample data, but the compression ratio seems low. Between roughly 0.7 and 0.9 on the files I tested. I tried with 24 bit, both raw data and a file, but there is some kind of error. All the function calls return as success, but the compression ratio is over 1.0, ie negative compression, and the decompressed data is zeroed.

EDIT: Failed to notice this:

Samples must be stored in 32-bit longs in the native endian format of the executing processor

Now fixed.

Thanks,

Matt

Big Tick · Post by **Big Tick** » Mon Sep 14, 2015 1:27 pm

If you want to use your own - rice encoding with a linear predictor works quite well.

matt42 · Post by **matt42** » Wed Sep 16, 2015 12:57 pm

Thanks Big Tick,

I've only had time to have a brief look into it. Rice encoding seems straight forward enough. How about linear prediction? What kind of filters would, typically, produce reasonably good results?

Big Tick · Post by **Big Tick** » Fri Sep 18, 2015 10:53 am

No filter needed. Just use linear interpolation from the 2 previous samples, to predict where the next sample might be. Then store (rice-encoded) the difference between that predicted sample value and the actual value.

You can experiment with better predictors, using more of the previous points, but in my experience it doesn't give significantly better results than linear with 2 points.

fmr · Post by **fmr** » Fri Sep 18, 2015 11:10 am

What was the problem in using FLAC own encoder? https://xiph.org/flac/download.html

matt42 · Post by **matt42** » Tue Sep 29, 2015 7:03 am

Hi Big Tick,

Thanks for the helpful advice on this. I would have posted back earlier, but I'm still working on this. Hopefully I'll have something decent done in a couple more days and will be able to say more about it then.

fmr,

I need a solution I can compile with a project I'm working on for a third party. Wavpack has very good compression and is very well documented. I'd probably just go with that as I'm not against adding third party licences to the software, but a custom solution would be cleaner in that regard.

mystran · Post by **mystran** » Tue Sep 29, 2015 2:54 pm

Some thoughts:

- I'd probably try to split the signal using [1,1] and [1,-1] as simple filters, then fit a linear prediction to each (flipping the sign of every other sample in the "high frequencies" band), then combine these to get an estimate and encode the error from there; that should improve the behavior for signals with significant high-frequency content, with very little overhead.

- Rice-coding looks like the type of scheme that would do well in the best-case and extremely poorly in the worst-case; you'd definitely want some fall-back (like RLE) with that, but you might do better with bit-wise arithmetic coding and a fairly simple predictor since there is a high-probability of series of leading 0 or 1 (for positive/negative numbers respectively) in the significant bits (assuming the errors are generally small), but once the general magnitude has been established I'd predict the probability of each to rapidly decay to about 50% for each; you probably want to take advantage of this somehow, to avoid spending too many bits on trying to compress the high-entropy least-significant bits.

Music Engineer · Post by **Music Engineer** » Tue Sep 29, 2015 4:37 pm

mystran wrote:Some thoughts:

- I'd probably try to split the signal using [1,1] and [1,-1] as simple filters, then fit a linear prediction to each (flipping the sign of every other sample in the "high frequencies" band), then combine these to get an estimate and encode the error from there; that should improve the behavior for signals with significant high-frequency content, with very little overhead.

very interesting idea. how about this variant: use a leaky integrator (lowpass) filter on the input, use linear prediction on the lowpass signal, and during reconstruction use the inverse of the leaky integrator ("leaky differentiator"?) as last step. one could even experiment with cascades of (stably invertible) lowpasses (using a cascade of inverse filters in the reconstruction). every lowpass should reduce the variance of the prediction error - possibly up to some point of diminishing returns - and you would also have to take into account finite precision arithmetic considerations.

i think, .flac works using linear prediction as well, but computes an optimal higher order predictor for each block and stores the prediciton coefficients. the 2-sample linear extrapolation scheme seems to be an interesting alternative that doesn't need any segmentation and storage of prediction coefficients

Big Tick · Post by **Big Tick** » Wed Sep 30, 2015 5:19 pm

Also, what I did for the Rhino samples packing, is that the compressor splits the audio into smaller segments (like 512 samples), then for each segment, tries a few different strategies for the predictor (linear, cubic, ...). The strategy that gives the best result is stored just before the Rice-encoded data. This has no impact on the unpacking speed, obviously.

Lossless Compression