A new language

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

This is my vision about a new computer language:
So lets assume you'd slice one minute long audio into as many slices as you can. Then you'd make a meaning for to every different kind of audio slice. As example slice1 would mean A in a binary, another slice B etc. Anyway, if you'd pack 10 000 slices into an one minute long mp3 128kb/s, you then could extract this mp3 into any kind of digital data. The magic is that 1 minute long mp3 is smaller in size than the extracted data afterwards. Possible?

Post

If by "meaning" you mean you'd replace the original bits of the first chunk with the binary code for "A", the second chunk with binary for "B", etc., how do you convert that compact representation back into the original? In other words, where do the original data go after you've encoded them?

You can't just output "ABCDEFGHI" and so on. There's no way to turn that back into the original.

You could include a lookup table ("A" stands for these bits, "B" stands for these bits). But then the output will be larger than the input — it will contain the lookup table, which contains all the input data in slices, plus the compact representation.

Or you could use a character set that includes (at least) as many characters as there are possible chunk data values. Such a character set would necessarily be gigantic, and encoding the data would simply scramble bits without providing any reduction in size (and almost certainly leading to an increase in size).

Post

Meffy wrote: Mon Jan 20, 2020 7:30 pm If by "meaning" you mean you'd replace the original bits of the first chunk with the binary code for "A", the second chunk with binary for "B", etc., how do you convert that compact representation back into the original? In other words, where do the original data go after you've encoded them?

You can't just output "ABCDEFGHI" and so on. There's no way to turn that back into the original.

You could include a lookup table ("A" stands for these bits, "B" stands for these bits). But then the output will be larger than the input — it will contain the lookup table, which contains all the input data in slices, plus the compact representation.

Or you could use a character set that includes (at least) as many characters as there are possible chunk data values. Such a character set would necessarily be gigantic, and encoding the data would simply scramble bits without providing any reduction in size (and almost certainly leading to an increase in size).
I loaded an audio into Reason and sliced the audio into as short slices as possible. An 1 minute mp3 could contain 100 000 slices or even more. I meant that you'd load an mp3 128kb/s into an new kind of software and then read what it has inside. The mp3 128 frames per second has nothing to do with it. It could be wav too, but mp3 is smaller in the size.

Post

I tested and made an approximated math and: 1 minute long Reason audio can contain 18 240 000 slices. And 1 minute long mp3 128kb/s is 938kb in the size. So basically taken it can contain 18 240 000 kilobytes while the file itself is 938 kilobytes.

Post

Anyway, if every slice is different with a different meaning, we'd need to read the audio stem and extract it.

Post

Please read about Shannon's compression theorem.
https://en.m.wikipedia.org/wiki/Shannon ... ng_theorem
And the pigeonhole principle
https://en.m.wikipedia.org/wiki/Pigeonhole_principle

It's impossible to "compress" arbitrary data losslessly. If it was possible, you could apply the compression recursively and reduce any amount of data too one bit, which violates Shannon's theorem and the pigeonhole hole principle.

If you use mp3, the data is stored lossy so you're not even guaranteed to get the original data back. Often that's unacceptable.

Lossless "compression" like zip only work because they remove redundancy in the data. If the data does not contain redundancy, you end up with later output file than only files. Try to zip white nose to see how bad it compresses.

I'm not completely sure what you're trying to achieve, but I think you're trying to come up with some encoding scheme to store large data blocks in small sizes. It will only work well in very specific case and very bad in others. You can't trick math :shrug:

Post

You’re approaching the entire discipline of data compression with a very naive perspective. MP3 is already a highly compressed format. You aren’t likely to do better without learning a substantial amount about this field of study.
Incomplete list of my gear: 1/8" audio input jack.

Post

Here: Anyone knows my first message - I zipped a Reason file to make the first binary/audio example.
Mystery_Message.zip
You do not have the required permissions to view the files attached to this post.

Post

All you'd need to do is making a cancelling test for the mp3 slices to know they match 100%.

Post

The first message in audio is this: http://www.reflexion-x.com/downloads/Th ... essage.mp3

Post

Even if the audio would be 16 bit 44 100 wav - one minute can contain as many slices as mp3. One minute audio is 10mb.

Post

For those who have Reason but didn't get it. The labels! http://www.reflexion-x.com/downloads/The_Answer.reason

Post

EDIT: I reread your post and I've misread what you meant. MP3 changes the wave that you give it, so it can't be used to do any lossless compression (it can't replace ZIP for instance).

It might be possible to abuse MP3 as a lossy compression scheme for continuous data. MP3 is heavily tuned towards data where each value is similar to the previous one and wiggles up and down. For discontinuous data, it won't work because discontinuities are going to be compressed in a way that adds a bunch of wiggles before and after the jump. If you tried to encode a picture, it would have to be stored in some kind of zig-zag pattern.

There's also FLAC, which is lossless and much closer to what you're trying to do. However, FLAC is designed around encoding continuous data (ie every value is similar to the previous but going up or down). For other types of data, it will work worse than repetition-based encodings (zip, 7z gif etc).

Original post:
This is called "Vector Quantization" and is a classical compression algorithm: https://en.wikipedia.org/wiki/Vector_quantization

This guy uses Vector Quantization as a 100kbps compression scheme on C64:
http://brokenbytes.blogspot.com/2018/03 ... r-for.html
Last edited by MadBrain on Mon Jan 20, 2020 10:40 pm, edited 2 times in total.

Post

MadBrain wrote: Mon Jan 20, 2020 10:22 pm This is called "Vector Quantization" and is a classical compression algorithm: https://en.wikipedia.org/wiki/Vector_quantization

This guy uses Vector Quantization as a 100kbps compression scheme on C64:
http://brokenbytes.blogspot.com/2018/03 ... r-for.html
Interesting - seems something near to mine. Anyway, you could send 18 240 000 bytes in a 938 000 bytes to someone else to get extracted via an internet.

Post

I'm still not sure what you're trying to achieve. Using MP3 as a general purpose "compressed" data transfer format? That won't work. An e-mail sent as MP3 slices would properly be only gibberish.

Post Reply

Return to “DSP and Plugin Development”