image to sound conversion

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

AdmiralQuality wrote:While this stuff is kind of cute, there's really no "correct" way to interpret images into sound.
Well, for sure there's not a "correct" way, but there can be "useful" ones.

For example watching the video and looking at the explanation in the first post. I realise that there are certain sound effects that I could put together very quickly using ms paint that would take me a very long time, or be nearly impossible in a traditional subtractive say, just as a totally random example, something like Poly-Ana :D

All that's going on is the image is being interpreted as a horizontal spectrogram waterfall. This is kind of cool as images can be viewed as a spectral piano roll. It's very intuitive and produces some interesting results.

Post

No. This specific idea is a useless toy idea that only works through specially selected images.
Sorry.
People can use it as an art form or something but it doesn't really expand from the screeching whining effect.

Post

Obviously you have no use for, or interest in, such a thing.

Toy? Perhaps, though I'd argue that to be a subjective opinion and perhaps some people would use it as such. I could see it being a useful sound generator for various types of special effects.

Selective with the images? Of course that's the case unless you want to generate random noise. I'm also quite selective of which frets I hold my fingers down on my guitar, which string(s) to pluck - whatever. I could just bash away moving my hands at random, but that wouldn't be grounds that the guitar is just a useless toy.

My point is if I wanted a sound effect, for example with lost of rising, falling, or swirling tones. I could very easily draw an image that would produce that result. Perhaps not the most versatile thing, but some people may have uses it.

Post

thinking of an image as representing frequency is silly. it doesn't represent frequency, it represents a set of scalars in an array with some number of dimensions. (you can also take the fourier transform of an image to get frequency data.)

since an image is two dimensions, the "correct" way would be actually to sample along some path in the image plane.

for example, you could sample each line one after the other. you could also sample on an angle, any angle.

this would work best with repeating textures because in such a case you could continue moving along the vector and just wrap when the edges are reached, always producing perfectly continuous results.

that isn't the only way you could sample though. you could sample in a spiral, or any path through the plane really.

*watches as everyone goes to implement this obvious idea*

there are many different ways to translate from the value of a sampled pixel to a scalar. chroma, luma, hue. red, green, blue. many more.
Free plug-ins for Windows, MacOS and Linux. Xhip Synthesizer v8.0 and Xhip Effects Bundle v6.7.
The coder's credo: We believe our work is neither clever nor difficult; it is done because we thought it would be easy.
Work less; get more done.

Post

I'd say anything which engages our senses and produces any kind of quasi-predictable outcome to our inputs, is useful in some way. Regarding something like this as a mere curio/toy is one way of looking at it, but then it does lend itself to interactive art installations, generation of off-the-wall sounds for incorporation of one's music , and can even have practical utility, for example in the visual impairment field: take a two-dimensional spatial brightness map of a visual image, scan and transform it into a two-dimensional map of oscillation amplitude as a function of frequency and time. The sound patterns corresponding to simple shapes can then be easily imagined, since the output is not 'purely random'. A straight bright line on a dark background running diagonally will sound as a single tone steadily increasing in pitch, a rectangle standing on its side will sound like bandwidth limited noise, having a duration corresponding to its width, and a frequency band corresponding to its height and elevation. Interpretation for simple shapes can then become the 'building blocks' for understanding more 'realistic images' that yield more complicated sound patterns. The key is predictability/repeatability for making the output accessible to conscious analysis.

Post

doctornash wrote:I'd say anything which engages our senses and produces any kind of quasi-predictable outcome to our inputs, is useful in some way. Regarding something like this as a mere curio/toy is one way of looking at it, but then it does lend itself to interactive art installations, generation of off-the-wall sounds for incorporation of one's music , and can even have practical utility, for example in the visual impairment field: take a two-dimensional spatial brightness map of a visual image, scan and transform it into a two-dimensional map of oscillation amplitude as a function of frequency and time. The sound patterns corresponding to simple shapes can then be easily imagined, since the output is not 'purely random'. A straight bright line on a dark background running diagonally will sound as a single tone steadily increasing in pitch, a rectangle standing on its side will sound like bandwidth limited noise, having a duration corresponding to its width, and a frequency band corresponding to its height and elevation. Interpretation for simple shapes can then become the 'building blocks' for understanding more 'realistic images' that yield more complicated sound patterns. The key is predictability/repeatability for making the output accessible to conscious analysis.
It's just a spectrogram "reader".

Post

I've just released a free picture to sound synth. Download for Windows and a video of it in action here:

http://flexibeatz.weebly.com/paint2sound.html

Thanks all for your views and input - much appreciated! :D

Post

aciddose wrote:thinking of an image as representing frequency is silly. it doesn't represent frequency, it represents a set of scalars in an array with some number of dimensions. (you can also take the fourier transform of an image to get frequency data.)

since an image is two dimensions, the "correct" way would be actually to sample along some path in the image plane.

for example, you could sample each line one after the other. you could also sample on an angle, any angle.

this would work best with repeating textures because in such a case you could continue moving along the vector and just wrap when the edges are reached, always producing perfectly continuous results.

that isn't the only way you could sample though. you could sample in a spiral, or any path through the plane really.

*watches as everyone goes to implement this obvious idea*

there are many different ways to translate from the value of a sampled pixel to a scalar. chroma, luma, hue. red, green, blue. many more.
Is a similar idea to wave terrain synthesis except using an image as your wave terrain. Can give interesting results provided you can wrap smoothly around the edges.

Post

Cool concepts, so I will throw something into the stew.

You should consider first converting the RGB image into CIELAB space. Why? Because Lab colour is a perceptual color space. A color vector in Lab represents the same amount of perceived change no matter where you start from. Or, in other words, adding some amount to Luminance (for eample) produces an equal change in brightness no matter what color you start with.

Compare this with RGB, where at the dark end of the color range, everything mostly looks black.

Post

since an image is two dimensions, the "correct" way would be actually to sample along some path in the image plane.

for example, you could sample each line one after the other. you could also sample on an angle, any angle.

this would work best with repeating textures because in such a case you could continue moving along the vector and just wrap when the edges are reached, always producing perfectly continuous results.

that isn't the only way you could sample though. you could sample in a spiral, or any path through the plane really.

*watches as everyone goes to implement this obvious idea*
well, I don't know how many have been toiling away at this, but I'm prepared to give it a shot. do you mind elaborating on what you mean by 'sampling each line' and edge wrapping...just trying to get to grips with the concept of this proposal :?

Post

doctornash wrote:
since an image is two dimensions, the "correct" way would be actually to sample along some path in the image plane.

for example, you could sample each line one after the other. you could also sample on an angle, any angle.

this would work best with repeating textures because in such a case you could continue moving along the vector and just wrap when the edges are reached, always producing perfectly continuous results.

that isn't the only way you could sample though. you could sample in a spiral, or any path through the plane really.

*watches as everyone goes to implement this obvious idea*
well, I don't know how many have been toiling away at this, but I'm prepared to give it a shot. do you mind elaborating on what you mean by 'sampling each line' and edge wrapping...just trying to get to grips with the concept of this proposal :?
He's referring to the question of what order in time do you assign the pixels? Because pixels are spatial, not time based. So the mapping into the time domain is of course totally arbitrary, as there isn't a pixel that's "first". The other problem is what you do when you run into the edge of an image where, even if you wrap around or fold back, you're no doubt going to hear the discontinuity of the edge.

Post

not if the image is tiled.

take for example a flat-shaded cube in the center of the image. you can trace a path through the image plane in any way you like and it'll generate some interesting effects. the "difficult" part would be that you'd need to maintain the same frequency regardless of the angle.

no i haven't done it. it's extremely simple though. 99% gui work. a naive dsp implementation will suck - you need the right components.
Free plug-ins for Windows, MacOS and Linux. Xhip Synthesizer v8.0 and Xhip Effects Bundle v6.7.
The coder's credo: We believe our work is neither clever nor difficult; it is done because we thought it would be easy.
Work less; get more done.

Post

aciddose wrote:it's extremely simple though. 99% gui work
To that I can attest :(

Post

well, fortunately i already have the components required. a 3d renderer supporting all the modes i want.

sampling the z-buffer would be interesting too. not only could you use cubes, but you could sample venus' z-plane. now that would be interesting.

Image
Free plug-ins for Windows, MacOS and Linux. Xhip Synthesizer v8.0 and Xhip Effects Bundle v6.7.
The coder's credo: We believe our work is neither clever nor difficult; it is done because we thought it would be easy.
Work less; get more done.

Post

since the geometry is not aliased until being sampled/rendered, why wouldn't we take the fractional information available to us to correctly synthesize anti-aliased ramps at polygon edges?

likewise for font glyphs and other geometric primitives.
Free plug-ins for Windows, MacOS and Linux. Xhip Synthesizer v8.0 and Xhip Effects Bundle v6.7.
The coder's credo: We believe our work is neither clever nor difficult; it is done because we thought it would be easy.
Work less; get more done.

Post Reply

Return to “DSP and Plugin Development”