Synesthesia


So I was sitting around tonight, and a thought popped into my head: what happens if you take an image and pretend it's an audio file?  What happens if you compress it as a .mp3 or .ogg file, uncompress it, and turn it back into an image?

No, I was not smoking pot.  This time.

I've actually wondered about this for a long time, ever since I started reading about how different compression algorithms worked.  When I discovered JPEG in 1994, I was blown away by how much compression you could get with very little loss in quality, and I've been fascinated by stuff like that ever since.  (Down, ladies!  I'm taken!)

In theory, there's no reason it shouldn't work.  Raw images and sound files are both just streams of bytes.  With a little handwaving, you should be able to trick software into thinking one is the other.

So I found a nice pretty test picture:

Original Photo

and did my magic.

First, I decided to give the audio compression a fighting chance.  I know that lossy compression algorithms like MP3 typically rely on having a little continuity of the data, so that they can predict patterns and so on.  The problem is, a 24-bit image like the one above is stored as a series of 3-byte values (for the R, G, and B values of each pixel).  This means the data can vary dramatically from byte to byte, which will make life very hard for compression routines.  So I took pity, and broke the image into separate R, G, and B channels.  This means that, instead of one 24-bit color image, I had three 8-bit grayscale images.  Since the test image has a lot of smooth sections (the sky, the shading of the ground), I figured this would give the compression routines at least a little to work with.   I would run the compression on each channel separately, then combine the three resulting files back into one color image at the end.

So I downloaded Audacity, imported the channel images as 8-bit unsigned mono at 11025 Hz, saved them as compressed audio, re-imported them, saved them as raw audio, then turned them back into a single color image.

And...it worked!  It worked pretty darn well, actually.  Even though it was meant for a totally different kind of data, the audio compression didn't do half bad with images, both in terms of quality and the amount of compression.

Here's the same image, compressed as an MP3 with a 128k bitrate:

128k bitrate photo

Not bad at all.  And a 1.36:1 compression ratio (PNG was 2:1).  Not great, but what do you want?  You're saving an image as an MP3!  Why don't you just back off, okay??

But what I was really hoping for was to get an idea of what kind of compression artifacts you'd see with MP3.  So, naturally, I turned the bitrate way down (to 16k) and tried again:

16k bitrate photo

And there you go: the answer to the long-pondered question of "what does an MP3 look like?"  Little splotchy lines, that's what.  And the compression ratio: 5.4:1, baby!  That's on par with an average-quality JPG (which, to be fair, does look a lot better).

Comments

There are no comments for this entry.

Post new comment

, or
The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <b> <blockquote> <cite> <code> <dd> <dl> <dt> <em> <i> <li> <ol> <p> <pre> <strike> <strong> <u> <ul>
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Copy the characters (respecting upper/lower case) from the image.