This is the core technical challenge i’m facing as i create my image-to-sound machine. Although sounds often make quite pretty pictures, like this spectrogram…
…the reverse isn’t necessarily true. There’s a great track by Aphex Twin called ‘Equation’ that exemplifies this – he’s encoded images of his face using a kind of reverse spectrogram into the track. The parts of the song where his face appears are quite unpleasant sounding – only he could get away with including them!
My plan is to associate specific colour values with specific sounds, scanning through an image at a particular rate. For example, if a bright red is encountered in the image, a bright, angry sound is played – or, is more likely to be played. This will involve splitting the image up into easy to manage chunks, rather than dealing with it on a pixel-by-pixel basis – analysing each chunk for colour content before passing that information to a sound generating process.
This throws up a couple of questions:
- How do i decide what colours equate to what sounds?
- How will i generate the sounds / music?
I’m going to address the second point first. I’ve already started to put together a library of samples, each with a particular tonal and rhythmic quality. I want to get a good idea of the colour that’s associated with each sound before i reverse the process, and to do this my plan is to set up a soundcloud page, and invite users to comment on each sound sample only in the form of colours. Hopefully i can build up a kind of tag list for each sound, and identify the most common colours associated with each sound.
I also want to investigate the basic components of musical sounds and how these relate to colour. To do this my plan is to create an online survey asking participants to associate a colour with a series of simple tones, varying only in pitch, envelope shape (i.e. sustained or percussive), tone (i.e. bright or dark) and chord shape (i.e. major, minor, sustained, diminished). These sounds will be free from any other qualities that might affect their perceived ‘colour’, such as varying timbre (for example, orchestral vs. synthesised sounds) or varying reverb amounts.
Once i’ve got a good idea of how people generally perceive sounds as colours, i’ll work on a way of playing sounds based on colour information in an image. This will involve an increased likelihood of a particular sound playing given a specific input colour, based on a fairly broad database of sampled and generated sounds. I’ll use the max object ‘groove~’ to play back samples, and some freely available physically-modelled instrument objects together with some cunning sequencing to generate sounds from scratch.