Audio Terrain Sonification

Converting user uploaded images into synthesized sound in real time.

This project explores image-based terrain data as an input to a real-time audio sonification system. I created its prototype in Max/MSP as a researcher in the Responsive Environments Group at the MIT Media Lab.

Users can upload any image, which is treated as terrain, and then “listen” to it as a dynamic sound map—transforming 2D visual information into a continuous auditory experience. The system allows manual and automated orbit paths to sweep across the image, mapping pixel intensity to frequency and amplitude across stereo channels.

Technical Approach

I built the system in Max/MSP, a visual programming language for audio signal processing. A user-uploaded image is interpreted as terrain data, with pixel values sampled using jit.peek and mapped to stereo sine wave oscillators. The scanning path is controlled by phase-offset sine functions to create a circular orbit, which can be manually positioned or automated over time.

Terrain values directly modulate the frequency and amplitude of the sound, allowing dynamic auditory representation of visual features. The patch includes adjustable orbit radius, real-time visual rendering, and smoothing filters to ensure clean audio transitions, all optimized for live interaction or exploratory use.

Potential Applications

Accesibility: Enables non-visual exploration of maps and environments
Data Representation: Offers an intuitive way to perceive image features like contrast, gradients, and structure through sound
Creative Tools: Data-driven live performance
Cognitive Research: Support investigations into cross-modal sensory interpretation or spatial audio perception

Technical Skills

Signal Mapping: Pixel data to stereo sine wave output

User Interaction Design: Live control over scanning paths and sound parameters

Max/MSP Programming: Real-time audio synthesis and matrix manipulation

Visual-audio integration: Coordinated image rendering with spatial audio output