THE ART OF DESIGNING ART – Jesse Harlin (March 2008)

  • Post author:
  • Post category:Uncategorized

BECAUSE OF AUDIO’S INTANGIBLE NATURE, we tend to speak about sound with the borrowed language of painting. We talk about our canvas. We talk about sketches, palettes, and colors. But in a practical sense, game audio much more closely resembles photography insofar as we frequently strive to authentically impart real-life detail into our titles. If it revs, jumps, or shoots, we ensure that each instance is faithfully, predictably, and realistically scored. The technological ability to create these richly nuanced worlds is relatively new to our industry, roughly a decade or less, but we’re there now. We’ve proven that we can build the believable. As audio designers, we’re ready to move onto the next step—infusing art back into our technical accomplishments.


Compared to audio design in both television and film, game audio is extremely literal. Everything in the game world is constantly emitting sound to the perpetual accompaniment of dialogue and music. The more music there is in a scene, the more a scene strays from the edges of realism toward hyperrealism as the game’s score helps to telegraph emotion, pacing, and setting to the player.

But listen to the audio for a movie like I Am Legend or The Graduate. Watch an episode of anything from ER to Buffy The Vampire Slayer to The Real World. With exceptional frequency, film and television both play with massively exaggerated hyper reality in ways that games rarely approach. The end of Terry Gilliam’s 12 Monkeys is shot entirely as  a slow-motion action sequence. Most of the sound for the world has drained away. What remains resonates in a massive wash of reverb. At a pivotal point in the scene, all remaining sound finally fades away leaving only music to carry the emotion of the finale. By selectively isolating specific elements of the film’s soundtrack, the movie moves away from a simple documentary of events and becomes impressionistic art.

If we are doing this in games, we’re almost exclusively relegating this kind of audio treatment to the realm of cinematics, with few notable experiments such as XIII and MAX PAYNE 2. There is really no reason why this kind of audio mixing can’t be done in real-time during interactive gameplay. Most audio engines contain separate busses for sound effects, music, and voice. Most times, a general mix of these busses is set and then the volumes
of individual files and specific sound banks are tweaked as needed to refine and finalize the mix. However, nothing in the compliance guidelines for Sony, Microsoft, or Nintendo prohibits audio designers from dynamically interacting with these busses for dramatic purposes during gameplay. By combining dynamic control over these master mix busses with real-time digital signal processing (DSP) effects such as filtering and reverb, game audio engines have the exact same ability to achieve impressionistic hyper- reality within the art of audio design.


For all our bluster as an industry about the promises of interactive storytelling over the traditional medium of film, games almost always adhere to a rigidly linear narrative structure. The visual language of film and television, however, is frequently more sophisticated and uses flashbacks, montages, and other departures from strict linearity to help tell their stories. As clichéd as an arpeggiated whole-tone harp scale may be to mark the beginning of a dream sequence, this example scores the vast importance audio has in helping to sell shifts in time and setting. Watch the film Atonement or any episode of Lost and you’ll find that hyper real sound and music effects always precede jumps in time and place as a means of preparing the audience for a shift in setting.

Again, if we use flashbacks or montages in games, they rarely come in any form other than cinematics. Rarities like SHADOW OF DESTINY, FINAL FANTASY VIII, and, again, XIII have experimented with non linear time during gameplay. While it’s essentially disjointed nonsense and devoid of actual story, the WARIOWARE series is basically micro-game montages frequently tied together only by their audio content.

The potential for non linear gameplay as part of a game’s narrative structure is endless with just as many creative opportunities as those available to film and television. Again, it’s going to be up to audio to help sell the distinction between real and hyper real. By decoupling audio from the game’s cutscene movie player and ensuring that audio has the necessary available streams for crossfades, cinematics can occur as bookends to gameplay and yet allow audio to continue seamlessly from gameplay to movie and back
to gameplay. By ensuring that audio engines utilize banks of sounds that can be dynamically loaded, separate sets of assets can be created and DSP effects established to differentiate between past and present, realistic snapshot and impressionistic hyper reality.

A century of cinema has made audiences more sophisticated than games frequently assume them to be. By combining well-planned asset management with real-time DSP effects and broadening the scope of creative direction, the language of games and game audio can transcend from photographic snapshots of virtual worlds to artistic expressions of hyper real art that captivate, challenge, and astonish players and paint imaginative interactive experiences not possible in any other media.