What’s Next for the Next NextGeN? – Jesse Harlin (Jan 2012)

  • Post author:
  • Post category:Uncategorized


As both a creative and technical discipline, interactive audio has made tremendous strides over the last console cycle. When I took over writing this column six years ago, most companies were still laboring to build proprietary audio engines. Handheld games were primarily in the control of first-party console manufacturers. Rich, interactive music was a cutting-edge feature rarely found in-game. iTunes’ video game marketplace, Kongregate.com, and Zynga didn’t exist yet. Six years on, smartphones now dominate the handheld market, and games have a thriving independent developer community again. Proprietary audio engines have lost ground to Wwise and FMOD. Interactive music is now the rule rather than the exception, and the internet is now the frontier of game platform innovation. As we gear up for the next generation of consoles and the next burst of technical and creative advancement, it’s worth looking at what the next areas of focus should be for game audio.

At the start of the next generation, sound designers are finally in a place where the battle in AAA development for detail and variation is a waning concern. Complex instance culling and stream management systems are already a must, and will only continue to become more of a fundamental need. Sound designers can expect the usual fidelity increases to both asset bit and sample rates and, subsequently, the usual increased impact on both memory and disc footprints.

But that’s the small stuff. The next generation of game sound is going to be about maturity. We’ve shown how big the worlds we can create can be. Now we need to show how well we can get them to sound.

As such, the new frontier of sound design is mixing and mature implementation. The days are gone where it’s acceptable to simply have static master levels for sound and music that are occasionally ducked by voice. Nuanced mixes and intelligent systemic mixing systems are the next big focus. As we gain the ability to add more real-time convolution reverbs and more detailed surround ambiences, we’re going to need the ability to deftly sculpt frequency space and create situational mixes that change depending on player feedback and myriad shifting game states.

Game music only continues to get more sophisticated and more complex, both compositionally and technically. Wwise and FMOD’s considerable acceptance across the industry has given game composers an advantage that was sorely missing for years: standardized tools. When the gig can be about composing interactive music—as opposed to building technology that facilitates composing interactive music—composers as a group can begin to focus on innovation rather than reinvention.

The ubiquitous music loop was once the undisputed king of game scoring. King Loop, however, has become tiresome, and audio teams across the industry are now working with game music systems that focus on more variety. Interactive music scores are starting to be more about stitching non-looping material together, rather than wall-to-wall loops of repetitive music. This increase in variety is bringing with it an increase in the amount of music needed to cover a game.

By the end of the next generation of games, conversations regarding “disc footprint” will be a thing of the past. Cloud storage and cloud streaming of content will not only be the norm for digital distribution of game software, but also for game content. In-game radio stations, streamed level music for Facebook games, and even faction-specific multiplayer music no longer needs to live on a physical disc or even on the end user’s machine. If game developers and publishers don’t offer the technology themselves, expect to see a rise in streaming services akin to YouTube and SoundCloud that can be used to propagate in-game musical content.

In-game text is an endangered species. We’ve reached a point as an industry where even a gigantic MMO like Star Wars: The Old Republic is fully voiced. As such, expectations from players are shifting. Increases in storage space mean that everything that can be voiced should be voiced. Additionally, players are coming to expect a wider range of localized languages within games. Including Chinese subtitles simply won’t be enough anymore.

As graphics and animation technology improve, game voice is going to be massively impacted by the further proliferation of facial motion capture. Our industry was once closely related to the animation world, in which VO actors ruled the day. We’re now shifting into a camera- and physical performance-oriented field. More and more, game industry casting directors are looking for their actors to have previous motion capture experience as well as a physical likeness that can be used in-game. The traditional role of voice over- only talent isn’t going to go away. However, this influx of new film and TV-vetted actors, mixed with an increasing importance placed on cast records and actor involvement in script and character development, is already bringing a new level of nuanced drama to our games.

After six years of writing for “Aural Fixation,” this column will be my last. Thanks to everyone who’s taken the time to read and comment over the years. See you all in the trenches.