Game Audio Basics: File Formats and Mono vs. Stereo

by Jack Menhorn

Before diving head first into the wonderful world of Unity audio implementation; it is wise to make prepare the assets properly to ensure smooth sailing and easy troubleshooting later on in a project’s development. Easily fixable or preventable issues or errors with a file taken care of before the file touches Unity’s file hierarchy will save yourself and your team time and headaches. Let’s take a look at some of the basic aspects of sound editing and troubleshooting to ensure a quality project with quality audio.

File Format

Much like with visual file formats; audio has a few types of file types to use and some to stay away from.

The file format .WAV is the most common and most supported type of file in sound design. WAV is an uncompressed format unlike formats like MP3 and OGG. When choosing sound effects from around the internet or requesting them from a sound designer; WAV format typically makes the most sense.

WAV files are saved at specific sample rates. These rates reflect the number of samples per second and as such the higher the number the greater fidelity and processing flexibility. 44.1 kHz is CD quality but often times sound designers will record and edit files at 96 kHZ or higher so that when they process them they can get away with more drastic changes to pitch or speed without introducing unwanted artifacts into the file.

OGG and MP3 files have their uses and place at the end of the sound design chain, but as compressed formats they are not ideal for editing or processing. It as greatly encouraged to stay away from purchasing sound effects from websites in MP3 format or even .WAV files at sample rates less than 44.1 kHZ. Options will be limited and quality will suffer.

When implementing in Unity; typically one will import the .WAV file directly into the application and then use Unity’s built in OGG, AAC, or MP3 compression based on the platform and type of sound effect.

Mono/Stereo

Audio files come in two flavors: mono and stereo. Each one has it uses in game audio. Typically stereo files would be used for what would be in Unity terms; called a “2d sound”. A “2d sound” would be sounds that do not need to be positioned in 3d space such as; music, UI, level ambience (that doesn’t come from a specific point), voice overs, or player’s own weapon fire.

Mono sounds, in Unity terms would be used for “3d sounds”. A “3d sound” would be positioned in 3d space such as: enemy weapon fire, a character speaking to you off in the distance, a machine making noise, cars driving by or practically any sound that would have an object in 3d space as its origin. Stereo sounds are not very useful for 3d sounds as since Stereo sounds have specific data for left and right speakers; a game engine cannot raise or lower the volume of each speaker to simulate the position of an object.

When choosing sounds from websites; there will usually be information letting one know if it is a stereo or mono file. In addition; checking file data in Windows or OS X will tell or when all else fails; open the file in a wave editor. If there is one waveform it is mono and if there are two waveforms it is stereo. Mono and stereo files can be converted into each other, but converting from mono to stereo only duplicates the mono file and doesn’t give quite the same fullness as a true stereo file. Converting from stereo to mono also loses the fullness of a file.

Here is an outdoor recording made in stereo:

https://dl.dropboxusercontent.com/u/16341299/GANG_UNITY_TUTORIAL/GANG_UNITY_TUTORIAL_01_stereo%3Amono_01.png

https://dl.dropboxusercontent.com/u/16341299/GANG_UNITY_TUTORIAL/GANG_UNITY_TUTORIAL_01_stereo%3Amono_01.wav

Here is the same recording converted to mono:

https://dl.dropboxusercontent.com/u/16341299/GANG_UNITY_TUTORIAL/GANG_UNITY_TUTORIAL_01_stereo%3Amono_02.png

https://dl.dropboxusercontent.com/u/16341299/GANG_UNITY_TUTORIAL/GANG_UNITY_TUTORIAL_01_stereo%3Amono_02.wav

While they may sound rather similar (in that they certainly aren’t the best outdoor recordings); the stereo one sounds a bit fuller and wider.

– end of Part 1