‘Mixing’ spreads itself across the whole life-cycle of any audio project, and involves an amalgam of all our perceptual, creative and technical skills. We’re all quite aware of how challenging mixing is (or, at least, can be) for fixed media situations but how are these challenges affected when we’re presented with a (literally) moving target?

This session we’ll look at these challenges: what’s similar and distinct about mixing for dynamic situations, and what things to bear in mind. We’ll look a bit at a couple of approaches taken in the games industry, and finally have a look at some of FMOD’s facilities.

First Bit: An Exercise

You all have experiences mixing. As such it seems a bit redundant, not to mention inefficient, for me to stand at the front and tell you about it. Rather, we’re going to pool our experience.

In groups, first consider mixing in the general sense. What are four priorities for making a ‘good’ mix? What are four priorities for avoiding a ‘bad’ mix?These priorities shouldn’t be qualities of what qualifies a good/bad mix, rather I want to you think in practical terms about the sorts of thing you need to do (or not do) as you work.
Still considering the general case: what sorts of knowledge do we bring to our practice as mixers? Look over your eight priorities from above, and jot down the different kinds of skill / knowledge required to enact them. Is it all just technical, or do specific forms of cultural, social, personal knowledge play a part? Why am I asking you this?
Now, agree between your group members five qualities of a good mix, in a general sense (any medium, any genre?)

(Class discussion to follow)

Back in to groups:

What’s different for interactive / dynamic mixing? Agree on a list of (4-5) distinct challenges presented by this medium. (For those doing Sound and Fixed Media, it may be valuable to consider the kind of work you’ve been doing on that course and consider how it might need to change in an interactive context)
What are the particular priorities for an effective game mix?

(Some further discussion)

Dynamic Mixing Perspectives

A quick survey of perspectives and approaches to game mixing.

This picture gives some indication of the mixing task that needs to be dealt with at a given instant. One helpful way of thinking about the job at hand comes from Bridgett’s perspectives on Horizontal and Vertical mix elements. Bridgett denotes ‘horizontal’ elements of a mix as concerning the consistency between assets (e.g. loudness normalisation of dialogue), and their ability to fit. Vertical elements concern the moment-by-moment combination of assets in order to present the game narrative appropriately. (This perspective seems to me also applicable more generally to asset-based mixes where we’re building things from a collection of materials, e.g. much sample-based music).

Considering, in particular, ‘vertical’ elements we have to bear the following in mind:

SFX – how do these fit into the wider sound design schema? what resources and sonic bandwidth are available? Often SFX are given a lower priority than the other uber-categories… choosing priority: what conveys essential information and what is purely for aesthetic/immersive purposes…
Interface sounds – are these clear enough above any other elements in order that their informational content is effectively delivered? Do they spoil immersion by being too disruptive?
Dialogue – what frequency range, priority? spatialize? send to centre speaker?
Music – eg. level of intensity – what is happening when this cue is playing back? Can different layers be called or excluded based on other elements? Will the player be allowed to substitute their own music into the game?
Switching components of a sound based on various criteria eg. distance from the listener object, switching tails / indoor/ outdoor
On a global level, usually attenuation and occasionally boosting of levels/frequency ranges based on priority, state (e.g.. night, day, player health etc), resources, player location, object location etc – ducking, dynamic filtering and multi-band compression, side-chaining
Depending on the platform some sound assets might be discarded so its important for a mix to work even at its bare minimum

General Techniques

‘Normal’ asset preparation (this is where mixing starts!)
Conventional ‘linear’ techniques
Location-based mixing
Dynamic range based manipulations

Dynamics

Straightforward ‘side-chaining’ (actually, external keying for those of us who are pedantic)
Meta-data tagging
Snapshots, priority based systems

The ‘High Dynamic Range’ approach from Dice uses a tagging system to mark assets with a nominal loudness. These are given priority at runtime, and other elements are simply muted to reduce clutter:

How High Dynamic Range Audio Makes Battlefield: Bad Company Go BOOM from Anders Clerwall

See here for a really interesting podcast discussion about the challenges / techniques of dynamic mixing.

Dynamic Mixing and FMOD

Most of you have encountered some of the tools FMOD offers, especially the use of distance for location based mixing, and custom parameters to achieve more complex states (e.g. different sonic scenes at different elevations).

FMOD has some additional features though:

Live Mixing

A cycle of run, tweak, build, refresh, run… is a pretty slow way to work. Fortunately we can connect directly to the game…

Also, we can connect a control surface (that speaks Mackie Control) and actually get some faders under our fingers. (In the podcast above, Rob Bridgett discusses how important this was when they were working with Randy Thom on Scarface)

Loudness Metering

FMOD has a ITU 1770 / EBU R128 meter for ‘loudness’ metering of your material. For an explanation of this technology, see this.

Bridgett’s article above has the following quite informative selection of loudness levels that they worked to:

The long term metering expressed in the ITU spec was developed for broadcast program content, and suits ‘predictable’ program content usually around one hour or half an hour in duration, and in this respect a single long-term number can’t be applied as easily to video games that have indeterminate lengths and unpredictable content.

The strategy we adopted wasn’t a conscious one, but more an observation based on what sounded good. We noticed different sections of the game naturally pooling into different loudness ranges by simply mixing to what sounded right. We noted a ‘range’ of long term loudness measurements which can be anywhere between -13 and -23 LU, based on the nature of the action.

story cutscenes -23

in-game cut scenes -19

background ambience -23.

‘Talky’ missions, or ‘stealth’ missions between -23 and -19 depending on context.

Action missions -19

Insane action missions -13

These numbers are what we are calling, for now, our “Long Term Dynamic-Range”, a grouping of loudness measurements that apply to certain types of game play or presentation elements, in our case -13 to -23. In essence I think what is required here is a method of measuring all these different aspects of the game within a short time window, around 30 mins to make a single overall loudness measurement useful. This is not always an easy approach when the game isn’t structured in that way.

(gamasutra.com/view/feature/172660/notes_from_the_mix_prototype_2.php?page=4)

Elsewhere, Bridgett discusses how he used different loudness profiles for a kids’ game in order to try and mitigate the danger of headphone use: gamasutra.com/blogs/RobBridgett/20140405/208210/ReImagining_the_Sound_of_PreSchool_Games.php?print=1

Submixes, Groups and VCAs

Once our projects get beyond a certain size (dynamic or not) we need to make the number of things to manage copeable-with.

We can make sub-mixes within events by routing audio away from the event master
FMOD has a very flexible grouping / bussing mechanism
VCAs give you an additional layer on top of groups that sit outside the rest of the routing system

Some of you may already have watched Jules’ video on this:

Snapshots

These are more powerful than they sound! Snapshots allow us to store and recall particular mixer settings, including arbitrary parameters

Anything that is to be included in a snapshot has to be manually ‘scoped-in’. That is, nothing is included by default. This gives us a high degree of flexibility
Snapshots are attached to events. This might seem counter-intuitive, but is very powerful indeed
As a consequence, multiple snapshots can be in effect at any given moment
FMOD then does behind the scenes magic to run a prioritising system, based on which snapshots have greater priority (denoted by their place in the snapshot list)
Most commonly this would be used for ducking, but all kinds of creative possibilities are available

These tutorial videos are useful. Even though done in the context of UE4, most of the action is in FMOD:

Interactive and Dynamic Mixing