Tales from the Front Lines of VR Audio — AudioTechnology
It’s early days in VR. Not many people on the planet have finished and shipped a ‘Virtual Reality’ experience, but those pioneers have valuable insights from the VR audio production coalface. AT brings together four leading audio developers who’ve actually been there and designed the t-shirt. Each has experienced their own unique sonic journey in VR sound, dialogue and music.
Story: John Broomhall
Music in VR is multi-faceted. In a passive linear experience like a movie, you’ll hear diegetic music — i.e. music emanating from somewhere in the movie world which the characters can hear. It could be anything from a radio playing in or out of shot to an orchestra playing at a concert attended by the characters. Then there’s non-diegetic music which helps the audience decode and emotionally interpret what’s onscreen by clueing them into a character’s feelings or intent. Even in regular ‘first person’ perspective videogames this can raise questions around who the underscore music is for, when you — as the player — are the character actively participating and creating the action. VR raises the question of ‘Who’s the music for?’ even more acutely, and audio developers are finding music can very easily bump you out of the experience.
Each case is different but certainly music needs to be highly appropriate, strongly connected with the environment and characters — possibly even melded with the soundscape/sound design. It can even be diegetic, but designed to deliver emotional messages from within the VR environment itself.
Todd Baker: “There’s an established language in film and games, where you’re used to hearing non-diegetic music all the time. This highlights the true difference of VR, you’re actually not looking at VR on a screen. There’s this removal of the ‘fourth wall’. So if you’re standing on the edge of a cliff with the 3D ambience giving you all the realistic cues for that environment, and then a music cue kicks in, it’s like somebody just put headphones in your ears! It’s very easy for music to take you out of the experience and make you question why you’re hearing it. That said, it’s not like music doesn’t have a place. It’s such a powerful tool, so it’s a case of coming up with approaches that aren’t distracting.”
Barney Pratt: “With Until Dawn: Rush of Blood, we decided that all music would be diegetic with a 3D position in the world. We placed visible loudspeakers wherever we wanted and as the concept grew, evolved a back story for their presence, further enhancing their diegetic appeal whereby they embody the mood of the ever present character who guides you. He scares you, laughs at you, and plays spooky music, all to help drive the overall experience. A strong design edict of only having diegetic music resulted in a very immersive game element. We have a lot of licence with Rush of Blood — we could put visually fantastic gramophone speakers throughout, which look like part of the world — whereas another game might be set in nature, for example, and therefore not have that opportunity.”
Matt Simmonds: “Conversely, in narrative-driven The Assembly, incidental score works because the title’s not so obviously ‘first person’, mitigating that feeling of ‘Why is there music following me around?’ By two or three chapters in, the player is settled with the interface/viewpoint and we bring in incidental music. It’s not everywhere though, and we’ve also made ‘spot’ radios part of our game world. Actually, I’ve found incidental music score can greatly enhance the experience and it really doesn’t feel out of place within a narrative experience.”
Todd Baker: “With Land’s End I took a very holistic approach with music and sound. For instance, there were lots of musical tonal elements in the interactional audio that were tuned to a key related to very subtle underscore elements. Music grows gently out of the ambiance at key points. It’s a light touch with very blurred lines between sound and music.”
Land’s End is a VR adventure from the creators of Monument Valley. Set against spectacular landscapes, the player is tasked with awakening an ancient civilisation using the powers of their mind. Land’s End combines Ustwo Games’ award-winning approach to interactive storytelling with Samsung Gear VR, creating a virtual reality experience you can take anywhere.
HEADS UP WITH THE DISPLAY
Todd Baker (Audio/Music Artist): “It’s really important to have access to VR headset tech early on in the game’s development. You need to understand the nature of the experience you’re trying to bring to life with audio. The sooner you get the headset on, the better. While not a world apart, if you’ve worked on a 3D game (a first person game, in particular), there are key areas of difference.
“A good thing about the mobile VR platform is you can take the headset anywhere — you just need a phone and the headset — so I was able to take ‘game builds’ home very easily and play in different environments using different headphone options.”
Though VR clearly creates dramatic opportunities for sound to provoke strong visceral responses, arguably sound designers have to throw out at least some of the rule book and find some new tropes and techniques.
Barney Pratt: “From the outset we knew we would have to adjust our approach, opening the door to experimentation. Certain things we’d taken for granted simply didn’t work in the VR realm. Rush of Blood exists in the same world as Until Dawn, but the step to VR meant continually reshaping the experience from a cinematic to a more immersive one.”
Matt Simmonds: “The first immediate VR sound difference I noticed was how players respond to new environments. They spend much longer experiencing them, even if they’re not actively engaging with them. In that regard, we’ve had to rework our ideas on ambience over time by having things evolve gradually, and understanding players will be in the game space far longer than in flatscreen games. You need more attention to detail, populating even seemingly mundane objects with emitters to make them feel more solid.”
Barney Pratt: “One of the biggest mind shifts was meticulously giving all sounds, without exception, a true 3D position, otherwise they detracted. The in-world spatialisation of all sounds is vital.”
Simon Pressey described his team’s attention to detail with Crytek’s high-gloss Robinson title as nothing short of ‘fanatical’: “We’ve created a totally complete and coherent new world, every creature (and variation of it) from brontosaur to cockroach makes sound. The world is alive with sound, all playing in dynamically binaural 3D. VR takes visual immersion to a new level, and for that immersion to be believable and engaging, the audio reality has to complement and reinforce it.”
Matt Simmonds: “You’re operating in an audio setting with no framing. How much that affects sound design depends on the project, but it certainly changes your approach to many things. It’s about removing some of the traditional ‘go-tos’ we take as given. For example, I’ve found ducking, and ‘focused audio’ can take a step back. I did use that approach for the main VO on one project but in future I’d rather re-work the way assets are recorded and placed.”
UNTIL DAWN: RUSH OF BLOOD
Developer: Supermassive Games
Strap yourself in for the most disturbing rollercoaster ride you’ll ever take. From the warped minds of the team behind PS4 horror classic Until Dawn, comes Until Dawn: Rush of Blood — a virtual reality experience to strike fear into the hearts of every trigger happy arcade shooter fan.
Barney Pratt (Audio Director): “The way sounds are attenuated over distance has to sound more realistic than in a filmic experience, which creates challenges at long distance with sounds you want to prioritise. Early on, we realised that when characters or objects are very close to the player, we can really invade their personal space, creating audio events people feel they can literally reach out and touch. It can add a visceral layer of creepiness when a character leans in to talk to the player, and it’s fantastic for VR horror scares. We call it ‘pulling focus’. VR soundscapes have a lot more space but it’s a mistake to try and fill it with more sounds as any ‘clutter’ can be a tiring distraction. Choosing your moments is key to the emotional curve. Rush of Blood is an intense experience so having emotional lulls is just as important as pushing the highs.”
Set in the future, Robinson features a 12-year old survivor of a crash landing, whose space colony ship Esmerelda has experienced a catastrophic disaster. Robin starts exploring his new planet and finds a remarkable environment populated with incredible flora and fauna… and dinosaurs. He must maintain his healthy survival, learn why the disaster struck and possibly contact other survivors.
STRONGLY—CONNECTED MUSIC MATTERS
Robinson features a unique music score provided by composer ace, Jesper Kyd, but figuring out the right overall music design approach was not without its challenges.
Simon Pressey (Audio Director): “I was very concerned about music potentially bumping us out of the experience. Our early experiments in VR proved how easily this could happen — as soon as music started to play it was like, ‘What is this stuff doing in my world?’
“We found the music had to be totally coherent with the entire world and narrative. The key design pillars being Robin’s story and perspective, and the world and the mission of Esmerelda. The music continually makes reference to both. We kept the use of music minimal and intentionally simple — less truly was more with the Erik Satie-esque compositional minimalism adding to the experience rather than taking it over. The overall simplicity and sense of naivety connect strongly to young Robin. The story is a ‘future fiction’ rather than pure science fiction because it’s Robin’s story and in fact about mankind’s desire to explore. So the music has a sense of wonder of the unknown.
“There’s also a cinematic approach with the music being used to help direct the player’s experience, expressing emotions related to the story. For example, when you’re seeing a vista of the world you’ve landed on for the first time, with the ruins of the crashed ship, one music cue expresses all the emotion connected to that — wonder, potential, loss, resolution — in a way words can’t. Subsequently, elements of that music cue echo throughout the rest of the story and score.”
Whether or not the notional sounds of the user like footsteps or breathing should be included is a case-by-case judgement according to Todd Baker: “It comes down to the project and the particular world you’re creating. In Land’s End I included very gentle foley when you move between ‘look points’. It was a very subtle feeling of wind movement — a kind of ambiguous ‘flappiness’ — we talked about it like a spirit or a memory.”
Barney Pratt: “We decided against incorporating any breaths. We discussed possible heartbeat for stress situations but decided against that too. However, for contact with the environment, where you actually see your feet walking, we have appropriate sound. That’s a correct response to the environment and quite natural, rather than forcing something on the player.
The Assembly is an intriguing first-person interactive drama in which players investigate a shadowy organisation that’s been conducting secret experiments, their astonishing breakthroughs only made possible by operating outside government scrutiny and society’s morals. But what is it hiding and how far will it go to keep its existence buried?
MY FIRST VR PROJECT
Matt Simmonds (Audio Director): “If you’re approaching your first VR project, my advice would be to think carefully about dialogue recording. It’s a difficult thing to manage but in a narrative game I think a larger sense of the scene’s environment comes into play. The player’s viewpoint eschews old concepts of a ‘game camera’ plus omitting ducking and sound focus removes the notion of the player’s audio being ‘controlled’. So, what are you left with?
“Record dialogue the way the actor would respond to environment. If it’s quiet, downplay it; you can even whisper in VR and have it work. Same with competing sound loudness, have them shout against the environment. For the latter, I think having crystal clear dialogue in the mix isn’t something you’ll strive for compared to a greater sense of realism. You rework dialogue so the implication is obvious even if the words are not.
“Also, get your third party technology choices settled early, start thinking about mixing sooner rather than later and allow generous scheduling for it. However, remember that mixing and placement are almost interchangeable going forward. Think ahead how you’ll deal with the obvious difficulty of having to wear the headset whilst mixing.”
Barney Pratt: “On paper, binaural encoding — or head-related transfer function (HRTF) — is a ‘ticked box’, ‘job done’, ‘best 3D audio’. However, that’s not necessarily the case. We’d often exclude sounds from the HRTF in Rush of Blood’s busy soundscape to improve the sense of directionality for the player. We’d only select sounds for HRTF filtering that wouldn’t suffer from the resultant low-end loss; the mulch.”
Todd Baker: “It does affect the character of the sound. I ended up not using HRTF at all on Land’s End. It’s interesting how a lot of the talks I’ve heard over the last couple of years have been very technically focused. I just had to ask, ‘do I need this? Is it actually going to make this project better?’ HRTF filtering affected the character of the sounds in a way I felt was undesirable. It’s definitely not a case of absolutely needing it. Even if you do use it, it’s going to most likely be on a select few sounds within a scene.”