BEAUTY IS IN THE EARS OF A DUMMY - AudioTechnology
Are high sampling rates a cure or a curse? Chesky records stands by them, and pays top dollar to ensure both tracks of its ‘Binaural +’ recordings make the most of every 1 and 0.
Story: Paul Tingen
Artist: Macy Gray
True to its name, Macy Gray’s ninth studio album, Stripped, is a minimalist affair. The mainstream music press reported it as a revival of the simpler days — recorded in a church, over two days, with a single microphone and no overdubs. It matched the picture of a singer going back to basics to record a selection of older songs, some new ones and the occasional cover with only a small jazz combo as accompaniment. Unsurprisingly, that revivalist image only told half the story.
For starters, Stripped was not recorded by only one mic, but by two. Specifically, it was recorded with a B&K 4100 D Binaural ‘head’, which has a mic in each ear. Stripped also has no compression or EQ (as we know it), was recorded to 24-bit/192k, and released by New York audiophile label Chesky Records, as part of its ‘Binaural +’ series. Normally, binaural recordings sound wonderful on headphones, but don’t translate well to speakers. However, a collaboration between label co-founder and co-owner David Chesky, and Professor Edgar Choueiri of Princeton University — known for his research into advanced spacecraft propulsion — has resulted in technology that renders binaural recordings compatible with speaker playback. Hence the ‘+’.
Chesky’s Binaural + series is the most recent of many moves the company has made to remain at the forefront of recording and playback technology. The label was started in 1978 by David Chesky and his brother Norman, and has since been a leader in the audiophile market. In 2008, long before Tidal and Pono, the Chesky Brothers founded HDTracks.com, where all Chesky releases, as well as a wide variety of other albums, by artists from Jimi Hendrix to Alanis Morissette to Bela Bartok, can be downloaded as uncompressed files with resolutions up to 24-bit/192k. The blurb on the Chesky Records website notes that “we’re living in the Golden Age of headphone design,” and the Binaural + series aims to cater for connoisseurs of the headphone market, and attract a younger audience more inclined to listen via headphones than by any other means. Adding an established pop/R&B name like Macy Gray to the label’s roster helps serve both purposes at the same time.
PRINCETON’S HIGH-FIDELITY BLESSING
Over the course of several phone calls to New York — where David Chesky also works as a composer, producer, arranger, and musician — he elaborated on some of the company’s philosophies and technical approaches, and the innovations resulting from his work on ‘future audio technology’ with Professor Choueiri at Princeton. Together with engineer Nicholas Prout, Chesky explained how these philosophies and technologies were applied on the Macy Gray project.
Chesky is clearly a driven man, who talks fast and with great intensity. He began by addressing the entire raison d’etre of his company: “We pioneer HD audio, but unfortunately we live in a time of cheap consumerism, fast food and disposable everything. However, the pendulum can swing, and we should always try to do the best work we can. If we make a film, we want to shoot it in 70mm rather than 16 or 8mm. If we have tools like that, in our case high-definition audio, why not make use of them to the best of our ability? Why make things cheap and bad when we can make them great? There’s a world that goes for the lowest common denominator, and there’s a world that’s utopian. Eating in McDonald’s is not the same as eating in a great French restaurant. We try to offer the latter.”
Chesky’s utopian vision of quality over quantity is not only embodied in the label’s technical approach, but also in the music Chesky Records releases. Despite the great variety of genres, including jazz, classical, pop, R&B, folk and world music, performed by artists who rarely enjoy mainstream recognition (Macy Gray is one of very few exceptions), Chesky releases have an identity that oozes high-quality, in a manner similar to the sonic and aesthetic identity of releases by the German label, ECM Records. In the past Chesky releases were recorded with as few microphones as possible, using custom-made valve desks and the latest in digital technology. The label’s Binaural + series defines the label’s identity and sound with even greater clarity.
BACCH IN BLACK & WHITE
For more information on BAACH 3D Sound, you can head direct to the source at Princeton University to find a collection of info — princeton.edu/3D3A/Projects.html. The concept is now commercially available in a hardware box developed by Theoretica (theoretica.us) called the BACCH Stereo Purifier (BACCH-SP). It’s also available to be integrated into software via BitCaludron’s Unity plug-in SDK (bitcauldron.com).
“What we’re about is capturing great musicians in a great space,” explained Chesky. “We take beautiful aural photographs of musicians playing together in a real space. As good as high definition audio is, if the musicians aren’t great, or they’re not playing in a great space, it won’t work. If Madonna works in a studio and creates a song using 64 tracks and tons of overdubs, that’s okay, because that’s the world she lives in and that’s what she wants to do. Similarly, if someone wants to express himself through Hip Hop, that’s the way they do it. But our recipe is capturing real musicians in a real space, playing music together. We’re not trying to be everything to everybody; instead, we’re being very selective.
“With regards to our binaural recordings, we do a different type of binaural. Our recordings are a hybrid version of binaural that work perfectly on headphones and on speakers. If you listen back on speakers, it sounds like it was recorded by two spaced omni mics. The technology was developed by Dr Edgar Choueiri of Princeton, with whom I have been working for many years now. They’re BACCH 3D Sound filter cancels crosstalk between speakers, and in so doing retrieves a 360-degree sonic hologram of the binaural recording. The recordings have height and depth, as well as width, and you can hear things above you and behind you, and so on. We use a Diffuse Field EQ which also makes sure that ambisonic recordings translate better to speakers.
“Normally when you listen to speakers there’s corruption from crosstalk, because your ears are hearing both speakers at the same time, but the filters developed by Dr Choueri at Princeton’s 3D Audio & Applied Acoustics Lab correct for this. It means we can put the listener in a virtual space. Think of it as virtual reality for audio. This definitely is the way of the future. It’s where we’re heading. The other aspect that’s important for this is that we need high sampling frequencies to be able to auto-locate sounds, as we do in nature. The ear needs at least 10ms to auto-locate, and 96k and 192k sampling rates give you better location and imaging. 44.1k is not enough; at that sampling rate, the lens is still a little blurred.”
IRATE OVER SAMPLES
The mind — and ears — may boggle a little at all this utopian talk, especially when introducing relatively new and obscure technologies like Choueri’s BACCH 3D Sound and Diffuse Field EQ. Moreover, in championing 192k, Chesky also takes a position that is in some quarters regarded as controversial, as there are plenty of fairly credible papers and blogs doing the rounds that state that 24-bit (as a playback format) and sampling rates above 60k are overkill. In fact, some experts claim that 192k sampling rates can actually damage practical fidelity because of intermodulation distortion cause by non-linearity at high frequencies (for more, read Justin Colletti’s blog on sonicscoop.com called The Science of Sample Rates, and the credited paper by Lavry at lavryengineering.com/pdfs/lavry-sampling-theory.pdf).
The high sample rate debate has gone on for decades, at times degrading into flaming wars with both sides taking up mutually exclusive positions. Instead of trying to single-handedly resolve the conflict, let’s bring things down to earth and see how these techniques panned out in practice with the recordings of Macy Gray’s Stripped.
As with all other Chesky releases, David Chesky is credited as producer (in this case alongside his brother Norman). David explained how the project came into being, and what his role was: “I have been involved in the making of every record Chesky has released, because everything we do is my concept and my vision. Our entire binaural series is as well. However, I also like to give our artists the freedom to make the records they want to. In this case, it’s not a David Chesky record, it’s a Macy Gray record. All I’m doing is helping capture an artist in a space. I’m not trying to make a pop record or a hit record or trying to push an artist into doing something they don’t want to do. In fact, mainstream artists often come to us because they want to do something different and new and a little more creative. Macy picked the material, with some input from me, and then she rehearsed it with the band.”
From then on, freelance engineer Nicholas Prout was heavily involved. Prout has worked for Chesky Records since 1997, and has recorded, edited and mastered all Chesky Records releases for the last decade. He began life as a jazz drummer, studied at Boston’s Berklee College of Music, held staff engineering positions in Boston and New York, worked for a while as a mastering engineer at Foothill Digital, and currently also works at Chesky’s sister companies Manhattan Production Music (a music library) and HDTracks. In addition, Prout works on sessions unrelated to Chesky, notably he’s been the recording engineer of the Vermont Symphony Orchestra for the last 20 years.
“We take beautiful aural photographs of musicians playing together in a real space”
“I think Macy and the band had two rehearsals in Manhattan,” recalled Prout, “during which they worked out the keys, roughed out the arrangements and got a general feel. David and I, and the entire Chesky recording team, were working on a different project earlier that week at The Hirsch Center, in Brooklyn, where we do most of our recordings [The other location is St. Paul’s Church, in Manhattan]. When Macy and the band came in to record, she knew what the songs were going to be, and they further refined the arrangements over the two days we recorded. They even created a song from scratch. On the second day, to everyone’s surprise, Macy announced we were going to make up a song. David suggested a groove, the band made up the music and she made up the lyrics, all on the spot. They did three takes, and in postproduction I put together the best performance I could, creating the final arrangement after the fact. This became the final song of the album, Lucy.”
According to Prout, the Chesky team normally spends a day setting up. The gear they use includes the B&K dummy head affectionately dubbed Lars, Crystal cables, an MSB Technologies Platinum Studio A/D converter going into a Sonic Studio Pro Model 303 eight-channel AES/EBU interface made by Metric Halo. Then they record a stereo 192k file into a Macbook Pro using the Sonic Studio recording software included with the interface. A Mytek 8X192 AD/DA converter going into Logic is used as a backup system, which also provides monitoring feeds to the headphones. So far, so simple, and totally in keeping with Chesky Records’ perfectionist credentials. However, the gear list also includes a PA system, a Beyerdynamic M160A mic, a Mackie desk and some effects units, like a delay. Are Chesky and Prout secretly cheating on their purist binaural approach?
“That mic has two functions,” explained Prout. “First of all it gives the vocalist a focal point, and we encourage him or her to work the microphone. Singers totally relate to that, because it’s what they do all the time. It also gives us a consistent position from the singer in relation to the B&K head. But the M160 is not in the recording chain. Instead it gets sent to the PA system that we set out in the church, with the speakers directed outwards, away from the singer — this is how we add more room reverb to the vocal. The Beyerdynamic is a hypercardioid mic, so you don’t get a lot of leakage from the rest of the band, and because we usually want a very present vocal sound, we tend to ask the singer to be very close to the binaural mic, which means the singer will sound very dry. For the same reasons I also sent a touch of Wallace Roney’s trumpet to the PA.
“Cranking the PA system adds more room reverb, if we want it, and I sometimes delay that signal by 120ms or so using a digital delay, to separate the reverb a bit from the dry signal. We want the PA to sound good, but we don’t go crazy with that gear. No super-expensive cables, for example. The church has beautiful acoustics, which we like to use, but not always. For example, there’s a hideous green rug on the floor, which warms and dampens the sound. But sometimes it causes the sound of an instrument to die, in which case we put three-quarter inch plywood under the player to get a livelier sound. On Macy’s record both the bass and the drums are on plywood.”
REALITY CAN BE SPOOKY
“Because we have often worked at The Hirsch Center before, we know what it sounds like, and where to put the equipment to get the results we want on a project,” continued Prout. “What changes is where the musicians are positioned. In the past we had two eight-channel custom mono mixers with tube electronics and we used a Soundfield microphone to pick up as much as we could, and then added two or three spot microphones. The fewer microphones you have up, the purer your sound is going to be. We now only do binaural recordings and if you add any additional microphones the ambisonic effect collapses, so that was an adjustment for me. It really becomes a matter of: if you want some more guitar, move the guitarist closer to the microphone. With Macy’s record there were just three players, guitar, drums, double bass, and a singer, plus occasionally Wallace on trumpet, so that was relatively easy.
“Before the musicians arrive, David and I discuss how he wants the final product to sound, as far as where in the stereo field the instruments should be and how present, so when the musicians come in we get going pretty quickly. In this case, the band came in before Macy so they had some time to run through songs and we had time to get the balance we wanted. We monitor in a separate room, with headphones on, and the moment the musicians play, it is apparent right away what adjustments need to happen. The dummy head has rubber ears with B&K mics in the ear canals, and with headphones on we’re in the same position as the dummy head in the room. The realism that we experience while we’re listening back to the recording in that side room, with headphones on, is uncanny. If someone in the church stands behind the microphone and makes an unexpected sound, we all turn our heads. It’s so realistic, it’s spooky.
“We mix the sound while recording by placing the musicians in the stereo field. This has to do with balance, panning, and presence, and I will go into the church and talk to the players about these. It may be a matter of asking someone to play a little louder, or softer, or I’ll move a chair a little bit, because the imaging is not quite where I want it to be. If we want an instrument or singer to be really present, we place the musician or singer right next to the mic. With the Macy Gray album, we really wanted the bass player, Daryl Johns, to have punch and presence and immediacy, so he’s placed right next to the mic, in the right channel. Russell Malone, the guitarist, also was fairly close to the microphone on the left, but his amplifier was about 20 feet back, because I wanted some space on his sound. Although, because he was sitting close to the microphone and playing a semi-hollow body you hear a little bit of the natural acoustic sound of the guitar as well. Ari Hoenig played drums, and we wanted those to sound very present as well, so he’s pretty tight on the microphone too. Macy was closest to the dummy head, though, and the general rule for these sessions was: ‘If you can’t hear Macy, you’re playing too loud!’
“In general, the more experienced the musicians are, the more they like this way of recording, and the better they adapt to the situation and play with dynamics. They get it right away. All the players on Macy’s record had recorded for Chesky before so they knew what to expect, and they all rose to the challenge! To me that’s what music is: people playing together, listening to each other and reacting to what they hear. The musicians don’t use headphones while recording, but they do come in and listen through any of the many headphones we have lying around, and they can immediately hear how they are coming across. That also is a very important part of the process because usually if they hear the playback they know how to adjust their performance to what the microphones pick up.”
PURITY AT ALL COSTS
One detail conspicuously missing from the above equipment list and descriptions are monitor speakers. Prout explained that it’s part of Chesky Records’ wholesale switch to the binaural recording approach, and keeping that associated mindset in focus: “We used to bring speakers, but not anymore. Because we are recording for people with headphones, that’s what we use while recording. Many headphone manufacturers are aware of what we’re doing, so they supply us with an array of different headphones that we and the musicians can listen to, from headphones costing $500 to one pair costing $5000. I personally listen to my custom-moulded Ultimate Ears in-ears. They are not the most comfortable in the world, but I feel they accurately represent what I’m recording. Some of the more expensive headphones sound amazing, but have very little to do with the truth of what we’re actually laying down.”
While Prout leaves the uber expensive headphones for the end listener, the rest of the recording chain is as high-end and expensive as it gets. The B&K Head And Torso Simulator (HATS) costs so much money that dealers ask customers to call in to get a quote — which will probably come in at around the AU$20,000 mark. From there the recorded signal goes from one of two mic pres (supplied with the HATS) via Crystal Cables (which start at AU$700 a pair and can cost up to a staggering AU$7000 per cable) to the MSB AD Converter, and so on. While a two-day, two-track recording process free of constant credit card payments to update software may sound comparatively cheap, the initial gear outlay is significant.
“Crystal cables are very expensive,” agreed Prout, “but they sound amazing. We made a recording in Sweden a couple of years ago of an orchestra playing three concertos by David Chesky. We brought our whole setup, including the Soundfield mic we used at the time, but one cable was not sent, a six-foot link in the chain, so I used a regular cable instead. When our Crystal cable arrived the next day, I switched the cables while standing behind the speakers (which we were using for this session), and even from that position the improvement in sound was dramatic. Every link in the chain is important. I really like B&K microphones, and the MSB converter is the best converter we have heard. Every step in the chain makes a big difference, and we always try to refine that.”
So far, so admirably purist. But there comes a moment when even Chesky Records needs to knock these pristine recordings into a shape that can be released. The post production is mostly performed by Prout, explained David Chesky, who likened his own role to that of a movie director. “Nick goes in and finds the best takes, and edits out the wrong notes. It’s like the director in a movie and an editor. The director gets all the takes in, and gives them to the editor, who then puts them together. Most of the time we try to use complete takes, but it is live, and if there are flaws Nick will go in and find something better from another take and insert that. His job is to make sure the product goes out sounding great based on the material we captured.”
“We usually record two or three versions of each song,” said Prout. “With Macy there were sometimes even fewer takes, because we didn’t want to burn her out and spend the whole day on a song. I go into the editing room at Chesky, load everything into Sonic Studio’s SoundBlade software, and create the best version of each song. With the amazing players we had on Macy’s record it was not so much a matter of fixing mistakes as simply including all the good stuff. There might have been a particularly exquisite bass fill in one take, or a better intro, or ending, and you say to yourself: ‘That has to be in there.’ When I get it to a stage I’m happy with, I send a sequence to David and the artist so they can listen and give feedback. I then make whatever changes are needed and that’s it. A couple of songs with Macy were shortened because we felt the soloing could be cut a bit.
“The next step is for me to apply the Diffuse Field EQ, and then master the project. Traditionally this involves balancing the tracks to each other. You don’t want the listener to be running to the volume controls for different tracks. The second aspect is getting it as hot as I can, without using compression. Overall we try to make it sound exactly the way it sounded the day we recorded it. I don’t normally apply EQ, unless for emergencies. I apply a high-pass filter at 20Hz to the recordings at The Hirsch Center, because it’s a very noisy environment. You can hear helicopters and aeroplanes, and the busy street and subway that are nearby, all competing with the rumble of the heating system.
“I also sometimes use EQ if a fix is absolutely needed. There was a record we did recently where there was something odd coming from the position of one of the singers. I dipped the mid-range just when he was singing, and only on one song. Equalisation degrades the sound. You’re altering the signal, and we try very hard not do to that. We want the listener to be as close to the original recordings as possible, which is why we have the ultra clean signal path, with very few stages between the microphone and the hard drive. I think you can hear that.”
Stripped undoubtedly sounds wonderful, and it’s great to hear a modern recording with its entire dynamic range intact. The remaining question is whether end listeners can hear that the entire project was recorded in 24-bit/192k. David Chesky stated that his target audience “is a sophisticated person, with a good stereo who wants to appreciate good acoustic music,” and argued that this person would be able to hear the difference between the 192k download version of the album (which is a whopping 3.3GB), and the CD version. “Yes, if you are playing back on a good system, the 192k version will sound a lot better.”
This writer put this to the test, but sadly, perhaps after several decades of rock ’n’ roll abuse, his ears could not spot the difference between the two versions. However, my 14-year old son, with young and completely unspoilt ears, picked out the HD version three times in a row during blind testing, each time within seconds. He used words like ‘smoother high’, and ‘more detail in the bass’, terms I’d never discussed with him before, so words had hardly been put into his mouth.
This obviously throws up several cans of worms. Detractors claim it’s impossible to hear what my son heard, so how could he? Moreover, Chesky’s target audience is more likely in the 30+ range, and certainly not young teenagers. If anyone could hear any benefits to HD Audio, it’ll be the younger generation. The same generation that has grown up largely ignorant of and indifferent to hi-fidelity audio and may be ruining its ears with relentless high-volume headphone abuse.
The HD controversy undoubtedly will rage on. Meanwhile, perhaps it’s best to simply listen to that Macy Gray album, and enjoy the music as it envelops you — whether you’re listening to it on headphones or not.