Why are Tubes Better at Soundstaging & Imaging (or Are They...)?

A few nights ago a wrote a…let’s call it a ‘treatise’…comparing a little ChiFi tube buffer with the Darkvoice 336SE. In that post, I remarked that one of the oft-cited advantages of tubes is a spatial presentation (soundstaging & imaging) that makes tube amps/preamps often sound more “holographic” than their solid state counterparts. My own testing has confirmed such a phenomenon with relatively inexpensive gear. Don’t worry, I’ll do my best to keep this post shorter.

It’s uncontroversial to say that tubes distort the incoming electrical signal more than their solid state counterparts. Why then, if tubes mess with the signal so much more, are they able to create a better spatial presentation? I do NOT have a definitive answer to this question. Thus, this post is both throwing out some hypotheses and hoping to generate a discussion. I would really like to hear your ideas on this question. I came up with a hypothesis, tested it as best I could, and came up with an inconclusive result. I’ll share that here and hopefully you all will run with it.

Background:

First, to understand my forthcoming hypothesis, I need to define “distortion.” I’m going to take a fairly basic definition here. In the context of an electrical waveform it simply means to change the shape of that waveform. Electronic components can do this in a variety of ways, but that’s going to be the definition I run with here. Feel free to pick it apart - but please be clear why. Harmonic distortion (the HD in THD) is then changing the shape of the harmonic waves riding along with the fundamental oscillation.

Second, let’s talk about human hearing and how our auditory systems localize sounds. One of the primary ways (there are others) humans localize sound is through the changes in pitch and intensity and the time delay between our ears’ perceiving the sound. If a sound occurs to the left of our head, our left ear hears it first. Then, some small amount of time later, our right ear hears the same sound at a slightly lower pitch and intensity. The lower pitch is caused by the sound wave 1) diffracting around our head and 2) some of the energy from the compression wave passing through the matter of our heads producing a refracted waveform. Our brains take over and begin interpreting those pitch and intensity changes and time delays to give us an idea of where the sound is occuring (this is a type of pattern recognition, which I’ve also written some about here at HiFi Guides, sorry, I’m a hopeless academic). In this example, what our right ear hears is technically a distortion of what the left ear heard - the waveform has changed between left ear and right ear.

An Initial Hypothesis:

Since we can make the case that our localization ability relies on some degree of sound-wave distortion, maybe it’s reasonable to think that the pursuit of vanishing levels of distortion in our audio gear is having an adverse effect of the spatial presentations of gear. For example, our own @M0N was the first on this forum to point out that THX amps do not have the spatial presentation chops that the Rupert Neve RNHP has. Many of our fellow HiFiGuides (we need a denonym here) members have echoed his sentiments. According to Amir at Audio Science Review, the RNHP has much more distortion than the Drop 789 or the Monolith 887. The Darkvoice 336SE goes far higher yet. These distortion measurements and the nature of human-hearing-localization got me wondering if some amount of some type of distortion is helpful in creating sonic space. Is there a chance that some of the distortion that is introduced by tubes (or the type of distortion in the RNHP and other solid state devices that image well) mimics the distortion of the sound wave that happens because our heads are physically in the way of a sound wave? A working hypothesis was more or less yes, that.

There is a lot to unpack and measure here. I admit to limited knowledge on the ins and outs of total harmonic distortion and noise and SINAD, etc. I also don’t know enough to say what kind of distortion would mimic the diffraction caused by our heads being in the way of a wave. There are also lots of different kinds and sources of distortion that get collapsed into the singular THD and SINAD measurements. So again, I appeal to the community to fill in some of the inevitable holes in my knowledge and thinking. There is also a limited body of research that shows that tubes distort oh, so good, but that article does not mention the spatial presentation at all, and that’s what I’m interested in here.

The best test I can come up given the equipment I have access to is to use binaural recordings and compare their spatial presentations on tube and solid state gear. There are different methods of doing binaural recordings, but the one I’ll use is the type that uses a dummy head with microphones placed right at the openings of dummy ear canals (or in). When recording a live performance, this microphone setup will, to some degree, capture the subtle pitch and intensity changes that happen to the sound waves as they bend around the head. Traditional recording techniques either use 2 microphones spread several feet apart or use a mixing console to pan certain signals to one channel or the other. These traditional techniques capture or approximate 2 of the 3 ways our auditory systems process localization: time delay and intensity difference a the two microphone positions (in-room speakers would presumably do the third). Binaural recordings hard record the diffraction-caused pitch shifts. A super-clean, i.e. low-distortion, amplifier’s spatial presentation should equal or surpass a higher-distortion amplifer, like a tube amp, in spatial quality with binaural recordings. The low-distortion amp will more accurately reproduce those recorded distortions instead of introducing them on their own. A corollary, and in alignment with the theory that 0 distortion is the best, is that a high-distortion amplifier would NOT be able to produce as good as a spatial presentation as a low-distortion amplifier on a binaural recording - the ‘excess’ distortion would further damage the signal and possibly collapse the sonic image. Let’s test it…

Method:

Some time ago I downloaded a binaural music sampler from HD Tracks [Hey! It’s FREE right now!] and almost forgot about it until now. For one iteration of the test I split the SE output on my SMSL SU-8 DAC, sent one signal through my Douk Audio Dz tube buffer with GE JAN5654W tubes and then into a JDS Labs Atom amp, and the other signal went directly to the Atom. In the second round of the test I sent the SU-8’s balanced signal to the SMSL SP200 amp, and the unbalanced output through the tube buffer and into the RCA input of the SP200. I then played those binaural tracks and switched back and forth, paying particular attention to the soundstage and imaging. In the comparison post linked above, I discussed how the spatial presentation of the tube buffer bettered both of these pure solid state solutions. I used Beyerdynamic DT880-600 ohm headphones as they are reknowned for their spatial prowess at their price point.

Results:

Definitively, the imaging and soundstage from both the Atom and the SP200 equalled what the buffer did on these binaural recordings. The buffer still provided a warmer, more relaxed sound, but there was no longer any audible difference in the quality of imaging or soundstage. This result also means that the buffer’s spatial presentation, with its much higher distortion, was also not WORSE than that of the pure solid state presentations.

Discussion:

While the disappearance of the difference in spatial presentation between tube and solid state was definitive, the results here are not. This simple test does not show that it is the distortion a tube injects that mimics the real life, between ear distortion of real life soundwaves. This result does, however, provide some partial support to the hypothesis that the tube-caused distortion is helpful for spatial presentation. And really, that’s all I can conclude here.

I remind you all that due to my limited knowledge of the engineering side here, all of this might be flawed, and possibly flawed enough to be a pile of bovine feces. For example, a tube preamp/amp is going to distort the sound in both channels of a stereo setup in probably the same way and amount, whereas in real life, only one ear hears the distortion in sound. It’s possible this issue may be mitigated by the fact that stereo effects are created with time delays and differences in intensity, so the same amount of distortion in both channels creates a proportionally different impact in what each ear perceives, partially mimicikng the IRL pitch change. But, I can say with confidence that on my gear binaural recordings produced no difference in imaging or soundstage quality between tube and solid state.

So what do you all think? What’s going on here?

(was that shorter? :crazy_face:)

6 Likes

I would say more that comes down to the topology of the thx, with it’s feed forward design causing this than the distortion. Of course the distortion also plays a role, but I would say the thx feed forward might be to blame here imo (of course that’s a guess, not good with electronics whatsoever lol) Edit: also I am not for one specific amp over thx here, I just think you can get the same or better performance than thx for a lower price lol

Personally I have solid state gear that can preform just as well as tubes can for spatial recreation, it’s just different for me at least, I don’t think I would call one better

Also something to explore in the future might be r2r or multibit dacs, as imo they typically recreate space better than their sigma delta counterparts (just my opinion though of course) at the lower price points, but in the higher end both can do an equally good job

Something else to consider with this whole test, is that it’s done on the budget realm of tube electronics, high end tube designs sound very different than the cheaper stuff, also there are a ton of both tube and solid state designs, so it would be pretty hard to narrow down what measurement might affect spatial recreation with all the potential variables

Interesting thoughts though, personally I don’t involve myself in the science of how stuff is done (as that ruins the fun for me in this hobby at least lol)

Agreed that the gear here is on the cheaper end and that is a limitation. That was also one reason I did my best not to come to hard conclusions and leave things in the hypothesis stage.

My understanding of r2r dacs is that they are also a bit higher in distortion? Or am I not understanding that one correctly?

That is typically true, they don’t measure as well as their sigma delta counterparts

Just to throw more confusion into the mix. Tubes are considered to provide more linear amplification than SS devices.

Tube amplifier topologies are often simpler, and commonly rely on less higher quality, often more expensive components. Power supplies generally have to be of higher quality for tube components. Often the highest level designs do everything can to remove extraneous capacitors and resistors from the audio chain.

Typical tube amplifier distortion is 2nd harmonic which is usually considered more pleasing than odd harmonic distortion. There is also considered to be a difference for out of phase vs in phase distortion

The transformers in transformer coupled designs often have as much or more affect on the sound than the tubes do.

Output impedances on tube amps are generally higher, resulting in a lesser ability to control a given transducer than SS devices.

And finally generally higher end components will tend to exhibit better spatial recreation, and staging whether Tube or SS.

Many of these points don’t apply to your cheap tube buffer.

2 Likes

Amir at ASR specifically calls out the 2nd order harmonic distortion of the RNHP as the reason why its distortion measurements are “high.” That statement and measurement, combined with the RNHP thread on this forum, is one thing that got me thinking down this road.

Thanks for your input.

FWIW if I were guessing I think what people think is a wider stage for lower end tube devices has mostly to do with the fact they are mostly simple low or Zero feedback circuits, with correspondingly higher output impedances, that probably results in some sort it ringing that the brain correlates to space.

I.e I don’t think it has anything to do with Noise on a frequency response graph.

On higher end designs, I suspect it’s actually different.

1 Like

Not sure I follow what you’re saying here…

You’re right to bring up what the brain may or may not be perceiving, though. I’ve been looking high and low for what creates effective spatial presentation in audio and there is precious little out there. This is the best I’ve come up with so far and I’m sure it’s not a very good explanation.

Perhaps because it’s somewhat not feasible to link with only a few features or measurements? So really up to the listener to judge, so not many objective facts exist around it?

Yes, that’s a big part of it. But that means there’s something missing between what’s in the signal and what our brains are perceiving. We can measure electrical signals pretty well. We can measure frequencies and match different shaped frequency response curves to different sound perceptions like “warm” or “bright.” But we haven’t figured out the spatial part yet. I want to know why and it has me flumoxed.

But also, if some electronics company figures it out, they could make a killing.

High impedance output devices struggle to “control” downstream devices, the ratio of load to output impedance is called the damping factor.
You’d see the effect on an impulse response graph or a waterfall plot.
I’m suggesting that it imparts a quality that the brain interprets like reverb, which makes the space sound bigger.

2 Likes

OK, that makes sense. Thanks. That could partially explain a perception of bigger soundstage, but what about imaging? To my ear, the buffer and the Darkvoice also do a better job of positioning individual sounds than the pure solid state devices.

Are you referring to separation?

Hard to guess I suspect that is directly related to the 2nd harmonic distortion, that’s relatively easy to construct an experiment for by artificially adding it into a signal, the various tube simulators basically do this.
But again this could be related to the impulse response.

That’s my assumption.

I had to look up “separation” to answer this. I understand the terms this way:

soundstage - the perceived width, depth, and height (if you’re really lucky) of the space in which sound is happening. That space is called the “soundfield”, AFAIK.

imaging - the ability to create a perceived location of a specific sound within the soundstage (eg the violin sounds like it’s 2 feet left of center and 6 feet back from the plane of the speakers)

separation - there seems to be a couple definitions: 1) almost the ‘negative’ of the imaging definition just given, the ability to create space between perceived sound source locations, thus separating them and 2) the difference between the right and left channel, the farther apart (or larger the angle between) the left and right speakers, the more stereo separation there is.

My question refers to what I call imaging here; the perceived location of a sound source within the soundstage.

Hi all,

sorry to necrothread this - I’ve just been going through the exact same thought process that WaveTheory did and my googling led me here

My journey down this path started when I went from a FiiO K3 --> RME ADI-2 DAC

It was cleaner for sure, but shortly after switching I realized I lost “something” engaging from the music. It’s taken me months to figure out what that was, and I believe what I’m missing is the exact subject that WaveTheory is writing about. To me it feels like what people refer to as “holographic” sound.

I found that sound again (much stronger) by hooking up the RME to a Garage1217 Project Sunrise, but I further discovered that not all headphones react to this. Some don’t get that “holographic” sound no matter what dac/amp I use. My current favorite in this regard is the Fostex TR-X00 on the Project Sunrise.

I’m under the impression that what we are hearing (or “feeling” ?) and loving is either harmonic distortion, or as Polygonhell pointed out, possibly “Ringing”

How can we put together a low-cost effort to test these theories out?

Can we build cheap amps that exhibit high harmonic distortion but low ringing, and other cheap amps that have high ringing but low harmonic distortion, then compare?

Or is there a list of low-cost / used amps that are known for their “holographic” sound properties?

WT…any way you could provide a TLDR?

posts like yours overwhelm me. :frowning:

FWIW I’ve changed my mind on this, I think it’s far more to do with an amplifiers ability to reproduce small signals while also reproducing large ones.
I suspect high feedback amplifiers are less conducive to this, which is why it ends up being associated with Tube amps in moderately priced gear.
It could be the even harmonic distortion, or other factor, but I don’t think it is.
It’s also not unique to tube amps, some higher end SS amps can produce similarly expansive staging.

My thinking is also evolving. Even the Asgard 3 does a really solid job with spatial recreation, it rivals the Darkvoice in that regard. Getting to the bottom of Soundstage and imaging is definitely tough. There aren’t measurements to give insight on it either. It does seem like measurement-focused gear struggles with it though. But I don’t know why.