How to approach to review and judge a studio/reference headphone? How to tell if something really is "reference-grade"?

I finally got on with my review. I am writing it with a very special approach and not judging it at all from listening to music or anything of that sort, that would make it subjective. I decided to test it technically.

I think what you wanted to say is that there is no universal “flat” for our ears, but technically there is “flat”. It would be a frequency response that doesn’t have boosted frequencies and is as flat as possible - we all know perfectly flat is non achievable. How do you test if something is flat technically-wise? Measurements.

Now, there is controversy in measurements as well. If you use a G.R.A.S. measurement system that has one of those heads with ears… you cannot really use that as a standard because everybody’s ears are different. I concluded that the most objective way to test if a headphone has a flat frequency response would be by having measurement microphones and headphones without much obstacles. I had a chance to visit a friend of mine and use his measurement system, he didn’t tell me what exact mics he used but the setup was pretty sensible. He just told me that the mics were pricey, ideally I want to know what mics he used. He used two different softwares (I also didn’t ask which ones they were).

Instead of an ear, it was this circular mounting system, no obstacles that would cause reflections (other than the headphones themselves, like their ear-pads for example), so it was as technically accurate as it gets (unless you have more expensive rig I guess…).

From my findings Ollo S4X had what I could safely call a “perfectly” flat response, at least the mid-range. From listening to it, I did find it to lack the deeper frequencies, I also didn’t find it to hit the highest notes (to my ears).

I think this is the best way to make a flat headphone. You are not making it for a certain ear, you are making sure technically-wise it is what it claims to be, and sure as hell it proved itself to have a flat frequency response. This means that the subjective part will be the fact that different people will interpret this flat frequency response differently, but that flat frequency response stays objective (I believe). Maybe our sight could be compared in a similar way, we don’t really know how our perception of vision differs, but we all call “red” red, and a “table” a table. It’s not really easy to find comparisons, maybe somebody can contribute with comparisons so it is easier to understand this.

My conclusion is to make a headphone technically correct, if a user wants to trust the technical side of things, that would be the standard. However, if he wants a “flat” or “reference-grade” response for himself, he would need to have a headphone made ONLY for himself, this is the only sensible conclusion I made. A manufacturer cannot make a truly reference-grade headphone for everybody because he doesn’t know whose ears will be listening to it. But what the manufacturer can do is to make sure that the product is technically correct and “doesn’t” have coloration, this would be a flat frequency response - however, it WILL be interpreted differently by everyone, but this is the only solution that makes sense.

What does everybody think?


I spent nearly 20 years deeply involved in digital photography. In that world, as in television, tech exists to calibrate/profile an imaging device to accurate in several parameters, including brightness, hue and saturation. To me, the world of audio remains a bizarre alice-in-wonderland world of subjectivity run rampant. Clearly, it’s a fun sandbox for hobbyists to play in, since anything goes and there are never-ending justifications to buy the next new toy.

But in the world of digital visual, having reference accuracy equipment doesn’t in the least impede content creators from cranking up the saturation (colour vividness) to crayon box levels, nor from using a vast tool kit of visual effects. Consumers remain free to tune the colour on their TVs to any psychedelic coloration they like. In fact, someone could probably make a tidy little fortune by inventing a device that randomly faded one video coloration into another at random intervals.

Personally, I’m all in favour of anyone contributing to the pursuit of reproducible accuracy in audio. It’s not as if having the option for accuracy would force anyone to use it, after all. So have at it, Voja. Go for it. I just happen to be doubtful how much pick-up you’ll get for this approach on this forum.

(BTW, it’s not as if extensive work along this line isn’t underway. A key element of the headphone audio calibration chain is called a head-related transfer function, or HRTF. This characterizes the differences between the sound reaching the eardrum for any one individual compared to any other. If there were such a thing as audio calibration and profiling, each user would hear the digital signal modified to match his HRTF as part of the signal chain.)


Heh, yup. I’m writing a small review about a DAC for fun and I wrote in it “my ears are not harman neutral”.

I am not trying to get any type of fame or praise from the community. I am doing this for myself. As a perfectionist, I care about the quality of my content. This being said, if I am reviewing a “flat frequency” headphone, I want to make sure I approach it correctly, otherwise I am not meeting my own standards. I had a handful of times where people were against my approach in reviewing, even though they consume and support people who don’t put any effort in making their reviews more objective (you will find this everywhere, people describe the sound performance of a product with only words, without any reference songs - this makes it useless for the reader to know why this person thinks that a product (e.g.) sounds “bright”, “piercing”, “muddy”, and all the other audiophile terms used). In short, I am just wanting to satisfy myself by making sure I am not following the sheep and making sure that at least in a sensible way I can test if a product is what it claims it is.

By the way, I am just trying to understand what HRTF is, and does it matter in headphones. I didn’t find myself to succeed in this way too much.

But my conclusion is, if a headphone has a flat frequency response from these microphone measurement setups, then the rest is subjective to the used - to their perception and interpretation of sound, but at the end of the day, it could be used as a standard.

Of course, as we discussed before, if a person adapts and knows a product well, this standard is meaningless to them, they can achieve a great result with this product because they know how it behaves.
If a person has professionally learned to use a product, they know this product very well, they can often achieve more professional results than a non-professional person would with this 400 euro headphone or anything similar. Billie Eilish’s brother was mentioned earlier - a person who has the money and buys 30,000 USD reference-grade studio monitors and idk what expensive professional studio equipment they can also buy, they can have a worse result than what Billie’s brother did with his setup.

But my point is that if this headphone costs 400 euros and claims to be reference-grade, at least I want to know if that is true - which it proved itself to be true. Somebody who wants an uncolored headphone that has a flat frequency response can invest their money, and they will get what the manufacturer claimed the product to have. Whether it will suit them, that’s the subjective part. Some people make their ears bleed with some Beyerdynamics phones or Yamaha studio monitors only because that is the “industry standard”, while others find a product that suits them and that they enjoy more, and in the end achieve greater results. It is all too subjective to make it simple, but I decided that it would be useless if I listened to music on these Ollo S4X headphones. They are meant to be used as a tool for making music, not necessarily listening to music. This is why I created this thread, to come to a conclusion that makes sense, and I think we concluded that there are too many factors that play a role in the way we hear sound, this is why I turned to the technical side, the “robotic” side that doesn’t face as many challenges as our ears do. Measurement microphones can vary too, but this is as close as you get to objectively judging the sound of a product (I think)

There IS a potentially interesting conversation around this, but it’s complicated.

To have reference devices you need to be able to define and measure reference. defining a reference FR might be one part of those, but it’s not it.

I have yet to see a set of headphone measurements that capture the “technicalities”, FR certainly doesn’t do it, taking 2 headphones I own, the D8000 Pro resolves detail vastly better than an HD650, it’s night and day, point to the measurement that shows that.

I think there is a closer analogy in audio to video (not film) reproduction, than just cameras.
When I worked in games for CRT’s back in the early 2000’s we would have 2 monitors on a desk, a multi thousand dollar 13 inch Sony reference monitor (used for broad cast TV), and the shitiest TV we could find to see what an end user would see. the output on the reference monitor really didn’t matter, it was there for checks and balances, in the end you had to be able to read the text, and play the game on the crappy TV.

I could look at either display and know what to expect on the other, I believe this is what most sound engineers do, they use headphones (or more likely studio monitors) they are familiar with that don’t have egregious issues and understand how that sound will change with various real world listening options.

Even if tomorrow you could define and produce a perfect set of measurements for a reference headphone. There would still be a centuries worth of recordings that weren’t mastered using them, so I’m not sure they would be interesting for consuming, rather than mastering.

And I’m not getting into what the rest of the mastering chain sounds like.


This would again be subjective. It is you to who it sounds like that. And of course, there are multiple sound characteristics that you cannot see from a frequency response chart. Imagine you give these two exact headphones (and sources that you use) to other people, and some of them say the exact opposite, then you run into the whole problem we are discussing in here. I think that to make a reference headphone, you must not use human ears to make it, you must use microphones and digital software - they don’t have the obstacles and challenges that come from the subjective nature of our ears. This is one of the reasons why I think G.R.A.S. measuring systems with ears are kind of inaccurate - even if those ears are the average, you are testing the headphones on ears that not everybody has. If you strip it down to just the microphones and a plate that you can put the headphones against (with ear-pads), you have yourself the least biased setup. The setup with ears is kind of taking for granted the technology of microphones and measuring software, you are implementing the same subjective imperfection that is present in our ears… which is the exact thing we are trying to run away from to be objective.

It is a limitation in my opinion. It is like limiting a robot to rotate its arm to a certain extend, when you can very well let it rotate its arm 360˚. This is the simplest example I could make that would make sense to the average reader, if you are limiting the microphone measuring system with a pair of artificial ears, you are going against the grain, you are presenting it an imperfection (doesn’t matter if they took 10, 10 000, or a million ears and based that fake ear on the average one, from the video @MaynardGK attached, you can see why this is a big problem).

I think you would get pretty universal agreement from trained listeners.
So it’s not subjective, just not captured in measurements.

This is the fundamental mistake measurement focused sites make, they assume the measurements define the sound rather than trying to understand what people are “hearing” with the measurements.
The former is beligerance, the latter is science.

And yes absolutely there are placebo effects, and bias’ from human experience, but that does not mean you just dismiss them.


I am personally not fond of measurement focused sites or reviewers. You will see that in no review of mine do I approach it with a measuring graph of anything of that type. I listen to music and do my best to explain what I am hearing, where, and in what quantity and quality.

This is an exception, as you can see… it took me to create a whole thread to come to a conclusion myself as to how to approach the review in the most objective way.

The argument can go on forever - for example, how biased are these trained listeners? How healthy are their ears? Sometimes, what I would call “virgin ears” are of much higher value than these trained ears, because they are completely unbiased - but they are not trained to pick up the nuances that otherwise people wouldn’t hear… and then again, a normal listener won’t pick up these nuances, so of how much value are the “trained ears”. There are many things anybody can bring up for this argument, I personally am not the one to use science and graphs in my reviews where I listen to how the headphone sounds to me, these reviews are completely subjective, but I do my best for the fellow reader to use this review (thanks to me using reference tracks, they can play them and listen for the thing I was talking about, this lets them make a comparison).

For example, I first listened to S4X, then when I saw the measurement graph, it matched what I thought. I personally coudln’t find a way to test how neutral a headphone is, I am not trained for that, so I went to the graphs - but I did notice that it does lack in the bottom end and doesn’t extend to the highest highs. This is a similar concept present in studio monitors - if you have a monitor with lots of bass response, (usually) the final result/track will have less bottom end, while if you had a monitor with less bass response, the final track would be bassier. This goes for soundstage, mids, highs, it is a relatively simple concept.

Readers cannot know how biased or unbiased these trained listeners are, this is why I looked for the most objective approach. As you may know, audiophiles are some of the most stubborn and biased people, and they often follow the sheep without putting much effort themselves. You have cable debates, these small things that elevate the cables debate, there are many things that can be debated in the audiophile space… people will believe anything and pay for it if it does a good job at convincing them that it does impact sound quality in a positive way. If you make it believable enough, there is a high chance somebody will “trust” you with it.

Anyways… I got a little off-topic there

God I love all you studio reference/mix dudes…you’ve mostly done a amazing job over the last 40 years of my music listening pleasure…carry on mastering music in different ways because that’s exactly how we hear it…different day, different mood , different amp, different headphone etc :man_shrugging:


Well, that’s objective alright, it’s also mostly irrelevant to the question of how those headphones will be heard by human listeners, who present a very different “landscape” to the sound path vs. that geometrically oversimplified setup. You’ve achieved objectivity while doing away with the usefulness of the measurement, since no human ever will hear the frequency envelope you just measured. :slight_smile:

My statement still stands: you can’t define a universal “flat” tuning that is what all humans will perceive as neutral or natural FR. “Flatness” is personal with headphones.

Hm. Correct to some point, but as we already discussed here, good luck testing ears of every single human being.

I did not ask how those headphones will be heard by humans, but rather how to test how reference they are. Even though you see things that we cannot hear in the graph, that is objective, our ears our not. Our ears are the ones that cannot pick those frequencies up, thus making them the one that are alternate the sound in the first place.

I don’t think your statement stands for a different reason - you are referring to flatness as a subjective thing, not a objective technical thing, which is what a flat frequency response is. I am not referring how “flat” something “sounds”, that is a subjective thing, but how flat is the frequency chart of the said headphone is an objective thing.

Also, how can a human ear determine what neutral and flat truly is? If you set a flat frequency response as a standard, it doesn’t matter how our ears perceive that, it is the flattest and most uncolored form of that sound that you are listening to - at least according to the microphone setup that tries to be as close to perfection as possible. If every human was to design a flat frequency response, you would have millions of different ones, whereas that is not the case if you use a measurement system.

You cannot test how flat of a frequency response a headphone has with a human ear, this is why I believe what you are saying is incorrect. You can, but then it would be subjective to that person’s perception of a “flat frequency response”.

Our hearing is subjective, so as I previously said, our ears are the ones that are alternating the truly and technically correct “flat frequency response”. This is not a personal thing, it is an objective thing.

If I asked you to explain how would a human ear objectively (truly) and accurately determine what flat frequency response sounds like, it would be based of a personal interpretation of it, which goes against the fact that the frequency chart is not interpreting anything, it’s actually presenting data of the sound waves that the microphone picked up and recorded - objective vs subjective. Our ears are essentially filters, because they cannot pick up all of that data, while a mic can. And do note - you are using the measurements to achieve a flat and uncolored frequency response, this does not go for audiophile headphones.

The listener compares the sound of the headphones to the sound of anechoically-flat-responding speakers when used in a good listening room, which is how Olive & Welti determined the Harman target and also how dr. Griesinger’s equalization method works. “Flat” means it should sound like listening to natural sources that are away from your body and not attached to your head, so that’s what you compare to. This is not an abstract mathematical question, it’s a question of two types of human experiences and how you make one as similar as possible to the other.

1 Like

Ideally yes. Ideally you want headphones to replicate speakers and not sound like two drivers on either side of your head. But the driver size itself represents a limitation, to get the height (soundstage) of a loudspeaker, you need a larger speaker driver.

What you said is also subjective. What exact loudspeakers are you trying to make the headphones sound like? What listening room?

I think @MaynardGK mentioned earlier that a flat frequency response is not like it is with speakers/monitors. With a monitor you ideally want a perfectly flat frequency response, but with headphones it is different because this perfectly flat frequency response gets altered on the way from the headphone drivers to your ears. This is why I am not considering our ears at all, they are subjective.

The only way would be to individually create a flat headphone that is flat to your ears, but you know that manufacturers cannot do that.

By the way, you just gave a very strong and valid point, I want to thank you for that!

Perfectly valid questions, even more reasons to say that “flat” is personal and cannot be universalized. You don’t know what type of setting each listener prefers for music listening, maybe one person only ever listens to live concerts in open fields, maybe someone else only ever listens to an omnidirectional speaker in their shower. Whatever “objective” method you use to determine flatness without referring to personal listener characteristics will be useless to these people and you will give them bad advice by recommending them your “flat” headphones that you have chosen based on this “inhuman” method. :slight_smile:

That’s fine, but human listeners might similarly choose to not give any consideration to your reviews as well. :smiley:

1 Like

I think you got the complete wrong idea. If you read any reviews from me you would know that I never use measuring equipment to judge their sound performance. I use my ears, which makes my reviews subjective, but I actually take the time to explain what I am hearing :slight_smile: Unlike the many poets in this hobby who describe sound with words, which I would like to hear how anybody can translate someones words “gorgeous” “bright” etc. to actual sound qualities.

Again, why are you mentioning music listening when clearly a reference-grade product is not aimed for music listening? Seems pointless… This is why audiophile-grade products exist, they have a coloration that you either like or don’t like. Reference products are not the same, completely different

You’re saying that manufacturing isn’t the right place to do that. Yep. The information that defines a person’s HRTF is better stored in a known location and used by the local play-back software to tailor the signal going to the headphone accordingly.

Ah what a wicked web we weave when first we abandon dear faithful analogue, which never done us no harm, and buy into digital’s deceitful domain.

1 Like

No, I meant that a manufacturer cannot make a commercial product if it needs to make test and measurement on each individual to make sure that the headphone is flat for them personally. Because from the video we concluded that even if the headphone does a perfectly flat frequency response, it will get distorted by the time this person hears it due to the ear canal and whatnot

This is something that you said that I agree with. You cannot make a flat product that will actually sound flat to everybody. What you can do is make the product flat on its own, this would be according to a measuring system - but because our ear structure and perception of sound differs so much, you wouldn’t hear this technically correct frequency response, you would hear your own individual altered version of this frequency response - I think that @MaynardGK mentioned this at the beginning of the thread, he said that this altered/personal frequency response is what you call HRTF.

But that brings me onto my next point. Absolutely no review would be of any value if it was tested from this persons experience, as though it would be completely subjective. And this is especially case for a headphone that has a flat frequency response on its own (before it is altered by our ears). This is different for audiophile headphones, they are trying to satisfy a certain group of people, their headphones are not meant to be used as tools for “unbiased” judging of sound.

I personally believe that the majority of reviews for audiophile products are of 0 value. This is because people describe sound with descriptive words. As you could see from the video that @MaynardGK attached, pink noise was the same sound that was played to multiple people, but sounded vastly different to each person. This is why if you write that a headphone is “bright” without actually explaining where you hear this “bright”, it’s nothing more than a word =)

This is a big problem in the audio media and reviewers, but nobody is speaking up. I am personally going against the grain, which isn’t something I will benefit from, but I know for a fact that I learned nothing about a product by reading words, I cannot translate words into actual sound qualities - maybe some people have a 6th sense for translating words into sounds :man_shrugging:t2: Zeos falls guilty for this, but I think most people watch him for entertainment, in the same way these reviews that only use words to describe sound are also entertainment. Not only are words not accurate to describe sound, but they also cannot do so. Sound is a technical thing, the reader has to hear something in order to put it in perspective, in order to perceive it as sound. If you put reference song with minute marks that refer to an exact sound and you specify “oh, that’s where this sounds bright for me”, the reader can play it and decide if it is bright or not for them… of course, this isn’t ideal for people who review as a job =) for them less effort = quicker money, which = better for them.

This is a whole other argument tough.
Regarding headphones and reference-grade sound, aka flat frequency response, the only objective approach is to make the headphone have a flat frequency response would be with the help of measurements and frequency response graphs. There is no other approach, you would need to start a service business that offers calibration of headphones to individuals, this would not be a cheap service. You would need to make several measurements of the individual’s ears, molds of their ears, and several data, then you would be able to create headphones for their ears only, and this way you would actually be able to see how the altered frequency response looks like - this would allow you to make flat frequency response for the individual’s ears.

I think that @Polygonhell brought up a very valid point. Even if you make the frequency response flat (of course, I am talking about with measurement systems rather than the altered freq. resp. based on individual ears), there are so many other aspects that don’t really have a “reference”. Sure, flat frequency responses mean that there is “no” coloration, but what about detail retrieval, soundstage, imaging, separation, those are the things that you cannot tell from a frequency response, but you also cannot make them really reference, I don’t think there has been a solution or a standard that would apply to these elements the same way frequency responses can be “perfected”.

However, as far as achievement goes, I think that the “unfiltered” sound signature of the S4X is a big achievement because it is indeed flat (before filtered by individual’s ears). Until there is a company that offers individually calibrated headphones to your ears, we will not be able to create a pair of headphones that sounds flat to us (individually you). Like @MaynardGK said:

Until we see this, we will have no reference for out ears. But this perception of sound, perception of “reference” (flat freq response) would be subjective because it only sounds like that to your ears. But there is no other way around it, you cannot create a universally reference headphone and make sure it sounds reference to everybody. Ollo Audio created a reference headphone, but by the time we perceive their headphones’ sound, this flat frequency response got altered and distorted…

This esentially means that the majority of people here were correct, but one thing that people did not agree is that a technically-wise reference headphone can be made. How useful is it? I don’t know. What most people on here were talking about is the personal perception of reference (flat freq response), not technical and undistorted flat freq response - but that is for a reason, our perception distorts this frequency response, so it is not the same as it is on the graph.

Controversy controversy controversy… probably the best word to describe me, huh? But also the perfect word to describe the audiophile industry on its own.
I was certainly persistent in my findings…

I’d also like to clarify that my intention was not to argue with anybody, if somebody took it that way, I didn’t mean it.

There are many factors that play a role in our perception of sound, so it is not just as simple as fixing one thing. You fix one thing, then you realize there is more to it than just that…

I want to thank everybody for sharing their opinion, it opened up my mind and let me look at it from many different angles. This greatly contributed to me making a sound conclusion.

This thread has been a major influence, and let me realize that the majority of arguments presented were true. I have realized that many of the things I said in this thread proved to be wrong, but the most important thing is that I realized this.

Needless to say, I posted the whole review/article here: 🔶 Ollo S4X Reference Headphone

Thank you to everybody who contributed to this thread!