Because it is the most objective way to do it. You might listen to a hifiman he1000v2 and a susvara separately and think they both sound impeccable, but only by AB testing you will discover imperfections of the he1000v2 which sounds otherwise amazing. Also your tastes changes. One day you might find a certain headphone too bright, but in a few months you might enjoy it. By AB testing at the same day you can directly compare them because your hearing and preference is the same. You also have a fresher memory of the sound
Double blind testing such as ABX exposes differences, or otherwise, between products. To my mind it is the gold standard of audio evaluation as if you can’t discern differences between different gear (noting that this does not mean they are the same, just that any differences are not discernible to a particular listener) then there is no point obsessing over measurement, tuning, frequency response graphs etc. For transducers differences in tuning are usually very easy to discern (and double blind testing is difficult at best) but in the case of amplifiers and DACs it can be quite eye opening. Even when differences are discernible, if the effort risks brain overload and induces a cold sweat then that in itself indicates that such differences are pretty meaningless. The problem for many reviewers is the classic contradiction highlighted many times - on the one hand they talk about night & day differences, hear the difference from another room etc, but if challenged suddenly decide these night and day differences only surface if used with a carefully matched system costing $$$$$$$$$$s, cable made from unobtainium etc. Well, which is it, it can’t be both.
If only considering SQ it can be surprising just how little you need to spend to get great sound. However, what ABX (and measurement for that matter) can’t indicate is build quality, durability, after sales support, style and all the other things that form part of the purchase decision.