- Gene Steinberg's Tech Night Owl - https://www.technightowl.live/blog -

Consumer Reports’ Deck Stacking — or Incompetence — Exposed

Macs tend to fare second best in Consumer Reports testing, partly because the magazine lives in ignorance of the differences between Apple’s computers and Windows boxes. But they’ve always been recommended, until recently. I can quibble about the way the tests appear to emphasize features over performance, usability and reliability. In fact, I have.

But it took a poor rating by CR to trigger a dialogue that revealed a serious flaw in their testing. The tests also triggered an obscure bug in Safari for macOS Sierra that might otherwise have remained undiscovered and unfixed.

It all started when CR reported wildly divergent battery life results, ranging from 3.75 hours up to 19 hours over three tests for each product. The latter is way more than Apple’s estimates, which range up to 10 hours.

Now all three MacBook Pro models exhibited similar behavior. A clue that something might be amiss was the fact that CR uses the default browser, in this case Safari. When the tests were rerun in Google Chrome, battery life was within acceptable limits.

Now Apple usually ignores test results from the media, but not CR, which has a circulation of millions of consumers and is highly influential when readers make buying decisions. A bad rating can kill or seriously hurt sales of some products. It can also accomplish good things, such as when an auto manufacturer has to go back and modify a faulty suspension system that might cause a rollover during a rapid maneuver to avoid an accident.

This time, Apple was in the hot seat. Even though a number of owners of the new MacBook Pros have reported an assortment of battery issues, CR’s results were unique. The inconsistency didn’t make sense, and thus marketing VP Philip Schiller posted a tweet — the new normal for getting the word out nowadays — saying that the results didn’t jibe with Apple’s own field tests. Apple was working with CR to figure out just what was going on.

Now CR’s tests are intended to be consistent from notebook to notebook. It involves downloading 10 sites from the company’s in-house server until the battery is spent. So just what was going on here, and was the test deliberately designed to leave Safari — and Macs — second best?

Well, that’s debatable, but to achieve consistent results, CR turns off caching on a browser. With caching on, the theory goes that the sites would be retrieved from the local cache, which presents an anomalous situation since different computers — and operating systems — might do it differently. On the other hand, it would also be using the computer normally, not in an artificial way. CR’s excuse, by the way, is that the test sequence puts greater stress on the battery: “This allows us to collect consistent results across the testing of many laptops, and it also puts batteries through a tougher workout.”

But how can such a test possibly produce results that in any way reflect what a typical user would encounter? After all, normal users might check a site several times a day, rather than constantly bring up new uncached sites. While all notebooks are being evaluated the same way, it’s a curious choice. Unfortunately, CR would have to go back and retest hundreds of computers to switch the testing scheme.

On Safari, caching is switched off via a seldom-used menu bar command, Develop, which is available in the apps preferences under the Advanced category. Clearly this is not a feature most users will ever use — or even know about. I use it to access the “Show Page Source” command from the context menu when I’m examining a site’s coding.

Now I suppose using a non-standard test scheme of this sort shouldn’t have had a disastrous effect, but it did. It seemed that the action triggered an obscure and inconsistent bug in Safari. With caching turned off, logos would reload, thus unnecessarily taxing the battery. It’s a bug that Apple discovered and fixed in the latest beta for macOS Sierra 10.12.3. You can download it if you’re a public beta tester or developer, and it will be made available for general distribution in a few weeks.

In the meantime, CR has accepted Apple’s findings: “According to Apple, this last part of our testing is what triggered a bug in the company’s Safari browser. Indeed, when we turned the caching function back on as part of the research we did after publishing our initial findings, the three MacBooks we’d originally tested had consistently high battery life results.”

It would have been nice if they said that before the review appeared, because that clearly indicated there was some sort of software issue that might be unnecessarily impacting the tests in a way that customers wouldn’t encounter. In other words, it’s an admission the test was unfair, and that the results didn’t in any way reflect a normal use case. After all, CR is testing a notebook’s battery life, not the capabilities of the default browser to render pages without caching.

In any case, CR is retesting the MacBook Pros with the revised macOS, and it shouldn’t take more than a few days to deliver the results. Assuming battery life is normal, the rating will be changed accordingly, and the new notebooks will be added to the recommended list.

Of course, CR should have realized something was amiss as soon as the battery life normalized with caching on. They could have reached out to Apple before the results were published for clarification. As it was, CR got a boatload of publicity for its decision not to recommend the MacBook Pros. Of course, that result will soon be changed if all goes well.

Will CR learn a lesson from this debacle? Probably not. After all, few companies would dare protest a bad rating. Indeed most companies who build products that don’t past muster probably deserve it.