Betanews Relative Performance Index for browsers 3.0: How it works and why

Our physical test platform, and why it doesn't matter

The physical test platform we've chosen for browser testing is a triple-boot system, which enables us to boot different Windows platforms from the same large hard drive. For the Comprehensive Browser Test (CRPI), our platforms are Windows XP Professional SP3, Windows Vista Ultimate SP2, and Windows 7 RTM. For the standard RPI test, we use just Windows 7.

All platforms are always brought up to date using the latest Windows updates from Microsoft, prior to testing. We realize, as some have told us, this could alter the speed of the results obtained. However, we expect real-world users to be making the same changes, rather than continuing to use unpatched and outdated software. Certainly the whole point of testing Web browsers on a continual basis is because folks want to know how Web browsers are evolving, and to what degree, on as close to real-time a scale as possible. When we update Vista, we re-test IE7 on that platform to ensure that all index scores are relative to the most recent available performance, even for that aging browser on that old platform.

Our physical test system is an Intel Core 2 Quad Q6600-based computer using a Gigabyte GA-965P-DS3 motherboard, an Nvidia 8600 GTS-series video card, 3 GB of DDR2 DRAM, and a 640 GB Seagate Barracuda 7200.11 hard drive (among others). Three Windows XP SP3, Vista SP2, and Windows 7 RC partitions are all on this drive. Since May 2009, we've been using a physical platform for browser testing, replacing the virtual test platforms we had been using up to that time. Although there are a few more steps required to manage testing on a physical platform, you've told us you believe the results of physical tests will be more reliable and accurate.

But the fact that we perform all of our tests on one machine, and render their results as relative speeds, means that the physical platform is actually immaterial here. We could have chosen a faster or slower computer (or, frankly, a virtual machine) and you could run this entire battery of tests on whatever computer you happen to own. You'd get the same numbers because our indexes are all about how much faster x is than y, not how much actual time elapsed.

The speed of our underlying network is also not a factor here, since all of our tests contain code that is executed locally, even if it's delivered by way of a server. The download process is not timed, only the execution.

Why don't we care about download speeds, especially how long it takes to load certain pages? We do, but we're still in search of a scientifically reliable method to test download efficiency. Web pages change by the second, so any test that measures the time a handful of browsers consumes to download content from any given set of URLs, is almost pointless today. And the speed of the network can vary greatly, so a reliable final score would have to factor out the speed at the time of each iteration. That's a cumbersome approach, and that's why we haven't embarked on it yet.

There are three major benchmark suites that we have evaluated and re-evaluated, and with respect to their authors, we have chosen not to use. Dromaeo comes from a Firefox contributor whom we respect greatly, named John Resig. We appreciate Resig's hard work, but we don't yet feel his results numbers correspond to the differences in performance that we see with our own eyes, or that we can time with a stopwatch. The browsers just aren't that close together. Meanwhile, we've currently rejected Google's V8 suite -- built to test its V8 JavaScript engine -- for the opposite reason: Yes, we know Chrome is more capable than IE. But 230 times more capable? No. That's overkill. There's a huge exponential curve there that's not being accounted for, and once it is, we'll reconsider it.

We've also been asked to evaluate Futuremark's PeaceKeeper. I'm very familiar with Futuremark's tests from my days at Tom's Hardware. Though it's obvious to me that there's a lot going on in each of the batteries of the Peacekeeper suite, it doesn't help much that the final result is rendered only as a single tick-mark. And while that may sound hypocritical from a guy who's pushing a single performance index, the point is, for us to make sense of it, we need to be able to see into it -- how did that number get that high or that low? If Futuremark would just break down the results into components, we could compare each of those components against IE7 and the modern browsers, and we could determine where each browser's strengths and weaknesses lie. Then we could tally an index based on those strengths and weaknesses, rather than an artificial sum of all values that blurs all those distinctions.

This is as complete and as thorough an explanation of our testing procedure as you're ever going to get.

34 Responses to Betanews Relative Performance Index for browsers 3.0: How it works and why

© 1998-2025 BetaNews, Inc. All Rights Reserved. About Us - Privacy Policy - Cookie Policy - Sitemap.