Betanews Comprehensive Relative Performance Index 2.2: How it works and why
We did not have the Comprehensive Relative Performance Index (CRPI -- the "Creepy Index") out for very long before we found it needed to be changed again. The main reason came from one of the architects of the benchmark suites we use, Web developer Sean Patrick Kane. This week, Kane declared his own benchmark obsolete, and unveiled a completely new system to take its place.
When the author of a benchmark suite says his own methodology was outdated, we really have no choice but to agree and work around it. As you'll see, Kane replaced his original, simple suite that covers all the bases with a very comprehensive, in-depth battery of classic tests called JSBenchmark that covers just one of those bases. For our CRPI index to continue to be fair, we needed not only to compensate for those areas of the old CK index that were no longer covered, but also to balance those missing points with tests that just as comprehensively covered those missing bases.
The result is what we call CRPI 2.2 (you didn't see 2.1, although we tried it and we weren't altogether pleased with the results). The new index number covers a lot more data points than the old one, and the result...is a set of indices that are stretched back out over the 20.0 mark, like the original 1.0, but whose proportions with respect to one another remain true. In other words, the bars on the final chart look the same shape and length, but there are now more tick marks.
General explanation of the CRPI
Since we started this, we've maintained one very important methodology: We take a slow Web browser that you might not be using much anymore, and we pick on its sorry self as our test subject. We base our index on the assessed speed of Microsoft Internet Explorer 7 on Windows Vista SP2 -- the slowest browser still in common use. For every test in the suite, we give IE7 a 1.0 score. Then we combine the test scores to derive a CRPI index number that, in our estimate, best represents the relative performance of each browser compared to IE7. So for example, if a browser gets a score of 6.5, we believe that once you take every important factor into account, that browser provides 650% the performance of IE7.
We believe that "performance" means doing the complete job of providing rendering and functionality the way you expect, and the way Web developers expect. So we combine speed, computational efficiency, and standards compliance tests. This way, a browser with a 6.5 score can be thought of as doing the job more than five times faster and better.
Here now are the ten batteries we use for our CRPI 2.2 suite, and how we've modified them where necessary to suit our purposes:
Here's how we developed our new score for this test battery: There are three loading events: one for Document Object Model (DOM) availability, one for first element access, and the third being the conventional onLoad event. We counted DOM load as one sixth, first access as two sixths, and onLoad as three sixths of the rendering score. Then we adjusted the re-rendering part of the test so that it iterates 50 times instead of just five. This is because some browsers do not count milliseconds properly in some platforms -- this is the reason why Opera mysteriously mis-reported its own speed in Windows XP as slower than it was. (Opera users everywhere...you were right, and we thank you for your persistence.) By running the test for 10 iterations for five loops, we can get a more accurate estimate of the average time for each iteration because the millisecond timer will have updated correctly. The element loading and re-rendering scores are averaged together for a new and revised cumulative score -- one which readers will discover is much fairer to both Opera and Safari than our previous version.
- Celtic Kane JSBenchmark. The very first benchmark tests I ever ran for a published project were taken from Byte Magazine, and the year was 1978. They were classic mathematical and algorithmic challenges, like finding the first handful of prime numbers or finding a route through a random maze, and I was excited at how a TRS-80 trounced an Apple II in the math department. The new JSBenchmark from Sean P. Kane is a modern version of the classic math tests first made popular, if you can believe it, by folks like myself. For instance, the QuickSort algorithm segments an array of random numbers and sorts the results in a minimum number of steps; while a simplified form of genetic algorithms, called the "Genetic Salesman," finds the shortest route through a geometrically complex maze. It's good to see a modern take on my old favorites. Like the old CK benchmark, rather than run a fixed number of iterations and time the result, JSBenchmark runs an undetermined number of iterations within a fixed period of time, and produces indexes that represent the relative efficiency of each algorithm during that set period -- higher numbers are better.
There are two simple heats whose purpose is to draw an ordinary wireframe cube and rotate it in space, accounting for forward-facing surfaces. Each heat produces a set of five results: total elapsed time, the amount of that time spent actually rendering the cube, the average time each loop takes during rendering, and the elapsed time in milliseconds of the fastest and slowest loop. We add those last two together to obtain a single average, which is compared with the other three times against scores in IE7 to yield a comparative index score.
Next: The other five elements of CRPI 2.2...