I saw an article earlier this week about Japan’s K Computer, the latest computer to be designated the “fastest supercomputer” in the world. Twice a year (June and November), the Top500 list comes out. The list’s publishers consider the highest scoring computer on the list as the fastest computer in the world. The first article I read about the recent rankings did not cite the results, just the rankings. So, I went to another article which referred to the K computer as capable of 8.2 quadrillion calculations per second, but did not give the results of the other leading supercomputers. On to the next article which said the K Computer was capable of 1.2 petaflops per second. (The phrase petaflops per second is in the same category as ATM machine or PIN number…) The same article said that the third fastest was able to get 1.75 petaflops per second. OK, now I was definitely confused. (I really miss the old days of good copy editing and fact checking, but that is a blog for another day.)
So, I went to the source, the Top500 Web site (www.top500.org). It confirmed that the K Computer obtained 8.16 petaflops (or quadrillion calculations per second) on the LINPACK test. The Chinese Tianhe-1A got 2.56 petaflops and the American Jaguar, 1.76 petaflops.
Once I got over the sloppy reporting and stopped playing with the graphs of the trends and scores over time, I started thinking about the problem of metrics and the importance of making them easy to understand. Some metrics are very easy to report and understand. For example, a battery life benchmark reports its results in hours and minutes. We all know what this means and we know that more hours and minutes is a good thing. Understanding what petaflops are is decidedly harder.
Another issue is the desire for bigger numbers to mean better results. The time to finish a task is fairly easy to understand, but in that case, less time is better. One technique for dealing with this issue is to normalize the numbers. Basically, that means to divide the result (such as a time) by the result of a baseline system’s result. The baseline system’s result is typically considered to be 1.0 (or some other number like 10 or 100) and other results are meaningful only in relation to the baseline system or each other. A system scoring 2.0 runs twice as fast as the baseline system’s 1.0. While that is clear, it does take more explanation than just seconds.
Finding the right metrics was a challenge we faced with HDXPRT 2011. Do you think we got it right? Please let us know what you think.
Bill