BenchmarkXPRT Blog banner

Category: HDXPRT

Always wanting to know more

I’m an engineer (computer science) by training, and as a consequence I’m always after more data.  More data means better understanding, which leads to better decision making.  We acquired a lot of data in the course of finishing our white paper on the characteristics of HDXPRT 2011.  Now, of course, I want even more.

The biggest area that I want to understand better is the graphics subsystem.  Our testing showed processor-integrated graphics out-performing discrete graphics cards.  That was not what I expected.  There seem to be two likely explanations.  The first is that since the workload of HDXPRT 2011 does not include 3D, discrete graphics cards are not that helpful to the benchmark’s applications.  Certainly, 3D performance plays more to the traditional strengths of discrete graphics cards.  The second likely explanation is that the integrated graphics on the second-generation Intel Core processors we used perform well.  A number of performance Web sites have noted the same thing since the debut of those processors.

The answer is probably a combination of the two.

To satisfy my data desires, we’re going to look further. We’ll start by testing on some older processors as well as some different graphics cards.  We’ll share our findings with you.

Please let us know any other characteristics of HDXPRT 2011 that you’d like us to explore in more depth.  I can’t guarantee we’ll be able to look at everything, but I know I always want to know more!

Bill

Comment on this post in the forums

Sneak peak at the HDXPRT 2011 results white paper

After spending weeks testing different configurations with HDXPRT 2011, we are putting the final touches on a white paper detailing the results. I thought I’d give you a sneak peak at some of the things the tests revealed about the characteristics of HDXPRT 2011.

As I explained last week, trying to understand the characteristics of a benchmark requires careful testing while changing one component at a time. To do that, we ran the tests on a single system using an Intel DH67BL motherboard. We changed processors (both type and speed), the amount of RAM, the type of storage (hard disk and SSD), and the graphics subsystem, as well as a few other variables.

Here are a few of the things we found:

  • Processor speed – On an Intel Core i3, increasing the processor speed (GHz) 6.5% resulted in a 4.4% increase in the HDXPRT overall score. On an Intel Core i5, increasing the processor speed (GHz) 17.9% resulted in an 8.1% increase in the HDXPRT overall score. Generally, that means that increased processor speed is important, but the performance scales somewhat less than the raw gigahertz.
  • Memory – Increasing from 2 GB to 4 GB increased the overall score 10.7% on an Intel Core i5 and 15.8% on an Intel Core i7. However, increasing from 4 GB to 8 GB increased the score less than 2% on both processors. These results map pretty well with my personal experience: going to 4 GB is important for media-rich applications, but going to 8 GB is less so.
  • Disk drive – Switching from a hard disk to an SSD increased the overall score about 1%. While I would certainly prefer an SSD to a hard disk, this shows that, for HDXPRT 2011, disk performance has only a small influence on the results.

Many more details will be in the white paper we will publish in the next few days. Please be on the lookout for it and let us know what you think of the results and what they say about the characteristics of HDXPRT 2011.

We plan to conduct a Webinar in the near future to discuss the HDXPRT 2011 results white paper and to answer general questions. I hope to see you there!

Bill

Comment on this post in the forums

Benchmarking a benchmark

One of the challenges of any benchmark is understanding its characteristics. The goal of a benchmark is to measure performance under a defined set of circumstances. For system-level, application-oriented benchmarks, it isn’t always obvious how individual components in the system influence the overall score. For instance, how does doubling the amount of memory affect the benchmark score? The best way to understand the characteristics of a benchmark is to run a series of carefully controlled experiments that change one variable at a time. To test the benchmark’s behavior with increased memory, you would take a system and run the benchmark with different amounts of RAM. Changing the processor, graphics subsystem, or hard disk lets you see the influence of those components. Some components, like memory, can change in both their amount and speed.

The full matrix of system components to test can quickly grow very large. While the goal is to change only one component at a time, this is not always possible. For example, you can’t change the processor from an Intel to an AMD without also changing the motherboard.

We are in the process of putting HDXPRT 2011 through a series of such tests. HDXPRT 2011 is a system-level, application-oriented benchmark for measuring the performance of PCs on consumer-oriented HD media scenarios. We want to understand, and share with you, how different components influence HDXPRT scores. We expect to release a report on our findings next week. It will include results detailing the effect of processor speed, amount of RAM, hard disk type, and graphics subsystem.

There is a tradeoff between the size of the matrix and how long it takes to produce the results. We’ve tried to choose the areas we felt were most important, but we’d like to hear what you consider important. So, what characteristics of HDXPRT 2011 would you like to see us test?

Bill

Comment on this post in the forums

Device or computer?

As you may have noticed, I am fascinated by performance.  I’m also an avid cyclist and techno geek.  The recent start of the Tour de France has turned my thoughts to the technology of bikes and their accessories.  As with most technology, the latest models promise to be faster, lighter, and better.

One accessory of particular interest to me is the bike “computer.”  When I first started serious riding six years ago, bike computers were pretty minimal devices.  They were generally a small LCD display that connected via wires to two sensors.  One sensor counted how quickly a magnet on the pedal passed by to determine the cyclist’s cadence (pedal strokes per minute).  The other sensor counted how quickly a magnet on one of the wheels passed by. Knowing the circumference of the wheel, it calculated the cyclist’s speed and distance traveled.  Sure there had to be a processor of some sort in those bike computers, but I always refused to call it a bike computer. Speedometer/odometer seemed more accurate to me.

Now, however, I have on my bike a Garmin Edge 500 (https://buy.garmin.com/shop/shop.do?cID=160&pID=36728).  It is a small device—less than 2 inches by 3 inches—that attaches to my handle bars and determines my speed and distance via a built-in GPS. It determines altitude by detecting changes in barometric pressure and temperature by a built-in thermometer.  It communicates wirelessly with my heart rate monitor.  It can also talk wirelessly to other devices, like a cadence sensor or a power meter that measures the power applied to the pedals.  The LCD screen is customizable and allows me to display the information I most care about while riding.  The Edge 500 collects all of the data and can upload it via a computer to the Garmin Connect Web site.

By any definition of computer, the Edge 500 seems to qualify.  I still don’t call it a computer, however. Calling it a speedometer/odometer would be silly.  I tend to refer to it as my Garmin.  The line between computer and device is definitely getting blurrier.

We are all surrounded by more and more computing devices, whether they are desktops, notebooks, tablets, smart phones, or bike computers.  On some of those, performance is critical while on others, fast enough is all we care about.  On which devices do you think performance is important?  Even as we start the work on HDXPRT 2012, we are constantly examining other areas and types of devices that need benchmarks.  Let us know your thoughts!

Bill

Comment on this post in the forums

Long-lasting benchmarks

While researching the Top500 list for last week’s blog, I ran across an interesting article (http://bits.blogs.nytimes.com/2011/05/09/the-ipad-in-your-hand-as-fast-as-a-supercomputer-of-yore/?ref=technology).  Its basic premise is that the iPad 2 has about the same computing power as the Cray 2 supercomputer, the world’s fastest computer in 1985.  I’m old enough to remember the Cray 1 and Cray 2 supercomputers with their unique circular shapes.  In their day, they were very expensive and, consequently, rare.  Only government agencies could afford to buy them.  Just getting to see one was a big deal.  In stark contrast, I seem to see iPads everywhere.

What was the benchmark for determining this?  It was LINPACK, the same benchmark that determined the winner of the Top500 earlier in June.  Based on the LINPACK results, I am holding in my hand a device that could rival the most powerful in the world about 25 years ago.  Another perspective is that I have a phone faster than the most powerful computer in the world the year I graduated with my CS degree.  And, I use it to play Angry Birds…   (Picture trying to convince someone in the 80s that one day millions of hand-held Cray 2 supercomputers would be used to catapult exploding birds at annoying oinking pigs.)

One interesting thought from all of this is the power of benchmarks that last over time.  While it will be a rare (and rather limited) benchmark that can last as long as LINPACK, it is important for benchmarks to not change too frequently.  On the other side of the scale is the desire for a benchmark to keep up with current technology.  With HDXPRT, we are aiming for about a year between versions.  I’d love to know whether you think that is too long, too short, or about right.

Bill

Comment on this post in the forums

Petaflops?

I saw an article earlier this week about Japan’s K Computer, the latest computer to be designated the “fastest supercomputer” in the world.  Twice a year (June and November), the Top500 list comes out.  The list’s publishers consider the highest scoring computer on the list as the fastest computer in the world.  The first article I read about the recent rankings did not cite the results, just the rankings.  So, I went to another article which referred to the K computer as capable of 8.2 quadrillion calculations per second, but did not give the results of the other leading supercomputers.  On to the next article which said the K Computer was capable of 1.2 petaflops per second.  (The phrase petaflops per second is in the same category as ATM machine or PIN number…)  The same article said that the third fastest was able to get 1.75 petaflops per second.  OK, now I was definitely confused.  (I really miss the old days of good copy editing and fact checking, but that is a blog for another day.)

So, I went to the source, the Top500 Web site (www.top500.org).  It confirmed that the K Computer obtained 8.16 petaflops (or quadrillion calculations per second) on the LINPACK test.  The Chinese Tianhe-1A got 2.56 petaflops and the American Jaguar, 1.76 petaflops.

Once I got over the sloppy reporting and stopped playing with the graphs of the trends and scores over time, I started thinking about the problem of metrics and the importance of making them easy to understand.  Some metrics are very easy to report and understand.  For example, a battery life benchmark reports its results in hours and minutes.  We all know what this means and we know that more hours and minutes is a good thing.  Understanding what petaflops are is decidedly harder.

Another issue is the desire for bigger numbers to mean better results.  The time to finish a task is fairly easy to understand, but in that case, less time is better.  One technique for dealing with this issue is to normalize the numbers.  Basically, that means to divide the result (such as a time) by the result of a baseline system’s result.  The baseline system’s result is typically considered to be 1.0 (or some other number like 10 or 100) and other results are meaningful only in relation to the baseline system or each other.  A system scoring 2.0 runs twice as fast as the baseline system’s 1.0.  While that is clear, it does take more explanation than just seconds.

Finding the right metrics was a challenge we faced with HDXPRT 2011. Do you think we got it right? Please let us know what you think.

Bill

Comment on this post in the forums

Check out the other XPRTs:

Forgot your password?