BenchmarkXPRT Blog banner

Author Archives: Bill Catchings

Here a core, there a core…

Earlier this week, Apple announced its latest iPad. While the improvements seem to be largely incremental, I can’t wait to get my hands on one. (As an aside, I wonder how much work and how many arguments it took to come up with the name “new iPad.” I thought Apple had finally gotten over their longstanding fear of the number 3 with the iPhone 3, but I guess not.)

One of the incremental improvements that caught my eye, especially in light of trying to test the performance of touch devices, is the new iPad’s processor, the A5X. It’s hard to get a straight story as most reports refer to the chip as a quad-core processor and Apple referred to quad-core graphics. As best I can ferret out amidst the hype, the A5X is a quad-core for graphics, but for other operations it functions only as a dual-core.

Regardless of the specifics of the chip, it does have multiple cores for general execution and for graphics. Multiple processing units is an important trend over the last decade for processors in devices from PCs to tablets to phones. The interesting question to me is what is the proper way to benchmark devices in light of that trend. The problem is that for some things, the extra cores don’t help. For others, two cores may be twice as fast as one core. Similarly, additional dedicated processing units (such as for graphics) help only for particular operations.

The right answer to me is to do as we are trying to do with both HDXPRT and TouchXPRT—start with what people really do. That means that some usage scenarios and applications will benefit from additional processing units, while others will not. That should correspond with what people really experience. To make the results more useful, it would be helpful to try and understand which operations are most affected by additional general or special purpose processing units.

How do you think we should look at devices with multiple and varied processing units? I’d love to get your feedback and incorporate it into both HDXPRT and TouchXPRT over the coming months.

Bill

Comment on this post in the forums

Bye, bye 32 bits?

In developing HDXPRT 2012, we have encountered a dilemma. The problem is the amount of effort necessary to support 32-bit as well as 64-bit. While the world is moving to 64-bit Windows, some older platforms as well as possibly some lower-end devices still use 32-bit Windows. Our feeling is that the effort necessary to support 32-bit Windows would be better spent elsewhere, such as working on TouchXPRT. Further, supporting 32-bit Windows might have a noticeable impact on when we can complete HDXPRT 2012.

The downside in supporting only 64-bit Windows is that we had hoped to be able to increase the range of devices HDXPRT 2012 supports. The advent of TouchXPRT, however, means that it might be the more appropriate benchmark for those lower-end devices that consume content rather than create it. What do you think? This is one decision where we would really like your input. So, should we support 32-bit Windows or limit HDXPRT to 64-bit? Thanks!

Bill

Comment on this post in the forums

Our new baby has a name!

At the beginning of the year, at CES, we announced that we would start working on a touch-based benchmark that would initially run on Windows 8 Metro. We have been hard at work learning about Metro and creating the benchmark itself.

In parallel, we’ve been working on a name for the benchmark. What we settled on was Touch eXperience & Performance Ratings Tool, or TouchXPRT for short. We’re updating the Web pages with the new name and getting the domain properly set up. In the meantime, check out the logo:

Let us know what you think about the name and the logo. We are happy with both!

I’ve been reading that the Windows 8 beta should be available soon and we hope to have an alpha TouchXPRT available within a few weeks of the beta. We will need your help to critique, debug, and expand TouchXPRT from there. Hang onto your hats, these are exciting times in Benchmark Land!

Bill

Comment on this post in the forums

Quick HDXPRT 2012 status update

Between talking about CES, the new touch benchmark, and sausage making, it seems like it’ has been a while since I’ve said anything about HDXPRT. Here’ is a quick status update. The short form is that folks are heads- down coding, debugging, and testing. We still have some significant hurdles to overcome, such as trying to script Picassa. We also are going to have to make some difficult decisions in the near future about possibly swapping out one or two of the applications due to either licensing or scripting issues. (Sausage making at its best!) We’ll keep you posted in the forums when we have to make those decisions.

There is still a lot to get done, but things still appear to be on schedule. That schedule means that we are still hoping to have a beta version available for the Development Community to test in late March. At that point, the beta version will be available to our members and we will really need your help to try and shake things out. (Join at http://hdxprt.com/forum/register.php if you are not yet a member of the Development Community and want to help in our effort.) The more different systems and configurations we can all test together, the better the benchmark will be. There will also be at least some time for feedback on whether HDXPRT 2012 matches the design specification and if there are any last- minute tweaks you think would help make for a better benchmark.

So, stay tuned! We look forward to continuing to work with you on making HDXPRT 2012 even better than the current version.

Bill

Comment on this post in the forums

Art or sausage?

I discussed in my previous blog how weighing the tradeoffs between real science and real world in benchmark is a real art. One person felt it was more akin to sausage making than art! In truth, I have made that comparison myself.

That, of course, got me thinking. Is the process of creating a benchmark like that of creating sausage? With sausage, the feeling is that if you knew what went into sausage, you probably wouldn’t eat it. That may well be true, but I would still like to know that someone was inspecting the sausage factory. Sausage that contains strange animal parts is one thing, but sausage containing E. coli is another.

We are trying with the Development Community to use transparency to create better benchmarks. My feeling is that the more inspectors (members) there are, the better the benchmark will be. At least to me, unlike making sausage, creating benchmarks is actually cool. (There are probably sausage artisans who feel the same way about sausage.)

What do you think? Would you prefer to know what goes into making a benchmark? We hope so and hope that is why you are a part of this community. If you are not part of the Development Community, we encourage you to join at http://hdxprt.com/forum/register.php. Come join us in the sausage-making art house!

Bill

Comment on this post in the forums

The real art of benchmarking

In my last blog entry, I noted the challenge of balancing real-world and real-science considerations when benchmarking Web page loads. That issue, however, is inherent in all benchmarking. Real world argues for benchmarks that emphasize what users and computers actually do. For servers, that might mean something like executing real database transactions against a real database from real client computers. For tablets, that might mean real fingers selecting and displaying real photos. There are obvious issues with both—setting up such a real database environment is difficult and who wants to be the owner of the real fingers driving the tablet? It is also difficult to understand what causes performance differences—is it the network, the processors, or the disks in the server? There are also more subtle challenges, such as how to make the tests work on servers or tablets other than the original ones. Worse, such real-world environments are subject to all sorts of repeatability and reproducibility issues.

Real science, on the other hand, argues for benchmarks that emphasize repeatable and reproducible results. Further, real science wants benchmarks that isolate the causes of performance differences. For servers, that might mean a suite of tests targeting processor speed, network bandwidth, and disk transfer rate. For tablets, that might mean tests targeting processor speed, touch responsiveness, and graphics-rendering rate. The problem is that it is not always obvious what combination of such factors actually delivers better database server performance or tablet experience. Worse, it is possible that testing different databases and transactions would result in very different characteristics that these tests don’t at all measure.

The good news is that real world and real science are not always in opposition. The bad news is that a third factor exacerbates the situation—benchmarks take real time (and of course real money) to develop. That means benchmark developers need to make compromises if they want to bring tests to market before the real world they are attempting to measure has changed. And, they need to avoid some of the most difficult technical hurdles. Like most things, that means trying to find the right balance between real world and real science.

Unfortunately, there is no formula for determining that balance. Instead, it really is somewhat of an art. I’d love to hear from you some examples of benchmarks (current or from the past) that you think do a good job implementing this balance and showing the real art of benchmarking.

Bill

Comment on this post in the forums

Check out the other XPRTs:

Forgot your password?