BenchmarkXPRT Blog banner

Category: Benchmarking

Knowing when to wait

Mark mentioned in his blog entry a few weeks ago that waiting sucks.  I think we can all agree with that sentiment.  However, an experience I had while in Taipei for Computex made me reevaluate that thinking a bit.  

I went jogging one morning in a park near my hotel.  It was a relatively small park, just a quarter mile around the pond that took up most of the park.  I was one of only a couple people jogging, but the park was full of people.  Some were walking around the pond.  There also were groups of people doing some form of Tai Chi in various clearings around the pond.  The path I was on was narrow.  At times, there was no way of getting around the people walking without running into the ones doing Tai Chi.  That in turn meant running in place at times.  Or, put another way, waiting.  

Everyone was polite at the encounters, but the contrast between me jogging and the folks doing Tai Chi was stark.  I wanted to run my miles as quickly as possible.  Those doing Tai Chi were decidedly not in a rush.  They were doing their exercises together with others.  The goal was to do them at the proper pace in the proper way.  

That got me to thinking about waiting on my computer.  (Hey, time to think is one of the main reasons I exercise!)  There are times when waiting for a computer infuriates me.  Other times, however, the computer is fast enough.  Or even too fast, like when I’m trying to scroll down to the right cell in Excel and it jumps down to a whole screen full of empty cells.  This phenomenon, of course, relates to benchmarks.  Benchmarks should measure those operations that are slow enough to hurt productivity or are downright annoying.  There is less value in measuring operations that users don’t have to wait on. 

Have you had any thoughts about what makes a good benchmark?  Even if you weren’t exercising when you had the thought, please share it with the community. 

Bill

Comment on this post in the forums

Home sweet home

After a long set of flights back from Computex in Taipei, I’m finally home in North Carolina. Unfortunately, I’m still not quite sure what time zone I’m in!

While awake in the middle of the night, I’ve been thinking about some of the things I saw at Computex.  While I was there, it seemed like a jumble of notebooks, power supplies, gaming rigs, motherboards, cases, Hello Kitty accessories, and some things that I still don’t quite know what they were.   Many of the things I saw were not brand new, but it was my first chance to see them up close.  Some of them were of technologies still on the horizon like Intel’s Ultrabook concept and Microsoft’s Windows 8.  I also saw all sorts of combinations of phones, 4G, and other devices.

One thing that stood out to me were the number and variety of tablets.  They were in a variety of sizes (and screen resolutions).  There were quite a few vendors and some were ones I would not have suspected but was pleasantly surprised to encounter, like Viewsonic and Shuttle.  The OS choices included Android, WebOS, and MeeGo.   ASUS had a couple of interesting hybrid approaches such as the Eee Pad Transformer and the Padfone.  The former is a 10.1-inch tablet that plugs into a keyboard.  The Padfone is a smartphone that can plug into the back of a larger (10.1-inch) touch screen to act as a tablet.

All of these tablet choices, as well as the iPad that they all must be compete against, left me wondering how to choose between them.  Some part of the choice comes down to the size and features.  As always, however, performance plays a key role.  My tolerance for waiting on a tablet device is even lower than it is for waiting on my PC.  The problem is how to make valid comparisons across such a wide range of platforms.  I’d love to hear from you what you think about performance testing on tablets.  Is it useful?  What are the best ways to accomplish it?

Finally, thanks to all the folks who came by and visited our suite at Computex.  I enjoyed getting the chance to meet some of the members of the HDXPRT Development Community.  And, hopefully, I convinced more folks to join.

Bill

Comment on this post in the forums

Computex – Taipei

It’s hot and muggy here in Taipei. Just like home in North Carolina!

Weather aside, Taipei is definitely not Raleigh. Taipei is a big city with tall buildings. Right next to the hotel is the Taipei 101 which was the world’s tallest building for a few years. The streets are full of cars and motor scooters. People here walk quickly and purposefully. All of Computex seems to be filled with similar purpose and drive. It reminds me a quite bit of COMDEX in Vegas in its prime. Technology has taken over a city only too glad to embrace that technology. In next week’s blog, I’ll let you know about some of the cool things showing here.

I’ve had some interesting HDXPRT meetings so far. One of them helped me to remember some of the non-technical challenges of a successful benchmark. We’ve mentioned benchmark challenges like reliability (it needs to run when you need it to run) and repeatability (it needs to give similar results—within a few percent—each time you run it). I discussed with folks from one PC performance Web site the importance of a benchmark having some permanence. If the benchmark changes too frequently, you can’t compare the current product with the one you reviewed a couple months ago. With HDXPRT, our goal is an annual cycle. That should allow for comparing to older results while still keeping the benchmark current.

Any folks who may be here in Taipei for Computex, please come on by the Hyatt. We can talk about HDXPRT, benchmarks in general, or what you would most like to see in the future of performance evaluation. If nothing else, come by and escape the humidity! Drop us an email at hdxprt_computex@principledtechnologies.com and set up a time to come on over.

Bill

Comment on this post in the forums

What to do, what to do

When you set out to build an application-based benchmark like HDXPRT, you face many choices, but two are particularly important:  what applications do you run, and what functions do you perform in each application?

With HDXPRT the answers were straightforward, as they should be.

The applications we chose reflected a blend of market leaders, those providing emerging but important features, and the input from our community members.

The functions we perform in each application are ones that are representative of common uses of those programs—and that reflect the input of the community.

What’s so important here is the last clause of each of those paragraphs:  your input defines this benchmark.

As we finish off HDXPRT 2011 and then move to the 2012 version, we’ll begin the development cycle anew. When we do, if you want to make sure we choose the applications and functions that matter most to you, then participate, tell us what you want, let us hear your voice.  We will respond to all input, so though we can’t guarantee to accept all direction—after all, goals and desires sometimes conflict—we can guarantee that you will hear back from us and that we will explain the rationale for our decisions.

Mark Van Name

Comment on this post in the forums

Putting HDXPRT in some benchmark context

Benchmarks come in many shapes and sizes.  Some are extremely small, simple, and focused, while others are large, complex, and cover many aspects of a system.  To help position HDXPRT in the world of benchmarks, let me share with you a little taxonomy that Bill and I have long used.  No taxonomy is perfect, of course, but we’ve found this one to be very helpful as a general categorization tool.

From the perspective of how benchmarks measure performance, you can divide most of them into three groups.

Inspection tools use highly specialized tests to target very particular parts of a system. Back in the day, lo these many decades ago—okay, it was only two decades, but in dog years two tech decades is like five generations—some groups used a simple no-op loop to measure processor performance. I know, it sounds dumb today, but for a short time many felt it was a legitimate measure of processor clock speed, which is one aspect of performance. Similarly, if you want to know how fast a graphics subsystem could draw a particular kind of line, you could write code to draw lines of that type over and over.

These tools have very limited utility, because they don’t do what real users do, but for people working close to hardware, they can be useful.

Moving closer to the real world, synthetic benchmarks are specially written programs that simulate the kinds of work their developers believe real users are doing. So, if you think your target users are spending all day in email, you could write your own mini email client and time functions in it.  These tools definitely move closer to real user work than inspection tools, but they still have the drawback of not actually running the programs real people are using.

Application-based benchmarks take that last step by using real applications, the same programs that users employ in the real world. These benchmarks cause those applications to perform the kinds of actions that real users take, and they time those actions.  You can always argue about how representative they are—more on that in a future blog entry, assuming I don’t forget to write it—but they are definitely closer to the real world because they’re using real applications.

With all of that background, HDXPRT becomes easy to classify:  it’s an application-based benchmark.

Mark Van Name

Comment on this post in the forums

An example of the community in action

Last week, I hosted a Webinar on HDXPRT. We’ll make a recording of it available on the site fairly soon. Multiple members attended. As I was going through the slides and discussing various aspects of the benchmark, a member asked about installing the benchmark from a USB key or a server. My response was the simple truth: we hadn’t considered that approach. As I then elaborated, we clearly should have thought about it, because those capabilities would be useful in just about every production lab out there, including ours here at PT. I concluded by saying that we’d look into it.

I’m not naming the member simply because with big companies I’m never sure if doing that will be good or will cause someone trouble, and I don’t want to cause hassle for anyone. He should, though, feel free to step forward and claim the well-deserved credit for the suggestion.

Less than a week after the Webinar, I’m happy to be able to report that the team has done more than look into these capabilities; it’s implemented them! So, the next Beta release, Beta 2, which we’ll be releasing any time now (maybe even before we post this blog entry), lets you install the benchmark from a network share or a USB key.

I know this is a relatively small thing, but I think it bears reporting because it is exactly the way the community should work. A member brought the benefits of his experience to bear in a great bit of feedback, and now the benchmark is better for it—and so are all of us who use it.

Keep the good ideas coming!

Mark Van Name

Comment on this post in the forums

Check out the other XPRTs:

Forgot your password?