BenchmarkXPRT Blog banner

Category: Benchmarking

Apples and pears vs. oranges and bananas

When people talk about comparing disparate things, they often say that you’re comparing apples and oranges. However, sometimes that expression doesn’t begin to describe the situation.

Recently, Justin wrote about using CrXPRT on systems running Neverware CloudReady OS. In that post, he noted that we couldn’t guarantee that using CrXPRT on CloudReady and Chrome OS systems would be a fair comparison. Not surprisingly, that prompted the question “Why not?”

Here’s the thing: It’s a fair comparison of those software stacks running on those hardware configurations. If everyone accepted that and stopped there, all would be good. However, almost inevitably, people will read more into the scores than is appropriate.

In such a comparison, we’re changing multiple variables at once. We’ve written before about the effect of the software stack on performance. CloudReady and Chrome OS are two different implementations of the Chromium OS, and it’s possible that one is more efficient than the other. If so, that would affect CrXPRT scores. At the same time, the raw performance of the two hardware configurations under test could also differ to a certain degree, which would also affect CrXPRT scores.

Here’s a metaphor: If you measure the effective force at the end of two levers and find a difference, to what do you attribute that difference? If you know the levers are the same length, you can attribute the difference to the amount of applied force. If you know the applied force is identical, you can attribute the difference to the length of the levers. If you lack both of those data points, you can’t know whether the difference is due to the length, the force, or a combination of the two.

With a benchmark, you can run multiple experiments designed to isolate variables and use the results from those experiments to look for trends. For example, we could install both CloudReady OS and Chrome OS on the same Intel-based Chromebook and compare the CrXPRT results. Because that removes hardware differences as a variable, such an experiment would offer some insight into how the two implementations compare. However, because differences in hardware can affect the performance of a given piece of software, this single data point would be of limited value. We could repeat the experiment on a variety of other Intel-based Chromebooks, and other patterns might emerge. If one of the implementations consistently scored higher, that would suggest that it was more efficient than the other, but would still not be definitively conclusive.

I hope this gives you some idea about why we are cautious about drawing conclusions when comparing results from different sets of hardware running different software stacks.

Eric

Thoughts from MWC Shanghai

I’ve spent the last couple days walking the exhibition halls of MWC Shanghai. The Shanghai New International Expo Centre (SNIEC) is large, but smaller than the MWC exhibit space in Barcelona or the set of exhibit halls in Las Vegas for CES. (SNIEC is not even the biggest exhibition space in Shanghai!) Further, MWC here still only took up half the exhibition space, but there was plenty to see. And, I’m less exhausted than after CES or MWC in Barcelona!

Photo Jun 28, 9 56 45 AM

If I had to pick one theme from the exhibition halls, it would be 5G. It seemed like half the booths had 5G displayed somewhere in their signage. The cloud was the other concept that seemed to be everywhere. While neither was surprising, it was interesting to see halfway around the world. In truth, it feels like 5G is much farther along here than it is back in the States.

I was also surprised to see how many phone vendors are here that I’d never heard of before such as Lephone and Gionee. I stopped by their booths with XPRT Spotlight information and hope they will send in some of their devices for inclusion in the future.

One thing I found of note was how much technology in general and IoT in particular is going to be everywhere. There was an interesting exhibit showing how stores of the future might operate. I was able to “buy” items without traditionally checking out. (I got a free water and some cookies out of the experience.) I just placed the items in a location on the checkout counter, which read their NFC labels and displayed them on the checkout screen. It seemed sort of like my understanding of the experiments that Amazon has been doing with brick-and-mortar grocery stores (prior to their purchase of Whole Foods). The whole experience felt a bit odd and still unpolished, but I’m sure it will improve and I’ll get used to it.

Photo Jun 29, 12 04 30 PM

The next generation will find it not odd, but normal. There were exhibits with groups of children playing with creative technologies from handheld 3D printers to simplified programming languages. They will be the generation after digital natives, maybe the digital creatives? What impact will they have? The future is both exciting and daunting!

I came away from the conference thinking about how the XPRTs can help folks choose amongst the myriad devices and technologies that are just around the corner. What would you most like to see the XPRTs tackle in the next six months to a year?

Bill Catchings

HDXPRT: see how your Windows PC handles media tasks

Over the last several weeks, we reminded readers of the capabilities and benefits of TouchXPRT, CrXPRT, and BatteryXPRT. This week, we’d like to highlight HDXPRT. HDXPRT, which stands for High Definition Experience & Performance Ratings Test, was the first benchmark published by the HDXPRT Development Community, which later became the BenchmarkXPRT Development Community. HDXPRT evaluates the performance of Windows devices while handling real-world media tasks such as photo editing, video conversion, and music editing, all while using real commercial applications, including Photoshop and iTunes. HDXPRT presents results that are relevant and easy to understand.

We originally distributed HDXPRT on installation DVDs, but HDXPRT 2014, the latest version, is available for download from HDXPRT.com. HDXPRT 2014 is for systems running Windows 8.1 and later. The benchmark takes about 10 minutes to install, and a run takes less than two hours.

HDXPRT is a useful tool for anyone who wants to evaluate the real-world, content-creation capabilities of a Windows PC. To see test results from a variety of systems, go to HDXPRT.com and click View Results, where you’ll find scores from many different Windows devices.

If you’d like to run HDXPRT:

Simply download HDXPRT from HDXPRT.com. The HDXPRT user manual provides information on minimum system requirements, as well as step-by-step instructions for how to configure your system and kick off a test. Testers running HDXPRT on Windows 10 Creators Update builds should consult the tech support note posted on HDXPRT.com.

If you’d like to dig into the details:

Check out the Exploring HDXPRT 2014 white paper. In it, we discuss the benchmark’s three test scenarios in detail and show how we calculate the results.

If you’d like to dig even deeper, the HDXPRT source code is available to members of the BenchmarkXPRT Development Community, so consider joining today. Membership is free for members of any company or organization with an interest in benchmarks, and there are no obligations after joining.

If you haven’t used HDXPRT before, give it a shot and let us know what you think!

On another note, Bill will be attending Mobile World Congress in Shanghai next week. Let us know if you’d like to meet up and discuss the XPRTs or how to get your device in the XPRT Spotlight.

Justin

Notes from the lab

This week’s XPRT Weekly Tech Spotlight featured the Alcatel A30 Android phone. We chose the A30, an Amazon exclusive, because it’s a budget phone running Android 7.0 (Nougat) right out of the box. That may be an appealing combination for consumers, but running a newer OS on inexpensive hardware such as what’s found in the A30 can cause issues for app developers, and the XPRTs are no exception.

Spotlight fans may have noticed that we didn’t post a MobileXPRT 2015 or BatteryXPRT 2014 score for the A30. In both cases, the benchmark did not produce an overall score because of a problem that occurs during the Create Slideshow workload. The issue deals with text relocation and significant changes in the Android development environment.

As of Android 5.0, on 64-bit devices, the OS doesn’t allow native code executables to perform text relocation. Instead, it is necessary to compile the executables using position-independent code (PIC) flags. This is how we compiled the current version of MobileXPRT, and it’s why we updated BatteryXPRT earlier this year to maintain compatibility with the most recent versions of Android.

However, the same approach doesn’t work for SoCs built with older 32-bit ARMv7-A architectures, such as the A30’s Qualcomm Snapdragon 210, so testers may encounter this issue on other devices with low-end hardware.

Testers who run into this problem can still use MobileXPRT 2015 to generate individual workload scores for the Apply Photo Effects, Create Photo Collages, Encrypt Personal Content, and Detect Faces workloads. Also, BatteryXPRT will produce an estimated battery life for the device, but since it won’t produce a performance score, we ask that testers use those numbers for informational purposes and not publication.

If you have any questions or have encountered additional issues, please let us know!

Justin

Evaluating machine learning performance

A  few weeks ago, I discussed the rising importance of machine learning and our efforts to develop a tool to help in evaluating its performance. Here is an update on our thinking.

One thing we are sure of is that we can’t cover everything in machine learning. The field is evolving rapidly, so we think the best approach is to pick a good place to start and then build from there.

One of the key areas we need to hone in on is the algorithms that we will employ in MLXPRT. (We haven’t formally decided on a name, but are currently using MLXPRT internally when we talk about what we’ve been doing.)

Computer vision, or image detection, seems to be a good place to start. We see three specific sets of algorithms to possibly cover. Worth noting, there is plenty of muddying of lines amongst these sets.

The first set of computer vision algorithms performs image classification. These algorithms identify things like a cat or a dog in an image. Some of the most popular algorithms are Alexnet and GoogLeNet, as well as ones from VGG . The initial training and use for these was on the ImageNet database, containing over 10 million images.

The next set of algorithms in computer vision performs object detection and localization. The algorithms identify the contents and their spatial location in an image, and typically draw bounding boxes around them. A couple of the most popular algorithms are Faster R-CNN and Single Shot MultiBox Detector (SSD).

The final set of computer vision algorithms perform image segmentation. Rather than just drawing a box around an object, image segmentation attempts to classify each pixel in an image by the object it is a part of. The result looks like a contour/color map that shows the different objects in the image. These techniques can be especially useful in autonomous vehicles and medical diagnostic imaging. Currently, the leading algorithms in image segmentation are fully convolution networks (FCN), but the area is developing rapidly.

Even limiting the initial version of MLXPRT to computer vision may be too broad. For example, we may end up only doing image classification and object detection.

As always, we crave input from folks, like yourself, who are working in these areas. What would you most like to see in a machine learning performance tool?

Bill

Learning something new every day

We’re constantly learning and thinking about how the XPRTs can help people evaluate the tech that will soon be a part of daily life. It’s why we started work on a tool to evaluate machine learning capabilities, and it’s why we developed CrXPRT in response to Chromebooks’ growing share of the education sector.

The learning process often involves a lot of tinkering in the lab, and we recently began experimenting with Neverware’s CloudReady OS. CloudReady is an operating system based on the open-source Chromium OS. Unlike Chrome OS, which can run on only Chromebooks, CloudReady can run on many types of systems, including older Windows and OS X machines. The idea is that individuals and organizations can breathe new life into aging hardware by incorporating it into a larger pool of devices managed through a Google Admin Console.

We were curious to see if it worked as advertised, and if it would run CrXPRT 2015. Installing CloudReady on an old Dell Latitude E6430 was easy enough, and we then installed CrXPRT from the Chrome Web Store. Performance tests ran without a hitch. Battery life tests would kick off but not complete, which was not a big surprise because the battery life calls involved were developed specifically for Chrome OS.

So, what role can CrXPRT play with CloudReady, and what are the limitations? CloudReady has a lot in common with Chrome OS, but there are some key differences. One way we see the CrXPRT performance test being useful is for comparing CloudReady devices. Say that an organization was considering adopting CloudReady on certain legacy systems but not on others; CrXPRT performance scores would provide insight into which devices performed better with CloudReady. While you could use CrXPRT to compare those devices to Chromebooks, the differences between the operating systems are significant enough that we cannot guarantee the comparison would be a fair one.

Have you spent any time working with CloudReady, or are there other interesting new technologies you’d like us to investigate? Let us know!

Justin

Check out the other XPRTs:

Forgot your password?