Category: Benchmarking

Best practices

on August 3, 2017

Recently, a tester wrote in and asked for help determining why they were seeing different WebXPRT scores on two tablets with the same hardware configuration. The scores differed by approximately 7.5 percent. This can happen for many reasons, including different software stacks, but score variability can also result from different testing behavior and environments. While some degree of variability is natural, the question provides us with a great opportunity to talk about the basic benchmarking practices we follow in the XPRT lab, practices that contribute to the most consistent and reliable scores.

Below, we list a few basic best practices you might find useful in your testing. While they’re largely in the context of the WebXPRT focus on evaluating browser performance, several of these practices apply to other benchmarks as well.

Test with clean images: We use an out-of-box (OOB) method for testing XPRT Spotlight devices. OOB testing means that other than initial OS and browser version updates that users are likely to run after first turning on the device, we change as little as possible before testing. We want to assess the performance that buyers are likely to see when they first purchase the device, before installing additional apps and utilities. This is the best way to provide an accurate assessment of the performance retail buyers will experience. While OOB is not appropriate for certain types of testing, the key is to not test a device that’s bogged down with programs that influence results unnecessarily.
Turn off updates: We do our best to eliminate or minimize app and system updates after initial setup. Some vendors are making it more difficult to turn off updates completely, but you should always account for update settings.
Get a feel for system processes: Depending on the system and the OS, quite a lot of system-level activity can be going on in the background after you turn it on. As much as possible, we like to wait for a stable baseline (idle) of system activity before kicking off a test. If we start testing immediately after booting the system, we often see higher variability in the first run before the scores start to tighten up.
Disclosure is not just about hardware: Most people know that different browsers will produce different performance scores on the same system. However, testers aren’t always aware of shifts in performance between different versions of the same browser. While most updates don’t have a large impact on performance, a few updates have increased (or even decreased) browser performance by a significant amount. For this reason, it’s always worthwhile to record and disclose the extended browser version number for each test run. The same principle applies to any other relevant software.
Use more than one data point: Because of natural variability, our standard practice in the XPRT lab is to publish a score that represents the median from at least three to five runs. If you run a benchmark only once, and the score differs significantly from other published scores, your result could be an outlier that you would not see again under stable testing conditions.

We hope those tips will make testing a little easier for you. If you have any questions about the XPRTs, or about benchmarking in general, feel free to ask!

Justin

Posted in Benchmark metrics, Benchmarking, Browser-based benchmarks, Performance benchmarking, Performance testing on tablets, Uncategorized, WebXPRT, WebXPRT 2013, WebXPRT 2015, XPRT Weekly Tech Spotlight |

MobileXPRT: evaluate the performance of your Android device

By Justin Greene

on July 27, 2017

We recently discussed the capabilities and benefits of TouchXPRT, CrXPRT, BatteryXPRT, and HDXPRT. This week, we’re focusing on MobileXPRT, an app that evaluates how well an Android device handles everyday tasks. Like the other XPRT family benchmarks, MobileXPRT is easy to use. It takes less than 15 minutes to run on most devices, runs relatable workloads, and delivers reliable, objective, and easy-to-understand results.

MobileXPRT includes five performance scenarios (Apply Photo Effects, Create Photo Collages, Create Slideshow, Encrypt Personal Content, and Detect Faces to Organize Photos). By default, the benchmark runs all five tasks and reports individual workload scores and an overall performance score.

MobileXPRT 2015 is the latest version of the app, supporting both 32-bit and 64-bit hardware running Android 4.4 or higher. To test systems running older versions of Android, or to test 32-bit performance on a 64-bit system, you can use MobileXPRT 2013. The results of the two versions are comparable.

MobileXPRT is a useful tool for anyone who wants to compare the performance capabilities of Android phones or tablets. To see test results from a variety of systems, go to MobileXPRT.com and click View Results, where you’ll find scores from many different Android devices.

If you’d like to run MobileXPRT:

Simply download MobileXPRT from MobileXPRT.com or the Google Play Store. The full installer package on MobileXPRT.com, containing both app and test data, is 243 MB. You may also use this link to download the 18 MB MobileXPRT app file, which will download the test data during installation. The MobileXPRT user manual provides instructions for configuring your device and kicking off a test.

If you’d like to dig into the details:

Check out the Exploring MobileXPRT 2015 white paper. In it, we discuss the MobileXPRT development process and details of the individual performance scenarios. We also explain exactly how the benchmark calculates results.

If you’d like to dig even deeper, the MobileXPRT source code is available to members of the BenchmarkXPRT Development Community, so consider joining today. Membership is free for members of any company or organization with an interest in benchmarks, and there are no obligations after joining.

If you haven’t used MobileXPRT before, give it a shot and let us know what you think!

Justin

Posted in Android, BatteryXPRT 2014 for Android, CrXPRT, HDXPRT, Mobile devices, MobileXPRT, MobileXPRT 2015, Performance benchmarking, TouchXPRT, WebXPRT |

Planning the next version of HDXPRT

By Justin Greene

on July 20, 2017

A few weeks ago, we wrote about the capabilities and benefits of HDXPRT. This week, we want to share some initial ideas for the next version of HDXPRT, and invite you to send us any comments or suggestions you may have.

The first step towards a new HDXPRT will be updating the benchmark’s workloads to increase their value in the years to come. Primarily, this will involve updating application content, such as photos and videos, to more contemporary file resolutions and sizes. We think 4K-related workloads will increase the benchmark’s relevance, but aren’t sure whether 4K playback tests are necessary. What do you think?

The next step will be to update versions of the real-world trial applications included in the benchmark, including Adobe Photoshop Elements, Apple iTunes, Audacity, CyberLink MediaEspresso, and HandBrake. Are there other any applications you feel would be a good addition to HDXPRT’s editing photos, editing music, or converting videos test scenarios?

We’re also planning to update the UI to improve the look and feel of the benchmark and simplify navigation and functionality.

Last but not least, we’ll work to fix known problems, such as the hardware acceleration settings issue in MediaEspresso, and eliminate the need for workarounds when running HDXPRT on the Windows 10 Creators Update.

Do you have feedback on these ideas or suggestions for applications or test scenarios that we should consider for HDXPRT? Are there existing features we should remove? Are there elements of the UI that you find especially useful or would like to see improved? Please let us know. We want to hear from you and make sure that HDXPRT continues to meet your needs.

Justin

Posted in 4K, BenchmarkXPRT, Collaborative benchmark development, Future of performance evaluation, HDXPRT, HDXPRT capabilities, HDXPRT development process, HDXPRT release cycle, Let us know your thoughts, Performance benchmarking, What makes a good benchmark?, Windows 10 |

Apples and pears vs. oranges and bananas

By Eric Hale

on July 6, 2017

When people talk about comparing disparate things, they often say that you’re comparing apples and oranges. However, sometimes that expression doesn’t begin to describe the situation.

Recently, Justin wrote about using CrXPRT on systems running Neverware CloudReady OS. In that post, he noted that we couldn’t guarantee that using CrXPRT on CloudReady and Chrome OS systems would be a fair comparison. Not surprisingly, that prompted the question “Why not?”

Here’s the thing: It’s a fair comparison of those software stacks running on those hardware configurations. If everyone accepted that and stopped there, all would be good. However, almost inevitably, people will read more into the scores than is appropriate.

In such a comparison, we’re changing multiple variables at once. We’ve written before about the effect of the software stack on performance. CloudReady and Chrome OS are two different implementations of the Chromium OS, and it’s possible that one is more efficient than the other. If so, that would affect CrXPRT scores. At the same time, the raw performance of the two hardware configurations under test could also differ to a certain degree, which would also affect CrXPRT scores.

Here’s a metaphor: If you measure the effective force at the end of two levers and find a difference, to what do you attribute that difference? If you know the levers are the same length, you can attribute the difference to the amount of applied force. If you know the applied force is identical, you can attribute the difference to the length of the levers. If you lack both of those data points, you can’t know whether the difference is due to the length, the force, or a combination of the two.

With a benchmark, you can run multiple experiments designed to isolate variables and use the results from those experiments to look for trends. For example, we could install both CloudReady OS and Chrome OS on the same Intel-based Chromebook and compare the CrXPRT results. Because that removes hardware differences as a variable, such an experiment would offer some insight into how the two implementations compare. However, because differences in hardware can affect the performance of a given piece of software, this single data point would be of limited value. We could repeat the experiment on a variety of other Intel-based Chromebooks, and other patterns might emerge. If one of the implementations consistently scored higher, that would suggest that it was more efficient than the other, but would still not be definitively conclusive.

I hope this gives you some idea about why we are cautious about drawing conclusions when comparing results from different sets of hardware running different software stacks.

Eric

Posted in Benchmark metrics, Benchmarking, Benchmarks in general, Chrome OS, Chromebooks, CrXPRT, Google Chrome, Performance benchmarking, What makes a good benchmark? |

Thoughts from MWC Shanghai

By Bill Catchings

on June 29, 2017

I’ve spent the last couple days walking the exhibition halls of MWC Shanghai. The Shanghai New International Expo Centre (SNIEC) is large, but smaller than the MWC exhibit space in Barcelona or the set of exhibit halls in Las Vegas for CES. (SNIEC is not even the biggest exhibition space in Shanghai!) Further, MWC here still only took up half the exhibition space, but there was plenty to see. And, I’m less exhausted than after CES or MWC in Barcelona!

If I had to pick one theme from the exhibition halls, it would be 5G. It seemed like half the booths had 5G displayed somewhere in their signage. The cloud was the other concept that seemed to be everywhere. While neither was surprising, it was interesting to see halfway around the world. In truth, it feels like 5G is much farther along here than it is back in the States.

I was also surprised to see how many phone vendors are here that I’d never heard of before such as Lephone and Gionee. I stopped by their booths with XPRT Spotlight information and hope they will send in some of their devices for inclusion in the future.

One thing I found of note was how much technology in general and IoT in particular is going to be everywhere. There was an interesting exhibit showing how stores of the future might operate. I was able to “buy” items without traditionally checking out. (I got a free water and some cookies out of the experience.) I just placed the items in a location on the checkout counter, which read their NFC labels and displayed them on the checkout screen. It seemed sort of like my understanding of the experiments that Amazon has been doing with brick-and-mortar grocery stores (prior to their purchase of Whole Foods). The whole experience felt a bit odd and still unpolished, but I’m sure it will improve and I’ll get used to it.

The next generation will find it not odd, but normal. There were exhibits with groups of children playing with creative technologies from handheld 3D printers to simplified programming languages. They will be the generation after digital natives, maybe the digital creatives? What impact will they have? The future is both exciting and daunting!

I came away from the conference thinking about how the XPRTs can help folks choose amongst the myriad devices and technologies that are just around the corner. What would you most like to see the XPRTs tackle in the next six months to a year?

Bill Catchings

Posted in 5G, AI, computer vision, Education, Future of performance evaluation, Internet of things, IoT, Mobile World Congress, Mobile World Congress, Trade Shows |

HDXPRT: see how your Windows PC handles media tasks

By Justin Greene

on June 22, 2017

Over the last several weeks, we reminded readers of the capabilities and benefits of TouchXPRT, CrXPRT, and BatteryXPRT. This week, we’d like to highlight HDXPRT. HDXPRT, which stands for High Definition Experience & Performance Ratings Test, was the first benchmark published by the HDXPRT Development Community, which later became the BenchmarkXPRT Development Community. HDXPRT evaluates the performance of Windows devices while handling real-world media tasks such as photo editing, video conversion, and music editing, all while using real commercial applications, including Photoshop and iTunes. HDXPRT presents results that are relevant and easy to understand.

We originally distributed HDXPRT on installation DVDs, but HDXPRT 2014, the latest version, is available for download from HDXPRT.com. HDXPRT 2014 is for systems running Windows 8.1 and later. The benchmark takes about 10 minutes to install, and a run takes less than two hours.

HDXPRT is a useful tool for anyone who wants to evaluate the real-world, content-creation capabilities of a Windows PC. To see test results from a variety of systems, go to HDXPRT.com and click View Results, where you’ll find scores from many different Windows devices.

If you’d like to run HDXPRT:

Simply download HDXPRT from HDXPRT.com. The HDXPRT user manual provides information on minimum system requirements, as well as step-by-step instructions for how to configure your system and kick off a test. Testers running HDXPRT on Windows 10 Creators Update builds should consult the tech support note posted on HDXPRT.com.

If you’d like to dig into the details:

Check out the Exploring HDXPRT 2014 white paper. In it, we discuss the benchmark’s three test scenarios in detail and show how we calculate the results.

If you’d like to dig even deeper, the HDXPRT source code is available to members of the BenchmarkXPRT Development Community, so consider joining today. Membership is free for members of any company or organization with an interest in benchmarks, and there are no obligations after joining.

If you haven’t used HDXPRT before, give it a shot and let us know what you think!

On another note, Bill will be attending Mobile World Congress in Shanghai next week. Let us know if you’d like to meet up and discuss the XPRTs or how to get your device in the XPRT Spotlight.

Justin

Posted in BenchmarkXPRT development community, HDXPRT, HDXPRT 2012, HDXPRT 2014, HDXPRT capabilities, HDXPRT development process, HDXPRT source code, Let us know your thoughts, Performance benchmarking, Source code, White papers, Windows 10, Windows 8.1, XPRT Weekly Tech Spotlight |

Category: Benchmarking

Best practices

MobileXPRT: evaluate the performance of your Android device

If you’d like to run MobileXPRT:

If you’d like to dig into the details:

Planning the next version of HDXPRT

Apples and pears vs. oranges and bananas

Thoughts from MWC Shanghai

HDXPRT: see how your Windows PC handles media tasks

If you’d like to run HDXPRT:

If you’d like to dig into the details:

Check out the other XPRTs: