BenchmarkXPRT Blog banner

Category: Performance benchmarking

The WebXPRT 3 results calculation white paper is now available

As we’ve discussed in prior blog posts, transparency is a core value of our open development community. A key part of being transparent is explaining how we design our benchmarks, why we make certain development decisions, and how the benchmarks actually work. This week, to help WebXPRT 3 testers understand how the benchmark calculates results, we published the WebXPRT 3 results calculation and confidence interval white paper.

The white paper explains what the WebXPRT 3 confidence interval is, how it differs from typical benchmark variability, and how the benchmark calculates the individual workload scenario and overall scores. The paper also provides an overview of the statistical techniques WebXPRT uses to translate raw times into scores.

To supplement the white paper’s overview of the results calculation process, we’ve also published a spreadsheet that shows the raw data from a sample test run and reproduces the calculations WebXPRT uses.

The paper and spreadsheet are both available on WebXPRT.com and on our XPRT white papers page. If you have any questions about the WebXPRT results calculation process, please let us know, and be sure to check out our other XPRT white papers.

Justin

The value of speed

I was reading an interesting article on how high-end smartphones like the iPhone X, Pixel 2 XL, and Galaxy S8 generate more money from in-game revenue than cheaper phones do.

One line stood out to me: “With smartphones becoming faster, larger and more capable of delivering an engaging gaming experience, these monetization key performance indicators (KPIs) have begun to increase significantly.”

It turns out the game companies totally agree with the rest of us that faster devices are better!

Regardless of who is seeking better performance—consumers or game companies—the obvious question is how you determine which models are fastest. Many folks rely on device vendors’ claims about how much faster the new model is. Unfortunately, the vendors’ claims don’t always specify on what they base the claims. Even when they do, it’s hard to know whether the numbers are accurate and applicable to how you use your device.

The key part of any answer is performance tools that are representative, dependable, and open.

  • Representative – Performance tools need to have realistic workloads that do things that you care about.
  • Dependable – Good performance tools run reliably and produce repeatable results, both of which require that significant work go into their development and testing.
  • Open – Performance tools that allow people to access the source code, and even contribute to it, keep things above the table and reassure you that you can rely on the results.

Our goal with the XPRTs is to provide performance tools that meet all these criteria. WebXPRT 3 and all our other XPRTs exist to help accurately reveal how devices perform. You can run them yourself or rely on the wealth of results that we and others have collected on a wide array of devices.

The best thing about good performance tools is that everyone, even vendors, can use them. I sincerely hope that you find the XPRTs helpful when you make your next technology purchase.

Bill

AIXPRT: We want your feedback!

Today, we’re publishing the AIXPRT Request for Comments (RFC) document. The RFC explains the need for a new artificial intelligence (AI)/machine learning benchmark, shows how the BenchmarkXPRT Development Community plans to address that need, and provides preliminary design specifications for the benchmark.

We’re seeking feedback and suggestions from anyone interested in shaping the future of machine learning benchmarking, including those not currently part of the Development Community. Usually, only members of the BenchmarkXPRT Development Community have access to our RFCs and the opportunity to provide feedback. However, because we’re seeking input from non-members who have expertise in this field, we will be posting this RFC in the New events & happenings section of the main BenchmarkXPRT.com page and making it available at AIXPRT.com.

We welcome input on all aspects of the benchmark, including scope, workloads, metrics and scores, UI design, and reporting requirements. We will accept feedback through May 13, 2018, after which BenchmarkXPRT Development Community administrators will collect and evaluate the feedback and publish the final design specification.

Please share the RFC with anyone interested in machine learning benchmarking and please send us your feedback before May 13.

Justin

The XPRTs in action

In the near future, we’ll update our “XPRTs around the world” infographic, which provides a snapshot of how people are using the XPRTs worldwide. Among other stats, we include the number of XPRT web mentions, articles, and reviews that have appeared during a given period. Recently, we learned how one of those statistics—a single web site mention of WebXPRT—found its way to consumers in more places than we would have imagined.

Late last month, AnandTech published a performance comparison by Andrei Frumusanu examining the Samsung Galaxy S9’s Snapdragon 845 and Exynos 9810 variants and a number of other high-end phones. WebXPRT was one of the benchmarking tools used. The article stated that both versions of the brand-new S9 were slower than the iPhone X and, in some tests, were slower than even the iPhone 7.

A CNET video discussed the article and the role of WebXPRT in the performance comparison, and the article has been reposted to hundreds of tech media sites around the world. A quick survey shows reposts in Albania, Bulgaria, Denmark, Chile, the Czech Republic, France, Germany, Greece, Indonesia, Iran, Italy Japan, Korea, Poland, Russia, Spain, Slovakia, Turkey, and many other countries.

The popularity of the article is not surprising, for it positions the newest flagship phones from the industry’s two largest phone makers in a head-to-head comparison with a somewhat unexpected outcome. AnandTech did nothing to stir controversy or sensationalize the test results, but simply provided readers with an objective, balanced assessment of how these devices compare so that they could draw their own conclusions. The XPRTs share this approach.

We’re grateful to Andrei and others at AnandTech who’ve used the XPRTs over the years to produce content that helps consumers make informed decisions. WebXPRT is just part of AnandTech’s toolkit, but it’s one that’s accessible to anybody free of charge. With the help of BenchmarkXPRT Development Community members, we’ll continue to publish XPRT tools that help users everywhere gain valuable insight into device performance.

Justin

WebXPRT 3 is here!

We’re excited to announce that WebXPRT 3 is now available to the public. BenchmarkXPRT Development Community members have been using a community preview for several weeks, but now anyone can run WebXPRT 3 and publish their results.

As we mentioned on the blog, WebXPRT 3 has a completely new UI, updated workloads, and new test content. We carried over several features from WebXPRT 2015 including automation, the option to run individual workloads, and language options for English, German, and Simplified Chinese.

We believe WebXPRT 3 will be as relevant and reliable as WebXPRT 2013 and 2015. After trying it out, please submit your scores and feel free to let us know what you think. We look forward to seeing new results submissions!

WebXPRT 3 arrives next week!

After much development work and testing, we’re happy to report that we’ll be releasing WebXPRT 3 early next week!

Here are the final workload names and descriptions.

1) Photo Enhancement: Applies three effects to two photos each using Canvas.
2) Organize Album Using AI: Detects faces and classifies images using the ConvNetJS neural network library.
3) Stock Option Pricing: Calculates and displays graphics views of a stock portfolio using Canvas, SVG, and dygraphs.js.
4) Encrypt Notes and OCR Scan: Encrypts notes in local storage and scans a receipt using optical character recognition.
5) Sales Graphs: Calculates and displays multiple views of sales data using InfoVis and d3.js.
6) Online Homework: Performs science and English homework using Web Workers and Typo.js spell check.

As we mentioned in an earlier blog post, the updated photo workloads contain new images and a deep learning task. We also added an optical character recognition task to the Local Notes workload and combined part of the DNA Sequence Analysis scenario with a writing sample/spell check scenario to simulate online homework in the new Online Homework workload.

Longtime WebXPRT users will immediately notice a completely new UI. We worked to improve the UI’s appearance on smaller devices such as phones and we think testers will find it easier to navigate.

Testers can still choose to run individual workloads and we’re once again offering English, German, and Simplified Chinese language options.

Below my sig, I’ve included pictures of WebXPRT 3’s start test and results pages, as well as an in-test screen capture.

We’re thankful for all the interest in WebXPRT 3 so far and believe the new version of WebXPRT will be as relevant and reliable as WebXPRT 2013 and 2015—and easier to use. We look forward to seeing new results submissions next week!

Justin

WebXPRT 3 start page

WebXPRT 3 in test

Results page

Check out the other XPRTs:

Forgot your password?