BenchmarkXPRT Blog banner

Category: Collaborative benchmark development

AIXPRT: We want your feedback!

Today, we’re publishing the AIXPRT Request for Comments (RFC) document. The RFC explains the need for a new artificial intelligence (AI)/machine learning benchmark, shows how the BenchmarkXPRT Development Community plans to address that need, and provides preliminary design specifications for the benchmark.

We’re seeking feedback and suggestions from anyone interested in shaping the future of machine learning benchmarking, including those not currently part of the Development Community. Usually, only members of the BenchmarkXPRT Development Community have access to our RFCs and the opportunity to provide feedback. However, because we’re seeking input from non-members who have expertise in this field, we will be posting this RFC in the New events & happenings section of the main BenchmarkXPRT.com page and making it available at AIXPRT.com.

We welcome input on all aspects of the benchmark, including scope, workloads, metrics and scores, UI design, and reporting requirements. We will accept feedback through May 13, 2018, after which BenchmarkXPRT Development Community administrators will collect and evaluate the feedback and publish the final design specification.

Please share the RFC with anyone interested in machine learning benchmarking and please send us your feedback before May 13.

Justin

Comparing open source and open development

Why do we use open development when designing and building the XPRTs, and what’s the difference between our open development approach and traditional open-source methods? The terminology around these two models can be confusing, so we wanted to review some similarities and differences.

Why open development?

An open development approach helps encourage collaboration, innovation, and transparency. XPRT community members get involved in the development of each benchmark from the beginning:

  • They submit suggestions, questions, and concerns that inform the future design of the tools.
  • They view early proposals for new versions and contribute comments for the final design.
  • They suggest new workloads.
  • They have access to community previews (beta builds) of the tools.
  • They submit source code for inclusion in the benchmarks.
  • They examine existing source code.

A commitment to transparency

Because we’re committed to publishing reliable, unbiased benchmarks, we also want make the XPRT development process as transparent as possible. It’s not unusual for people to claim that any given benchmark contains hidden biases. To address this problem, we make our source code available to anyone who joins the community. This approach reduces the risk of unforeseen bias in our benchmarks.

Quality control

Unlike open-source models, open development allows us to control derivative works, which can be important in benchmarking. While open source encourages a constantly evolving product that may fork into substantially different versions, benchmarking requires a product that remains static to enable valid comparisons over time. By controlling derivative works, we can avoid the problem of unauthorized versions of the benchmarks being published as “XPRTs.”

In the future, we may use a traditional open-source model for specific XPRTs or other projects. If we do, we’ll share our reasoning with the community and ask for their thoughts about the best way to proceed. If you’re not a community member, but are interested in benchmark development, we encourage you to join today!

Justin

The XPRTs in action

In the near future, we’ll update our “XPRTs around the world” infographic, which provides a snapshot of how people are using the XPRTs worldwide. Among other stats, we include the number of XPRT web mentions, articles, and reviews that have appeared during a given period. Recently, we learned how one of those statistics—a single web site mention of WebXPRT—found its way to consumers in more places than we would have imagined.

Late last month, AnandTech published a performance comparison by Andrei Frumusanu examining the Samsung Galaxy S9’s Snapdragon 845 and Exynos 9810 variants and a number of other high-end phones. WebXPRT was one of the benchmarking tools used. The article stated that both versions of the brand-new S9 were slower than the iPhone X and, in some tests, were slower than even the iPhone 7.

A CNET video discussed the article and the role of WebXPRT in the performance comparison, and the article has been reposted to hundreds of tech media sites around the world. A quick survey shows reposts in Albania, Bulgaria, Denmark, Chile, the Czech Republic, France, Germany, Greece, Indonesia, Iran, Italy Japan, Korea, Poland, Russia, Spain, Slovakia, Turkey, and many other countries.

The popularity of the article is not surprising, for it positions the newest flagship phones from the industry’s two largest phone makers in a head-to-head comparison with a somewhat unexpected outcome. AnandTech did nothing to stir controversy or sensationalize the test results, but simply provided readers with an objective, balanced assessment of how these devices compare so that they could draw their own conclusions. The XPRTs share this approach.

We’re grateful to Andrei and others at AnandTech who’ve used the XPRTs over the years to produce content that helps consumers make informed decisions. WebXPRT is just part of AnandTech’s toolkit, but it’s one that’s accessible to anybody free of charge. With the help of BenchmarkXPRT Development Community members, we’ll continue to publish XPRT tools that help users everywhere gain valuable insight into device performance.

Justin

Just before showtime

In case you missed the announcement, WebXPRT 3 is now live! Please try it out, submit your test results, and feel free to send us your questions or comments.

During the final push toward launch day, it occurred to us that not all of our readers are aware of the steps involved in preparing a benchmark for general availability (GA). Here’s a quick overview of what we did over the last several weeks to prepare for the WebXPRT 3 release, a process that follows the same approach we use for all new XPRTs.

After releasing the community preview (CP), we started on the final build. During this time, we incorporated features that we were not able to include in the CP and fixed a few outstanding issues. Because we always try to make sure that CP results are comparable to eventual GA results, these issues rarely involve the workloads themselves or anything that affects scoring. In the case of WebXPRT 3, the end-of-test results submission form was not fully functional in the CP, so we finished making it ready for prime time.

The period between CP and GA releases is also a time to incorporate any feedback we get from the community during initial testing. One of the benefits of membership in the BenchmarkXPRT Development Community is access to pre-release versions of new benchmarks, along with an opportunity to make your voice heard during the development process.

When the GA candidate build is ready, we begin two types of extensive testing. First, our quality assurance (QA) team performs a thorough review, running the build on numerous devices. In the case of WebXPRT, it also involves testing with multiple browsers. The QA team also keeps a sharp eye out for formatting problems and bugs.

The second type of testing involves comparing the current version of the benchmark with prior versions. We tested WebXPRT 3 on almost 100 devices. While WebXPRT 2015 and WebXPRT 3 scores are not directly comparable, we normalize scores for both sets of results and check that device performance is scaling in the same way. If it isn’t, we need to determine why not.

Finally, after testing is complete and the new build is ready, we finalize all related documentation and tie  the various pieces together on the web site. This involves updating the main benchmark page and graphics, the FAQ page, the results tables, and the members’ area.

That’s just a brief summary of what we’ve been up to with WebXPRT in the last few weeks. If you have any questions about the XPRTs or the development community, feel free to ask!

Justin

Principled Technologies and the BenchmarkXPRT Development Community release WebXPRT 3, a free online performance evaluation tool for web-enabled devices

Durham, NC — Principled Technologies and the BenchmarkXPRT Development Community have released WebXPRT 3, a free online tool that gives objective information about how well a laptop, tablet, smartphone, or any other web-enabled device handles common web tasks. Anyone can go to WebXPRT.com and compare existing performance evaluation results on a variety of devices or run a simple evaluation test on their own.

WebXPRT 3 contains six HTML5- and JavaScript-based scenarios created to mirror common web browser tasks: Photo Enhancement, Organize Album Using AI, Stock Option Pricing, Encrypt Notes and OCR Scan, Sales Graphs, and Online Homework.

“WebXPRT is a popular, easy-to-use benchmark run by manufacturers, tech journalists, and consumers all around the world,” said Bill Catchings, co-founder of Principled Technologies, which administers the BenchmarkXPRT Development Community. “We believe that WebXPRT 3 is a great addition to WebXPRT’s legacy of providing relevant and reliable performance data for a wide range of devices.”

WebXPRT is one of the BenchmarkXPRT suite of performance evaluation tools. Other tools include MobileXPRT, TouchXPRT, CrXPRT, BatteryXPRT, and HDXPRT. The XPRTs help users get the facts before they buy, use, or evaluate tech products such as computers, tablets, and phones.

To learn more about and join the BenchmarkXPRT Development Community, go to www.BenchmarkXPRT.com.

About Principled Technologies, Inc.
Principled Technologies, Inc. is a leading provider of technology marketing and learning & development services. It administers the BenchmarkXPRT Development Community.

Principled Technologies, Inc. is located in Durham, North Carolina, USA. For more information, please visit www.PrincipledTechnologies.com.

Company Contact
Justin Greene
BenchmarkXPRT Development Community
Principled Technologies, Inc.
1007 Slater Road, Ste. 300
Durham, NC 27703
BenchmarkXPRTsupport@PrincipledTechnologies.com

WebXPRT 3 arrives next week!

After much development work and testing, we’re happy to report that we’ll be releasing WebXPRT 3 early next week!

Here are the final workload names and descriptions.

1) Photo Enhancement: Applies three effects to two photos each using Canvas.
2) Organize Album Using AI: Detects faces and classifies images using the ConvNetJS neural network library.
3) Stock Option Pricing: Calculates and displays graphics views of a stock portfolio using Canvas, SVG, and dygraphs.js.
4) Encrypt Notes and OCR Scan: Encrypts notes in local storage and scans a receipt using optical character recognition.
5) Sales Graphs: Calculates and displays multiple views of sales data using InfoVis and d3.js.
6) Online Homework: Performs science and English homework using Web Workers and Typo.js spell check.

As we mentioned in an earlier blog post, the updated photo workloads contain new images and a deep learning task. We also added an optical character recognition task to the Local Notes workload and combined part of the DNA Sequence Analysis scenario with a writing sample/spell check scenario to simulate online homework in the new Online Homework workload.

Longtime WebXPRT users will immediately notice a completely new UI. We worked to improve the UI’s appearance on smaller devices such as phones and we think testers will find it easier to navigate.

Testers can still choose to run individual workloads and we’re once again offering English, German, and Simplified Chinese language options.

Below my sig, I’ve included pictures of WebXPRT 3’s start test and results pages, as well as an in-test screen capture.

We’re thankful for all the interest in WebXPRT 3 so far and believe the new version of WebXPRT will be as relevant and reliable as WebXPRT 2013 and 2015—and easier to use. We look forward to seeing new results submissions next week!

Justin

WebXPRT 3 start page

WebXPRT 3 in test

Results page

Check out the other XPRTs:

Forgot your password?