BenchmarkXPRT Blog banner

Category: Collaborative benchmark development

Worldwide influence

On a weekly basis, we track several XPRT-related metrics such as completed test runs and downloads, content views, and social media engagement. We also search for instances where one or more of the XPRTs has been mentioned in an advertisement, article, or review. Compiling all this information gives us some insight into how, where, and how much people are using the XPRT tools.

From time to time, we share some of that data with readers and community members so they can see the impact that the XPRTs are having around the world. One way we do this is through our “XPRTs around the world” infographic, which provides a great picture of the XPRT’s reach.

Here are some key numbers from the latest update:

  • The XPRTs have been mentioned more than 11,400 times on over 3,600 unique sites.
  • Those mentions include more than 9,200 articles and reviews.
  • Those mentions originated in over 590 cities located in 65 countries on six continents. New cities of note include Phnom Penh, Cambodia; Osijek, Croatia; Kano, Nigeria; and Lahore, Pakistan.
  • The BenchmarkXPRT Development Community now includes 210 members from 74 companies and organizations around the world.


In addition to the growth in web mentions and community members, the XPRTs have now delivered more than 380,000 real-world results!

We’re grateful for everyone who’s helped us get this far! Every contribution to the community, no matter how small, contributes to our goal of providing reliable, relevant, and easy-to-use benchmark tools.

Justin

Which browser is the fastest? It’s complicated.

PCWorld recently published the results of a head-to-head browser performance comparison between Google Chrome, Microsoft Edge, Mozilla Firefox, and Opera. As we’ve noted about similar comparisons, no single browser was the fastest in every test. Browser speed sounds like a straightforward metric, but the reality is complex.

For the comparison, PCWorld used three JavaScript-centric test suites (JetStream, SunSpider, and Octane), one benchmark that simulates user actions (Speedometer), a few in-house tests of their own design, and one benchmark that simulates real-world web applications (WebXPRT). Edge came out on top in JetStream and SunSpider, Opera won in Octane and WebXPRT, and Chrome had the best results in Speedometer and PCWorld’s custom workloads.

The reason that the benchmarks rank the browsers so differently is that each one has a unique emphasis and tests a specific set of workloads and technologies. Some focus on very low-level JavaScript tasks, some test additional technologies such as HTML5, and some are designed to identify strengths or weakness by stressing devices in unusual ways. These approaches are all valid, and it’s important to understand exactly what a given score represents. Some scores reflect a very broad set of metrics, while others assess a very narrow set of tasks. Some scores help you to understand the performance you can expect from a device in your everyday life, and others measure performance in scenarios that you’re unlikely to encounter. For example, when Eric discussed a similar topic in the past, he said the tests in JetStream 1.1 provided information that “can be very useful for engineers and developers, but may not be as meaningful to the typical user.”

As we do with all the XPRTs, we designed WebXPRT to test how devices handle the types of real-world tasks consumers perform every day. While lab techs, manufacturers, and tech journalists can all glean detailed data from WebXPRT, the test’s real-world focus means that the overall score is relevant to the average consumer. Simply put, a device with a higher WebXPRT score is probably going to feel faster to you during daily use than one with a lower score. In today’s crowded tech marketplace, that piece of information provides a great deal of value to many people.

What are your thoughts on browser testing? We’d love to hear from you.

Justin

An update on HDXPRT development

It’s been a while since we updated the community on HDXPRT development, and we’ve made a lot of progress since then. Here’s a quick summary of where we are and what to expect in the coming months.

The benchmark’s official name will be HDXPRT 4, and we’re sticking with the basic plan we outlined in the blog, which includes updating the benchmark’s real-world trial applications and workload content and improving the UI.

We’ve updated Adobe Photoshop Elements, Audacity, CyberLink Media Espresso, and HandBrake to more contemporary versions, but decided the benchmark will no longer use Apple iTunes. We sometimes encountered problems with iTunes during testing, and because we can complete the audio-related workloads using Audacity, we decided that it was OK to remove iTunes from the test. Please contact us if you have any concerns about this decision.

In addition to the editing photos, editing music, and converting videos workloads from prior versions of the benchmark, HDXPRT 4 includes two new Photoshop Elements scenarios. The first utilizes an AI tool that corrects closed eyes in photos and the second creates a single panoramic photo from seven separate photos. For the photo and video workloads, we produced new high-res photo content and 4K GoPro video footage respectively.

For the UI, our goal is to implement a clean and functional design and align it more closely with the themes, colors, and font styles we’ll be implementing in the XPRTs moving forward. The WebXPRT 3 UI will give you a feel for the direction the HDXPRT UI is headed.

Some of these details may change as we test preliminary builds, but we wanted to give you a better sense of where HDXPRT is headed. We’re not ready to share a date for the community preview, but will provide more details as the day approaches.

If you have any questions or comments about HDXPRT, please let us know. It’s not too late to for us to consider your input for HDXPRT 4.

Justin

AIXPRT: We want your feedback!

Today, we’re publishing the AIXPRT Request for Comments (RFC) document. The RFC explains the need for a new artificial intelligence (AI)/machine learning benchmark, shows how the BenchmarkXPRT Development Community plans to address that need, and provides preliminary design specifications for the benchmark.

We’re seeking feedback and suggestions from anyone interested in shaping the future of machine learning benchmarking, including those not currently part of the Development Community. Usually, only members of the BenchmarkXPRT Development Community have access to our RFCs and the opportunity to provide feedback. However, because we’re seeking input from non-members who have expertise in this field, we will be posting this RFC in the New events & happenings section of the main BenchmarkXPRT.com page and making it available at AIXPRT.com.

We welcome input on all aspects of the benchmark, including scope, workloads, metrics and scores, UI design, and reporting requirements. We will accept feedback through May 13, 2018, after which BenchmarkXPRT Development Community administrators will collect and evaluate the feedback and publish the final design specification.

Please share the RFC with anyone interested in machine learning benchmarking and please send us your feedback before May 13.

Justin

Comparing open source and open development

Why do we use open development when designing and building the XPRTs, and what’s the difference between our open development approach and traditional open-source methods? The terminology around these two models can be confusing, so we wanted to review some similarities and differences.

Why open development?

An open development approach helps encourage collaboration, innovation, and transparency. XPRT community members get involved in the development of each benchmark from the beginning:

  • They submit suggestions, questions, and concerns that inform the future design of the tools.
  • They view early proposals for new versions and contribute comments for the final design.
  • They suggest new workloads.
  • They have access to community previews (beta builds) of the tools.
  • They submit source code for inclusion in the benchmarks.
  • They examine existing source code.

A commitment to transparency

Because we’re committed to publishing reliable, unbiased benchmarks, we also want make the XPRT development process as transparent as possible. It’s not unusual for people to claim that any given benchmark contains hidden biases. To address this problem, we make our source code available to anyone who joins the community. This approach reduces the risk of unforeseen bias in our benchmarks.

Quality control

Unlike open-source models, open development allows us to control derivative works, which can be important in benchmarking. While open source encourages a constantly evolving product that may fork into substantially different versions, benchmarking requires a product that remains static to enable valid comparisons over time. By controlling derivative works, we can avoid the problem of unauthorized versions of the benchmarks being published as “XPRTs.”

In the future, we may use a traditional open-source model for specific XPRTs or other projects. If we do, we’ll share our reasoning with the community and ask for their thoughts about the best way to proceed. If you’re not a community member, but are interested in benchmark development, we encourage you to join today!

Justin

The XPRTs in action

In the near future, we’ll update our “XPRTs around the world” infographic, which provides a snapshot of how people are using the XPRTs worldwide. Among other stats, we include the number of XPRT web mentions, articles, and reviews that have appeared during a given period. Recently, we learned how one of those statistics—a single web site mention of WebXPRT—found its way to consumers in more places than we would have imagined.

Late last month, AnandTech published a performance comparison by Andrei Frumusanu examining the Samsung Galaxy S9’s Snapdragon 845 and Exynos 9810 variants and a number of other high-end phones. WebXPRT was one of the benchmarking tools used. The article stated that both versions of the brand-new S9 were slower than the iPhone X and, in some tests, were slower than even the iPhone 7.

A CNET video discussed the article and the role of WebXPRT in the performance comparison, and the article has been reposted to hundreds of tech media sites around the world. A quick survey shows reposts in Albania, Bulgaria, Denmark, Chile, the Czech Republic, France, Germany, Greece, Indonesia, Iran, Italy Japan, Korea, Poland, Russia, Spain, Slovakia, Turkey, and many other countries.

The popularity of the article is not surprising, for it positions the newest flagship phones from the industry’s two largest phone makers in a head-to-head comparison with a somewhat unexpected outcome. AnandTech did nothing to stir controversy or sensationalize the test results, but simply provided readers with an objective, balanced assessment of how these devices compare so that they could draw their own conclusions. The XPRTs share this approach.

We’re grateful to Andrei and others at AnandTech who’ve used the XPRTs over the years to produce content that helps consumers make informed decisions. WebXPRT is just part of AnandTech’s toolkit, but it’s one that’s accessible to anybody free of charge. With the help of BenchmarkXPRT Development Community members, we’ll continue to publish XPRT tools that help users everywhere gain valuable insight into device performance.

Justin

Check out the other XPRTs:

Forgot your password?