BenchmarkXPRT Blog banner

Category: Collaborative benchmark development

A smooth transition

We want to thank Andrei Frumusanu of AnandTech for mentioning WebXPRT 3 in the System Performance section of their Snapdragon 845 review. For testing labs and tech media, incorporating a new benchmark into a test suite can be daunting, and they don’t make the decision to do so lightly. Once a new benchmark is in play, the score database used for comparisons is suddenly empty, and a lot of testing needs to happen before anyone can compare devices on a large scale.

In the BenchmarkXPRT Development Community, we’ve designed our development and release system to minimize the stress involved in adopting new benchmark tools. A key part of that strategy is releasing community previews to members several weeks before the general release. When we release a community preview, we include no publication restrictions and we work to make sure that preview results will be comparable to results from the general release. Between a community preview and a general release, we may still tweak the UI or fix issues with non-workload-related features, but you can be sure that the results will still be good after the general release.

The community preview system allows us to solicit feedback from an expanded base of pre-release testers, but it also allows labs to backfill results for legacy devices and get a head start on incorporating the new benchmark into their testing suites.

Speaking of previews, WebXPRT 3 community preview testing is going well and we’re excited about the upcoming release. If you’d like to learn more about our development community and how you can join, send us your questions and we’ll be happy to help.

Justin

WebXPRT 3, Mobile World Congress, and the next HDXPRT

We’re excited about everything that’s in store for the XPRTs, and we want to update community members on what to expect in the next few months.

The next major development is likely to be the WebXPRT 3 general release. We’re currently refining the UI and conducting extensive testing with the community preview build. We’re not ready to announce a firm release date, but hope to do so over the next few weeks. Please try the community preview and give us your feedback, if you haven’t already.

During the last week of February, Mark will be at Mobile World Congress (MWC) in Barcelona. Each year, MWC offers a great opportunity to examine the new trends and technologies that will shape mobile technology in the years to come. We look forward to sharing Mark’s thoughts on this year’s hot topics. Will you be attending MWC this year? If so, let us know!

In addition, we’re hoping to have a community preview of the next HDXPRT ready in the spring. As we mentioned a few months ago, we’re updating the workloads, applications, and UI. For the converting photos scenario, we’re considering incorporating new Adobe Photoshop tools such as the “Open Closed Eyes” feature and an automatic fix for pictures that are out of focus due to handheld camera shake. For the converting videos scenario, we’re including 4K GoPro footage that represents the quality of video captured by today’s “prosumer” demographic.

What features would you like to see in the next HDXPRT? Let us know!

Justin

Machine learning in 2018

We are almost to the end of 2017 and, as you have probably guessed, we will not have a more detailed proposal of our machine learning benchmark ready by the end of the year.

The key aspects of the benchmark proposal we wrote about a few months ago haven’t changed, but we are running behind schedule. We are still hoping to have the proposal ready in Q1 2018 and the tool based on that proposal later in the year. We will keep you posted.

In the meantime, we hope you enjoy as much as we did the recent CGP Grey tech video explanation of machine learning. There are actually two videos—the first one gives a general overview and then the second one does a better job of looking at the current state of machine learning. It talks mainly about the training aspects of machine learning rather than the inference aspects that we are looking into with AIXPRT/MLXPRT.

From all of us in the BenchmarkXPRT Development Community, we hope you and yours have a wonderful holiday and a great start to 2018!

Bill

The WebXPRT 3 Community Preview is here!

Today we’re releasing the WebXPRT 3 Community Preview (CP). As we discussed in the blog last month, in the new version of WebXPRT, we updated the photo-related workloads with new images and a new deep learning task for the Organize Album workload. We also added an optical character recognition task to the Local Notes workload and combined a portion of the DNA Sequence Analysis scenario with a writing sample/spell check scenario to simulate an online homework hub in the new “Online Homework” workload.

Also, longtime WebXPRT users will immediately notice a completely new, but clean and straightforward, UI. We’re still tweaking aspects of the UI and implementing full functionality for certain features such as social media sharing and German language translation, but we don’t anticipate making any significant changes to the overall test or individual workloads before the general release.

As with all community previews, the WebXPRT 3 CP is available only to BenchmarkXPRT Development Community members, who can access the link from the WebXPRT tab in the Members’ Area.

After you try the WebXPRT 3 CP, please send us your comments. Thanks and happy testing!

Justin

Nothing to hide

I recently saw an article in ZDNet by my old friend Steven J. Vaughan-Nichols that talks about how NetMarketShare and StatCounter reported a significant jump in the operating system market shares for Linux and Chrome OS. One frustration Vaughan-Nichols alluded to in the article is the lack of transparency into how these firms calculated market share, so he can’t gauge how reliable they are. Because neither NetMarketShare nor StatCounter disclosed their methods, there’s no sure way for interested observers to verify the numbers. Steven prefers the data from the federal government’s Digital Analytics Program (DAP). DAP makes its data freely available, so you can run your own calculations. Transparency generates trust.

Transparency is a core value for the XPRTs. We’ve written before about how statistics can be misleading. That’s why we’ve always disclosed exactly how the XPRTs calculate performance results, and the way BatteryXPRT calculates battery life. It’s also why we make each XPRT’s source code available to community members. We want to be open and honest about how we do things, and our open development community model fosters the kind of constructive feedback that helps to continually improve the XPRTs.

We’d love for you to be a part of that process, so if you have questions or suggestions for improvement, let us know. If you’d like to gain access to XPRT source code and previews of upcoming benchmarks, today is a great day to join the community!

Eric

Machine learning performance tool update

Earlier this year we started talking about our efforts to develop a tool to help in evaluating machine learning performance. We’ve given some updates since then, but we’ve also gotten some questions, so I thought I’d do my best to summarize our answers for everyone.

Some have asked what kinds of algorithms we’ve been looking into. As we said in an earlier blog, we’re looking at  algorithms involved in computer vision, natural language processing, and data analytics, particularly different aspects of computer vision.

One seemingly trivial question we’ve received regards the proposed name, MLXPRT. We have been thinking of this tool as evaluating machine learning performance, but folks have raised a valid concern that it may well be broader than that. Does machine learning include deep learning? What about other artificial intelligence approaches? I’ve certainly seen other approaches lumped into machine learning, probably because machine learning is the hot topic of the moment. It feels like everything is boasting, “Now with machine learning!”

While there is some value in being part of such a hot movement, we’ve begun to wonder if a more inclusive name, such as AIXPRT, would be better. We’d love to hear your thoughts on that.

We’ve also had questions about the kind of devices the tool will run on. The short answer is that we’re concentrating on edge devices. While there is a need for server AI/ML tools, we’ve been focusing on the evaluating the devices close to the end users. As a result, we’re looking at the inference aspect of machine learning rather than the training aspect.

Probably the most frequent thing we’ve been asked about is the timetable. While we’d hoped to have something available this year, we were overly optimistic. We’re currently working on a more detailed proposal of what the tool will be, and we aim to make that available by the end of this year. If we achieve that goal, our next one will be to have a preliminary version of the tool itself ready in the first half of 2018.

As always, we seek input from folks, like yourself, who are working in these areas. What would you most like to see in an AI/machine learning performance tool? Do you have any questions?

Bill 

Check out the other XPRTs:

Forgot your password?