BenchmarkXPRT Blog banner

Tag Archives: test results

More information about the CloudXPRT results submission process

Earlier this month, we discussed the possibility of using a periodic results submission process for CloudXPRT instead of the traditional rolling publication process that we’ve used for the other XPRTs. We’ve received some positive responses to the idea, and while we’re still working out some details, we’re ready to share the general framework of the process we’re planning to use.

  • We will establish a results review group, which only official BenchmarkXPRT Development Community members can join.
  • We will update the CloudXPRT database with new results once a month, on a pre-published schedule.
  • Two weeks before each publication date, we will stop accepting submissions for consideration for that review cycle.
  • One week before each publication date, we will send an email to the results review group that includes the details of that month’s submissions for review.
  • The results review group will serve as a sanity check process and a forum for comments on the month’s submissions, but we reserve the right of final approval for publication.
  • We will not restrict publishing results outside of the monthly review cadence, but we will not automatically add those results to the results database.
  • We may add externally published results to our database, but will do so only after vetting, and only on the designated day each month.

Our goal is to strike a balance between allowing the tech press, vendors, or other testers to publish CloudXPRT results on their own schedule, and simultaneously building a curated results database that OEMs or other parties can use to compete for the best results.

We’ll share more details about the review group, submission dates, and publications dates soon. Do you have questions or comments about the new process? Let us know what you think!

Justin

Our results database, your resource

Testers who have started using the XPRT benchmarks recently may not know about one of the free resources we offer. The XPRT results database currently holds more than 2,400 test results from over 90 sources, including major tech review publications around the world, OEMs, and independent testers. It offers a wealth of current and historical performance data across all the XPRT benchmarks and hundreds of devices.

We update the results database several times a week, adding selected results from our own internal lab testing, end-of-test user submissions, and reliable tech media sources. (After you run one of the XPRTs, you can choose to submit the results, but they don’t automatically appear in the database.)

Before adding a result, we evaluate whether the score makes sense and is consistent with general expectations, which we can do only when we have sufficient system information details. For that reason, we encourage testers to disclose as much hardware and software information as possible when publishing or submitting a result.

We encourage visitors to our site to explore the XPRT results database. There are three primary ways to do so. The first is by visiting the main BenchmarkXPRT results browser, which displays results entries for all of the XPRT benchmarks in chronological order (see the screenshot below). Users can narrow the results by selecting a benchmark from the drop-down menu and can type values, such as vendor or the name of a tech publication, into the free-form filter field. For results we produced in our lab, clicking “PT” in the Source column takes you to a page with additional disclosure information for the test system. For sources outside our lab, clicking the source name takes you to the original article or review that contains the result.

The second way to access our published results is by visiting the results page for each individual XPRT benchmark. Go the page of the benchmark you’re interested in, and look for the blue View Results button. Clicking it takes you to a page that displays results for only that benchmark. You can use the free-form filter on the page to filter those results, and can use the Benchmarks drop-down menu to jump to the other individual XPRT results pages.

The third way to view information in our results database is with the WebXPRT Processor Comparison Chart. When we publish a new WebXPRT result, the score automatically appears in the processor comparison chart as well. For each processor, the chart shows a bar representing the average score. Mousing over the bar displays a popup indicating the number of WebXPRT results we currently have for that processor and clicking the bar lets you view the results. You can change the number of results the chart displays on each page, and use the drop-down menu to toggle back and forth between the WebXPRT 3 and WebXPRT 2015 charts.

We hope you’ll take some time to browse the information in our results database. We welcome your feedback about what you’d like to see in the future and suggestions for improvement. Our database contains the XPRT scores that we’ve gathered, but we publish them as a resource for you. Let us know what you think!

Justin

CloudXPRT is up next, and we’re thinking about how to handle results submission and publication

Last month, we provided an update on the CloudXPRT development process and more information about the three workloads that we’re including in the first build. We’d initially hoped to release the build at the end of April, but several technical challenges have caused us to push the timeline out a bit. We believe we’re very close to ready, and look forward to posting a release announcement soon.

In the meantime, we’d like to hear your thoughts about the CloudXPRT results publication process. Traditionally, we’ve published XPRT results on our site on a rolling basis. When we complete our own tests, receive results submissions from other testers, or see results published in the tech media, we authenticate them and add them to our site. This lets testers make their results public on their timetable, as frequently as they want.

Some major benchmark organizations use a different approach, and create a schedule of periodic submission deadlines. After each deadline passes, they review the batch of submissions they’ve received and publish all of them together on a single later date. In some cases, they release results only two or three times per year. This process offers a high level of predictability. However, it can pose significant scheduling obstacles for other testers, such as tech journalists who want to publish their results in an upcoming device review and need official results to back up their claims.

We’d like to hear what you think about the different approaches to results submission and publication that you’ve encountered. Are there aspects of the XPRT approach that you like? Are there things we should change? Should we consider periodic results submission deadlines and publication dates for CloudXPRT? Let us know what you think!

Justin

Understanding AIXPRT results

Last week, we discussed the changes we made to the AIXPRT Community Preview 2 (CP2) download page as part of our ongoing effort to make AIXPRT easier to use. This week, we want to discuss the basics of understanding AIXPRT results by talking about the numbers that really matter and how to access and read the actual results files.

To understand AIXPRT results at a high level, it’s important to revisit the core purpose of the benchmark. AIXPRT’s bundled toolkits measure inference latency (the speed of image processing) and throughput (the number of images processed in a given time period) for image recognition (ResNet-50) and object detection (SSD-MobileNet v1) tasks. Testers have the option of adjusting variables such as batch size (the number of input samples to process simultaneously) to try and achieve higher levels of throughput, but higher throughput can come at the expense of increased latency per task. In real-time or near real-time use cases such as performing image recognition on individual photos being captured by a camera, lower latency is important because it improves the user experience. In other cases, such as performing image recognition on a large library of photos, achieving higher throughput might be preferable; designating larger batch sizes or running concurrent instances might allow the overall workload to complete more quickly.

The dynamics of these performance tradeoffs ensure that there is no single good score for all machine learning scenarios. Some testers might prefer lower latency, while others would sacrifice latency to achieve the higher level of throughput that their use case demands.

Testers can find latency and throughput numbers for each completed run in a JSON results file in the AIXPRT/Results folder. The test also generates CSV results files that are in the same folder. The raw results files report values for each AI task configuration (e.g., ResNet-50, Batch1, on CPU). Parsing and consolidating the raw data can take some time, so we’re developing a results file parsing tool to make the job much easier.

The results parsing tool is currently available in the AIXPRT CP2 OpenVINO – Windows package, and we hope to make it available for more packages soon. Using the tool is as simple as running a single command, and detailed instructions for how to do so are in the AIXPRT OpenVINO on Windows user guide. The tool produces a summary (example below) that makes it easier to quickly identify relevant comparison points such as maximum throughput and minimum latency.

AIXPRT results summary

In addition to the summary, the tool displays the throughput and latency results for each AI task configuration tested by the benchmark. AIXPRT runs each AI task multiple times and reports the average inference throughput and corresponding latency percentiles.

AIXPRT results details

We hope that this information helps to make it easier to understand AIXPRT results. If you have any questions or comments, please feel free to contact us.

Justin

Check out the other XPRTs:

Forgot your password?