BenchmarkXPRT Blog banner

Category: What makes a good benchmark?

We want your thoughts about experimental WebXPRT 4 workloads

Two weeks ago, we discussed how users can automate WebXPRT 4 testing by appending several parameters and values to the benchmark’s URL. One of these lets you enable any available experimental workloads during the test run. While we don’t currently offer any experimental workloads for WebXPRT 4, we are seeking suggestions for possible future workload scenarios, or specific web technologies that you’d like to be able to test with an experimental workload.

The main purpose of optional, experimental workloads would be to test cutting-edge browser technologies or new use cases, even if the experimental workload doesn’t work on all browsers or devices. The individual scores for the experimental workloads would stand alone, and would not factor in the WebXPRT 4 overall score. WebXPRT 4 testers would be able to run the experimental workloads one of two ways: by adjusting a value in the WebXPRT 4 automation scripts, as mentioned above, or by manually selecting them on the benchmark’s home screen.

Testers would benefit from experimental workloads by learning how well certain browsers or systems handle new tasks (e.g., new web apps or AI capabilities). We would benefit from fielding workloads for large-scale testing and user feedback before we commit to including them as core WebXPRT workloads.

Do you have any general thoughts about experimental workloads for browser performance testing, or any specific workloads that you’d like us to consider? Please let us know.

Justin

The XPRTs: What would you like to see?

One of the core principles of the BenchmarkXPRT Development Community is a commitment to valuing the feedback of both community members and the larger group of testers that use the XPRTs on a regular basis. That feedback helps us to ensure that as the XPRTs continue to grow and evolve, the resources that we offer will continue to meet the needs of those that use them.

In the past, user feedback has influenced specific aspects of our benchmarks such as the length of test runs, user interface features, results presentation, and the removal or inclusion of specific workloads. More broadly, we have also received suggestions for entirely new XPRTs and ways we might target emerging technologies or industry use cases.

As we approach the second half of 2022 and begin planning for 2023, we’re asking to hear your ideas about new XPRTs—or new features for existing XPRTs. Are you aware of hardware form factors, software platforms, or prominent applications that are difficult or impossible to evaluate using existing performance benchmarks? Are there new technologies we should be incorporating into existing XPRTs via new workloads? Can you recommend ways to improve any of the XPRTs or XPRT-related tools such as results viewers?

We are interested in your answers to these questions and any other ideas you have, so please feel free to contact us. We look forward to hearing your thoughts!

Justin

Chrome OS support for CrXPRT apps ends in June 2022

Last March, we discussed the Chrome OS team’s original announcement that they would be phasing out support for Chrome Apps altogether in June 2021, and would shift their focus to Chrome extensions and Progressive Web Apps. The Chrome OS team eventually extended support for existing Chrome Apps through June 2022, but as of this week, we see no indication that they will further extend support for Chrome Apps published with general developer accounts. If the end-of-life schedule for Chrome Apps does not change in the next few months, both CrXPRT 2 and CrXPRT 2015 will stop working on new versions of Chrome OS at some point in June.

To maintain CrXPRT functionality past June, we would need to rebuild the app completely—either as a Progressive Web App or in some other form. For this reason, we want to reassess our approach to Chrome OS testing, and investigate which features and technologies to include in a new Chrome OS benchmark. Our current goal is to gather feedback and conduct exploratory research over the next few months, and begin developing an all-new Chrome OS benchmark for publication by the end of the year.

While we will discuss ideas for this new Chrome OS benchmark in future blog posts, we welcome ideas from CrXPRT users now. What features or workloads would you like the new benchmark to retain? Would you like us to remove any components from the existing benchmark? Does the battery life test in its current form suit your needs? If you have any thoughts about these questions or any other aspects of Chrome OS benchmarking, please let us know!

Justin

Why we don’t control screen brightness during CrXPRT 2 battery life tests

Recently, we had a discussion with a community member about why we no longer recommend specific screen brightness settings during CrXPRT 2 battery life tests. In the CrXPRT 2015 user manual, we recommended setting the test system’s screen brightness to 200 nits. Because the amount of power that a system directs to screen brightness can have a significant impact on battery life, we believed that pegging screen brightness to a common standard for all test systems would yield apple-to-apples comparisons.

After extensive experience with CrXPRT 2015 testing, we decided to not recommend a standard screen brightness with CrXPRT 2, for the following reasons:

  • A significant number of Chromebooks cannot produce a screen brightness of 200 nits. A few higher-end models can do so, but they are not representative of most Chromebooks. Some Chromebooks, especially those that many school districts and corporations purchase in bulk, cannot produce a brightness of even 100 nits.
  • Because of the point above, adjusting screen brightness would not represent real-life conditions for most Chromebooks, and the battery life results could mislead consumers who want to know the battery life they can expect with default out-of-box settings.
  • Most testers, and even some labs, do not have light meters, and the simple brightness percentages that the operating system reports produce different degrees of brightness on different systems. For testers without light meters, a standardized screen brightness recommendation could discourage them from running the test.
  • The brightness controls for some low-end Chromebooks lack the fine-tuning capability that is necessary to standardize brightness between systems. In those cases, an increase or decrease of one notch can swing brightness by 20 to 30 nits in either direction. This could also discourage testing by leading people to believe that they lack the capability to correctly run the test.

In situations where testers want to compare battery life using standardized screen brightness, we recommend using light meters to set the brightness levels as closely as possible. If the brightness levels between systems vary by more than few nits, and if the levels vary significantly from out-of-box settings, the publication of any resulting battery life results should include a full disclosure and explanation of test conditions.

For the majority of testers without light meters, running the CrXPRT 2 battery life test with default screen brightness settings on each system provides a reliable and accurate estimate of the type of real-world, out-of-box battery life consumers can expect.

If you have any questions or comments about the CrXPRT 2 battery life test, please feel free to contact us!

Justin

Thinking about experimental WebXPRT workloads in 2022

As the WebXPRT 4 development process has progressed, we’ve started to discuss the possibility of offering experimental WebXPRT 4 workloads in 2022. These would be optional workloads that test cutting-edge browser technologies or new use cases. The individual scores for the experimental workloads would stand alone, and would not factor in the WebXPRT 4 overall score.

WebXPRT testers would be able to run the experimental workloads one of two ways: by manually selecting them on the benchmark’s home screen, or by adjusting a value in the WebXPRT 4 automation scripts.

Testers would benefit from experimental workloads by being able to compare how well certain browsers or systems handle new tasks (e.g., new web apps or AI capabilities). We would benefit from fielding workloads for large-scale testing and user feedback before we commit to including them as core WebXPRT workloads.

Do you have any general thoughts about experimental workloads for browser performance testing, or any specific workloads that you’d like us to consider? Please let us know.

Justin

Round 2 of the WebXPRT 4 survey is now open

In May, we surveyed longtime WebXPRT users regarding the types of changes they would like to see in a WebXPRT 4. We sent the survey to journalists at several tech press outlets, and invited our blog readers to participate as well. We received some very helpful feedback. As we explore new possibilities for WebXPRT 4, we’ve decided to open an updated version of the survey. We’ve adjusted the questions a bit based on previous feedback and added some new ones, so we invite you to respond even if you participated in the original survey.

To do so, please send your answers to the following questions to benchmarkxprtsupport@principledtechnologies.com before July 31.

  • Do you think WebXPRT 3’s selection of workload scenarios is representative of modern web tasks?
  • How do you think WebXPRT compares to other common browser-based benchmarks, such as JetStream, Speedometer, and Octane?
  • Would you like to see a workload based on WebAssembly (WASM) in WebXPRT 4? Why or why not?
  • Would you like to see a workload based on Single Page Application (SPA) technology in WebXPRT 4? Why or why not?
  • Would you like to see a workload based on Motion UI in WebXPRT 4? Why or why not?
  • Would you like to see us include any other web technologies in additional workloads?
  • Are you happy with the WebXPRT 3 user interface? If not, what UI changes would you like to see?
  • Have you ever experienced significant connection issues when testing with WebXPRT?
  • Given its array of workloads, do you think the WebXPRT runtime is reasonable? Would you mind if the average runtime increased slightly?
  • Would you like to see us change any other aspects of WebXPRT 3?


If you would like to share your thoughts on any topics that the questions above do not cover, please include those in your response. We look forward to hearing from you!

Justin

Check out the other XPRTs:

Forgot your password?