Category: Benchmark metrics

WebXPRT 5: Starting to assemble the pieces

on November 6, 2025

In our last blog post, we shared the exciting news that we’re currently working on WebXPRT 5. In that post, we described some of the ways that WebXPRT may evolve with the release of WebXPRT 5. In today’s post, we’ll revisit some of the points of emphasis from the last post and focus on potential workload changes in a bit more detail.

With any benchmark development project, there are always technical challenges you need to iron out. That is especially true with a cross-platform, browser-based benchmark like WebXPRT. Because we’re in the middle of exploring the technical feasibility of a few of the options we’ll mention, we’re not yet ready to say for certain that all these features will be available in the initial WebXPRT 5 release. We can, however, now paint a clearer picture of the overall direction we’re headed.

In the section below, you’ll find updated info on where we stand with respect to some of the key development focal points we discussed in our last post. If there’s an item from that post or previous posts that we didn’t mention below—such as updating the test harness—it doesn’t mean that we’re dropping that goal. We’re just focusing on workloads today.

One of our key goals with WebXPRT 5 is providing more AI-related workloads. In past blog posts, we’ve discussed the growing importance of local, browser-side AI. With WebXPRT 5, we’re investigating two ways that we can expand WebXPRT’s AI portfolio: 1) updating existing WebXPRT 4 AI-oriented workloads, and 2) adding all-new AI workloads.

Here are some possible ways those AI-related changes may play out in both categories:

Updating existing WebXPRT 4 AI-oriented workloads

Splitting the existing Organize Album using AI workload’s timed tasks—face detection and image classification—into two independent workloads.
Updating the face detection and image classification tasks with the latest versions of the OpenCV.js computer vision and machine learning libraries.
Updating the Caffe deep learning framework for the face detection task.
Updating the ONNX-based SqueezeNet machine learning model for the image classification tasks.
Updating the version of the Tesseract.js OCR engine that WebXPRT uses in the Encrypt Notes and OCR Scan workload.

Potentially adding all-new AI workloads (either core or experimental workloads)

We’re exploring the idea of including a workload that uses an AI-powered segmentation model to blur the background of a video call.
We’re exploring the feasibility of including a local LLM chat workload.
We would eventually like to include a WebGPU-based web AI framework for a computer vision workload.

In addition to the goal of adding more AI, we previously discussed the possibility of adding non-AI WebGPU workloads. As a web API, WebGPU enables web-based applications—such as image-based GenAI and inference workloads—to directly access the graphics rendering and computational capabilities of a system’s GPU. In the future, WebXPRT 5 could use that technology to execute complex 3D rendering workloads.

We hope today’s post gives you a better sense of where WebXPRT 5 may be headed. We want to reemphasize that while we are actively investigating the possible changes mentioned above, nothing is set in stone. As the pieces start to fall into place, we’ll provide more information here in the blog.

If you have any questions or comments about WebXPRT 5, please feel free to contact us!

Justin

Posted in AI, benchmark, Benchmark metrics, Benchmarking, Benchmarks in general, BenchmarkXPRT, browser performance, Browser-based benchmarks, Caffe, computer vision, Cross-platform benchmarks, face detection, Future of performance evaluation, image classification, image processing, JavaScript, large language models, ONNX Runtime Web, Performance benchmarking, SqueezeNet, Web AI, WebGPU, WebXPRT, WebXPRT 4 | Tagged AI, AI workloads, benchmark, BenchmarkXPRT, browser benchmark, browser performance, caffe, cross-platform, face detection, image classification, machine learning, ML, OCR, ONNX, SqueezeNet, Tesseract, Web AI, WebGPU, WebXPRT, WebXPRT 4 |

You asked, and we heard you: WebXPRT 5 is on the way!

By Justin Greene

on October 9, 2025

We’re excited to announce that WebXPRT 5 is officially on the way! Since we launched WebXPRT 4 in February 2022, it’s proven to be an exceptionally successful and reliable go-to benchmark for OEM labs, the tech press, and individual users alike—to the tune of over 644,000 runs to date. In past blog posts, we’ve discussed new features and possible auxiliary workloads that we contemplated adding to WebXPRT 4. As we’ve considered user comments and suggestions, changes in web technology, and how we can best position WebXPRT as a relevant browser benchmark in the future, however, it became clear that it was time for an all-new WebXPRT.

Now that we’ve announced WebXPRT 5, the first question for many existing WebXPRT users may be, “When will WebXPRT 5 be available?” We’re not yet ready to share an anticipated WebXPRT 5 release date, but we can share that a lot of groundwork is already complete, and the remaining work is moving along rapidly. We’ll continue to issue updates here in the blog as we reach important milestones.

The second question for many existing WebXPRT users may be, “How will WebXPRT change?” We’re not yet ready to share extensive details about WebXPRT 5’s workloads—rest assured that we will as soon as we can firm up everything—but we can share a few key guidelines we tried to follow in our WebXPRT 5 design. Each of these points of emphasis is a result of feedback we’ve received from labs, as well as features that users have asked for.

Provide more AI-related workloads. In past blog posts, we’ve discussed the growing importance of local, browser-side AI. WebXPRT 4 already includes timed AI tasks in two of its workloads: the Organize Album using AI workload and the Encrypt Notes and OCR Scan workload. We’re working on ways to expand WebXPRT’s AI portfolio in the next version.
Add WebGPU workloads. As a web API, WebGPU enables web-based applications—such as image-based GenAI and inference workloads—to directly access the graphics rendering and computational capabilities of a system’s GPU. We hope to incorporate WebGPU measures in WebXPRT 5.
Improve WebXPRT’s utility as a tool for test labs, publications, and engineering analysis.
- Update the workloads with longer operations. Many of WebXPRT’s existing workloads no longer challenge cutting-edge consumer hardware as much as many of us would like. Testing labs have asked for longer and more demanding workloads. We’re working on incorporating workloads that are accessible enough to be run by a broad range of devices yet challenging enough to allow performance differentiation among high-end systems.
- Enable more precise performance measures. Labs and testers have also asked for more granular insight into the workloads to help with engineering-level performance analysis. Currently, some WebXPRT 4 workload scores include multiple timed tasks. If we separate those compound scores so that each workload reports results from only one timed task, users will be able to more precisely assess how well a device performs while handling specific operations. We’re looking into this approach.
Modernize the harness to make it more flexible and to speed future work. WebXPRT 4’s current harness works with server-side sessions on a LAMP (Linux, Apache, MySQL and PHP) stack. If we implement the harness via JavaScript on the client side, it will pave the way for faster development and testing cycles in the future.

We expect WebXPRT 5 to carry on the WebXPRT legacy of reliability and real-world relevance, while providing users with compelling new workloads and features. As has been our habit with new benchmark releases, however, we won’t force anyone to change versions anytime soon. Instead, we will continue to make WebXPRT 4 available for quite some time after WebXPRT 5 goes live.

If you have any questions or comments about WebXPRT, please let us know!

Justin

Posted in AI, Application-based benchmarks, benchmark, Benchmark metrics, Benchmarking, BenchmarkXPRT, BenchmarkXPRT development community, browser performance, Browser-based benchmarks, Cross-platform benchmarks, Future of performance evaluation, GPU, image classification, image processing, inference, LAMP, on-device AI, Performance benchmarking, Web AI, web API, Web-based testing, WebGPU, WebXPRT, WebXPRT 3, WebXPRT 4 |

Multi-tab testing in a future version of WebXPRT?

By Justin Greene

on September 4, 2025

In previous posts about our recommended best practices for producing consistent and reliable WebXPRT scores, we’ve emphasized the importance of “clean” testing. Clean testing involves minimizing the amount of background activity on a system during test runs to ensure stable test conditions. With stable test conditions, we can avoid common scenarios in which startup tasks, automatic updates, and other unpredictable processes contribute to high score variances and potentially unfair comparisons.

Clean testing is a vital part of accurate performance benchmarking, but it doesn’t always show us what kind of performance we can expect in typical everyday conditions. For example, while a browser performance test like WebXPRT can provide clean testing scores that serve as a valuable proxy for overall system performance, an entire WebXPRT test run involves only two open browser tabs. Most of us will have many more tabs open at any given time during the day. Those tabs—and any associated background services, extensions, plug-ins, or renderers—have the potential to require CPU cycles and frequently consume memory resources. Depending on the number of tabs you leave open, the performance impact on your system can be noticeable. Even with modern browser tab management and resource-saving features, a proliferation of tabs can still have a significant impact on your computing experience.

To address this type of computing, we’ve been considering the possibility of adding one or more multi-tab testing features to a future version of WebXPRT. There are several ways we could do this, including the following options:

We could open each full workload cycle in a new tab, resulting in seven total tabs.
We could open each individual workload iteration in a new tab, resulting in 42 total tabs.
We could allow users to run multiple full tests back-to-back while keeping the tabs from the previous test(s) open.

If we do decide to add multi-tab features to a future version of WebXPRT, we could integrate them into the main score or make them optional and thus not affect traditional WebXPRT testing. We’re looking at all these options.

Whenever we have multiple choices, we seek your input. We want to know if a feature like this is something you’d like to see. Below, you’ll find two quick survey questions that will help us gauge your interest in this topic. We would appreciate your input!

If you’d like to share additional thoughts or ideas related to possible multi-tab features, please let us know!

Justin

Posted in benchmark, Benchmark metrics, Benchmarking, Benchmarking computing devices, BenchmarkXPRT, browser performance, Browser-based benchmarks, Collaborative benchmark development, Cross-platform benchmarks, Future of performance evaluation, Performance benchmarking, WebXPRT, WebXPRT 4 | Tagged BenchmarkXPRT, BenchmarkXPRT Development Community, browser benchmark, browser performance, browser tabs, browsers, CPU, laptop performance, laptops, RAM, survey, test results, WebXPRT, WebXPRT 4 |

Browser-based AI tests in WebXPRT 4: optical character recognition

By Justin Greene

on July 3, 2025

In our previous blog post, we discussed the rapidly expanding influence of AI-enhanced technologies in areas like everyday browser activity—and the growing need for objective performance data that can help us understand how well new consumer devices will handle AI tasks. We noted that WebXPRT 4 already includes timed AI tasks in two of its workloads—the “Organize Album using AI” and “Encrypt Notes and OCR Scan”—and we provided some technical details for the Organize Album workload. In today’s post, we’ll focus on the Encrypt Notes workload.

The Encrypt Notes workload includes two separate scenarios that reflect common web-based productivity app tasks. The first scenario syncs a set of encrypted notes, and the second scenario uses AI-based optical character recognition (OCR) to scan a receipt, extract data, and then add that data to an expense report.

Here are the details for each scenario:

The encrypt notes scenario downloads a set of notes, encrypts that data, temporarily stores it in the browser’s localStorage object (the localStorageDB.js database layer), and then decrypts and renders it for display. This scenario measures HTML5 Local Storage, JavaScript, AES encryption, and WebAssembly (Wasm) performance.
The OCR scan scenario uses a Wasm-based version of Tesseract.js (tesseract-core.wasm.js v2.20) to scan an expense receipt. Tesseract.js is a JavaScript port of the Tesseract OCR engine—a popular open-source C/C++ library that extracts text from images and PDFs. The scenario then adds the receipt to an expense report. This scenario measures HTML5 Local Storage, JavaScript, and Wasm performance.

We mention this test under the AI umbrella in part because people sometimes use the term “OCR” to refer to a spectrum of AI and non-AI technologies. In this case, though, the specifics make this workload clearly have an AI component. The Wasm-based Tesseract library that we use in WebXPRT 4 is based on a version of C/C++ (v4.x) that uses Long Short-Term Memory (LSTM). LSTM is a type of recurrent neural network (RNN) that is particularly well-suited for processing and predicting sequential data. As such, it is clearly an AI component of the Encrypt Notes and OCR Scan workload.

To produce a score for each iteration of the workload, WebXPRT calculates the total time that it takes for a system to sync (encrypt, decrypt, and render) the notes, use OCR to scan the receipt, and add the scanned data to an expense report. In a standard test, WebXPRT runs seven iterations of the entire six-workload performance suite before calculating an overall test score. You can find out more about the WebXPRT results calculation process here.

Along with our post on the Organize Album workload, we hope this information provides a deeper understanding of WebXPRT 4’s AI-equipped workloads. As we mentioned last time, if you want to explore the structure of these workloads in more detail, you can check out previous blog posts for information about how to access and use the WebXPRT 4 source code for free. You can also read more about WebXPRT’s overall structure and other workloads in the Exploring WebXPRT 4 white paper.

If you have any questions about WebXPRT 4, please let us know!

Justin

Posted in AI, benchmark, Benchmark metrics, Benchmarking, BenchmarkXPRT, browser performance, Browser-based benchmarks, computer vision, Cross-platform benchmarks, HTML5, inference, OCR, on-device AI, Performance benchmarking, Wasm, Web-based testing, WebAssembly, WebXPRT, WebXPRT 4, White papers | Tagged AI, AI workloads, browser benchmark, browser performance, encryption, JavaScript, LSTM, OCR, optical character recognition, recurrent neural network, RNN, Tesseract, Web AI, web apps, WebAssembly, WebXPRT, WebXPRT 4 |

Browser-based AI tests in WebXPRT 4: face detection and image classification

By Justin Greene

on June 5, 2025

I recently revisited an XPRT blog entry that we posted from CES Las Vegas back in 2020. In that post, I reflected on the show’s expanded AI emphasis, and I wondered if we were reaching a tipping point where AI-enhanced and AI-driven tools and applications would become a significant presence in people’s daily lives. It felt like we were approaching that point back then with the prevalence of AI-powered features such as image enhancement and text recommendation, among many others. Now, seamless AI integration with common online tasks has become so widespread that many people unknowingly benefit from AI interactions several times a day.

As AI’s role in areas like everyday browser activity continues to grow—along with our expectations for what our consumer devices should be able to handle—reliable AI-oriented benchmarking is more vital than ever. We need objective performance data that can help us understand how well a new desktop, laptop, tablet, or phone will handle AI tasks.

WebXPRT 4 already includes timed AI tasks in two of its workloads: the “Organize Album using AI” workload and the “Encrypt Notes and OCR Scan” workload. These two workloads reflect the types of light browser-side inference tasks that are now fairly common in consumer-oriented web apps and extensions. In today’s post, we’ll provide some technical information about the Organize Album workload. In a future post, we’ll do the same for the Encrypt Notes workload.

The Organize Album workload includes two different timed tasks that reflect a common scenario of organizing online photo albums. The workload utilizes the AI inference and JavaScript capabilities of the WebAssembly (Wasm) version of OpenCV.js—an open-source computer vision and machine learning library. In WebXPRT 4, we used OpenCV.js version 4.5.2.

Here are the details for each task:

The first task measures the time it takes to complete a face detection job with a set of five 720 x 480 photos that we sourced from commercial photo sites. The workload loads a Caffe deep learning framework model (res10_300x300_ssd_iter_140000_fp16.caffemodel) using the commands found here.
The second task measures the time it takes to complete an image classification job (labeling based on object detection) with a different set of five 718 x 480 photos that we sourced from the ImageNet computer vision dataset. The workload loads an ONNX-based SqueezeNet machine learning model (squeezenet.onnx v 1.0) using the commands found here.

To produce a score for each iteration of the workload, WebXPRT calculates the total time that it takes for a system to organize both albums. In a standard test, WebXPRT runs seven iterations of the entire six-workload performance suite before calculating an overall test score. You can find out more about the WebXPRT results calculation process here.

We hope this post will give you a better sense of how WebXPRT 4 measures one kind of AI performance. As a reminder, if you want to dig into the details at a more granular level, you can access the WebXPRT 4 source code for free. In previous blog posts, you can find information about how to access and use the code. You can also read more about WebXPRT’s overall structure and other workloads in the Exploring WebXPRT 4 white paper.

If you have any questions about this workload or any other aspect of WebXPRT 4, please let us know!

Justin

Posted in AI, benchmark, Benchmark metrics, Benchmarking, Benchmarking computing devices, BenchmarkXPRT, browser performance, Browser-based benchmarks, Caffe, CES, Collaborative benchmark development, computer vision, Consumer Electronics Show, Cross-platform benchmarks, face detection, image classification, ImageNet, inference, JavaScript, Las Vegas, object detection, OCR, On-premise, ONNX Runtime Web, Performance benchmarking, SqueezeNet, Wasm, WebAssembly, WebXPRT, WebXPRT 4 | Tagged AI, artificial intelligence, BenchmarkXPRT, BenchmarkXPRT Development Community, browser benchmark, browser performance, face detection, image classification, image processing, JavaScript, object detection, OpenCV, WASM, WebAssembly, WebXPRT, WebXPRT 4 |

Best practices for WebXPRT testing

By Justin Greene

on May 8, 2025

One of the strengths of WebXPRT is that it’s a remarkably easy benchmark to run. Its upfront simplicity attracts users with a wide range of technical skills—everyone from engineers in cutting-edge OEM labs to veteran tech journalists to everyday folks who simply want to test their gear’s browser performance. With so many different kinds of people running the test each day, it’s certain that at least some of them use very different approaches to testing. In today’s blog, we’re going to share some of the key benchmarking practices we follow in the XPRT lab—and encourage you to consider—in order to produce the most consistent and reliable WebXPRT scores.

We offer these best practices as tips you might find useful in your testing. Each step relates to evaluating browser performance with WebXPRT, but several of these practices will apply to other benchmarks as well.

Test with clean images: In the XPRT lab, we typically use an out-of-box (OOB) method for testing new devices. OOB testing means that other than running the initial OS and browser version updates that users are likely to run after first turning on the device, we change as little as possible before testing. We want to assess the performance that buyers are likely to see when they first purchase the device and before they install additional software. This approach is the best way to provide an accurate assessment of the performance retail buyers will experience from their new devices. That said, the OOB method is not appropriate for certain types of testing, such as when you want to compare largely identical systems or when you want to remove as much pre-loaded software as possible. The OOB method is less relevant to users who want to see how their device performs as it is.
Browser updates can have a significant impact: Most people know that different browsers often produce different performance scores on the same system. They may not know that there can be shifts in performance between different versions of the same browser. While most browser updates don’t have a large impact on performance, a few updates have increased (or even decreased) browser performance by a significant amount. For this reason, it’s always important to record and disclose the extended browser version number for each test run. The same principle applies to any other relevant software.
Turn off automatic updates: We do our best to eliminate or minimize app and system updates after initial setup. Some vendors are making it more difficult to turn off updates completely, but you should always double-check update settings before testing. On Windows systems, the same considerations apply to turning off User Account Control notifications.
Let the system settle: Depending on the system and the OS, a significant amount of system-level activity can be going on in the background after you turn it on. As much as possible, we like to wait for a stable baseline (idle time) of system activity before kicking off a test. If we start testing immediately after booting the system, we often see higher variance in the first run before the scores start to tighten up.
Run the test more than once: Because of natural variance, our standard practice in the XPRT lab is to publish a score that represents the median of three to five runs, if not more. If you run a benchmark only once and the score differs significantly from other published scores, your result could be an outlier that you would not see again under stable testing conditions or over the course of multiple runs.
Clear the cache: Browser caching can improve web page performance, including the loading of the types of JavaScript and HTML5 assets that WebXPRT uses in its workloads. Depending on the platform under test, browser caching may or may not significantly change WebXPRT scores, but clearing the cache before testing and between each run can help improve the accuracy and consistency of scores.

We hope these tips will serve as a good baseline methodology for your WebXPRT testing. If you have any questions about WebXPRT, the other XPRTs, or benchmarking in general, please let us know!

Justin

Posted in benchmark, Benchmark metrics, Benchmarking, BenchmarkXPRT, browser performance, Browser-based benchmarks, Cross-platform benchmarks, Performance benchmarking, Performance testing on tablets, WebXPRT, WebXPRT 4, Windows | Tagged benchmark, BenchmarkXPRT, browser benchmark, browser performance, HTML5, JavaScript, OOB, Performance, WebXPRT, WebXPRT 4, Windows, XPRTs |

Category: Benchmark metrics

WebXPRT 5: Starting to assemble the pieces

You asked, and we heard you: WebXPRT 5 is on the way!

Multi-tab testing in a future version of WebXPRT?

Browser-based AI tests in WebXPRT 4: optical character recognition

Browser-based AI tests in WebXPRT 4: face detection and image classification

Best practices for WebXPRT testing

Check out the other XPRTs: