Category: Performance benchmarking

Speaking of potential future WebXPRT workloads

on October 31, 2024

In recent blog posts, we’ve discussed several types of potential future WebXPRT workloads—from an auxiliary AI-focused workload to a WebXPRT battery life test—and many of the factors that we would need to consider when developing those workloads. In today’s post, we’re discussing other types of workloads that we may consider for future WebXPRT versions. We’re also inviting you to send us your WebXPRT workload ideas!

Currently, the most promising web technology for future WebXPRT workloads is WebAssembly (Wasm). Wasm is a binary instruction format that works across all modern browsers, provides a sandboxed environment that operates at native speeds, and takes advantage of common hardware specs across platforms. Wasm’s capabilities offer web developers significant flexibility in running complex client applications within the browser.

We first made use of Wasm in WebXPRT 4’s Organize Album and Encrypt Notes workloads, but Wasm has the potential to support many more types of test scenarios. Here are just a few of the use-case categories that Wasm supports:

Gaming
Image and video editing
Video augmentation
CAD applications
Interactive learning portals
Language translation

Those categories and the possibilities they open for additional workloads are exciting! When thinking through possible new workload scenarios, it’s important to remember that workload proposals need to fit within a set of basic guidelines that uphold WebXPRT’s strengths as a benchmark. You can read about those guidelines in more detail in this blog post, but in short, new workloads ideally should

be relevant to real-life scenarios
have cross-platform support
clearly differentiate in their performance between different types of devices
produce consistent and easily replicated results

After testing with WebXPRT or reviewing the list of use cases that Wasm supports, have you considered a new workload or test scenario that you would like to see? If so, please let us know! Your ideas could end up playing a role in shaping the next version of WebXPRT!

Justin

Posted in battery life, benchmark, Benchmarking, BenchmarkXPRT, browser performance, Browser-based benchmarks, Collaborative benchmark development, Cross-platform benchmarks, Future of performance evaluation, image processing, Performance benchmarking, WebAssembly, WebXPRT, WebXPRT 4, What makes a good benchmark? | Tagged benchmark, BenchmarkXPRT, browser benchmark, browser performance, cross-platform, gaming, image processing, WebAssembly, WebXPRT, WebXPRT 4 |

Thinking through a potential WebXPRT 4 battery life test

By Justin Greene

on October 17, 2024

In recent blog posts, we’ve discussed some of the technical considerations we’re working through on our path toward a future AI-focused WebXPRT 4 auxiliary workload. While we’re especially excited about adding to WebXPRT 4’s AI performance evaluation capabilities, AI is not the only area of potential WebXPRT 4 expansion that we’ve thought about. We’re always open to hearing suggestions for ways we can improve WebXPRT 4, including any workload proposals you may have. Several users have asked about the possibility of a WebXPRT 4 battery life test, so today we’ll discuss what one might look like and some of the challenges we’d have to overcome to make it a reality.

Battery life tests fall into two primary categories: simple rundown tests and performance-weighted tests. Simple rundown tests measure battery life during extreme idle periods and loops of movie playbacks, etc., but do not reflect the wide-ranging mix of activities that characterize a typical day for most users. While they can be useful for performing very specific apples-to-apples comparisons, these tests don’t always give consumers an accurate estimate of the battery life they would experience in daily use.

In contrast, performance-weighted battery life tests, such as the one in CrXPRT 2, attempt to reflect real-world usage. The CrXPRT battery life test simulates common daily usage patterns for Chromebooks by including all the productivity workloads from the performance test, plus video playback, audio playback, and gaming scenarios. It also includes periods of wait/idle time. We believe this mixture of diverse activity and idle time better represents typical real-life behavior patterns. This makes the resulting estimated battery life much more helpful for consumers who are trying to match a device’s capabilities with their real-world needs.

From a technical standpoint, WebXPRT’s cross-platform nature presents us with several challenges that we did not face while developing the CrXPRT battery life test for ChromeOS. While the WebXPRT performance tests run in almost any browser, cross-browser differences and limitations in battery life reporting may restrict any future battery life test to a single browser or browser family. For instance, with the W3C Battery Status API, we can currently query battery status data from non-mobile Chromium-based browsers (e.g., Chrome, Edge, Opera, etc.), but not from Firefox or Safari. If a WebXPRT 4 battery life test supported only a single browser family, such as Chromium-based browsers, would you still be interested in using it? Please let us know.

A browser-based battery life workflow also presents other challenges that we do not face in native client applications, such as CrXPRT:

A browser-based battery life test may require the user to check the starting and ending battery capacities, with no way for the app to independently verify data accuracy.
The battery life test could require more babysitting in the event of network issues. We can catch network failures and try to handle them by reporting periods of network disconnection, but those interruptions could influence the battery life duration.
The factors above could make it difficult to achieve repeatability. One way to address that problem would be to run the test in a standardized lab environment with a steady internet connection, but a long list of standardized environmental requirements would make the battery life test less attractive and less accessible to many testers.

We’re not sharing these thoughts to make a WebXPRT 4 battery life test seem like an impossibility. Rather, we want to offer our perspective on what the test might look like and describe some of the challenges and considerations in play. If you have thoughts about battery life testing, or experience with battery life APIs in one or more of the major browsers, we’d love to hear from you!

Justin

Posted in AI, battery life, benchmark, Benchmark metrics, Benchmarking, browser performance, Browser-based benchmarks, Chrome, Chromium, Collaborative benchmark development, Cross-platform benchmarks, CrXPRT, Firefox, Future of performance evaluation, Performance benchmarking, Safari | Tagged AI, AI workloads, battery life, browser benchmark, browser performance, Chrome, Chromium, cross-platform, CrXPRT, CrXPRT 2, Edge, Opera, WebXPRT, WebXPRT 4 |

Gain a deeper understanding of WebXPRT 4 with our results calculation white paper

By Justin Greene

on October 3, 2024

More people around the world are using WebXPRT 4 now than ever before. It’s exciting to see that growth, which also means that many people are visiting our site and learning about the XPRTs for the first time. Because new visitors may not know how the XPRT family of benchmarks differs from other benchmarking efforts, we occasionally like to revisit the core values of our open development community here in the blog—and show how those values translate into more free resources for you.

One of our primary values is transparency in all our benchmark development and testing processes. We share information about our progress with XPRT users throughout the development process, and we invite people to contribute ideas and feedback along the way. We also publish both the source code of our benchmarks and detailed information about how they work, unlike benchmarks that use a “black box” model.

For WebXPRT 4 users who are interested in knowing more about the nuts and bolts of the benchmark, we offer several information-packed resources, including our focus for today, the WebXPRT 4 results calculation and confidence interval white paper. The white paper explains the WebXPRT 4 confidence interval, how it differs from typical benchmark variability, and the formulas the benchmark uses to calculate the individual workload scenario scores and overall score on the end-of-test results screen. The paper also provides an overview of the statistical methodology that WebXPRT uses to translate raw timings into scores.

In addition to the white paper’s discussion of the results calculation process, we’ve also provided a results calculation spreadsheet that shows the raw data from a sample test run and reproduces the calculations WebXPRT uses to generate both the workload scores and an overall score.

In potential future versions of WebXPRT, it’s likely that we’ll continue to use the same—or very similar—statistical methodologies and results calculation formulas that we’ve documented in the results calculation white paper and spreadsheet. That said, if you have suggestions for how we could improve those methods or formulas—either in part or in whole—please don’t hesitate to contact us. We’re interested in hearing your ideas!

The white paper is available on WebXPRT.com and on our XPRT white papers page. If you have any questions about the paper or spreadsheet, WebXPRT, or the XPRTs in general, please let us know.

Justin

Posted in benchmark, Benchmark metrics, Benchmarking, BenchmarkXPRT, BenchmarkXPRT development community, browser performance, Browser-based benchmarks, Collaborative benchmark development, Cross-platform benchmarks, Performance benchmarking, results, Source code, WebXPRT, WebXPRT 4, White papers | Tagged benchmark, BenchmarkXPRT, BenchmarkXPRT Development Community, browser benchmark, browser performance, cross-platform, open development, results, source code, WebXPRT, WebXPRT 4, white paper, XPRTs |

Web APIs: Possible paths for the AI-focused WebXPRT 4 auxiliary workload

By Justin Greene

on September 19, 2024

In our last blog post, we discussed one of the major decision points we’re facing as we work on what we hope will be the first new AI-focused WebXPRT 4 auxiliary workload: choosing a Web AI framework. In today’s blog, we’re discussing another significant decision that we need to make for the future workload’s development path: choosing a web API.

Many of you are familiar with the concept of an application programming interface (API). Simply put, APIs implement sets of software rules, tools, and/or protocols that serve as intermediaries that make it possible for different computer programs or components to communicate with each other. APIs simplify many development tasks for programmers and provide standardized ways for applications to share data, functions, and system resources.

Web APIs fulfill the intermediary role of an API—through HTTP-based communication—for web servers (on the server side) or web browsers (on the client side). Client-side web APIs make it possible for browser-based applications to expand browser functionality. They execute the kinds of JavaScript, HTML5, and WebAssembly (Wasm) workloads—among other examples—that support the wide variety of browser extensions many of us use every day. WebXPRT uses those types of browser-based workloads to evaluate system performance. To lay a solid foundation for the first future browser-based AI workload, we need to choose a web API that will be compatible with WebXPRT and the Web AI framework and AI inference workload(s) we ultimately choose.

Currently, there are three main web API paths for running AI inference in a web browser: Web Neural Network (WebNN), Wasm, and WebGPU. These three web technologies are in various stages of development and standardization. Each has different levels of support within the major browsers. Here are basic overviews of each of the three options, as well as a few of our thoughts on the benefits and limitations that each may bring to the table for a future WebXPRT AI workload:

WebNN is a JavaScript API that enables developers to directly execute machine learning (ML) tasks on neural networks within web-based applications. WebNN makes it easier to integrate ML models into web apps, and it allows web apps to leverage the power of neural processing units (NPUs). WebNN has a lot going for it. It’s hardware-agnostic and works with various ML frameworks. It’s likely to be a major player in future browser-based inference applications. However, as a web standard, WebNN is still in the development stage and is only available in developer previews for Chromium-based browsers. Full default WebNN support could take a year or more.
Wasm is a binary instruction format that works across all modern browsers. Wasm provides a sandboxed environment that operates at near-native speeds and takes advantage of common hardware specs across platforms. Wasm’s capabilities offer web developers a great deal of flexibility for running complex client applications in the browser. Simply put, Wasm can help developers adapt their existing code for additional platforms and browser-based applications without requiring extensive code rewrites. Wasm’s flexibility and cross-platform compatibility is one of the reasons that we’ve already made use of Wasm in two existing WebXPRT 4 workloads that feature AI tasks: Organize Album using AI, and Encrypt Notes and OCR Scan. Wasm can also work together with other web APIs, such as WebGPU.
WebGPU enables web-based applications to directly access the graphics rendering and computational capabilities of a system’s GPU. The parallel computational abilities of GPUs make them especially well-suited to efficiently handle some of the demands of AI inference workloads, including image-based GenAI workloads or large language models. Google Chrome and Microsoft Edge currently support WebGPU, and it’s available in Safari through a tech preview.

Right now, we don’t think that WebNN will be fully out of the development phase in time to serve as our go-to web API for a new WebXPRT AI workload. Wasm and/or WebGPU appear to our best options for now. When WebNN is fully baked and available in mainstream browsers, it’s possible that we could port any existing Wasm- or WebGPU-based WebXPRT AI workloads to WebNN, which may open the possibility of cross-platform browser-based NPU performance comparisons.

All that said and as we mentioned in our previous post about Web AI frameworks, we have not made any final decisions about a web API or any aspect of the future workload. We’re still in the early stages of this project. We want your input.

If this discussion has sparked web AI ideas that you think would benefit the process, or if you have feedback you’d like to share, please feel free to contact us!

Justin

Posted in AI, benchmark, Benchmarking, BenchmarkXPRT, browser performance, Browser-based benchmarks, Chrome, Chromium, Collaborative benchmark development, Cross-platform benchmarks, Future of performance evaluation, GenAI, Google, Google Chrome, GPU, HTML5, inference, JavaScript, large language models, Machine learning, Microsoft, Microsoft Edge, NPU, Performance benchmarking, Wasm, Web AI, web API, WebAssembly, WebXPRT, WebXPRT 4 | Tagged AI, AI workloads, browser benchmark, browser performance, Chromium, cross-platform, HTML5, inference, JavaScript, machine learning, ML, WASM, Web AI, web API, WebAssembly, WebGPU, WebNN, WebXPRT, WebXPRT 4 |

Web AI frameworks: Possible paths for the AI-focused WebXPRT 4 auxiliary workload

By Justin Greene

on September 5, 2024

A few months ago, we announced that we’re moving forward with the development of a new auxiliary WebXPRT 4 workload focused on local, browser-side AI technology. Local AI has many potential benefits, and it now seems safe to say that it will be a common fixture of everyday life for many people in the future. As the growth of browser-based inference technology picks up steam, our goal is to equip WebXPRT 4 users with the ability to quickly and reliably evaluate how well devices can handle substantial local inference tasks in the browser.

To reach our goal, we’ll need to make many well-researched and carefully considered decisions along the development path. Throughout the decision-making process, we’ll be balancing our commitment to core XPRT values, such as ease of use and widespread compatibility, with the practical realities of working with rapidly changing emergent technologies. In today’s blog, we’re discussing one of the first decision points that we face—choosing a Web AI framework.

AI frameworks are suites of tools and libraries that serve as building blocks for developers to create new AI-based models and apps or integrate existing AI functions in custom ways. AI frameworks can be commercial, such as OpenAI, or open source, such as Hugging Face, PyTorch, and TensorFlow. Because the XPRTs are available at no cost for users and we publish our source code, open-source frameworks are the right choice for WebXPRT.

Because the new workload will focus on locally powered, browser-based inference tasks, we also need to choose an AI framework that has browser integration capabilities and does not rely on server-side computing. These types of frameworks—called Web AI—use JavaScript (JS) APIs and other web technologies, such as WebAssembly and WebGPU, to run machine learning (ML) tasks on a device’s CPU, GPU, or NPU.

Several emerging Web AI frameworks may provide the compatibility and functionality we need for the future WebXPRT workload. Here are a few that we’re currently researching:

ONNX Runtime Web: Microsoft and other partners developed the Open Neural Network Exchange (ONNX) as an open standard for ML models. With available tools, users can convert models from several AI frameworks to ONNX, which can then be used by ONNX Runtime Web. ONNX Runtime Web allows developers to leverage the broad compatibility of ONNX-formatted ML models—including pre-trained vision, language, and GenAI models—in their web applications.
Transformers.js: Transformers.js, which uses ONNX Runtime Web, is a JS library that allows users to run AI models from the browser and offline. Transformers.js supports language, computer vision, and audio ML models, among others.
MediaPipe: Google developed MediaPipe as a way for developers to adapt TensorFlow-based models for use across many platforms in real-time on-device inference applications such as face detection and gesture recognition. MediaPipe is particularly useful for inference work in images, videos, and live streaming.
TensorFlow.js: TensorFlow has been around for a long time, and the TensorFlow ecosystem provides users with a broad variety of models and datasets. TensorFlow is an end-to-end ML solution—training to inference—but with available pre-trained models, developers can focus on inference. TensorFlow.js is an open-source JS library that helps developers integrate TensorFlow with web apps.

We have not made final decisions about a Web AI framework or any aspect of the future workload. We’re still in the research, discussion, and experimentation stages of development, but we want to be transparent with our readers about where we are in the process. In future blog posts, we’ll discuss some of the other major decision points in play.

Most of all, we invite you to join us in these discussions, make recommendations, and give us any other feedback or suggestions you may have, so please feel free to share your thoughts!

Justin

Posted in AI, benchmark, Benchmarking, BenchmarkXPRT, BenchmarkXPRT development community, browser performance, Browser-based benchmarks, Collaborative benchmark development, computer vision, Cross-platform benchmarks, face detection, Future of performance evaluation, Google, GPU, image classification, inference, JavaScript, Machine learning, MediaPipe, Microsoft, NPU, on-device AI, ONNX Runtime Web, Open Source, OpenAI, Performance benchmarking, PyTorch, TensorFlow, Transformers.js, Video, Web AI, Web-based testing, WebAssembly, WebGPU, WebXPRT, WebXPRT 4 | Tagged AI, artificial intelligence, browser benchmark, browser performance, Google, HuggingFace, machine learning, Microsoft, ML, ONNX, ONNX Runtime Web, OpenAI, PyTorch, TensorFlow, Web AI, WebAssembly, WebXPRT, WebXPRT 4 |

How to automate your WebXPRT 4 testing

By Justin Greene

on August 22, 2024

We’re excited about the ongoing upward trend in the number of completed WebXPRT 4 runs that we’re seeing each month. OEM and tech press labs are responsible for a significant amount of that growth, and many of them use WebXPRT’s automation features to complete large blocks of hands-off testing at one time. We realize that many new WebXPRT users may be unfamiliar with the benchmark’s automation tools, so we decided to provide a quick guide to WebXPRT automation in today’s blog. Whether you’re testing one or 1,000 devices, the instructions below can help save you some time.

WebXPRT 4 allows users to run scripts in an automated fashion and control test execution by appending parameters and values to the WebXPRT URL. Three parameters are available:

test type
test scenarios
results

Below, you’ll find a description of those parameters and instructions for how you can use them to automate your test runs.

Test type

The WebXPRT automation framework accounts for two test types: (1) the six core workloads, and (2) any experimental workloads we might add in future builds. There are currently no experimental tests in WebXPRT 4, so always set the test type variable to 1.

Core tests: 1

Test scenario

The test scenario parameter lets you specify which subtest workloads to run by using the following codes:

Photo enhancement: 1
Organize album using AI: 2
Stock option pricing: 4
Encrypt notes and OCR scan using WASM: 8
Sales graphs: 16
Online homework: 32

To run a single subtest workload, use its code. To run multiple workloads, use the sum of their codes. For example, to run Stock options pricing (4) and Encrypt notes and OCR scan (8), use the sum of 12. To run all core tests, use 63, the sum of all the individual test codes (1 + 2 + 4 + 8 + 16 + 32 = 63).

Results format

The results format parameter lets you select the format of the results:

Display the result as an HTML table: 1
Display the result as XML: 2
Display the result as CSV: 3
Download the result as CSV: 4

To use the automation feature, start with the URL https://www.principledtechnologies.com/benchmarkxprt/webxprt/2021/wx4_build_3_7_3, append a question mark (?), and add the parameters and values separated by ampersands (&). For example, to run all the core tests and download the results, you would use the following URL: https://principledtechnologies.com/benchmarkxprt/webxprt/2021/wx4_build_3_7_3/auto.php?testtype=1&tests=63&result=4

We hope WebXPRT 4’s automation features will make testing easier for you. If you have any questions about WebXPRT or the automation process, please feel free to ask!

Justin

Posted in Automation, benchmark, Benchmark metrics, Benchmarking, browser performance, Browser-based benchmarks, Performance benchmarking, results, tech press, WebXPRT, WebXPRT 4 | Tagged automation, benchmark, BenchmarkXPRT, browser benchmark, browser performance, Principled Technologies, PT, test, test results, WebXPRT, WebXPRT 4 |

Category: Performance benchmarking

Speaking of potential future WebXPRT workloads

Thinking through a potential WebXPRT 4 battery life test

Gain a deeper understanding of WebXPRT 4 with our results calculation white paper

Web APIs: Possible paths for the AI-focused WebXPRT 4 auxiliary workload

Web AI frameworks: Possible paths for the AI-focused WebXPRT 4 auxiliary workload

How to automate your WebXPRT 4 testing

Test type

Test scenario

Results format

Check out the other XPRTs: