face detection | BenchmarkXPRT Development Community Blog

A few months ago, we announced that we’re moving forward with the development of a new auxiliary WebXPRT 4 workload focused on local, browser-side AI technology. Local AI has many potential benefits, and it now seems safe to say that it will be a common fixture of everyday life for many people in the future. As the growth of browser-based inference technology picks up steam, our goal is to equip WebXPRT 4 users with the ability to quickly and reliably evaluate how well devices can handle substantial local inference tasks in the browser.

To reach our goal, we’ll need to make many well-researched and carefully considered decisions along the development path. Throughout the decision-making process, we’ll be balancing our commitment to core XPRT values, such as ease of use and widespread compatibility, with the practical realities of working with rapidly changing emergent technologies. In today’s blog, we’re discussing one of the first decision points that we face—choosing a Web AI framework.

AI frameworks are suites of tools and libraries that serve as building blocks for developers to create new AI-based models and apps or integrate existing AI functions in custom ways. AI frameworks can be commercial, such as OpenAI, or open source, such as Hugging Face, PyTorch, and TensorFlow. Because the XPRTs are available at no cost for users and we publish our source code, open-source frameworks are the right choice for WebXPRT.

Because the new workload will focus on locally powered, browser-based inference tasks, we also need to choose an AI framework that has browser integration capabilities and does not rely on server-side computing. These types of frameworks—called Web AI—use JavaScript (JS) APIs and other web technologies, such as WebAssembly and WebGPU, to run machine learning (ML) tasks on a device’s CPU, GPU, or NPU.

Several emerging Web AI frameworks may provide the compatibility and functionality we need for the future WebXPRT workload. Here are a few that we’re currently researching:

ONNX Runtime Web: Microsoft and other partners developed the Open Neural Network Exchange (ONNX) as an open standard for ML models. With available tools, users can convert models from several AI frameworks to ONNX, which can then be used by ONNX Runtime Web. ONNX Runtime Web allows developers to leverage the broad compatibility of ONNX-formatted ML models—including pre-trained vision, language, and GenAI models—in their web applications.
Transformers.js: Transformers.js, which uses ONNX Runtime Web, is a JS library that allows users to run AI models from the browser and offline. Transformers.js supports language, computer vision, and audio ML models, among others.
MediaPipe: Google developed MediaPipe as a way for developers to adapt TensorFlow-based models for use across many platforms in real-time on-device inference applications such as face detection and gesture recognition. MediaPipe is particularly useful for inference work in images, videos, and live streaming.
TensorFlow.js: TensorFlow has been around for a long time, and the TensorFlow ecosystem provides users with a broad variety of models and datasets. TensorFlow is an end-to-end ML solution—training to inference—but with available pre-trained models, developers can focus on inference. TensorFlow.js is an open-source JS library that helps developers integrate TensorFlow with web apps.

We have not made final decisions about a Web AI framework or any aspect of the future workload. We’re still in the research, discussion, and experimentation stages of development, but we want to be transparent with our readers about where we are in the process. In future blog posts, we’ll discuss some of the other major decision points in play.

Most of all, we invite you to join us in these discussions, make recommendations, and give us any other feedback or suggestions you may have, so please feel free to share your thoughts!

Justin

A few months ago, we shared detailed information about the changes we expected to make in WebXPRT 4. We are currently doing internal testing of the WebXPRT 4 Preview build in preparation for releasing it to the public. We want to let our readers know what to expect.

We’ve made some changes since our last update and some of the details we present below could still change before the preview release. However, we are much closer to the final product. Once we release the WebXPRT 4 Preview, testers will be able to publish scores from Preview build testing. We will limit any changes that we make between the Preview and the final release to the UI or features that are not expected to affect test scores.

General changes

Some of the non-workload changes we’ve made in WebXPRT 4 relate to our typical benchmark update process.

We have updated the aesthetics of the WebXPRT UI to make WebXPRT 4 visually distinct from older versions. We did not significantly change the flow of the UI.
We have updated content in some of the workloads to reflect changes in everyday technology, such as upgrading most of the photos in the photo processing workloads to higher resolutions.
We have not yet added a looping function to the automation scripts, but are still considering it for the future.
We investigated the possibility of shortening the benchmark by reducing the default number of iterations from seven to five, but have decided to stick with seven iterations to ensure that score variability remains acceptable across all platforms.

Workload changes

Photo Enhancement. We increased the efficiency of the workload’s Canvas object creation function, and replaced the existing photos with new, higher-resolution photos.

Organize Album Using AI. We replaced ConvNetJS with WebAssembly (WASM) based OpenCV.js for both the face detection and image classification tasks. We changed the images for the image classification tasks to images from the ImageNet dataset.
Stock Option Pricing. We updated the dygraph.js library.
Sales Graphs. We made no changes to this workload.
Encrypt Notes and OCR Scan. We replaced ASM.js with WASM for the Notes task and updated the WASM-based Tesseract version for the OCR task.
Online Homework. In addition to the existing scenario which uses four Web Workers, we have added a scenario with two Web Workers. The workload now covers a wider range of Web Worker performance, and we calculate the score by using the combined run time of both scenarios. We also updated the typo.js library.

Experimental workloads

As part of the WebXPRT 4 development process, we researched the possibility of including two new workloads: a natural language processing (NLP) workload, and an Angular-based message scrolling workload. After much testing and discussion, we have decided to not include these two workloads in WebXPRT 4. They will be good candidates for us to add as experimental WebXPRT 4 workloads in 2022.

The release timeline

Our goal is to publish the WebXPRT 4 preview build by December 15^th, which will allow testers to publish scores in the weeks leading up to the Consumer Electronics Show in Las Vegas in January 2022. We will provide more detailed information about the GA timeline here in the blog as soon as possible.

If you have any questions about the details we’ve shared above, please feel free to ask!

Justin

Category: face detection

Web AI frameworks: Possible paths for the AI-focused WebXPRT 4 auxiliary workload

Here’s what to expect in the WebXPRT 4 Preview

General changes

Workload changes

Experimental workloads

The release timeline

Check out the other XPRTs: