BenchmarkXPRT Blog banner

Search Results for: results viewer

TouchXPRT CP1

This week we released the TouchXPRT 2014 Community Preview 1 (CP1). As with past community previews, the tests are stable and you may publish your results.

CP1 has a number of improvements over TouchXPRT 2013. We’ve updated the tests and used new and more demanding kinds of data. The Run All button is now prominent on the main screen, and the benchmark includes a results viewer.

However, as I said last week, the new UI design did not make it into CP1. Over the next few weeks, we’ll be working to give TouchXPRT an exciting new look. The results viewer will also change a lot. The current version captures the date, time, and test results, but the sandbox environment of Windows 8 applications makes getting the system information challenging. We’re working to solve that problem.  We’ll also be improving the results submission to make it more streamlined.

Rest assured that, while the appearance will change, the results will not. The test results you generate with CP1 will be good for the life of the benchmark.

Community previews are only available to community members. If you are not a member, this a great time to join.

After you’ve downloaded CP1, let us know what you think by posting to the forum or e-mailing us at BenchmarkXPRTsupport@principledtechnologies.com.

Eric

Comment on this post in the forums

It’s almost here

Sometime next week, we plan to release a sneak preview of TouchXPRT 2014, the TouchXPRT 2014 Community Preview 1 (CP1).

CP1, as its name makes clear, is not the final TouchXPRT 2014 release. There is still a lot of work to do on the user interface and the new results viewer.  However, it includes a number of improvements over the current TouchXPRT, making it an even more useful tool for measuring Windows 8 and Windows 8.1 device performance. It is also a great way for everyone in the community to see the current state of our thinking and to provide us with feedback. You can run this version of the tool and see what you think!

As we have done with previous community previews, we’re also taking two more steps:

  • We’re not putting any publication restrictions on this preview release. Test at will, and publish your findings.
  • We’re releasing the source code to all community members. If you’re curious about not just what we’re doing but how we’re doing it, you can find out.

We believe these steps make the tool easier to evaluate and more useful to all of us.

Releasing a preview version is a lot of work, because we have to do much of the work of a software release and on less-than-final code, but we believe the value to our community justifies the effort.

Next week, when we release CP1, I’ll go over more details, the known limitations, and how you can get us your feedback—feedback we very much want.

Between now and then, we’ll be readying CP1 for your use.

Eric

Comment on this post in the forums

Anatomy of a benchmark, part II

As we discussed last week, benchmarks (including HDXPRT 2011) are made up of a set of common major components. Last week’s components included the Installer, User Interface (UI), and Results Viewer.  This week, we’ll look more at the guts of a benchmark—the parts that actually do the performance testing.

Once the UI gets the necessary commands and parameters from the user, the Test Harness takes over.  This part is the logic that runs the individual Tests or Workloads using the parameters you specified.  For application-based benchmarks, the harness is particularly critical, because it has to deal with running real applications.  (Simpler benchmarks may mix the harness and test code in a single program.)

The next component consists of the Tests or Workloads themselves.  Some folks use those terms interchangeably, but I try to avoid that practice.  I tend to think of tests as specially crafted code designed to gauge some aspect of a system’s performance, while workloads consist of a set of actions that an application must take as well as the necessary data for those actions.  In HDXPRT 2011, each workload is a set of data (such as photos) and actions (e.g., manipulations of those photos) that an application (e.g., Photoshop Elements) performs.  Application-based benchmarks, such as HDXPRT 2011, typically use some other program or technology to pass commands to the applications.  HDXPRT uses a combination of AutoIT and C code to drive the applications.

When the Harness finishes running the tests or workloads, it collects the results.  It then passes those results either to the Results Viewer or writes them to a file for viewing in Excel or some other program.

As we look to improve HDXPRT for next year, what improvements would you like to see in each of those areas?

Bill

Comment on this post in the forums

Anatomy of a benchmark, part I

Over many years of dealing with benchmarks, I’ve found that there are a few major components that HDXPRT 2011 and most others include.  Some of these components are not what you might think of as part of a benchmark, but they are essential to making one both easy to use and capable of producing reproducible results.  We’ll look at those parts this week and the rest next week.

The first piece that you encounter when you use a benchmark is its Installation program.  Simple benchmarks may forgo an installation component and just let you copy the files, including any executables, into a directory.  By contrast, HDXPRT 2011, like other application-based benchmarks, takes great pains to install the necessary applications. It even has to check to see which of them are already installed on the computer under test and cope with those it finds.

Once the benchmark is on the system, you launch it and encounter the User Interface (UI).  For some benchmarks, the UI may be only a command-line interface with a set of switches or options. HDXPRT 2011, in keeping with its emphasis on an HD user experience, includes a graphical UI that lets you run its tests.

Many benchmarks, including HDXPRT 2011, provide a Results Viewer that makes it easy for you to look at your results and compare them to others.  Results viewers range from fairly simple to quite sophisticated.  The prevalence of spreadsheet applications and XML has led to benchmark creators minimizing the development costs of this component.

Next week, I’ll look at the components that handle the actual tests that make up the benchmark.

Bill

Comment on this post in the forums

Putting together a good WebXPRT workload proposal

Recently, we announced that we’re moving forward with the development of a new AI-focused WebXPRT 4 workload. It will be an auxiliary workload, which means that it will run as a separate, optional test, and it won’t affect existing WebXPRT 4 tests or scores. Although the inspiration for this new workload came from internal WebXPRT discussions—and, let’s face it, from the huge increase in importance of AI–we wanted to remind you that we’re always open to hearing your WebXPRT workload ideas. If you’d like to submit proposals for new workloads, you don’t have to follow a formal process. Just contact us, and we’ll start the conversation.

If you do decide to send us a workload proposal, it will be helpful to know the types of parameters that we keep in mind. Below, we discuss some of the key questions we ask when we evaluate new WebXPRT workload ideas.

Will it be relevant and interesting to real users, lab testers, and tech reviewers?

When considering a WebXPRT workload proposal, the first two criteria are simple: is it relevant in real life, and are people interested in the workload? We created WebXPRT to evaluate device performance using web-based tasks that consumers are likely to experience daily, so real-life relevance has always been an essential requirement for us throughout development. There are many technologies, functions, and use cases that we could test in a web environment, but only some are relevant to common applications or usage patterns and are likely to draw the interest of real users, lab testers, and technical reviewers.

Will it have cross-platform support?

Currently, WebXPRT runs on almost any web browser and almost every device that supports a web browser. We would like to keep that level of cross-platform support when we introduce new workloads. However, technical differences in how various browsers execute tasks make it challenging to include certain scenarios without undermining our cross-platform ideal. When considering any workload proposal, one of the first questions we ask is, “Will it work on all the major browsers and operating systems?”

There are special exceptions to this guideline. For instance, we’re still in the early days of browser-based AI, and it’s unlikely that a new browser-based AI workload will run on every major browser. If it’s a particularly compelling idea, such as the AI scenario we’re currently working on, we may consider including it as an auxiliary test.

Will it differentiate performance between different types of devices?

XPRT benchmarks provide users with accurate measures for evaluating how well target systems or technologies perform specific tasks. With a broadly targeted benchmark like WebXPRT, if the workloads are so heavy that most devices can’t handle them or so light that most devices complete them without being taxed, the results will be of little use for helping buyers evaluating systems and making purchasing decisions, OEM labs, and the tech press.

That’s why, with any new WebXPRT workload, we look for a sweet spot with respect to how computationally demanding it will be. We want it to run on a wide range of devices—from low-end devices that are several years old to brand-new high-end devices, and everything in between. We also want users to see a wide range of workload scores and resulting overall scores that accurately reflect the experiences those systems deliver, so they can easily grasp the different performance capabilities of the devices under test.

Will results be consistent and easily replicated?

Finally, WebXPRT workloads should produce scores that consistently fall within an acceptable margin of error and are easily replicated with additional testing or comparable gear. Some web technologies are very sensitive to uncontrollable or unpredictable variables, such as internet speed. A workload that measures one of those technologies would be unlikely to produce results that are consistent and easily replicated.

We hope this post will be useful if you’re thinking about potential new workloads that you’d like to see in WebXPRT. If you have any general thoughts about browser performance testing or specific workload ideas that you’d like us to consider, please let us know.

Justin

Making progress with WebXPRT 4 in iOS 17

In recent blog posts, we discussed an issue that we encountered when attempting to run WebXPRT 4 on iOS 17 devices. If you missed those posts, you can find more details about the nature of the problem here. In short, the issue is that the Encrypt Notes and OCR scan subtest in WebXPRT 4 gets stuck when the Tesseract.js Optical Character Recognition (OCR) engine attempts to scan a shopping receipt. We’ve verified that the issue occurs on devices running iOS 17, iPadOS 17, and macOS Sonoma with Safari 17.

After a good bit of troubleshooting and research to try and identify the cause of the problem, we decided to build an updated version of WebXPRT 4 that uses a newer version of Tesseract for the OCR task. Aside from updating Tesseract in the new build, we aimed to change as little as possible. To try and maximize continuity, we’re still using the original input image for the receipt scanning task, and we decided to stick with using the WASM library instead of a WASM-SIMD library. Aside from a new version of tesseract.js, WebXPRT 4 version number updates, and updated documentation where necessary, all other aspects of WebXPRT 4 will remain the same.

We’re currently testing a candidate build of this new version on a wide array of devices. The results so far seem promising, but we want to complete our due diligence and make sure this is the best approach to solving the problem. We know that OEM labs and tech reviewers put a lot of time and effort into compiling databases of results, so we hope to provide a solution that minimizes results disruption and inconvenience for WebXPRT 4 users. Ideally, folks would be able to integrate scores from the new build without any questions or confusion about comparability.

We don’t yet have an exact release date for a new WebXPRT 4 build, but we can say that we’re shooting for the end of October. We appreciate everyone’s patience as we work towards the best possible solution. If you have any questions or concerns about an updated version of WebXPRT 4, please let us know.

Justin

Check out the other XPRTs:

Forgot your password?