BenchmarkXPRT Blog banner

Category: computer vision

Evaluating machine learning performance

A  few weeks ago, I discussed the rising importance of machine learning and our efforts to develop a tool to help in evaluating its performance. Here is an update on our thinking.

One thing we are sure of is that we can’t cover everything in machine learning. The field is evolving rapidly, so we think the best approach is to pick a good place to start and then build from there.

One of the key areas we need to hone in on is the algorithms that we will employ in MLXPRT. (We haven’t formally decided on a name, but are currently using MLXPRT internally when we talk about what we’ve been doing.)

Computer vision, or image detection, seems to be a good place to start. We see three specific sets of algorithms to possibly cover. Worth noting, there is plenty of muddying of lines amongst these sets.

The first set of computer vision algorithms performs image classification. These algorithms identify things like a cat or a dog in an image. Some of the most popular algorithms are Alexnet and GoogLeNet, as well as ones from VGG . The initial training and use for these was on the ImageNet database, containing over 10 million images.

The next set of algorithms in computer vision performs object detection and localization. The algorithms identify the contents and their spatial location in an image, and typically draw bounding boxes around them. A couple of the most popular algorithms are Faster R-CNN and Single Shot MultiBox Detector (SSD).

The final set of computer vision algorithms perform image segmentation. Rather than just drawing a box around an object, image segmentation attempts to classify each pixel in an image by the object it is a part of. The result looks like a contour/color map that shows the different objects in the image. These techniques can be especially useful in autonomous vehicles and medical diagnostic imaging. Currently, the leading algorithms in image segmentation are fully convolution networks (FCN), but the area is developing rapidly.

Even limiting the initial version of MLXPRT to computer vision may be too broad. For example, we may end up only doing image classification and object detection.

As always, we crave input from folks, like yourself, who are working in these areas. What would you most like to see in a machine learning performance tool?

Bill

Learning about machine learning

Everywhere we look, machine learning is in the news. It’s driving cars and beating the world’s best Go players. Whether we are aware of it or not, it’s in our lives–understanding our voices and identifying our pictures.

Our goal of being able to measure the performance of hardware and software that does machine learning seems more relevant than ever. Our challenge is to scan the vast landscape that is machine learning, and identify which elements to measure first.

There is a natural temptation to see machine learning as being all about neural networks such as AlexNet and GoogLeNet. However, new innovations appear all the time and lots of important work with more classic machine learning techniques is also underway. (Classic machine learning being anything more than a few years old!) Recursive neural networks used for language translation, reinforcement learning used in robotics, and support vector machine (SVM) learning used in text recognition are just a few examples among the wide array of algorithms to consider.

Creating a benchmark or set of benchmarks to cover all those areas, however, is unlikely to be possible. Certainly, creating such an ambitious tool would take so long that it would be of limited usefulness.

Our current thinking is to begin with a small set of representative algorithms. The challenge, of course, is identifying them. That’s where you come in. What would you like to start with?

We anticipate that the benchmark will focus on the types of inference learning and light training that are likely to occur on edge devices. Extensive training with large datasets takes place in data centers or on systems with extraordinary computing capabilities. We’re interested in use cases that will stress the local processing power of everyday devices.

We are, of course, reaching out to folks in the machine learning field—including those in academia, those who create the underlying hardware and software, and those who make the products that rely on that hardware and software.

What do you think?

Bill

Airborne

I’m old enough that I’ve never really understood the whole selfie thing. However, it’s clearly never going away, and I’m fascinated–although a little creeped out–by the development of selfie drones. It’s amazing that we have so quickly reached the point where you can buy a drone that will literally fit in your pocket.

As an example of how sophisticated these devices can be, consider Zero Robotics Hover Camera Passport.  It’s capable of capturing 4K UHD video and 13-megapixel images, it can track faces or bodies, and it includes sensors, including sonar, to measure the distance from air to ground. All in a package that’s about the size of an old VHS tape.

A while back we talked about the new ways people are finding to use technology, and how the XPRTs need to adapt.  While I don’t think we’re going to be seeing DroneXPRT any time soon, we’ve been talking about including the technologies that make these devices possible in the XPRTs. These technologies include machine learning, computer vision, and 4K video.

What new devices fascinate you? Which technologies are going to be most useful in the near future? Let us know!

Eric

The things we do now

We mentioned a couple of weeks ago that the Microsoft Store added an option to indicate holographic support, which we selected for TouchXPRT. So, it was no surprise to see Microsoft announce that next year, they will release an update to Windows 10 that enables mainstream PCs to run the Windows Holographic shell. They also announced that they‘re working with Intel to develop a reference architecture for mixed-reality-ready PCs. Mixed-reality applications, which combine the real world with a virtual reality, demand sophisticated computer vision, and applications that can learn about the world around them.

As we’ve said before, we are constantly watching how people use their devices. One of the most basic principles of the XPRT benchmarks is to test devices using the same kinds of work that people do in the real world. As people find new ways to use their devices, the workloads in the benchmarks should evolve as well. Virtual reality, computer vision, and machine learning are among the technologies we are looking at.

What sorts of things are you doing today that you weren’t a year ago? (Other than Pokémon GO – we know about that one.) Would you like to see those sorts of workloads in the XPRTs? Let us know!

Eric

Seeing the future

Back in April we wrote about how Bill’s trip to IDF16 in Shenzhen got us thinking about future benchmarks. Technologies like virtual reality, the Internet of things, and computer vision are going to open up lots of new applications.

Yesterday I saw an amazing article that talked about an automatic computer vision system that is able to detect early-stage esophageal cancer from endoscopy images. These lesions can be difficult for physicians to detect, and the system did very well when compared to four experts who participated in the test. The article contains a link to the original study, for those of you who want more detail.

To me, this is the stuff of science fiction. It’s a very impressive accomplishment. Clearly, new technologies are going to lead to many new and exciting applications.

While this type of application is more specialized than the typical XPRT, things like this get us really excited about the possibilities for the future.  Have you seen an application that impressed you recently? Let us know!

Eric

Check out the other XPRTs:

Forgot your password?