A few weeks ago, I discussed the rising importance of machine learning and our efforts to develop a tool to help in evaluating its performance. Here is an update on our thinking.
One thing we are sure of is that we can’t cover everything in machine learning. The field is evolving rapidly, so we think the best approach is to pick a good place to start and then build from there.
One of the key areas we need to hone in on is the algorithms that we will employ in MLXPRT. (We haven’t formally decided on a name, but are currently using MLXPRT internally when we talk about what we’ve been doing.)
Computer vision, or image detection, seems to be a good place to start. We see three specific sets of algorithms to possibly cover. Worth noting, there is plenty of muddying of lines amongst these sets.
The first set of computer vision algorithms performs image classification. These algorithms identify things like a cat or a dog in an image. Some of the most popular algorithms are Alexnet and GoogLeNet, as well as ones from VGG . The initial training and use for these was on the ImageNet database, containing over 10 million images.
The next set of algorithms in computer vision performs object detection and localization. The algorithms identify the contents and their spatial location in an image, and typically draw bounding boxes around them. A couple of the most popular algorithms are Faster R-CNN and Single Shot MultiBox Detector (SSD).
The final set of computer vision algorithms perform image segmentation. Rather than just drawing a box around an object, image segmentation attempts to classify each pixel in an image by the object it is a part of. The result looks like a contour/color map that shows the different objects in the image. These techniques can be especially useful in autonomous vehicles and medical diagnostic imaging. Currently, the leading algorithms in image segmentation are fully convolution networks (FCN), but the area is developing rapidly.
Even limiting the initial version of MLXPRT to computer vision may be too broad. For example, we may end up only doing image classification and object detection.
As always, we crave input from folks, like yourself, who are working in these areas. What would you most like to see in a machine learning performance tool?
Bill