Robot Perception Improved with GNC Algorithm

When Vasileios Tzoumas, a research scientist at the Massachusetts Institute of Technology (MIT), visits a new city, he likes to explore by going for a run. And sometimes he gets lost. A few years ago, on a long run while in Osaka for a conference, the inevitable happened. But then Tzoumas spotted a 7-Eleven he remembered passing soon after leaving his hotel. This recognition allowed him to mentally “close the loop,” to connect the loose end of his trajectory to someplace he knew and was sure about, thus solidifying his mental map and allowing him to make his way back to the hotel.

The graduated nonconvexity (GNC) algorithm could help machines traverse land, water, sky, and space—and come back to tell the tale.

Closing the loop is actually a technical term for something robots frequently have to do when navigating new environments. It’s part of a process called simultaneous localization and mapping (SLAM). SLAM is not new. It is used for robotic vacuum cleaners, self-driving cars, search-and-rescue aerial drones, and robots in factories, warehouses, and mines. As autonomous devices and vehicles navigate new spaces, from a living room to the sky, they construct a map as they travel. They must also figure out where they are on the map using sensors such as cameras, GPS, and lidar.

As SLAM finds more applications, it is more important than ever to ensure that SLAM algorithms produce correct results in challenging real-world conditions. SLAM algorithms often work well with perfect sensors or in controlled lab conditions, but they get lost easily when implemented with imperfect sensors in the real world. Unsurprisingly, industrial customers frequently worry about whether they can trust those algorithms.

Researchers at MIT have developed several robust SLAM algorithms, as well as methods to mathematically prove how much we can trust them. The lab of Luca Carlone, the Leonardo Career Development assistant professor at MIT, published a paper about their graduated non-convexity (GNC) algorithm, which reduces the random errors and uncertainties in SLAM results. More importantly, the algorithm produces correct results where existing methods “get lost.” The paper, by Carlone, Tzoumas, and Carlone’s students Heng Yang and Pasquale Antonante, received the best paper award in robot vision at the International Conference on Robotics and Automation (ICRA). This GNC algorithm will help machines traverse land, water, sky, and space—and come back to tell the tale.

Everything’s Aligned

Robot perception relies on sensors that often provide noisy or misleading inputs. MIT’s GNC algorithm allows the robot to decide which data points to trust and which to discard. One application of the GNC algorithm is called shape alignment. A robot estimates the 3D location and orientation of a car using 2D camera images. The robot receives a camera image with many points labeled by a feature-detection algorithm: headlights, wheels, mirrors. It also has a 3D model of a car in its memory. The goal is to scale, rotate, and place the 3D model so its features align with the features in the image. “This is easy if the feature-detection algorithm has done its job perfectly, but that’s rarely the case,” Carlone says. In real applications, the robot faces many outliers—mislabeled features—which can make up more than 90% of all observations. That’s where the GNC algorithm comes in and outperforms all competitors.

Robots solve this problem using a mathematical function that takes into account the distance between each pair of features—for instance, the right headlight in the image and the right headlight in the model. They try to “optimize” this function—to orient the model so as to minimize all of those distances. The more features, the more difficult the problem.

One way to solve the problem would be to try all the possible solutions to the function and see which one works best, but there are too many to try. A more common method, Yang and Antonante explain, “is to try one solution and keep nudging—making, say, the headlights in the model more aligned with the headlights in the 2D image—until you can’t improve it anymore.” Given noisy data, it won’t be perfect—maybe the headlights align but the wheels don’t—so you can start over with another solution and refine that one as much as possible, repeating the process several times to find the best outcome. Still, the chances of finding the best possible solution are slim.

In real applications, the robot faces many outliers, which can make up more than 90% of all observations. That’s where the GNC algorithm comes in and outperforms all competitors.

Mesh and point cloud with correspondences (70% outliners)

Incorrect registration by common SLAM algorithms

The GNC algorithm finds the optimal alignment despite noisy measurements with up to 70–90% outliers. Image credit: MIT.

The idea behind GNC is to first simplify the problem. They reduce the function they’re trying to optimize—the one describing the differences between the 3D model and the 2D image—to one with a single best solution. Now when they pick a solution and nudge it, they’ll eventually find that best solution. Then they reintroduce a bit of the original function’s complexity and refine the solution they’ve just found. They keep doing this until they have the original function and its optimal solution. The headlights are well aligned, and so are the wheels and bumpers.

Going in Circles

The paper applies the GNC algorithm to shape alignment and to SLAM, among other problems. In the case of SLAM, the robot uses sensor data to figure out its past trajectory and build a map. For example, a robot roaming around a college campus collects odometry data suggesting how far and in which direction it’s gone between 8:00 a.m. and 8:15 a.m., between 8:15 a.m. and 8:30 a.m., and so on. It also has lidar and camera data at 8:00 a.m., 8:15 a.m., and so on. Occasionally, it will complete loops, seeing the same thing at two different times, like Tzoumas did when he ran past the 7-Eleven again.

The researchers found that the GNC algorithm was more accurate than the state-of-the-art techniques and could handle a higher percentage of outliers.

Just as in shape alignment, there’s an optimization problem to be solved. Yang, who is the first author on the paper, explains: “For SLAM, instead of lining up features to match a 3D model, the system bends the trajectory it thinks it traversed in order to align objects on the map.” First, the system works to minimize the differences between the perceived journeys by different sensors, since every sensor is likely to have errors in measurements. For example, if the robot’s odometer shows it traveled 100 meters between 8:00 a.m. and 8:15 a.m., the trajectory updated based on lidar and camera measurements should reflect that distance, or something close to it. The system also minimizes the distances between locations that appear to be the same place. If the robot saw the same 7-Eleven at 8:00 a.m. and 10:00 a.m., the algorithm will try to bend the recalled trajectory—adjusting each leg—so that its recalled positions at times 8:00 a.m. and 10:00 a.m. align, closing the loop.

Robot mapping the interior of a building. GNC gradually unwraps messy data. In relatively few steps, the algorithm arrives at an accurate map of the interior of a building. Image credit: MIT.

Meanwhile, the algorithm identifies and discards outliers—bad data points, where it thought it was retracing its steps, but it wasn’t—just as mislabeled features in shape alignment. You don’t want to falsely close a loop. Tzoumas recalls a time, running through the woods in Maine, when he ran past a collection of fallen tree trunks that looked familiar. He thought he’d closed the loop, and using this supposed landmark, he took a turn. Only after not seeing anything else familiar for 20 minutes did he suspect his mistake and turn back.

A recalled trajectory before optimization might look like a tangled ball of twine. After untangling, it resembles a set of right-angled lines mirroring the shape of the campus pathways and hallways that the robot traversed. The technical term for this SLAM process is pose graph optimization.

In the paper, the researchers compared their GNC algorithm with other algorithms on several applications, including shape alignment and pose graph optimization. They found their method was more accurate than the state-of-the-art techniques and could handle a higher percentage of outliers. For SLAM, it worked even if three in four loop closures were mistaken, which is many more outliers than it would encounter in real-world application. What’s more, their method is often more efficient than other algorithms, requiring fewer computational steps. Tzoumas says, “One of the difficulties was finding a general-purpose algorithm that works well across many applications.” Yang says they’ve tried it on more than 10. In the end, Tzoumas says, they found “the sweet spot.”

The GNC algorithm correctly reconstructs a map of the interior of the MIT Great Dome. — MATLAB generated maps created from data derived from a robot mowing a lawn. Left: Original map of lawn. Middle: Map optimized with common SLAM algorithms, which include mislabeled data from unknown outlier loop closures. Right: Map optimized with GNC algorithm.

Going from research to production is an important step for research outcomes to make a difference at scale, says Roberto G. Valenti, a robotics research scientist at MathWorks. MathWorks has been working with Carlone’s lab to integrate the GNC algorithms into MATLAB as part of Navigation Toolbox™, which companies use to implement SLAM on commercial and industrial autonomous systems.

Out of the Woods

Carlone’s lab is working on ways to extend the capabilities of their GNC algorithm. For example, Yang aims to design perception algorithms that can be certified to be correct. And Antonante is finding ways to manage inconsistency across different algorithms: If the SLAM module in an autonomous vehicle says the road goes straight, but the lane-detection module says it bends right, you have a problem.

The GNC algorithm is the new benchmark in allowing robots to catch their own mistakes.

Tzoumas is looking at how to scale up not just to interaction between multiple algorithms in one robot, but collaboration between multiple robots. In earlier work, he programmed flying drones to track targets, such as criminals trying to escape on foot or by car. Going forward, multiple machines could perhaps run the GNC algorithm collectively. Each would contribute partial information to its neighbors, and together they would build a global map—of locations on Earth or elsewhere. This year he’s moving to the aerospace engineering department at the University of Michigan to work on trustworthy autonomy for multi-robot planning and self-navigation—even in difficult environments, such as battlefields and other planets.

“Not knowing how AI and perception algorithms will behave is a huge deterrent for using them,” Antonante says. He notes that robotic museum guides won’t be trusted if there’s a chance they’ll crash into visitors or the Mona Lisa: “You want your system to have a deep understanding of both its environment and itself, so it can catch its own mistakes.” The GNC algorithm is the new benchmark in allowing robots to catch their own mistakes, and, most importantly, as Tzoumas says, “it helps you get out of the woods.”