MIT researchers have designed a new algorithm to aid robots in navigating unfamiliar buildings. The system is based on the re-identification of a chosen landmark and using its location as a marker, helping the robot identify its orientation and, hopefully, simplifying the problems with scene understanding.
The researchers’ algorithm works on 3D data captured by laser rangefinders. First, the algorithm estimates the orientations of a large number of individual points in the scene. Those orientations are then represented as points on the surface of a sphere, with each point defining a unique angle relative to the sphere’s centre.
By identifying the dominant orientations in a given scene and representing them as sets of axes - called Manhattan frames - embedded in a sphere, then, as the robot moves, it would observe the sphere rotating in the opposite direction, and could gauge its orientation relative to the axes. Whenever it wanted to reorient itself, it would know which of its landmarks’ faces should be toward it, making them much easier to identify.
The team state that, in principle, it would be possible to approximate the point data very accurately by using hundreds of different Manhattan frames, but this would yield a model that’s much too complex to be useful. So another aspect of the algorithm is a cost function that weighs accuracy of approximation against number of frames. The algorithm starts with a fixed number of frames - somewhere between three and 10, depending on the expected complexity of the scene - and then tries to pare that number down without compromising the overall cost function.
The resulting set of Manhattan frames may not represent subtle distinctions between objects that are slightly misaligned with each other, but those distinctions aren’t necessarily useful to a navigation system.