A new 3D mapping technique that requires minimal computing power could improve the navigation of autonomous vehicles.
The plug-and-play device uses artificial intelligence (AI) to create maps of three-dimensional spaces using two-dimensional images captured by multiple cameras.
Most autonomous vehicles use powerful AI programs called vision transformers to take 2D images from multiple cameras and create a representation of the 3D space around the vehicle.
The new technique, called Multi-View Attentive Contextualisation (MvACon), can be used in conjunction with these existing vision transformer AIs to improve their ability to map 3D spaces.
“The vision transformers aren’t getting any additional data from their cameras, they’re just able to make better use of the data,” said Tianfu Wu, an associate professor of electrical and computer engineering at North Carolina State University.
MvACon works by modifying an approach called Patch-to-Cluster attention (PaCa), which Wu and his collaborators released last year. PaCa allows transformer AIs to more efficiently and effectively identify objects in an image.
“The key advance here is applying what we demonstrated with PaCa to the challenge of mapping 3D space using multiple cameras,” said Wu.
Speed and orientation of AI vision transfomers improved
To test the performance of MvACon, the researchers used it in conjunction with three leading vision transformers which each collected 2D images from six different cameras. In all three instances, MvACon significantly improved the performance of each vision transformer.
“Performance was particularly improved when it came to locating objects, as well as the speed and orientation of those objects,” said Wu. “And the increase in computational demand of adding MvACon to the vision transformers was almost negligible.
The group’s next steps include testing MvACon against additional benchmark datasets, as well as testing it against actual video input from autonomous vehicles. If MvACon continues to outperform the existing vision transformers, we’re optimistic that it will be adopted for widespread use.”
The research is being presented at the IEEE conference this week in Seattle.