Researchers have developed a new technique for detecting and categorising multiple objects without images that could help improve autonomous driving technology.
Known as image-free single-pixel object detection (SPOD), the technique works by scanning a scene using structured light patterns to acquire the spatial information of objects.
This information is then used to computationally reconstruct the objects or calculate their properties.
The researchers say that because the new approach decreases the computing power needed for object detection, it could help identify hazards while driving.
“Our technique is based on a single-pixel detector, which enables efficient and robust multi-object detection directly from a small number of 2D measurements,” said research team leader Liheng Bian from the Beijing Institute of Technology. “This type of image-free sensing technology is expected to solve the problems of heavy communication load, high computing overhead and low perception rate of existing visual perception systems.”
“For autonomous driving, SPOD could be used with lidar to help improve scene reconstruction speed and object detection accuracy,” Bian continued. “We believe that it has a high enough detection rate and accuracy for autonomous driving while also reducing the transmission bandwidth and computing resource requirements needed for object detection.”
While current image-free perception methods can only achieve classification, single object recognition or tracking, SPOD can do all three at once, the researchers say, with a detection accuracy of just over 80%.
The team's report, published in Optics Letters, claims that automating advanced visual tasks such as navigating a vehicle or tracking a moving plane can be difficult due to complex hardware, high computational costs, long running times, and heavy data transmission loads. But, as SPOD doesn’t require detailed images or scene reconstruction, it overcomes these limitations.
“Compared to the full-size pattern used by other single-pixel detection methods, the small, optimised pattern produces better image-free sensing performance,” said group member Lintao Peng. “Also, the multi-scale attention network in the SPOD decoder reinforces the network’s attention to the target area in the scene. This allows more efficient extraction of scene features, enabling state-of-the art object detection performance.”
To demonstrate SPOD, the researchers randomly selected images from the Pascal Voc 2012 test dataset and printed them on film as target scenes. Using a sampling rate of 5%, SPOD completed spatial light modulation and image-free object detection for each scene in an average of 0.016 seconds – faster than performing scene reconstruction first (0.05 seconds) and then object detection (0.018 seconds).
“Currently, SPOD cannot detect every possible object category because the existing object detection dataset used to train the model only contains 80 categories,” said Peng. “However, when faced with a specific task, the pre-trained model can be fine-tuned to achieve image-free multi-object detection of new target classes for applications such as pedestrian, vehicle or boat detection.”
Image: Lintao Peng, Beijing Institute of Technology