Echolocation neural net gives phones 3D vision via sound

A machine learning algorithm that gives smartphones the ability to create 3D images through echolocation has been developed by scientists from the University of Glasgow.

The algorithm can deduce the shape, size and layout of a room by measuring the time it takes for sound from speakers to return to the phone's microphone.

The technique also works with radio waves pulsed from small antennas. The researchers say it could be used to generate images through potentially any devices equipped with microphones and speakers or radio antennae.

The results are displayed as a video feed which turns the echo data into 3D images.

Dr Alex Turpin of the University of Glasgow’s School of Computing Science and School of Physics and Astronomy, and one of the lead authors of the paper published in Physical Review Letters, said: 'Echolocation in animals is a remarkable ability, and science has managed to recreate the ability to generate three-dimensional images from reflected echoes in a number of different ways, like radar and lidar.

'What sets this research apart from other systems is that, firstly, it requires data from just a single input – the microphone or the antenna – to create three-dimensional images. Secondly, we believe that the algorithm we’ve developed could turn any device with either of those pieces of kit into an echolocation device.

'That means that the cost of this kind of 3D imaging could be greatly reduced, opening up many new applications. A building could be kept secure without traditional cameras by picking up the signals reflected from an intruder, for example. The same could be done to keep track of the movements of vulnerable patients in nursing homes. We could even see the system being used to track the rise and fall of a patient’s chest in healthcare settings, alerting staff to changes in their breathing.'

The researchers used the speakers and microphone from a laptop to generate and receive acoustic waves in the kilohertz range. They also used an antenna to do the same with radio frequency sounds in the gigahertz range.

In each case, they collected data about the reflections of the waves taken in a room as a single person moved around. At the same time, they also recorded data about the room using a time-of-flight camera to measure the dimensions of the room and provide a low-resolution image.

By combining the echo data from the microphone and the image data from the time-of-flight camera, the team trained their machine learning algorithm over hundreds of repetitions to associate specific delays in the echoes with images. Eventually, the algorithm had learned enough to generate its own images of the room and its contents from the echo data alone.

The research builds on previous work by the team, which trained a neural network to build 3D images by measuring the reflections from flashes of light using a single-pixel detector.

Dr Turpin added: 'We’ve now been able to demonstrate the effectiveness of this algorithmic machine learning technique using light and sound, which is very exciting. It’s clear that there is a lot of potential here for sensing the world in new ways, and we’re keen to continue exploring the possibilities of generating more high-resolution images in the future.'

Echolocation neural net gives phones 3D vision via sound

Topics

Read more about:

Editor's picks

Carbon Robotics launches real-time autonomous tractor solution using machine vision

Embracing edge computing for image processing

Beyond the visible: imaging in IR, NIR, SWIR, and hyperspectral

Machine vision for guidance and robotics