Deep learning was in force at the Vision trade fair in Stuttgart at the beginning of November. Greg Blackman reports from the show
More than six million people visit Oktoberfest in Munich each year, drinking around seven million litres of beer during the 16-day festival. Oktoberfest could be considered one of the defining events for Austrian start-up MoonVision, which put its real-time object tracking technology through its paces at the 2017 festival. The firm was exhibiting at the Vision show in Stuttgart, and Georg Bartels, the company’s head of sales, told Imaging and Machine Vision Europe that Oktoberfest was a ‘stress test’ for the young company.
As with a lot of today’s start-ups using image analysis, MoonVision’s technology is based on neural networks. The company was founded in August 2017 and currently has 12 employees at its Vienna headquarters.
In the summer of 2017, German caterer Ammer Wiesn approached MoonVision to add an automated quality assurance layer to its food service during Oktoberfest, to ensure the food leaving the kitchen tallied with each waiter’s order. MoonVision installed a camera over the exit of the kitchen looking directly down on the waiters and the plates. The system was trained to detect the 20 or so dishes and the haircuts of the 40 waiters who worked over the 16 days.
The dishes and staff were labelled in the images and the neural network trained on video recorded on the first day of the festival; the system had to be up and running quickly, according to Bartels.
The tracking solution worked at high frame rates to deal with rapid movement, and had to be able to identify partially hidden dishes, all with limited computing power – the final system required only seven per cent capacity of a modern GPU to deliver information in real-time.
Bartels said during Vision that the firm was now getting a lot of interest from automotive companies, and has recently worked with a producer of bearings to automate surface inspection.
Accelerating CNNs
MoonVision was one of five start-ups on a joint stand at Vision for research institutes, universities and start-ups, organised by Messe Stuttgart and VDMA Machine Vision. The gathering momentum behind artificial intelligence and deep learning is now very much apparent in the machine vision sector, with start-ups like MoonVision – Deevio was another start-up on the joint stand, using deep learning to improve industrial quality control – developing new vision solutions that address wide-ranging imaging tasks.
Most machine vision software library suppliers now incorporate deep learning tools in their packages, which were on display alongside exhibitors showing dedicated deep learning software, like the South Korean firm Laon People, or Sualab, also from South Korea. Flir launched its Firefly camera at the show, which incorporates Intel’s Movidius Myriad 2 vision processing unit (VPU) for real-time deep learning inference. The VPU has hardware accelerators for image processing, and includes streaming hybrid architecture engine processor cores that accelerate on-camera inference based on neural networks. Putting neural network acceleration directly on the camera means inference can be run at ‘the edge’ rather than sending the data elsewhere for processing. The initial version of the Firefly camera uses a 1.6 megapixel Sony Pregius global shutter sensor operating at 60fps.
Flir was able to build Intel’s Movidius chip into an industrial vision camera because of the product volumes Flir commands across its entire portfolio. Face detection was one of the Firefly demonstrations Flir was showing at the trade fair.
Learning phase
Deep learning has a lot of potential for image analysis, but it’s still largely untested in an industrial environment. Software company MVTec organised a series of deep learning seminars throughout the second half of 2018 – one of which was held at Vision Stuttgart – the feedback from which was: ‘We are all in a learning phase’, according to the firm’s managing director, Dr Olaf Munkelt. Munkelt was speaking on a VDMA-organised panel discussion held during the show.
Munkelt commented: ‘We can achieve remarkable results using deep learning because we have the computational power available. This adds opportunities.’ But he went on to say that deep learning is not the Holy Grail, observing that given a choice between showing a neural network 100,000 images of a screw or measuring the screw with five lines of code, then writing code will yield a result much faster and with less investment.
‘We have to find metrics where we can apply this technology [deep learning],’ Munkelt said. ‘It has great advantages, but we have to explore more to understand more.’
These opinions were echoed by Carsten Strampe, the MD of Imago Technologies. Strampe told Imaging and Machine Vision Europe that what’s missing is confirmation from the end users about whether deep learning is reliable for industrial inspection. He said that this is the next step in the evolution of the technology, at least for industrial vision.
Imago Technologies was showing its VisionBox industrial PC, which incorporates GPUs for tasks such as deep learning computation, while MVTec had demonstrations based on Halcon 18.11, which includes new deep learning functions.
In terms of processing power, neural networks can be very compute intensive. FPGA and computer hardware provider, Xilinx, had a number of vision demonstrations at the show, and was exhibiting off the back of unveiling its Versal adaptive compute acceleration platform (ACAP) chip in October. Versal is a low-power chip that contains an AI processing block. The integrated circuit is based on 7nm node technology, an upgrade compared to the 16nm node on which Xilinx based most of its prior chip technology. Dale Hitt, director of strategic market development at Xilinx, told Imaging and Machine Vision Europe that Versal will be rolled out in 2019 for big data applications, while a low-power version for embedded processing will be available in 2020.
Emphasis on training
Deep learning is only as accurate as the images with which the neural network is trained. Software firm Adaptive Vision was demonstrating its deep learning tools on the trade show floor, inspecting screws for defects. For its trade fair demo, the company trained its feature detection tool with images of screws with hands in the scene, because if the classifier is to work on live images at a trade fair where people are moving the screw under the camera it has to recognise hands.
Speaking during the VDMA’s series of industrial vision presentations, Michał Czardybon, general manager of Adaptive Vision, noted that ‘really difficult defects on the surface of parts can be detected reliably’. The software is able to pick out small scratches on the surface of the screw that are highlighted even while the screw is being moved around. ‘With traditional tools you would need many weeks of development, very complicated algorithms to detect those defects,’ he continued. ‘With deep learning, you just need a couple of minutes to prepare data and a couple of minutes for training.’
Adaptive Vision’s neural networks have been optimised for industrial inspection, and use a pre-trained method with the customer only requiring 20 to 50 images to fine-tune the network for a particular application. The company recommends a GPU to run its deep learning add-on; it has an execution time of around 100ms for a 1-megapixel image. Adaptive Vision’s four tools based on deep learning are: feature detection, which is very accurate, but the system needs to know the defect beforehand; anomaly detection, which can be thought of as a golden template-type tool; instance segmentation can identify complicated shapes; and object classification is able to recognise objects such as food.
Elsewhere, IDS was releasing new NXT smart cameras that could accommodate pre-trained neural networks, while Israeli start-up Inspekto was showing how its deep learning technology could help ease vision system integration on the factory floor.
The deep learning systems from MoonVision and other start-ups inject new technology into the reasonably mature industrial vision sector. Munkelt commented during the panel discussion: ‘This technology [deep learning] opens up a lot of opportunities for new companies. This brings movement into the [machine vision] industry. And we need some movement; we need new ideas and we need a push forward in order to provide better technology to our customers.’