Early machine vision systems were in use in Japan in the 1960’s. It has since grown into a global market variously estimated to be $8 to 14 billion in revenue. Machine vison typically refers to the use of vison technology for visual inspections in a manufacturing or logistics facility to identify, sort, and track products, as well as to detect defects, measure features, and position products properly on the line.
Manual visual inspection is a tedious process. One study in the pharma industry found that 100% inspection by trained and experienced inspectors identified 80-85% of product discrepancies. Inspectors get tired and distracted, and human eyes have limitations in the features that they can see. Multiple 100% inspections can reduce the defect rate, but add significant cost and are unlikely to bring the defect rate to zero.
Automating inspections with machine vision has the potential benefit of 100% inspection with consistent accuracy, the potential ability to analyze identified defects real time to find and correct root causes of production errors, and inspection accuracy that improves over time through the application of deep learning analysis. However, existing machine vision systems still have limitations that result in many processes continuing to use manual inspections. Cognex estimates that there remain 35 million visual inspectors supporting 360 million manufacturing workers globally. Even at the low wages prevalent in countries like Vietnam, this represents an annual cost of $100 billion, and a very large available market for improved machine vision solutions that can both reduce the inspection costs and help systematically improve product quality.
The major players in the industry today include Keyence, Cognex, Panasonic, Teledyne Dalsa, Omron, and Sick. The products of these companies are largely characterized by proprietary hardware/software systems. Machine learning historically has been a highly technical market, and integrating and optimizing cameras, networking, processing, and analysis software into an “appliance” greatly facilitated the development of the market. The flip side of the proprietary hardware/software model is that it is expensive, inflexible, and relatively slow to adopt new technologies that have been rapidly advancing. This whitepaper will look at the emerging opportunity for new application and new business models in machine vision.
Accelerating Technology Improvements
A machine vision solution has several elements: a camera to acquire an image, hardware capable of transporting and processing data from the camera, software that can analyze and interpret images, and a control system that can act on the interpreted data.
Each of these technologies has seen rapid advances over the past decade, largely driven by smartphones and cloud computing. The smartphone market has grown from zero in 2008 to over 1.4 billion units in 2021. This in turn has driven continually improvement in the performance and cost of camera sensors, networking chips, and processing chips including CPU and GPUs.
A 100 megapixel camera can be purchased retail for a few hundred dollars. 5G service is designed to deliver up to 10 Gb/s of bandwidth and 1 ms latency. The new A16 SoC in the Apple 14 contains 16 billion transistors and Apple claims that its neural engine can perform 17 trillion operations per second–better that a supercomputer 25 years ago.
Nvidia continues to lead the market in building compute and GPU hardware that supports improving AI and Computer vision performance. Its latest Jetson Orin Nano modules deliver 40 trillion operations per second. Up to 256 of Nvidia’s H100 Tensor core GPU for cloud computing can be used to support a single workload, enabling increasing complex training of deep learning models with processing requirements measured in petaflops.
As this area grows in important both AMD and Intel are releasing new offerings which helps keep technology moving forward.
The number of computer vision software tools and frameworks has been growing rapidly. Examples include TensorFlow, PyTorch, CUDA, SYCL, CAFFE and YOLO. This field of research is moving quickly—YOLO alone has had 4 releases in the past 2 years.
The result of these advances is a tremendous opportunity for expanding the machine vision market and improving the automated inspections being done today. Smartphone can become low cost platforms for many machine vision tasks, with high resolution cameras, the capability to process many tasks locally, and to offload more complex tasks to edge compute resources using low latency 5G connections.
Other use cases may require specialized camera packaging, but by using an open architecture can still benefit from the huge advances in GPUs for local or edge compute frameworks, and the wide variety of computer vision models that are becoming available.
Image Capture–Not Just Cameras
While current machine vision solutions typically use standard cameras with dedicated lighting hardware to capture an image for processing and analysis, the opportunity is broader than this. Image analysis machine learning or deep learning tools can utilize imagery other than that visible to the human eye. For example, FLIR has long sold cameras with infrared sensors that are able to capture heat maps—which can be an invaluable addition to certain types of inspection processes.
Umajin uses proprietary tools to capture and map RF propagation inside buildings which are then trained into a model analogous to those used in machine vision. Then location sensors deployed in the building can make use of this model. The sensors can scan the RF signals and pass this back to be matched against the RF map and hence accurately determine location in even very complex spaces.
Umajin has also developed computational imaging tools that measures how each pixel responds to directional light—allowing it to capture detail that is typically obscured by shadows or reflectivity in an image illuminated using diffuse lighting. The processed images this provide greatly enhanced data sets for machine learning or deep learning models that support machine vision use cases.
Example Use Cases
Open architecture machine vision solutions expand the TAM for machine vision with a range of use cases. In typical manufacturing inspection tasks, additional manual inspections can be automated by using a combination of computational imaging to enhance image capture and deep learning models that improve the ability to detect and classify production flaws. The open architecture model also facilitates better integration into other enterprise systems (vs. simply removing defective product at a given workstation). Root cause analysis of production errors are an immediate application of this. Prospectively, many firms are evaluation digital twins of their facilities that will be used to manage operational decisions. Real time machine vision data can be readily integrated into the digital twins if it is managed by a framework designed for this type of deployment.
Machine vision systems that provide metrology data, for example the fit of a door panel, and also be used to continually refine process management even when the product is within the minimum specification
Relatively low-cost systems based on smartphone can be used for a range of health and safety applications. Examples of this include identifying when workers are not wearing appropriate safety equipment (hard hats, high visibility vests, etc.) or when workers or vehicles are operating in unsafe areas.
Systems can be used to detect employee posture or movements that might increase the risk of injury, such as improper lifting. Existing software allows faces to be obscured for privacy, but also to create workflow applications that support systematic response to reducing unsafe practices.
The solutions can incorporate other sensor data, such as acoustical microphones, that can be use models analogous to machine vision to identify abnormal conditions.
Low- cost machine vision tools enable logistics solutions such as inspecting trays being loaded on delivery trucks to ensure proper order fulfilment—using direct identification of items rather than tags.
Machine vision enables high accuracy indoor location, as described in the previous section. This can be used for tracking location of parts and tools required for certain processes, providing real-time spatial data for digital twins, of even disabling a tool if a worker attempts to use it incorrectly.
Managing Open Architecture Machine Vision Solutions.
The proprietary appliance model has the advantage of ease of deployment. The tradeoff is high cost, inflexibility, and technology that is out of date. Smartphone based machine vision solutions are starting to get traction. For example, https://www.scandit.com/ is a European firm selling a smartphone-based bar code scanning application that joined unicorn ranks after its financing earlier this year.
Amazon offers services like Sagemaker, which facilitate development of machine learning solutions for given applications.
But what is really needed is a framework that can rapidly stitching together the pieces required for a machine vision solution. That includes managing the camera and lighting, image processing on smartphone or local processing resources such as the Nvidia Orin, orchestration of cloud compute resources when necessary, integration of the most appropriate machine vision model, and integration of the results into the manufacturing execution system.
Umajin is a platform for rapidly developing next generation mobile and XR applications, bult around a visual rendering engine. Hence it excels in computational imaging tasks, and its modular architecture facilitates the integration of the best machine vison tool for a given use case. Umajin’s server components both orchestrate cloud resources (for example cloud computing GPUs) and integrate with other enterprise manufacturing systems. Umajin can deploy machine vision solutions on smartphones using a hardened runtime, as well as build workflow applications that utilize data from machine vision systems to facilitate other manufacturing processes ((such as identifying the root cause of defects).
Umajin provides tools to facilitate automated deployment of machine vision solutions, and provides the flexibility to readily customize a solution for a new product or to incorporate improvement in machine vision software.
Umajin is going to market with large partners such as NTT. https://youtu.be/FWS4UwFHhVo