Top 10 pitfalls in developing machine vision systems with artificial intelligence
Machine vision is the ability of an electronic system to see surroundings and objects using one or more electronic cameras, advanced light sensors, analog-to-digital converters (ADC) and digital signal processors (DSP). Primitive forms of this technology have been in use for decades, with most applications being in the fields of medicine, defense/aerospace and industrial automation.
With recent improvements in 3D sensing technology, increased miniaturization of component modules, substantial price drops in quality CMOS imaging sensors and powerful data processors, machine vision applications are now growing exponentially. Grandview research estimates that continuing its explosive growth, the global machine vision market will reach $18.25 billion USD by 2025 with a CAGR of 7.7%.
New markets for machine vision systems include:
- Factory automation and robotics: visual inspection, diagnosis, assembly, mobile robots, digital fabrication, service robots, rescue robots
- Intelligent transport systems: traffic monitoring, autonomous vehicles, driver safety assistance
- Security and law enforcement: security monitoring, camera networks, remote sensing, underwater and adverse environments
- Life sciences: agriculture, forestry, fishery, civil/construction engineering, commerce, sports, fashion, home and more
- Multimedia: database archiving/retrieval, documents, culture/heritage, virtual reality (VR) /mixed reality (MR) / augmented reality (AR), entertainment
- Biomedical: tomography, endoscopy, computer-aided diagnostics, computer-assisted surgery, computational anatomy, bioinformatics, nursing care
- Human-computer interaction: face/gesture/behavior/gait/gaze analysis, biometrics, wearable computing, first-person vision systems
A recent Gartner study reported that by 2025:
- The penetration of machine-vision-integrated advanced driver assistance systems (ADAS) in automobiles will reach 35% from the current level of 10%.
- The application of machine vision technology in retail stores will result in 20% growth in customer traffic and 10% growth in store margins due to targeted campaigning.
- 20% of all smart home appliances shipped by the top five consumer electronics manufacturers will be enabled by machine vision technology.
- Nearly all premium smartphones and 30% of basic and utility smartphones will include machine vision capability to enable facial or gesture recognition as the standard authentication mechanism.
Machine vision can trace its origins to the 1950s, when the team of P. K. Weimer, S. V. Forgue and R. R. Goodrich working at RCA developed the Vidicon tube for use in early electronic cameras. Vidicon tubes used photoconductors as a target material to capture images. NASA employed RCA Vidicon cameras in the majority of unmanned deep space probes with remote sensing capabilities until the late 1970s.
The key to successful machine vision solution development and implementation is to work with a trusted technology partner to establish the necessary framework of hardware components and software that provides vision algorithms, camera interface standards, advanced analytics, artificial intelligence and machine learning. With all enterprises, there is a right way and wrong way of doing things. We’ve compiled this top 10 list of considerations when developing a modern machine vision solution.
1. Start with quality data, then develop the AI: In order operate properly, machine vision systems need to acquire, process, analyze and understand images, which is handled by AI. This understanding is achieved by compiling information known as training data to help teach the AI. The better the quality of training data, the better the quality of AI. This leads to better performance of the machine vision system. Training data that is poor quality or very limited in quantity will hamper AI and the success of a machine vision application. Even the best programmed AI will not deliver adequate results if it is not given proper training data.
2. Beware of scope creep: Begin every project with a set of realistic expectations and attainable goals. The human brain is capable of processing data from the five senses simultaneously and acting upon this data instantly, this is true multitasking. Machines are often programmed to do one task extremely well, but AI can struggle when required to learn and execute multiple tasks. During initial planning stages, focus on the primary capabilities that will deliver success. Attempting to have early builds of the AI application perform a wide range of diverse tasks can be difficult to execute properly and lead to unsatisfactory initial outcomes.
3. The language of vision: Successful machine vision applications require not only capable hardware but skilled programming. Programming can come in the form of AI frameworks and coding languages. An artificial intelligence framework allows for faster and easier development of AI applications, including machine learning, deep learning, neural networks and natural language processing (NLP). AI frameworks act as a template for the development of the artificial intelligence system. This makes development, deployment and governance much easier than developing an AI application from scratch. There are several programming languages that work with AI, each with their own strengths. These include Caffe, Python, Pytorch, TensorFlow, C++, Lisp, Java, R, Prolog and Model Zoo.
During the planning stages of the machine vision application, it’s important to establish whether to use in-house or contracted programming resources. What is the skill level of the programmers? What programming language will be used? What are the best development tools for the chosen programming language? Can you easily compile the AI program and subsequent updates? How will you distribute updates?
4. Choosing the right hardware brain: Many options exist when deciding upon the hardware that will be running your machine vision AI application. Field programmable gate arrays (FPGAs), graphics processing units (GPUs) and even microcontrollers (MCUs) each have their own benefits.
FPGAs: FPGAs are very powerful processing units that can be configured to meet the requirements of almost any application. Tailored-made FPGA architectures can be created for handling specific applications. This achieves much higher performance, lower costs and better power efficiency compared to other options like GPUs and CPUs.
GPUs: GPUs are specialized processors that are mainly designed to process images and videos. Compared to CPUs, they are based on simpler processing units but host a much larger number of cores. This makes GPUs excellent for applications in which large amounts of data need to be processed in parallel, like image pixels or video codecs. A few limitations with GPUs are that they are energy intensive and programmed in languages like CUDA and OpenCL, which provide limited flexibility compared to CPUs.
CPUs: CPUs have a limited core count, which inhibits their ability to quickly process the large amounts of data needed for AI. This renders the CPU only suited for small models with small effective batch sizes. The advantages of CPUs are their ease of programming, cost and broad support for programming frameworks.
Other factors to consider when choosing hardware include energy efficiency, device mobility, IO count, operating environments, and most importantly, cost. Being thorough and proactive in the initial planning stages can save headaches down the road. With all processors and supporting components, give yourself enough processing power for possible future capabilities and enough onboard memory to handle firmware upgrades and AI algorithm growth.
5. Image sensor and lighting: Great advancements in front side (FSI) and back side (BSI) illumination in CMOS sensor technology allow for higher resolution images in low light. Proper lighting is also an important consideration. The basis for all lighting performance comes down to three main image sensor characteristics: quantum efficiency (QE), dark current and saturation capacity. Quantum efficiency is calculated as a ratio of the charge created by the device for a specific number of incoming photons. As the QE will change over different wavelengths, it is best plotted as a function of wavelength. This provides an accurate measure of device sensitivity. When implemented within a camera, the maximum QE of the camera should be less than that of the sensor, due to external optical and electronic effects.
Dark current and saturation capacity are also important design considerations in machine vision systems. Dark current measures the variation in the number of electrons that are thermally generated within the CMOS imager and can add noise. Saturation capacity denotes the number of electrons that an individual pixel can store. While these parameters are generally not specified on camera manufacturers’ data sheets, they can be used with QE measurements to derive maximum signal-to-noise ratio (S/N), absolute sensitivity and the dynamic range of an application.
The right lighting will help increase accuracy and efficiency of a machine vision application. Other factors to consider with lighting include wavelength, such as infrared, fixed lighting and even lighting placement. Light sources and reflections that shine directly into the cameras of machine vision systems have been shown to decrease object detection accuracy.
6. Background discernment: Backgrounds can create unique problems for machine vision systems. Imagine a security system unable to discern a black gun from the dark garment of a bad actor. Similar difficulties can exist when reflective metallic objects in a factory environment make it impossible for the vision detection algorithms to properly function. Secondary algorithms can mitigate this by emphasizing different wavelengths of the electromagnetic (EM) spectrum like infrared and adaptive lighting.
7. Object positioning and orientation: AI can help a machine vision solution recognize an object that is learned from training data. If you take the same object and change its orientation, some machine vision systems will stumble. This can be mitigated through accurate training sets for the AI but it is data intensive.
8. Object scaling: When looking at a basketball from two feet away versus 10 feet, we still understand it to be the same object, just at a different distance. This is where a varied training set and accurate testing of the AI can help ensure the distance at which the object can be properly identified. Lens and focal length selection also directly factor into application performance. Most machine vision systems read pixel values but it’s also important for the successful deployment of a moving application to consider scaling aspects.
9. Object deformation: The ability of a machine vision system to recognize the same object with slight deviations can be critical in machine vision applications, especially in transportation and security. The need to identify pedestrians with articulating limbs is important for the accuracy of the application and operational safety. This again places the focus on having quality training sets for the AI to learn from, but it is data intensive.
10. Action and movement: Fast movements can create problems for machine vision systems. This can be detrimental in applications where safety is critical. This can be mitigated through the correct shutter selection of the imager, special programming algorithms and lighting. Inexpensive image sensors often utilize rolling shutters which can corrupt fast moving images. Global shutters might add to the cost of an image sensor, but it’s a necessary feature for correctly capturing fast movement. Anticipation and preparedness are two factors used to ascertain intelligence in human beings. The same applies for AI applications.
Strong innovation and the growing adoption of machine vision technology fueled its market value increase by nearly $10 billion USD in the last five years. With this strong growth comes rapid advancements in AI algorithms, processing components, light sources, image sensors and other related technology. With the many different competing options on the market, it’s difficult to stay ahead of the curve and the competition.
Choosing the right technology partners for your next innovation optimizes efficiency, mitigates potential risks and maximizes profit potential. To achieve your goals, Avnet can connect you with our trusted global technology partners in machine vision systems. This enables you to better focus valuable resources on intellectual property innovation and other areas that deliver a strong competitive edge. Together, we provide the support needed to successfully differentiate your product offerings, accelerate your time to market and improve business outcomes.
With a century of innovation at our foundation, Avnet can guide you through the challenges of developing and delivering new technology at any — or every — stage of your journey. We have the expertise to support your innovation, turn your challenges into opportunities, and build the right solutions for your success. Make your vision a reality and reach further with Avnet as your trusted global technology partner.