Machine vision applications have only just scratched the surface of what might be possible. Nevertheless, the implementations that have been achieved to date are still quite impressive.
The history of science is full of revolutionary advances that required small insights that anyone might have had, but that, in fact, only one person did. – Isaac Asimov, “The Three Numbers”
The most common applications today are for industrial process and quality control. A production line robot checks an object for defects by comparing a newly captured image with a stored reference model.
Police departments extended this to license plate identification in the 1990's. This quite naturally grew into systems for facial recognition – either by sorting an image database or picking out a face in a crowd. Further enhancements under development include 3D recognition that can account for variations in facial expression, capture unique details in skin texture and features, or see through disguises. The technology naturally extends itself to fingerprint or eye recognition systems.
Automotive collision avoidance has been available for nearly a decade. These are intricate solutions that involve image capture, motion estimation, monitoring rates of change in relative positions and road gradients, and so forth. Such vehicles can also include lane change/drift alerts that have their own sets of calculations, estimates, and model libraries.
What we do every day on our commute – keeping a car in its lane and navigating safely – is an incredibly burdensome task for computer/machine vision systems. There are quite a few computations that need to be performed dynamically. Even the primitive libraries for such an application are quite elaborate and extensive.
Audi, Google, and others have very ambitious research initiatives underway to create cars that can navigate independently using multiple cameras and even radars. Their machine vision needs are labyrinthine in complexity, as environmental factors (fog, snow, ice, smoke, rain, wind, and dew), road/traffic conditions (traffic signs and signals, potholes, road work, lanes, barriers, etc.), and relative position of other objects over time (vehicles, pedestrians, animals and detritus) present fiercely dynamic multivariate computing challenges.
The medical field is actively exploring machine vision and, frankly, needs its capabilities quite urgently. Any of you who have had to look at a magnetic resonance imaging (MRI) file, computerized axial tomography scan (CAT) scan, or ultrasound image and have a radiologist interpret the bizarre smears of colors, shapes, and shades knows exactly what I'm talking about. What machine vision will soon be able to offer is genuine clarity even to the layman thru reconstructing a captured image as a simulation model. This will help enormously for studying injuries or pathologies as well as assisting in diagnoses and courses for treatment. Other applications include detection of abnormalities – for instance, cancerous growths or damaged tissue.
The military has been a big fan of computer/machine vision ever since the US Air Force and Navy started deploying Tomahawk cruise missiles using Xilinx FPGAs for landmark-based navigation in the late 1980's. The FPGAs would periodically load maps stored in memory and compare them to the landscape the weapon was traversing in order to make autonomous in-flight course adjustments.
Newer missiles include guidance systems that can dynamically interpret more detailed geographical features as well as select and track moving targets. Inevitably, this is leading military researchers to the next step in unmanned aerial and submersible vehicles that would, in effect, become combat robots able to pick out and engage targets independently. A real life Terminator is not all that much of a conceptual step beyond that.
Some of the most exciting work in machine/computer vision stems from subtle insight into the current deficiencies of CAD. In order to interact with 3D models, designers today use clunky peripherals such as keyboards, mice, and joysticks.
Machine vision systems are being developed that completely bypasses such inefficient mechanisms by employing a gesture recognition apparatus. A camera array tracks hand and finger positions dynamically. The system then alters a 3D screen image so that a user can virtually interact with the model, reaching into the design to toggle switches, press buttons and so on.
The implications are enormous. Think of firefighters deploying remotely piloted robots to enter burning buildings, or miners controlling equipment while remaining safely above ground. The potential of machine/computer vision to alter our modern age is breathtaking. However, even these astounding innovations merely scratch the surface of what is possible.
What if, instead of controlling an android remotely, you could simply tell it what to do? There are things an android would need to have in addition to machine vision. For instance, it would have to be able to hear and comprehend the spoken word. But that is a topic for the next editorial.
Join over 2,000 technical professionals and embedded systems hardware, software, and firmware developers at ESC Silicon Valley July 20-22, 2015 and learn about the latest techniques and tips for reducing time, cost, and complexity in the embedded development process.
Passes for the ESC Silicon Valley 2015 Technical Conference are available at the conference’s official site with discounted advance pricing until July 17, 2015. The Embedded Systems Conference and EE Times are owned by UBM Canon.