Robotics & AI: A Systems Approach

In my last installment on this topic, we began our exploration of Artificial Intelligence by developing a definition of what human intelligence really is, as well as its foundation and course of development. Many of the world's leading high technology companies are attempting to create Mind from a completely non-biological and non-evolutionary direction through massing silicon, server racks, and software. They are intrigued with the myriad possibilities suggested by the creation of sentient robots – machines that can think. We'll explore two of those system-level efforts in this article.


The apple cannot be stuck back on the Tree of Knowledge; once we begin to see, we are doomed and challenged to seek the strength to see more, not less. – Arthur Miller

In 2011, Google, the eponymous search engine company, acquired Deepmind Technologies for something over $400M. Deepmind was developing a variant of artificial neural networks known as Deep Learning, in which a machine uses a library of models in combination with linear and nonlinear computational algorithms to capture patterns in data. The approach has become increasingly popular in the development of Machine Vision and Voice Recognition/Activation technology. The Google X R&D group sought Deepmind's capabilities for incorporation into their own AI initiatives, known internally as Google Brain.

AI is not actually a particularly new endeavor for Google. One can conceptualize its search engine as a kind of machine learning software. As a consequence, Google Brain and its Deep Learning research touch upon just about everything the firm is doing.

Various Google products have already benefited from this research. For instance: Google Maps no longer requires teams of people manually sorting thru street level photos and gathering building numbers to verify unique addresses. This has instead become a Machine Vision task. Voice Recognition has been integrated into Android and image search into Google+, with these capabilities soon to be added to Google Translate.

In one program, Google combined image recognition and text translation to capture new images and have the machine figure out an appropriate text label for the new image. The system appears to be successful about two-thirds of the time, though the research scientists on the project haven't yet figured out how it's actually doing this.

These activities (including the various AI-like functions of Google Now found in Android mobile phones) are all directly supported by the various basic AI capabilities residing on Google's servers – and herein lies the evidence of where the fundamental flaws in the company's approach to AI are. Google Brain researchers consider it a remarkable achievement that they induced a network of 16,000 servers to examine 10M images using Machine Vision capabilities and recognize on its own that they were all images of a cat. The defect in this method is that this is not at all how the human mind works. It doesn't take a human infant millions or billions of instances to recognize a cat or dog and distinguish them from each other. In fact, it only takes about a dozen attempts or less.

This suggests that the Google researchers are not anywhere close to building a true AI. Google Brain appear to have quite a long road to travel before it can be said that it has achieved even basic Awareness, let alone Perception/Cognition or Consciousness.


As the births of living creatures at first are ill-shapen, so are all Innovations, which are the births of time. – Francis Bacon

One of the first things that becomes obvious when scrutinizing Microsoft's work in AI and Robotics is that their mission is to beat Google. The Cortana voice-activated digital assistant that competes with Google Now and Apple Siri also drives much of the AI effort for Microsoft. It reads and understands email, supports Windows 8.1 search capability and can even be presented an image captured by the phone's camera and be asked to identify it. 

The many functions of Cortana are backed up by Adam, Microsoft's counterpart to Google Brain. For instance, the image identification utility described above is done by Adam, using Cortana as the interface to the company “AI.” 

Microsoft researchers claim Adam's machine vision requires 30x fewer images to correctly identify an object. If true, then Microsoft AI developers have made a breakthrough.

Architecturally, Adam differs markedly from Google's AI implementation. Adam is a neural network optimized for Microsoft's Azure Cloud services. The servers operate independently and pool their results asynchronously. The architecture is scalable – the more servers added to a task, the greater the accuracy of results. Microsoft is also spending significant resources and effort in optimizing server GPU throughput, bandwidth and algorithm partitioning – even using high end FPGAs as offload engines.

The Forest & the Trees

Progress has not followed a straight ascending line, but a spiral with rhythms of progress and retrogression, of evolution and dissolution. – Goethe

There is no doubt that both Google and Microsoft see a promising future for themselves in Robotics & AI and are energetically pursuing both intrinsic and applied R&D in all their assorted fields. Yet it is also equally clear that their current efforts, focused as they are on various forms of neural networks and the complex, non-linear mathematical concepts which regulate them, are innately misguided, akin to the ironic “can't see the forest thru the trees” conundrum.

The picture below illustrates the deficiencies in the current approaches of both companies. The diagram depicts the algorithmic architecture of Microsoft's image recognition/machine vision design.

The left three-fourths of the diagram are a variant on Deep Learning networks known as a Convolutional Network. The right quarter, though, is based on Gaussian probability distributions. Microsoft researchers are still trying to reduce a fundamentally non-linear process to a linear model – an error that has frustrated the likes of Benoit Mandelbrot and Nassim Taleb to the point that they've written entire books about it.

Though Microsoft's approach is 30 times more efficient than Google's (recognizing images after 'just' 300,000 instances instead of 10 million), it's still not anywhere near to the intellectual capacity of the human brain. In reality, Microsoft and Google are not actually working on AI. They are instead developing very sophisticated automatons that run complex software routines for low level tasks. As a result, the 'AI' development efforts at both companies are still stuck at something less than an insect level of intelligence – that of Awareness.

There are fundamental, foundational capabilities missing from the Adam and Brain programs:

  1. A base set of values akin to the instinctive impulses residing in the cerebellum and brain stem. By this I don't mean Isaac Asimov's three laws of robotics, which aren't anywhere near sufficient to serve such a function. There needs to be the synthetic equivalent of things such as the fight vs. flight response, the need to belong to a group, the warm sense of satisfaction that one gets from a slice of prime rib or a well-made slice of cheesecake, etc.
  2. An interpretive mechanism that can weigh fresh data inputs against base proclivities and their relative strengths, adjusting them dynamically (within certain ranges).

Without these two elemental functionalities underlying their AI work, the Adam and Brain teams will increasingly find themselves chasing their own tails. Their efforts will never generate a true, independent Consciousness in a machine that is capable of self-directed learning.

In the next installment, we'll look at two more R&D programs at leading systems houses.

0 comments on “Robotics & AI: A Systems Approach

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.