A common view in the press and in artificial intelligence research is that sentient and intelligent machines are just on the horizon. After all, machines diagnose and treat illnesses, drive cars, grow our food, manufacture and deliver new products, distinguish pictures, and play games better than we do. How much longer can it be before they surpass our intelligence and take our jobs?

Before we decide if machines can surpass our intelligence, let us first define two terms that will help us get a better handle on this topic: Weak AI and Strong AI.

Weak AI (or Narrow AI) refers to an AI system that is designed and trained for a specific task. Virtual personal assistants, such as Apple's Siri, are a form of weak AI. Weak AI underpins many commercial services such as trip planning, shopper recommendation systems, and ad targeting, and is finding important applications in medical diagnosis, education, and scientific research.

Strong AI (sometimes called Artificial General Intelligence, or AGI) refers to an AI system with generalized human cognitive abilities so that when presented with an unfamiliar task, it has enough intelligence to find a solution. The 2016 White House report on AI takes an appropriately skeptical view of achieving Strong AI anytime soon: “Expert opinion on the expected arrival date of AGI ranges from 2030 to centuries from now.”¹

So if the problem of achieving Strong AI is way off in the future, how can we make progress towards achieving this lofty goal? A new book The Book of Why: The New Science of Cause and Effect by Pearl and Mackenzie provides some answers. Their answer, strong AI systems must contain causal models. Why? Consider that the current machine learning systems (including those with deep learning networks) operate almost entirely in an associational mode, that is, they are driven by a stream of observations to which they attempt to fit a function. The resulting systems are brittle, special-purpose, inscrutable, and lack flexibility, adaptability, and transparency, even to the architects of the system.²

Consider the simple question posed by Pearl, “What happens if we double the price of floss?” Why can’t we simple answer the question just by observation, that is, just collect enough data to answer the question? The reason Pearl states is that on the previous occasions, the price may have been higher for different reasons. For example, the product may have been in short supply, and every other store also had to raise its price. If you had data on the market conditions that existed on the previous occasions, perhaps you could figure it out, but what data do you need? And then, how would you figure it out? Pearl states those are exactly the questions the science of causal inference allows us to answer.

So, collecting data and finding associations in the data would not enable one to answer questions such as “What if we do X”? But what about “How” questions? For instance, a manager may tell us that we have too many paper towels in our warehouse. He asks, “How can we sell it” and what price should we set it? Pearl states that to answer such questions requires doing or intervening, which we want to perform mentally before we decide whether and how to do it in real life, and that he claims requires a causal model. Consider for example, when we take Tylenol to reduce a fever, according to Pearl we are intervening on one variable (the quantity of Tylenol in our body) to affect another one (our fever status). So, if we are correct in our causal belief about Tylenol, the “outcome” variable will respond by changing from “fever” to “no fever.”

While reasoning about doing or intervening is important, it is not sufficient to answer “Why” or “What if” questions. Consider the question: My fever is gone now, but why? Was it the Tylenol I took? The food that I ate? The good news I heard? Pearl states to answer these types of questions, we must go back in time, change history and ask, “What would have happened if I had not taken the Tylenol?” Data cannot tell us what will happen in a counterfactual or imaginary world, in which some observed facts are bluntly negated. Yet the human mind does make such explanation-seeking inferences, reliably and repeatably. Pearl states this is the ability that most distinguishes human from animal intelligence, as well as model-blind versions of AI and machine learning.

Pearl argues that a key component in creating human-like intelligence is the mastery of causation, something a 3-year old child does very well. In a later post I will talk about how machines can acquire causal knowledge.

There is a long history of excessive optimism about AI. For example, AI pioneer Herb Simon predicted in 1957 that computers would outplay humans at chess within a decade, an outcome that required 40 years to occur. Early predictions about automated language translation also proved wildly optimistic, with the technology only becoming usable (and by no means fully fluent) in the last several years. It is tempting but incorrect to extrapolate from the ability to solve one particular task to imagine machines with a much broader and deeper range of capabilities and to overlook the huge gap between narrow task-oriented performance and the type of general intelligence that people exhibit [2016 White House report on AI].
Consider that even minor changes to street sign graphics can fool machine learning algorithms into thinking the signs say something completely different. Imagine a driverless car using the output of its vision system in which a stop sign is no longer recognized as a stop sign, but instead a speed limit of 45.

A First Step Towards Strong AI