(Note: this article was original Appendix A of our book O'Reilly's "Deep Learning: A Practitioner's Approach" (2018), and is included in this series with O'Reilly's permission.)
Cooper: Hey TARS, what's your honesty parameter?
TARS: 90 percent.
Cooper: 90 percent?
TARS: Absolute honesty isn't always the most diplomatic nor the safest form of communication with emotional beings.
Cooper: Okay, 90 percent it is.
Scene from the movie "Interstellar"
Artificial Intelligence (AI) is a discipline as old as the study of philosophy itself. It has evolved over time, yet we still
struggle to find its place in society, let alone the existential implications for the human race itself. One of the most salient written lines about the beginnings of AI was by author Pamela McCorduck when she wrote that AI began with "an ancient wish to forge the gods."[McCorduck. 2004. *Machines Who Think* (2nd ed.).]
While McCorduck had high-minded prose for the topic, much of marketing today plays on such aspirational themes yet is actually promoting far simpler functions in terms of business results. Deep learning comes up in the context of discussion of AI regularly, and it becomes difficult to have a directed conversation on the topic.
O'Reilly's Deep Learning: A Practitioner's Approach (2018)
We added
this appendix
because the practitioner commonly needs to have a grounded conversation with customers, executives, and managers about what deep learning can do for them and how it fits into the AI landscape. The themes in this appendix are a mixture of a history of the discipline of AI (for context) and discussions we have had with customers and industry peers. We seek to provide you, the practitioner, with the tools to reset the narrative around deep learning, communicate realistic expectations with stakeholders in projects, and to put them on ground that will better support their deep learning efforts going forward. Simply put, the narrative around AI has become overhyped and eventually the market will correct.
This appendix is also meant to be fun and thought provoking in a way that stokes the imagination of the researcher or practitioner such that we can still dream but with our feet on the ground. We'll work through some basic definitions, provides a short history of the topic of AI, and then look forward to where this all could go. Hopefully we'll help folks avoid the pitfalls of previous cycles of AI interest and better support their deep learning projects to be more successful through responsibly setting goals and expectations.
The Story So Far
The main topic of this book, deep learning, is constantly attached to
the term "Artificial Intelligence" in media and marketing. Definitions
are fluid at best and make it difficult to have a discussion on the
topic with other practitioners or stakeholders. Marketing departments
run with the theme of the current hype cycle. Here are some themes from
the not-too-distant past:
Smart grid
The cloud
Big data
When we work in domains like these or in deep learning, as practitioners
we need to delineate what is real from what is marketing hype. That
begins with understanding the history of a subject and having solid
definitions from which to build. Let's begin by reviewing what we
consider to be deep learning, and then we'll dive into the topic of
defining AI.
Defining Deep Learning
In Chapters 1 and 2, we laid out a working definition for deep learning that is described as
neural networks with the following properties:
More neurons than previous neural networks
More complex ways of connecting layers
"Cambrian-like" explosion of computing power to train
Automatic feature learning
These networks perform the same modeling functions as other machine
learning models (regression, classification) but have also been shown to
be good at tasks such as the following:
Generative modeling (e.g., generating art and text)
Speech-recognition technology
Image-recognition technology
Another key driving feature of deep learning is that it is able to
automatically learn features (as opposed to hand-engineering features)
from data in a domain-agnostic fashion. These capabilities of deep
learning are driving many of the new technology applications and have
stimulated the imagination of many folks beyond technology circles. By
itself, however, deep learning does not have higher-level functions such
as "automatically understanding the most interesting question to ask a
dataset," let alone any type of sentient operation.
Defining Artificial Intelligence
The history of AI is fraught with myths, stories, and overzealous
marketing departments trying to tap into the latest technology
narrative. To define AI, we need some context around the history of the
study of intelligence, modern arguments, and how the discipline has
evolved over time.
Using that line as our embarkation point, let's explore how the
discipline got its start and then evolved as an industry over the past
60 years.
The Study of Intelligence
The study of intelligence was formally initiated in 1956 at Dartmouth
yet is at least 2,000 years old. The field is based on understanding
intelligent entities and studying topics such as these:
Seeing
Learning
Remembering
Reasoning
These topics are components of what we'd consider intelligent function
to the capacity we have to understand intelligence (and this is a
relative perspective). We can find the study of intelligence throughout
history if we look for it. The list that follows presents a few of the
building blocks of intelligent study over time:
Philosophy (400 BC)
Philosophers began to suggest the mind as a mechanical machine that encodes knowledge in some form inside the brain.
Mathematics
Mathematicians developed the core ideas of working with statements of logic along with the groundwork for reasoning about algorithms.
Psychology
This field of study is built on the ideas that animals and humans have a brain that can process information.
Computer science
Practitioners came up with hardware, data structures, and algorithms to support reverse engineering basic components of the brain.
The study and application of AI techniques we see today are based on
these fundamentals. We typically see the study of AI broken into a focus
on either behaving or thinking in simulated intelligent systems. We see
these applied in areas such as machine learning applications, basic
knowledge systems, and game playing (e.g., Chess and Go).
However, there are limitations to our implementations of intelligent
study. There are still no good models of higher-order brain functions
such as consciousness. Science has also yet to determine where in the
brain consciousness resides. This leads some to question if
consciousness is even a real function of the brain, but we'll leave
that to the philosophers and computer scientists to debate.
One of the best books written on the subject of AI (if not the best) is
Stuart Russell and Peter Norvig's [*Artificial Intelligence: A Modern
Approach*](http://aima.cs.berkeley.edu/). We can't recommend this book
enough for you to get a more complete idea of the depth and history of
AI.
Cognitive dissonance and modern definitions
When we're dealing with a topic such as AI that ties
into so many core definitions society depends on, we naturally find
dissonance around setting ground truth to work from. Beau Cronin
writes:
Like the Internet of Things, Web 2.0, and big data, AI is discussed and debated in many different contexts by people with all sorts of motives and backgrounds: academics, business types, journalists, and technologists. As with these other nebulous technologies, it's no wonder the meaning of AI can be hard to pin down; everyone sees what they want to see.
Beau Cronin, http://oreil.ly/2sODKk2
Part of the problem with viewpoints and defining intelligence is that we
weave liberally into defining consciousness and then by extension more
philosophical (e.g., "what is consciousness") and religious topics
(e.g., "what is a soul?"). We're touching on some complex territory
at this point and any discussion about the definition of the soul is
fraught with complications. Our best maps today for the definition of
intelligence are covered in regions of "Here Be Dragons." Given that we
don't understand natural intelligence, it is even more difficult to
define the artificial variant.
Dr. Jason Baldridge wrote on the topic of AI
and machine learning and talked about how there are conflicting meanings
in play around the topic:
Regardless of any nuanced technical definition of AI, I'm pretty sure that when the public hears "artificial intelligence," they think of conscious nonbiological entities that interact with humans much as we interact with each other.
They don't think of an expert system that can analyze a complex domain-specific problem and provide interesting courses of action, or machine learning algorithms that find fascinating patterns in heaps of data.
Despite this, the general public seems to find it all to easy to mentally close the gap between these two very different levels of technological and scientific accomplishment on the spectrum of AI-related work.
Dr. Jason Baldridge, http://bit.ly/2tUwIt5
Dr. Baldridge goes on to define the difference between deep learning and
a full artificial model of the biological brain:
Despite all this progress, and for better or for worse, these are still far from sentient machines. Deep learning is inspired by the functioning of human neurons, but as far as I'm aware, artificial neural networks as yet have nothing like the architecture of meat-based intelligence
Dr. Jason Baldridge
So, we struggle with these definitions because they are complicated and
touch on many topics from many viewpoints. Let's take a step toward a
better definition by segmenting the topic and breaking these segments
down into simpler topics.
Francois Chollet recently made the following [salient comment on
Twitter](http://bit.ly/2tUsJNl):
Artificial intelligence is a poorly defined thing, to which many people attribute wildly unrealistic abilities. It's a recipe for trouble.
Francois Chollet, http://bit.ly/2tUsJNl
And then a further tweet:
Part of the problem is that some companies and journalists are hyping it up, blurring the line between sci-fi and reality. Because it sells.
Francois Chollet, http://bit.ly/2uz6vhF
Chollet goes on to say that [we should "define" what we're
talking](http://bit.ly/2u05jXA) about:
When you talk about "AI", \*define\* what you are talking about. Make explicit what it can do, and what it can't do. Avoid brain analogies.
Francois Chollet, http://bit.ly/2u05jXA
This is sound advice and the industry needs to do a far better job of
locking down these definitions.
What AI is not
Folks who claim machine learning to be AI do the entire computer
science industry a disservice. Machine learning is classification and
regression and in no way matches up to ephemeral aspirations of an
all-knowing, self-aware system that can help the reader with their
marketing problem. As Francois Chollet mentioned earlier, it's just
best (for now) to avoid brain analogies.
Many times AI is marketed as an application with all of the answers. It
will not, at least not anytime soon.
Moving the goal posts
Psychologists habitually have had disdain for the metaphor of the human
brain as a computer. [In a 2016 article](http://bit.ly/2tABWqX), Robert
Epstein states:
No matter how hard they try, brain scientists and cognitive psychologists will never find a copy of Beethoven's 5th Symphony in the brain---or copies of words, pictures, grammatical rules or any other kinds of environmental stimuli.
(http://bit.ly/2tABWqX), Robert Epstein
Unfortunately Dr. Epstein has not seen the renders in
[\#major\_architectures\_of\_deep\_networks](#major_architectures_of_deep_networks)
of Convolutional Neural Network (CNN) filter renders. His central
argument in the article is stated as:
Your brain does not process information, retrieve knowledge, or store memories. In short: your brain is not a computer.
(http://bit.ly/2tABWqX), Robert Epstein
This isn't a new sentiment and has been echoed, by and large, by the
disciplines outside computer science for the past 60 years. As stated in
Russell and Norvig's book on AI:
The intellectual establishment, by and large, preferred to believe that a "machine can never do X."
"Introduction to AI", Russell and Norvig
They go on to demonstrate examples in which AI researchers have
systematically responded by demonstrating one X after another. Thus, the
study of AI and definitions for the discipline have long suffered from
the industry "moving the goal posts" with respect to what they really
mean by "Artificial Intelligence."
Segmenting the definitions of AI
It's useful to break down the different viewpoints of how folks talk
about AI today and list them. [A good writeup](http://oreil.ly/2sODKk2)
by Beau Cronin uses the following four major segmented definitions of
AI:
AI as interlocutor
HAL, Siri, Cortana, Watson
Conversational intelligence
Limited reasoning
AI as android
Machine in the form of a humanoid
AI as mechanically embodied
Examples would be The Terminator or C3PO
Similar to the interlocutor but in a humanoid body
AI as the reasoner
Early AI pioneers were drawn to more refined and high-minded
tasks---playing chess, solving logical proofs, and planning
complex tasks
Still struggle with tasks simple for children
AI as the big data learner
More recent definition
Seeing many people talk about building "AI models"
Let's now look at these segmented definitions with a critical eye.
Segmenting the definitions of AI
AI as the interlocutor and AI as the big data learner are both recent
definitions due to the incorporation of many machine learning techniques
into engineered commercial products. AI as the interlocutor can perform
basic functions based on voice recognition. It's the combination of
voice-to-text machine learning (or deep learning) and Natural Language
Processing (NLP) technology to determine what the user wants to
accomplish. AI as the interlocutor has limited reasoning capabilities
because it typically relies on a separate system to send the
voice-to-text and NLP-processed results as input. This separate system
is many times as basic as a classic rule-base system or "expert
system."
Even though users might be initially entertained with conversation by
the system and even fooled by its "intelligence," they quickly realize
the limitations of the interactions. Ultimately, AI as the interlocutor
is a well-engineered combination of machine learning techniques that
over time became just good enough to progressively be wired further into
useful consumer products.
AI as the android is an interesting embodiment of the concept yet is
ultimately dependent on a network of machine learning subsystems just as
the interlocutor is. AI as the reasoner is a classical implementation of
AI yet has plateaued in recent years with respect to levels of industry
product integration interest. It continues to be a core component in
intelligent systems that wire together multiple components to produce
value, as in the interlocutor example.
"AI as the big data learner" is a troublesome use of the term that has
gained popularity in the past few years (2010--2015). Many times, a
marketing department will rebrand a product's use of a basic machine
learning technique over customer data as "Artificial Intelligence."
Worse, other times the product is doing basic business intelligence
functions and is also lumped into this category. The practice of machine
learning (or deep learning) alone should not be considered a type of AI.
It is, however, a useful subsystem of an intelligent system.
You should show restraint in labeling a machine learning model---deep
learning model or not---as "Artificial Intelligence." Overselling
capabilities early might attract funding but will hamper your project in
the longer term.
A fifth aspirational definition of AI
Another way to frame the question of "What is Artificial
Intelligence?" is to ask another question. If we look at this question
from the viewpoint of "what would definitively end the debate of 'What
is AI'?"
If we were presented with a transcendent, conscious, self-aware
intelligence that understood our world (and data) far better than any
human, we'd probably call it "true Artificial Intelligence." Or, an
alien.
Unfortunately chasing the mirage of AI is what brings on the unrealistic
expectations that always crash down on the industry no matter how much
tangible progress was made.
The AI winters
The AI industry has experienced multiple periods of up and down
interest and funding. These down periods of interest have been the
result of the sector being unrealistically overhyped followed by a cycle
of predictably underwhelming results. This down period is referred to as
an "AI winter" and involves cuts in academic research funding, reduced
venture capital interest, and stigma in the marketing realm around
anything connected to the term "artificial intelligence."
The result of these cycles is that the good technical advances (e.g.,
voice recognition or optical character recognition) is rebranded and is
integrated into other products.
AI Winter I: (1974--1980)
The lead-up to the first [AI
winter](https://en.wikipedia.org/wiki/AI_winter) saw machine translation
fail to live up to the hype. Connectionism (neural networks) interest
waned in the 1970s, and speech understanding research overpromised and
underdelivered.
In 1973, DARPA cut academic research in the AI domain. The Lighthill
report in the United Kingdom harshly criticized the field and research
funding was further curtailed.
AI Winter II: Late 1980s
In the late 1980s and early 1990s, there was overpromotion of
technologies such as expert systems and LISP machines, both of which
failed to live up to expectations. The Strategic Computing Initiative
canceled new spending at the end of this cycle. The fifth-generation
computer also failed to meet its goals.
The common patterns of AI Winters
The common patterns of these winters see the industry being overhyped by
a series of promising successes. After conditions percolate enough we
see the hype reach a nadir and the money for academic and industrial
research pour into the AI domain. A few real projects based on solid
technology reach at least part of their goals and solve real problems.
Most of the marketed promises go unfulfilled, however, and the Trough of
Disillusionment looms.
Winter kills the weak.
A few interesting applications slowly emerge from the trough, are
rebranded ("voice recognition"), and are integrated into other
projects as features, generally under the "latent intelligence" class
of improvements. We've seen this with the following:
Informatics
Machine learning
Knowledge-based systems
Business rules management
Cognitive systems
Intelligent systems
Computational intelligence
The name change might be partly because they consider their field to be
fundamentally different from AI. It is also true that the new names help
to procure funding by avoiding the stigma of the false promises attached
to the name "Artificial Intelligence."
Here's an interesting note from an AI meeting in the 1980s:
At the meeting, Roger Schank and Marvin Minsky---two leading AI researchers who had survived the "winter" of the 1970s---warned the business community that enthusiasm for AI had spiraled out of control in the '80s and that disappointment would certainly follow. Three years later, the billion-dollar AI industry began to collapse.
source ?
What Is Driving Interest Today in AI Today?
Three major contributors are driving interest in AI today:
The big jump in computer-vision technology in the late 2000s
The big data wave of the early 2010s
Advancements in applications of deep learning by top technology firms
In 2006, Geoff Hinton and his team at the University of Toronto
published a key paper on Deep Belief Networks (DBNs).[Hinton, Osindero,
and Teh. 2006. ["A fast learning algorithm for deep belief
nets."](https://www.cs.toronto.edu/~hinton/absps/fastnc.pdf)]{data-type="footnote"}
This provided the industry with a spark of creativity on what could
possibly improve the state of the art. We've seen a tsunami of deep
learning publications at top journals over the succeeding decade. These
publications began improving top scores for accuracy in many areas
beyond just computer vision, and deep learning took over the applied
machine learning landscape in short order.
Large web firms such as Google, Facebook, and Amazon all watch top
journals for the best ideas. These firms saw the developments by Yann
LeCun, Hinton, and others and began implementing these ideas in their
own pipelines. These new applications (e.g., better face detection, or
Alexa at Amazon) were widely recognized in the technology media.
In the mid-2000s much of the storage and ETL technology developed on the
West Coast by the large web firms began to be open source as projects
such as Hadoop and MongoDB.
Just as these web firms (Google, Yahoo!, etc.) had ramped up their
storage and ETL systems, they were then building out new machine
learning and deep learning techniques to better take advantage these new
and larger datasets.
Traditional *Fortune 500* enterprise companies were bringing online
large distributed systems in the early 2010s to hold their growing
transactional datasets. These enterprise companies tend to follow what
the West Coast web firms do by about 5 to 10 years. This has given rise
to an interest in deep learning in the *Fortune 500* and in systems that
allow the traditional enterprise to better exploit their investments in
big data.
If we combine the aforementioned three factors and then mix in the very
public successes of projects such as Watson (winning *Jeopardy*),
AlphaGo (winning Go), and Google's self-driving cars, we create an
environment in which enthusiasm outstrips the reality of the road ahead.
It's high tide in coverage and enthusiasm around AI. Unfortunately, in
these types of cycles, we also see the tide go back out eventually.
There are real applications using complex datasets for deep learning.
Here are just some of these applications:
Healthcare (e.g., predicting patient length of stay)
Retail (e.g., analyzing the shopping experience)
Telecom/Financial Services (e.g., analyzing transactions for
fraudulent patterns)
We've touched on some of the aforementioned use cases (and more) in
this book. When you, as practitioner, promote deep learning and AI we
recommend finding real use cases such as these and standing on "solid
ground." We mention solid ground here metaphorically because eventually
the tide will go out and we hope our fellow practitioners will have
something to stand on when it does.
Winter Is Coming
Deep learning in and of itself has been grounded in reality over the
course of this book. It is a framework for performing industry-leading
neural network modeling on complex data types. Deep learning, by itself,
would not fulfill our aforementioned fifth aspirational definition of
AI, so we don't have much to worry about on that front.
We are seeing systems marketed in 2016 as "Artificial Intelligence for
X," which are plainly using basic machine learning. AlphaGo was a
tremendous advancement in game playing, but as we saw with DeepBlue and
Chess, game-playing advances do not always translate easily to business
use cases.[*Jeopardy* does seem to be a "solved problem,"
however.]{contenteditable="false" data-type="footnote"}
Marketing departments are setting up similar scenarios as occurred in
the previous two AI winters, unfortunately. The coals of the real
advances in the domain, as in previous winters, will keep the true
enthusiasts and hardcore researchers warm during the cold of the coming
third AI winter.
Next in Series
Why Does a Company Need to Grow?
Why is growth a prized metric in company operations?