Artificial intelligence, like nanotechnology, will reshape our future. Nanotechnology
means thorough, inexpensive control of the structure of matter, and early
assemblers will enable us to build better assemblers: this will make it
a powerful and self-applicable technology. Artificial intelligence (that
is, genuine, general-purpose artificial intelligence) will eventually bring
millionfold-faster problem solving ability, and, like nanotechnology, it
will be self-applicable: early AI systems will help solve the problem of
building better, faster AI systems.
AI differs from nanotechnology in that its basic principles are not yet
well understood. Although we have the example of human brains to show that
physical systems can be (at least somewhat) intelligent, we don't understand
how brains work or how their principles might be generalized. In contrast,
we do understand how machines and molecules work and how to design many
kinds of molecular machines. In nanotechnology, the chief challenge is developing
tools so that we can build things; in AI, the chief challenge is knowing
what to build with the tools we have.
To get some sense of the possible future of AI--where research may go, and
how fast--one needs a broad view of where AI research is today. This article
gives a cursory survey of some major areas of activity, giving a rough picture
of the nature of the ideas being explored and of what has been accomplished.
It will inevitably be superficial and fragmentary. For descriptive purposes,
most current work can be clumped into three broad areas: classical AI, evolutionary
AI, and neural networks.
Since its inception, mainstream artificial intelligence work has tried to
model thought as symbol manipulation based on programmed rules. This field
has a huge literature; good sources of information include a textbook (Artificial
Intelligence by Patrick Winston, Addison-Wesley, 1984) and two compilations
of papers (Readings in Artificial Intelligence, Bonnie Lynn
Webber, Nils J. Nilsson, eds., Morgan Kaufmann, 1981, and Readings
in Knowledge Representation, Ronald J. Brachman, Hector J. Levesque,
eds., Morgan Kaufmann, 1985).
The standard criticism of AI systems of this sort is that they are brittle,
rather than flexible. One would like a system that can generalize from its
knowledge, know its limits, and learn from experience. Existing systems
lack this flexibility: they break down when confronted with problems outside
a narrow domain, and they must be programmed in painful detail. Work continues
on alternative ways to represent knowledge and action, seeking systems with
greater flexibility and a measure of common sense. (A learning program called
Soar
[ also
this link ], developed by Allen Newell of Carnegie Mellon University
in collaboration with John
Laird and Paul Rosenbloom,
is prominent in this regard.) In the meantime, systems have been built that
can provide expert-level advice (diagnosis, etc.) within certain narrow
domains. Though not general and flexible, they represent achievements of
real value. Many of these so-called "expert
systems" are in commercial use, and many more are under construction.
When one reads "artificial intelligence" in the media, the term
typically refers to expert systems. If this were the whole of AI, it would
still be important, but not potentially revolutionary. The great potential
of AI lies in systems that can learn, going beyond the knowledge spoon-fed
to them by human experts.
The most flexible and promising learning schemes are based on evolutionary
processes, on the variation and selection of patterns. Doug
Lenat's EURISKO program used this principle, applying heuristics (rules
of thumb) to solve problems and to vary and select heuristics. It achieved
significant successes, but Lenat concluded that it lacked sufficient initial
knowledge. He has since turned to a different project, CYC,
which aims to encode the contents of a single-volume encyclopedia--along
with the commonsense knowledge needed to make sense of it--in representations
of the sort used in classical AI work.
Another approach to evolutionary AI, pioneered by John Holland, involves
classifier
systems modified by genetic
algorithms. A classifier system uses a large collection of rules,
each defined by a sequence of ones, zeroes, and don't-care symbols. A rule
"fires" (produces an output sequence) when its sensor-sequence
matches the output of a previous rule; a collection of rules can support
complex behavior. Rules can be made to evolve through genetic algorithms,
which make use of mutation and recombination (like chromosome crossover
in biology) to generate new rules from old. This work, together with a broad
theoretical framework, is described in the book Induction: Processes
of Inference, Learning, and Discovery (by John H. Holland, Keith
J. Holyoak, Richard E. Nisbett, and Paul R. Thagard, MIT Press, 1986). So
far as I know, these systems are still limited to research use.
Mark S. Miller and I have
proposed an agoric approach to evolving software, including AI software.
If one views complex, active systems as being composed of a network of active
parts, the problem of obtaining intelligent behavior from the system can
be recast as the problem of coordinating and guiding the evolution of those
parts. The agoric approach views this as analogous to the problem of coordinating
economic activity and rewarding valuable innovation; accordingly, it proposes
the thorough application of market mechanisms to computation. The broader
agoric open systems approach would invite and reward human involvement
in these computational markets, which distinguishes it from the "look
Ma--no hands!" approach to machine intelligence. These ideas are described
in three papers ("Comparative
Ecology: A Computational Perspective," "Markets
and Computation: Agoric Open Systems," and "Incentive
Engineering for Computational Resource Management") included in
a book on the broader issues of open computational systems (The Ecology
of Computation, B. A. Huberman, ed., in Studies in Computer Science
and Artificial Intelligence, North-Holland, 1988).
Ted Kaehler of Apple Computer
has used agoric concepts in an experimental learning system initially intended
to predict future characters in a stream of text (including written dates,
arithmetic problems, and the like). Called "Derby," in part because
it incorporates a parimutuel betting system, this system also makes use
of neural network principles.
Classical AI systems work with symbols and cannot solve problems unless
they have been reduced to symbols. This can be a serious limitation.
For a machine to perceive things in the real world, it must interpret messy
information streams--taking information representing a sequence of sounds
and finding words, taking information representing a pattern of light and
color and finding objects, and so forth. To do this, it must work at a pre-symbolic
or sub-symbolic level; vision systems, for example, start their work by
seeking edges and textures in patterns of dots of light that individually
symbolize nothing.
The computations required for such tasks typically require a huge mass of
simple, repetitive operations before patterns can be seen in the input data.
Conventional computers simply do one operation at a time, but these operations
can be done by many simpler devices operating simultaneously. Indeed, these
operations can be done as they are in the brain--by neurons (or neuron-like
devices), each responding in a simple way to inputs from many neighbors,
and providing outputs in turn.
Recent years have seen a boom in neural network research. Different projects
follow diverse approaches, but all share a "connectionist" style
in which significant patterns and actions stem not from symbols and rules,
but from the cooperative behavior of large numbers of simple, interconnected
units. These units roughly resemble neurons, though they are typically simulated
on conventional computers, and the resemblance in behavior is often very
rough indeed. Neural networks have shown many brain-like properties, performing
pattern recognition, recovering complete memories from fragmentary hints,
tolerating noisy signals or internal damage, and learning--all within limits,
and subject to qualification. A variety of neural network models are described
in the two volumes of Parallel Distributed Processing: Explorations
in the Microstructure of Cognition (edited by David E. Rummelhart
and James L. McClelland, MIT Press, 1986). Neural network systems are beginning
to enter commercial use. Some characteristics of neural networks have been
captured in more conventional computer programs (Efficient
Algorithms with Neural Network Behavior, by Stephen
M. Omohundro, Report UIUCDCS-R-87-1331, Department of Computer Science,
University of Illinois at Urbana-Champaign, 1987). Webmaster's Note: Some WWW sources of information
on neural nets:
A major strength of the neural-network approach is that it is patterned
on something known to work--the brain. From this perspective a major weakness
of most current systems is that they don't very closely resemble real neuronal
networks. Computational models inspired by brain research are described
in a broad, readable book on AI, philosophy, and the neurosciences (Neurophilosophy,
by Patrica Smith Churchland, MIT Press, 1986) and in a more difficult work
presenting a specific theory (Neural Darwinism, by Gerald Edelman,
Basic Books, 1987). A bundle of insights based on AI and the neurosciences
appears in The Society of Mind (by Marvin Minsky, Simon and
Schuster, 1986).
For all its promise and successes, AI has hardly revolutionized the world.
Machines have done surprising things, but they still don't think in a flexible,
open-ended way. Why has success been so limited?
One reason is elementary: as robotics researcher Hans
Moravec of Carnegie-Mellon University has noted, for most of its history,
AI research has attempted to embody human-like intelligence in computers
with no more raw computational power than the brain of an insect. Knowing
as little as we do about the requirements for intelligence, it makes sense
to try to embody it in novel and efficient ways. But if one fails to make
an insect's worth of computer behave with human intelligence--well, it's
certainly no surprise.
Machine capacity has increased exponentially for several decades, and if
trends continue, it will match the human brain (in terms of raw capacity,
not necessarily of intelligence!) in a few more decades. Meanwhile, researchers
work with machines that are typically in the sub-microbrain range. What
are the prospects for getting intelligent behavior from near-term machines?
If machine intelligence should require slavish imitation of brain activity
at the neural level, then machine intelligence will be a long time coming.
Since brains are the only known systems with general intelligence, this
is the proper conservative assumption, which I made for the sake
of argument at one point in Engines
of Creation. Nonetheless, just as assemblers will enable construction
of many materials and devices that biological evolution never stumbled across,
so human programmers may be able to build novel kinds of intelligent systems.
Here we cannot be so sure as in nanotechnology, since here we do not know
what to build, yet novel systems seem plausible. It is, I believe, reasonable
to speculate that there exist forms of spontaneous order in neural-style
systems that were never tested by evolution--indeed, that may make little
biological sense--and that some of these are orders of magnitude better
(in speed of learning, efficiency of computation, or similar measures) than
today's biological systems. Stepping outside the neural realm for a moment,
Steve Omohundro
(see above) has found algorithms that outperform conventional neural networks
in certain learning and mapping tasks by factors of millions or trillions.
Algorithms of neural style may exist that were never
tested by evolution
Thus, although there is good reason to explore brain-like neural
networks, there is also good reason to explore novel systems. Indeed, some
of the greater successes in current neural network research involve multi-level
versions of "back-propagation" learning schemes that seem rather
nonbiological (and Omohundro's algorithms seem entirely nonbiological).
In summary, AI research is rich in diverse, promising approaches. Our ignorance
of our degree of ignorance precludes any accurate estimate of how long it
will take to develop genuine, flexible artificial intelligence (of the sort
that could build better AI systems and design novel computers and nanomechanisms).
If genuine AI requires understanding the brain and developing computers
a million times more powerful than today's, then it is likely to take a
long time. If genuine AI can emerge through the discovery of more efficient
spontaneous-order processes (or through the synergistic coupling of those
already being studied separately) then it might emerge next month, and shake
the world to its foundations the month after.
In this, as in so many areas of the future, it will not do to form a single
expectation and pretend that it is likely ("We will certainly have
genuine AI in about 20 years"--poppycock!). Rather, we must recognize
our uncertainty and keep in mind a range of expectations, a range of scenarios
for how the future may unfold. Genuine AI may come very soon, or very late;
it is more likely to come sometime in between. Since we don't know what
we're doing, it's hard to guess the rate of advance. Sound foresight in
this area means planning for multiple contingencies.