What happened next? How did the story turn out? We really want to know.
The story, Henry James once declared, is art’s spoiled child (1909). By that he surely meant how readily humans surrender to stories, whether listening to, reading, or telling them. We communicate by stories—“Did I tell you . . . ?”— and make up internal narratives as self-explanations. We create stories by shaping unrelated incidents into a sequence of cause-and-effect that can be utterly false. (Danny Hillis, a distinguished computer scientist and author, has said that because cause-and-effect is just an artifact of our brain’s penchant for storytelling, we should abandon the idea of it outright.) Stories are a special kind of compressed code. In a few lines, we can grasp a character’s lifetime and, in a few words, be inspired, uplifted, or cast down.
The Israeli historian Yuvah Noah Harari (2015) goes further. Humans are the only species that trade fictive stories, he declares, which has enormous consequences. It allows us to cooperate in numbers well beyond the average of 150 individuals we can learn to know about personally, biologically, as it were. “Large numbers of strangers can cooperate successfully by believing in common myths,” Harari says, and offers religion, or nationalism, as examples. “There are no gods in the universe, no nations, no money, no human rights, no laws, and no justice outside the common imagination of human beings”. But “Telling effective stories is not easy. The difficulty lies not in telling the story, but in convincing everyone else to believe it”.
That imagined reality, that shared story, exerts great force in the world, Harari continues. Moreover, imagined realities, collective myths, and shared stories can change rapidly, adapting to new circumstances. Before the French revolution, people believed in the divine right of kings but “almost overnight” Harari says, they adopted a belief in the sovereignty of the people. Humans are open to a fast lane of cultural evolution, outstripping any other species in an ability to cooperate. By revising our shared stories to adapt to changing circumstances, humans can change their beliefs and behavior in a matter of decades, rather than waiting for the slow changes evolution brings about.
The foundation of stories is language. Text, which stands for words, which stand for—well, whatever they stand for—is one of our most powerful codes, and stories are one of its most powerful forms, because as humans, storytelling is one of our distinguishing characteristics.
I went to talk to MIT’s Professor Patrick Winston, because of my own interest in stories and higher-level symbolic intelligence. Winston was a pink-skinned, affable, and trim man (not always—his website tells his tale of forcing himself to lose 60 pounds in 100 days). Winston had been at MIT since he was a freshman, loved his institution passionately, devoted much time to institute affairs, and loved to teach. He wrote a classic and best-selling textbook called Artificial Intelligence and, in 1972, succeeded his former dissertation advisor, Marvin Minsky, as the director of what was then known as the MIT Artificial Intelligence Laboratory, later the Computer Science and Artificial Intelligence Laboratory (CSAIL). Winston later stepped down as head of the lab but continued to teach and supervise research until his death in July 2019.
Winston’s research goal, a comprehensive computational account of human intelligence, was driven by two questions. First, what computational competences are uniquely human? Second, how do uniquely human competences support and benefit from the computational competences we share with other animals?
With his colleague Dylan Holmes, Winston writes:
Our answer to the uniquely human question is that we became the symbolic species and that becoming symbolic made it possible to become the story-understanding species. Our answer to the support-and-benefit question is that our symbolic competence, and the story-understanding competence that it enables, could not have evolved without myriad elements already in place. (Holmes & Winston, 2018)
This position is unusual—most AI today focuses on statistical mechanisms associated with machine learning, mechanisms that shed little light on aspects of intelligence that are uniquely human, as I’ve pointed out. Holmes and Winston elaborate on this point:
We believe that tomorrow’s AI will focus on an understanding of our uniquely human intelligence emerging from discoveries on par with the discoveries of Copernicus about our universe, Darwin about our evolution, and Watson and Crick about our biology. These cognitive mechanisms will take to another level applications aimed at reasoning, planning, control, and cooperation. Tomorrow’s AI applications will astonish the world because they will think and explain themselves, just as we humans think and explain.
Relying on work in linguistics and comparative anatomy by Robert Berwick and Noam Chomsky (2016), Winston and Holmes begin by emphasizing the merge operation, what they call “the sine qua non of being symbolic. It’s the capability to combine two expressions to make a larger expression without disturbing the two merged expressions. For example, English speakers understand a bird is an animal with feathers that flies, and also understand the exception that an ostrich is an animal with feathers—a bird—but doesn’t fly. Moreover, they understand from poet Emily Dickinson that “hope is the thing with feathers,” which allows imaginations to think of hope as birdlike, one that probably flies (in some sense), without disturbing any of the other ideas about birds they hold. Merge gives us, and only us, an inner language with which we build complex, highly nested symbolic descriptions of classes, properties, relations, actions, and events. “When we write that we are symbolic, we mean that we have a merge-enabled inner language”
Together with the competences humans share with other species, the merge operation enables storytelling, story understanding, story composition, and all that enables much, perhaps all, of education. The merge operation also enables religion, nationalism, currency systems, human rights, and the rest of Yuval Noah Harari’s list of fictive stories we tell each other (2015).
Our stories—the creating and the assimilating of them—are what make us different from other primates. They’re a marker of higher-level, symbolic intelligence though this keystone competence could not have evolved without other elements already in place, elements we share with other species. “We developed the means to externalize our inner stories into outer communication languages, and to internalize stories presented to us in those outer communication languages” (Holmes and Winston, 2018). Thus the strong story hypothesis, first proposed by Winston in 2011: “The mechanisms that enable humans to tell, understand, and recombine stories separate our intelligence from that of other primates.”
Although other animals might have internal representations of some aspects of the world, they seem to lack these complex, highly nested symbolic descriptions. Work with Nim Chimpsky, a chimpanzee who learned American Sign Language, showed that while the chimp could understand names of things and memorize sign sequences, Nim did not exhibit any merge-enabled inner language of complex, highly nested symbolic descriptions. A comparison between children and chimpanzees shows that young humans generate novel combinations of words very freely, but Nim Chimpsky never provided evidence via signing that suggested he had this merge-enabled inner compositional capability. “Somehow we developed the means to externalize our internal stories into outer communication languages and to internalize stories presented to us in those outer communication languages. Being social animals, we started telling each other stories” (Holmes & Winston, 2018).
How did this capacity arise in humans? As Winston would tell it, it’s—well, a story. Until about 80,000 to 100,000 years ago, humans and other hominins (the group consisting of modern humans, extinct human species and all our immediate ancestors) were about the same. Ian Tattersall, a paleoanthropologist at the American Museum of Natural History, believes that sometime in that span, humans became symbolic, and parted from our other hominin cousins. Tattersall conjectures that rapid climate changes during that era forced hominins to adapt or die, and one of the most successful adaptations was the ability in a small, isolated band to manipulate symbols, in speech, in pictures, perhaps otherwise. “As far as anyone can tell, we are the only organisms that mentally deconstruct our surroundings and our internal experiences into a vocabulary of abstract symbols that we juggle in our minds to produce new versions of reality: we can envision what might be, as well as describe what is,” Tattersall writes (2014).
Winston said: “Tattersall is a bit vague about what he means by ‘symbolic.’ He’s a paleoanthropologist. But I’m a computer scientist, and I know exactly what symbolic means.” (Recall early Allen Newell and Herbert Simon: symbols are functional entities. They have access to meaning—designations, denotations, information a symbol might have about a concept, such as a pen, brotherhood, or quality. The physical symbol system, whether brain or computer, can act appropriately with those symbols. (McCorduck, 1979).) Winston went on: “Then I heard Noam Chomsky talk about how we humans developed the ability to combine concepts, thus making new concepts, without destroying the original concepts.” The Genesis story-understanding program was born.
The Genesis model is being built by studying and employing the kinds of computations required to translate stories of up to 100 sentences, expressed in simple English, into inner stories. Winston and his colleagues then studied how to use the inner stories to answer questions, describe conceptual content, summarize, compare and contrast, react with cultural biases, instruct, reason hypothetically, solve problems, and find useful precedents. Nothing would go into Genesis unless it was needed and seemed biologically plausible.
Winston and his colleagues have been devoted to doing this scientifically. They’ve avoided models that are so general they can explain anything (and so are not falsifiable). Instead, their models are narrow in scope because this is only the beginning: Genesis, its builders say, is analogous to the Wright Brothers airplane of 1903.
Genesis has learned from summaries of plays, such as Shakespeare’s Macbeth; fairy tales, such as Hansel and Gretel; and contemporary conflicts, such as the 2007 Estonia-Russia cyberwar. As Genesis reads simple concise stories, it connects causes to effects and means to actions, sorts membership in classes, and uses inference to elaborate on what is written. It reflects on its reading, looking for concepts and concept patterns that allow it to make abstractions. Thus Macbeth harms Macduff, and Macduff wants revenge, a word that doesn’t appear in the summary Genesis has read. The system can do more: it models personality traits and anticipates trouble. It aligns similar stories for analogical reasoning (using an algorithm from molecular biology!). For example, Genesis finds clear parallels between the onset of the Arab-Israeli War and the Tet Offensive in the Vietnam War. “In both cases, intelligence noted mobilization, intelligence determined that the attackers would lose, intelligence determined that the attackers knew they would lose, intelligence concluded there would be no attack, whereupon the attackers promptly attacked. Retrospectively, there were political rather than military motives” (Holmes & Winston, 2018).
“I’d go beyond that, though,” Winston told me, “and say the most important concepts to combine are event descriptions. We combine event descriptions into larger sequences; then we move backward and forward in remembered sequences. With that ability, we can tell stories, understand stories, and combine old stories to make new ones. That, I think, constitutes part of the answer to the question of what’s different about us.”
So humans developed the capacity for a complex inner story—possibly owing to a completed anatomical loop, incomplete in other animals, Berwick and Chomsky hypothesize—and then the ability to externalize those inner stories, and internalize stories presented to us, “and because we are social animals, externalization and internalization had a powerful amplifying effect.” In other words, what Yuval Noah Harari names as the unprecedented ability to cooperate, owing to shared stories.
Storytelling, Winston believed, makes it possible for humans to construct elaborate models of ourselves (possibly consciousness?) and the world outside us. “If we’re to understand human thinking,” he speculated to me, “we must model that story-manipulation, model-enabling capability. In the end, that’s what makes us different from species that have plenty of just-do-it, and simulation capabilities, but whose story manipulation capability, if any, is on another, much lower level.”
Being symbolic, Winston went on, allows humans to have an inner language that supports story understanding, the acquisition of common sense from perception, and the ability to communicate with others. Of course, we share much with other animals, too, which remains to be fully understood.
Although Winston appreciated efforts that have led to outstanding engineering, such as Rodney Brooks’s robot insects, not to mention Brooks’s wildly successful Roomba robot vacuum cleaner, Winston was personally more interested in what he calls the science side of intelligence, symbolic capacities. The founding fathers of AI also believed symbolic capacities were central to intelligence. Winston thought the way forward was to ask “better, biologically inspired questions.” Good science informs good engineering or applications.
In Genesis, Winston believed he’d departed from early AI’s view of what it meant to be symbolic, that being symbolic meant only logical reasoning, and nothing else mattered. “I think that reasoning is recipe-following. Recipe-following is only a special case of story understanding,” he said to me.
Yes, the earliest AIs embodied logical reasoning, but given how much Newell and Simon, for example, honored and practiced storytelling themselves (remember “The Apple” and “Fairy Tales”), they never believed their reasoning-based programs were all there was to intelligence. Each of them said so explicitly. (As we saw earlier in this book, Simon called logical reasoning a “small but fairly important subset of what’s going on in mind”). Early AI was based on what was immediately accessible to cognitive psychologists in the mid-1950s, those thinking-aloud protocols of reasoning, as subjects tried to solve problems, coupled with the primitive computing technology of the time. That you might one day be able to read human or even rat brain waves, much less exhibit the electro-chemical behavior of the brain, was beyond anything at the time.
Newell and Simon declared explicitly that there was much more to thinking than what they could then simulate, and both would be comfortable, I think, with Winston’s emphasis on storytelling as an indisputable marker of human intelligence.
Winston and his colleagues worked with neuroscientists and psychologists to push these ideas further. Winston did so as an investigator participating in the Center for Brains, Minds, and Machines, an MIT-Harvard interdisciplinary group of computer scientists, neuroscientists, and psychologists that meets regularly to exchange ideas and findings about cognition.
Genesis does not aim to advance the state of the art in question-answering, as IBM’s Watson does, for example. Its creators intend to devise and build a plausible and scientific account of human story understanding, showing how a story-understanding system is able not only to answer questions, but also describe conceptual content, summarize, compare and contrast, react with cultural biases, instruct, reason hypothetically, solve problems, and find useful precedents. (This reminds me of the original Logic Theorist, which wasn’t built to be a killer logician, but to model how humans proved theorems in logic.)
The simple substrate of Genesis supports many competences, Winston declared. Some examples: Genesis answers questions about why and when, models personality traits, notes concept onsets, anticipates trouble, and can re-interpret stories with controllable allegiances and cultural biases. (It first views the cyberwar between Estonia and Russia as the aggression of a bully, from the Estonian point of view, and then as teaching a lesson from the Russian point of view.) Another example includes Genesis’ ability to persuade.
Winston viewed story understanding as foundational to human intelligence. To understand and model it in detail is a significant step toward constructing artificial intelligence. For now, Genesis reads and demonstrates all these capabilities only around stories that are adapted for it. The model cannot understand stories written by people for people. Critics complain that Genesis should learn, not be instructed, although most humans must be instructed—by their parents, by their schools, by experience—in many of the issues Genesis confronts.
My life has been shaped by stories. The most intimate and enduring transaction of my life has been to transfer an outer story to an inner one, an inner to an outer one. My mother read Enid Blyton to me, and so momentarily I became one of Blyton’s plucky children. But when my brother and sister, twins, were born, I was on my own. By then, I could read and took up my mother’s copy of unexpurgated Grimm’s Fairy Tales, transmuting horrors of cruelty and even death into inner stories, which would to teach me far more about real life than the denatured “children’s books” I encountered when we arrived in the United States. Much later I became for a while Dorothea Brooke, Isabel Archer. I began to transform my own inner stories to outer ones, as I have in this book.
So I observe the steps toward story understanding that Winston and Holmes propose as precise and explicit: knowledge acquisition, concept formation, analogies to other stories Genesis knows, the ability to reason and summarize, to persuade a reader from a given cultural point of view, and more. Much work is to be done, but Genesis is only Kitty Hawk. Genesis is only a first draft.
Understanding stories takes many forms. Oren Etzioni, who in 2013 became the first head of the new Allen Institute for Artificial Intelligence in Seattle known as AI2 (founded by Paul Allen and mostly, but not entirely, funded by him), has also been long at work on text understanding. “Why text?” I asked, knowing that so many of his colleagues are working on other kinds of perception—machine learning, for example—as a route to intelligent machines. “For the same reason Willie Sutton went to the banks,” Etzioni laughed. “The banks are where the money is. Text is where the knowledge is—all over the world.”
AI2’s approach is called open information extraction. It isn’t just fact-finding, but fact understanding—finding both knowledge and meaning in text.
AI2’s efforts include a series of programs that can pass fourth-grade, eighth-grade, and twelfth-grade tests in science, language arts, and social studies. The programs must meet explicit benchmarks. For example, in fourth-grade arithmetic, to tease out the essence of word problems requires not only the ability to think through the problems, but also real world knowledge—lifespans, what animals are, and so on. When we succeed at that, Etzioni says, we’ll only have an artifact. Which, of course, can be built upon.
The second goal at AI2 is for the common good: a better scientific search engine, called Semantic Scholar, that can “understand” and search semantically instead of using keywords in context, like Google Scholar. How will success be measured here? By how users behave: what they ask, how often the system is used, and whether and how often users return.
In 2017, AI2 announced a new project: giving computers common sense. Project Mosaic (first called Project Alexandria as a tribute to the great ancient library) builds on earlier programs the Institute has been working on, including machine reading and reasoning (Aristo), natural language understanding (Euclid), and computer vision (Prior), to create a new unified and extensive common-sense knowledge source. Mosaic will also draw on crowd-sourcing.
AI2 researchers work closely with the Allen Institute for the Brain because the long-term goal is to discover and define what intelligence is. “This is the grand question. It will take a long time,” Etizioni says. Meanwhile, AI2 will not only work on these specific shorter-term goals, but also sponsor distinguished investigator awards, stipends to individuals who are eager to go beyond incremental approaches to AI and think in larger, more comprehensive terms.
This voracious consumption of text in order to know and understand is underway (with variations in methods and ultimate goals) at all the major computer firms—IBM, Google, Microsoft, Apple, Facebook—and many research sites. Each effort takes a different approach. Carnegie Mellon’s Nell (Never Ending Language Learner) program is a machine-learning project that “reads,” or extracts facts from, text found in hundreds of millions of websites, to which it assigns different levels of confidence. It attempts to improve its competence so that it can learn better tomorrow, extract more facts more accurately. You can visit Nell’s website (http://rtw.ml.cmu.edu/rtw/) and see the categories it has extracted facts from, and whether you agree. Another program at Northwestern can convert numerical data (such as sports scores or profit-and-loss statements) into stories: a sports story about your youngster’s Little League game or a story to help a franchise manager understand why the branch across town is doing better.
In connection with machine learning, I’ve mentioned the largely unspoken assumption, maybe hope, that as machines accumulate the abilities that correspond to lower human faculties, the higher faculties will inevitably, or magically, emerge. We call this level of higher faculties symbolic intelligence, and emergence seems to be how it happened with humans, so why not?
Lots of reasons why not, and they aren’t necessarily about better hardware—although they could be about better software. Google’s Giant Brain seems to have as many connections as the human brain, but requires megawatts of power, whereas we’re still smarter in some ways with only 20 watts. (Although what “smarter” means is problematical.) The abstract kinds of language and thinking that might have emerged some 80,000 to100,000 years ago from a single hominin group was a winning elaboration of the relatively simple communications our hominin cousins already had. As I’ve noted, Ian Tattersall (2014) conjectures one small, isolated group produced and sustained symbolic capabilities, and because they were few and isolated, the genes were allowed to flourish.
Leslie Valiant has said it’s impossible for human coders to do what machines eventually must do automatically to achieve intelligence (probably so), and Eric Baum posits an underlying structure in the world that is detectible and amenable to a compressed representation. Perhaps such a structure exists. If so, for millennia it has been science’s grand quest to find it. We’ll see.
In 2015, Kathleen M. Carley, a professor in the School of Computer Science at Carnegie Mellon, presented “Will Social Computers Dream?” at a symposium, where she shared her strong belief that, as interesting and capable as machine learning is, it can’t achieve human-like intelligence without social cognition, the ability to reason in a socio-cognitive-emotional fashion. These are a set of procedures and behaviors humans follow to reason about and respond to the world from both “social collective” and “individual affective” viewpoints. Social cognition is partly physiological, partly learned, requiring the actor to be in a rich socio-cultural environment, engaging in real interactions with multilevel actors and with multilevel and competing goals, histories, and culture. To give complete social cognition to computers is complicated, she says, and is unlikely to happen in the next fifty years. But computers with partial social cognition are likely to emerge before then, with generally positive advantages for humans themselves (Carley, 2015). Perhaps they will also have the kind of distributed intelligence Winston believed is needed for genuinely human-like intelligence, not in some central part of the brain, but using and reusing coded vision, language, and motor systems, which will lead to inner and outer narratives.
If video games are the new storytelling, video game designers also have high ambitions to incorporate social cognition into games (Stuart, 2018). First, they want to remove the interface—that is, get rid of knobs and joysticks and allow participants to play the game with voice commands. Developments in animation and motion capture will soon allow more than words to present the nuances of a character’s behavior. Rigid narrative conventions will be replaced by AI-driven reactive systems, intelligent games in which the game engine develops a sense of dramatic control that enables it to decide the best moment for a player to meet another character. Because the stories will be open-ended with new possibilities added daily, video-game designers will have to learn how to tell stories that evolve over months or years. Perhaps most ambitiously, the best narrative designers want to develop stories that speak to cultures all over the globe. Margaret Stohl, a successful designer, mentions how loneliness afflicts so many humans, and adds: “People don’t think of video games as emotionally progressive, but as online communities thrive around them, that’s a chance to be part of something”.
It remains to be seen whether general intelligence can be achieved or whether the white-hot research of machine learning will lead to the spontaneous emergence of symbolic intelligence (thinking slow). But sooner or later, I believe we’ll be facing human-level general intelligence in our machines, except that their powers will be faster, wider, and deeper than ours.
Of course as with all aspects of AI, a troll lurks under the bridge. OpenAI, a nonprofit research group in Silicon Valley, has created GPT2, a text generator that is so good at writing news stories and fiction that the organization has decided not to release it yet (though that’s the point of OpenAI) because the potential for malicious use is so great. Fed just a few starter lines or paragraphs, the system takes up the narrative and continues with a story so plausible (complete with fictitious quotes, if it’s a news story, from major figures concerned with the story’s topic) that for a reader to detect whether the story is real or fake is nearly impossible. GPT2 has trained on very large data sets, and can be tweaked further to be positive or negative. OpenAI researchers are testing the system, to find out what it can and cannot do, especially maliciously (Hern, 2019). Meanwhile, the story—whether poem, novel, five-season TV series, or advanced video game—appears to us in combinations of words, images, and music, each a kind of code, each a technology of compression. (So is mathematics. So is music notation. So is computer programming.) A new field is beginning to view words and text (and music, and images) in just such terms. In a gratifying closure of my life’s circle, this new field is a wedding of the Two Cultures called the digital humanities.
- Max Tegmark would phrase it differently. In his book Life 3.0 (2017), he noted that unlike other animals, humans are able to rewrite their own software. ↵
- Early AI researchers recognized this. In the 1970s, Roger Schank, then at Yale, worked on programs that generated stories, which he allowed me to show to my own undergraduate writing classes. My students judged them cartoonish, simplistic, and I tactfully didn’t say how close the computer’s efforts were to theirs. ↵
- A few years ago, as Winston and I were meeting, he said, “Three days ago, I heard something really important. A colleague here at MIT has been able to put a probe into a rat hypothalamus. As the rat runs along a raised track, its brain waves show a sequence that corresponds to the curves in the track. As it reaches the end of the track (and its goal of food), its brain waves show that it’s negotiating the track again in its brain, even though it’s now standing still, eating. Moreover, sometimes it will stop on the track and play in its brain the patterns that correspond both to where it’s been and where it anticipates going. It sometimes even dreams about running the track.”“So this ability to imagine a sequence of events goes pretty far down the mammalian chain,” I said.Winston nodded. “And we know rats are very smart.” Winston further proposed that intelligence is within, not behind, our input/output channels, a view generally held at MIT for at least twenty years. This means intelligence lies not in some central part of the brain, but in the use and reuse of coded vision, language, and motor systems together. A major point of agreement at an AI Summit in February 2014, convened to discuss future directions of AI, and attended by researchers from the U.S., Europe, and Asia, was that it was time for integrated systems—vision, language, and motor systems to be combined into single entities. One dissenter since has been Stuart Russell, who thinks that might make machines too smart for our own good. ↵
- Current papers of AI2 are posted on the Institute’s web site (http://allenai.org) so you can judge the progress of research. ↵
- Access Semantic Scholar at https://allenai.org/semantic-scholar/ ↵