28

1.

The humanities willingly transform themselves, embracing the computer enthusiastically. Their embrace—whether they know it explicitly or not—incorporates principles Jeannette Wing has for more than a decade called “computational thinking.” Wing is the Avenessians Director of the Data Science Institute at Columbia University and a professor of computer science there. Computational thinking, she says, is a universally applicable attitude and set of skills that everyone, not just computer scientists, is eager to learn and use (Wing, 2006).[1] Computational thinking is part of the “durable intellectual content” that Mary Shaw, the Alan J. Perlis Professor of Computer Science at Carnegie Mellon, has called for, that makes computer science a science, beyond the technology of the moment (Togyer, 2014).

Computational thinking builds on both the power and limits of computing, whether executed by a human or by a machine. These methods and models give us

. . .the courage to solve problems and design systems that no one of us would be capable of tackling alone. Computational thinking confronts the riddle of machine intelligence: What can humans do better than computers? and What can computers do better than humans? (Wing, 2006)

Computational thinking is solving problems, designing systems, and understanding human behavior. It includes a range of mental tools that reflect the breadth of the field of computer science. So we ask of a particular problem: How difficult is it to solve? What’s the best way to solve it? These are questions that computer scientists can offer precise answers to. We ask, Is an approximate solution to the problem good enough? “Computational thinking is reformulating a seemingly difficult problem into one we know how to solve, perhaps by reduction, embedding, transformation, or simulation.” It’s thinking recursively; it’s parallel processing; it interprets code as data and data as code. It judges programs not just for correctness and efficiency but for aesthetics. It judges a system’s design for simplicity and elegance. And more. (Wing, 2006)

Computational thinking is about conceptualizing, not programming. It demands thinking at multiple levels of abstraction. Abstraction and decomposition are used to tackle large complex tasks or to design large complex systems. Computational thinking chooses an appropriate representation for a problem or models the relevant aspects of a problem to make it tractable. “It is planning, learning, and scheduling in the presence of uncertainty. It is search, search, and more search . . .” Wing writes.

Above all, computational thinking is a fundamental. It’s not a rote skill. It’s about ideas, not artifacts. It’s a way that humans, not computers, think. It complements and combines mathematical and engineering thinking. It’s for everyone, everywhere, an intellectual adventure that will be commonplace to human thought in the future.

2.

To calculate the present extent of digital humanities projects would be hopeless—they spring up daily (and sometimes languish as quickly). A Google search will lead you into this vast territory. It includes dynamic maps of the encounter between European and indigenous peoples, multimedia projects exploring 19th century music in Victorian literature, and archives of performances of Greek and Roman drama. A Stanford project examines the Roman world, considering travel patterns and their effects on governance, art, and literature, and lays it out in graphic terms for you. The Homer Multitext Project,[2] housed at both the University of Leipzig and Tufts University, exhibits multiple texts of Homer’s work from all over the world, for scholars sitting in their studies to access and compare. Visual reconstructions abound of medieval cathedrals, prehistoric villages, destroyed works of art. An app for detecting allusions in literary text is available.

Other digital humanities projects transform other kinds of data, collected earlier, into quickly understood images, much as, thirty years ago, scientists turned to artists to help make images out of otherwise incomprehensible supercomputer scientific data. Transforming data is only one modest part of what the digital humanities will be in the future, but you must start somewhere. People who resist this will find their arguments, and perhaps, as Edmond Campion worries, their work, made obsolete by the field’s evolution.

Scale is another revelation of the digital humanities. An individual scholar or two can examine concepts or themes in thousands of books, not the tens once possible. In 2010, Stanford literary critic Franco Moretti began to urge his colleagues to try not close, but “distant” reading, computer-assisted reading of thousands of texts at a time. His Stanford Literary Lab has examined loudness in the 19th-century novel (Katsma, 2014) and the evolving language of World Bank Reports (Moretti & Pestre, 2015). That World Bank study, which showed a drift over 60 years toward more abstract and self-referential language, led to its chief economist, Paul Romer, demanding that its publications reduce their use of the word “and”—which led to the reduction of that economist’s management duties (Schuessler, 2017).[3] Since retirement, Moretti has gone to Lausanne and is helping set up a new digital humanities program at ETH, the premier Swiss polytechnic.

Ted Underwood at the University of Illinois; David Bamman, University of California; and Sabrina Lee, University of Illinois used a machine learning algorithm to examine characters and authors in 104,000 novels. They noticed some unexpected trends: between 1800 and 1970, women decreased from 50 percent to 25 percent of authors of published novels, though that proportion later picked up: by 2000 they represented 40 percent. Women characters suffered a similar decline. Descriptors of women changed, too. (Eschner, 2018)

A literature post-doc at the University of Notre Dame, Dan Sinykin (2018) wrote an essay for the Perspective section of The Washington Post, which begins:

I earned a PhD in literature the traditional way, reading a lot and reading carefully. By the end, though, I began to wonder at the provenance of the books I studied. What led them to me? What forces guided me to read one book and not another? Hoping to find out, I followed the money. In 1960, basically every U.S. publisher was independent, not owned by a greater entity. By 2000, 80 percent of trade books were published by six global conglomerates. What had the shift done to literature?

Making sense of a problem at that scale would require tracing trends and patterns across thousands of books, a feat beyond the capacities of a single human mind, but computational analysis offered a way. Sinykin’s examination is underway.[4]

Most striking, many of these digital humanities studies are open. Anyone with the interest or the skills can participate. Projects are led by scholars but often gratefully crowd-sourced. This is very different from the humanities where I came of age, and a priestly caste ruled.

MIT, in announcing its new College of Computing, to begin in the fall semester of 2019, said its goal is to educate “the bilinguals,” the people in fields like biology, chemistry, politics, history and linguistics who are also skilled in the techniques of modern computing that can be applied to their fields. “We’re excited by the possibilities,” Melissa Nobles, dean of MIT’s School of Humanities, Arts and Social Sciences. “That’s how the humanities are going to survive, not by running from the future but by embracing it” (Lohr, 2018).[5]

Some in the humanities find the whole digital project worrying. They fear being disintermediated, or left out. They fear that scholarly or aesthetic judgments, which should be made by specialists, will be made by computer programs instead. They fear that the aesthetic encounter between humans and art will somehow disappear. I’m sanguine about the aesthetic encounter between humans and art, but the rest remains to be seen.

Whether the computer is mere instrument or has larger aims in the humanities is nowhere settled. Anne Burdick and her colleagues, in their book Digital Humanities, wrote:

Digital Humanities . . . asks what it means to be a human being in the networked information age and to participate in fluid communities of practice, asking and answering research questions that cannot be reduced to a single genre, medium, discipline, or institution . . . . It is a gobal, trans-historical, and transmedia approach to knowledge and meaning-making.

Digital Humanities is itself a model, a “collaboratively crafted work,” each of the five authors originating and editing the final product, trading the manuscript back and forth electronically. The authors claim this for the digital humanities in general: they are “conspicuously collaborative and generative” (Burdick et al., 2012).

So the digital humanities thrive. Stanford claims involvement in digital humanities (although sometimes under different names) since at least the late 1980s, encompassing literature, music, history, anthropology, and much more. At Harvard, the introductory computing course—for majors and nonmajors alike—fills venerable Sanders Theater every semester. At Columbia, a course called Computing in Context aims to teach humanities students programming and computer science logic within the context of three of their own disciplines, English, history, or economics. General lectures by a computer scientist occur twice weekly, and professors of English, history, and economics conduct the discussion sections.

A provocative collection of essays called Defining Digital Humanities includes a quote from Willard McCarty, a professor of humanities at King’s College London and a fellow of the Royal Anthropological Institute (and known fondly as the Obi-Wan Kenobi of the digital humanities). McCarty says, “I celebrate computing as one of our most potent speculative instruments, for its enabling in competent hands to force us all to rethink what we trusted that we knew” (Terras, Nyhan, &Vanhoutte, 2013, p. 5).

Rethinking what we trusted that we knew: this is the breathtaking challenge of the present-day humanities.

But all this has been about the encounter between the humanities and the computer. Where does AI fit in? One place is AI’s great inferential abilities, reading and drawing conclusions from all those texts. But that’s only the beginning.

3.

Defining Digital Humanities includes the essay “What Is Humanities Computing and What Is Not?” first published ten years earlier by John Unsworth, now University Librarian and Dean of Libraries at the University of Virginia. In the essay, Unsworth considers what it means to reason intelligently and how we make inferences based on what we know, questions that have been central to the humanities.

Unsworth quotes extensively from a key 1993 AI paper, “What Is Knowledge Representation?” by three MIT professors of computer science: Randall Davis (a former PhD student of Ed Feigenbaum’s), Howard Shrobe, and Peter Szolovits. In quoting the paper, Unsworth shows that artificial intelligence has brought these questions from the humanities into sharper focus because AI has had to deal with the same questions more exigently and precisely.

The humanities are very much about the representation of knowledge, Unsworth says, and it’s time humanities scholars acknowledged that: “In some form, the semantic web is our future, and it will require formal representations of the human record. Those representations—ontologies, schemas, knowledge representations, call them what you will—should be produced by people trained in the humanities.” I’d reply that representation has not been neglected in traditional humanities studies, but it was mostly about identifying genres and styles, or naming movements: allegory, the novel, irony, Modernism, post-Impressionism, identifications that were often vague and elastic.

Why does it matter? It matters, Unsworth says, because we are entering a new world. To navigate this new world, we need formal representations, which must be computable, because the computer mediates our access to this new world. Finally, those formal representations must be produced first-hand by those who know the terrain. Yes, he concedes, these will be maps, and maps are always schematic and simplified, but that’s what makes them useful.

Ontology—the nature of being, or existence—is seldom addressed outside a philosophy classroom, but it emerges as deeply important in computational models, because certain ontological questions must be answered: What exists? What categories can we sort existing things into? What’s universal and what’s specific? How can a body of knowledge maintain its consistency? AI has sharpened these questions for its own purposes and is useful in showing the humanities how to ask and answer such questions in computational models.

Ten years after the essay’s first publication, Unsworth added a commentary to his paper in praise of AI as a humanities model:

It seems important to establish that ‘humanities computing’ is not just an instrumental term, with the focus on using the computer, but an intellectual activity in its own right. Or maybe not exactly in its own right: as an intellectual activity it appears to require validation in terms of another field of inquiry (artificial intelligence). (Terras, Nyhan, &Vanhoutte, 2013)

Well, now.

4.

There’s much to celebrate in these new efforts and what they represent: a recombination of scholarly pursuits from fields across the spectrum. As with all recombinations, new things will appear: new tools, new points of view, new knowledge.

For instance, for more than a decade, David Blei, a Columbia University computer scientist, has developed LDA (latent Dirichlet allocation), a powerful statistical tool for discovering and exploiting the hidden thematic structure in large archives of text. LDA aims to capture the intuitive belief that documents exhibit multiple topics, and each document in a large collection can be situated. Contrary to human-like approaches, LDA assumes that text can be considered a “bag of words”—the word order in a given document doesn’t matter, and neither does the document order in a collection. In short, LDA has no semantic understanding.

LDA has proved to be deeply helpful in teasing out hidden themes in large collections of documents, from survey data to population genetics. But the applications that amaze Blei most are in the digital humanities: historians and English professors and folklore scholars, who want to find patterns—recurring themes—that they wouldn’t otherwise have noticed (Krakovsky, 2014). For example, Blei says:

Matt Connelly, a historian at Columbia, studies the history of diplomacy, and he has a dataset of all the cables sent between different diplomatic stations in the ‘70s. He could use LDA to analyze it, or he could say ‘I know something about this,’ and I can sit down with him and build a topic model based on what he knows about the data. Finding the hidden structure that this data exhibits is a computational problem called inference—the problem of computing the conditional distribution of the hidden variables, given the observations. (Krakovsky, 2014)

Some humanities scholars have seized LDA as another of many lenses to discover patterns that wouldn’t be found by close reading and moreover, to make tractable uncovering the themes in thousands, perhaps millions, of documents, a job beyond any single human head, or team of human heads.

Inference was the great task of early expert systems of the 1980s, but those systems were painstakingly handcrafted. Nowadays, far more complex and sophisticated algorithms, infinitely faster processing, bigger memories, and orders of magnitude more available data have changed the game, an example of quantity that transforms the quality of research. Human experts like the historian Connelly can certainly help prune the search space, the way experts once contributed heuristics to simpler intelligent programs, but the machines obviate year-long trips to obscure archives and libraries (a pity!) at the same time they tease out themes in our own productions that would be difficult, perhaps impossible, for us to see otherwise. They do it in ways unlike how we think. It gives me delighted pause.

On the other hand, Blei pointed out to me in a talk we had, in the old days, you were limited by your data in a good way. Your models were parsimonious. Massive data sets change the game. Models are more inclusive, perhaps, but also more unruly, vexatious, and by virtue of their provenance, harder to validate. Machine learning still struggles with data that can’t be quantified, Blei said, and often shoehorns what was once a problem of prediction, ML’s earliest task, into problems that have little to do with prediction. A human intelligence must still look at the results of LDA, and decide what they signify (as humans must supply labels to images for ML).

Blei told me he loves to go to digital humanities workshops and discover what scholars are up to and what they need. His findings push him, and his students, to do research that makes better and more useful tools for research. “It’s a wonderful feedback loop for us.”

No mere trend, Wendell Piez at the University of Illinois assures us, digital humanities are humanities in the digital age. Strange as this all may seem, he argues, we’ve been here before:

Digital humanities represents nothing so much as the humanistic movement that instigated the European Renaissance, which was concerned not only with the revival of Classical scholarship in its time but also with the development and application of high technology [printing] to learning and its dissemination. Scholar-technologists, such as Nicolas Jenson and Aldus Manutius designed typefaces and scholarly apparatus, founded publishing houses, and invented the modern critical edition. In doing so, they pioneered the forms of knowledge that academics work within to this day. . . (Terras, Nyhan & Vanhoutte, 2013) [6]

Digital humanities are for the generations of students, eventually future scholars, for whom computers are not a specialized tool, but “part of the tissue of the world,” writes Julia Flanders, head of the Digital Scholarship Group at Northeastern University. Moreover, because digital storage is so cheap (cheaper than making decisions), neglected and otherwise overlooked works are digitized and accessible, aggregated “into noticeable piles, so minority literatures, non-canonical literary works are now visible.” (Terras, Nyhan, &Vanhoutte, 2013) Texts outside the canon—the largest body by far of human texts—are what Thomas Leonard, former University Librarian at Berkeley once told me he calls “the great unread.” Until now, they’ve been all but inaccessible, whether because they were filtered by publishers who calculated a book was inappropriate for commercial publication (e.g., because the book’s topic was too geographically localized, or its language had insufficient readers to be profitable). Self-publication allows innovation that can evade the conservatism of commercial publishers and editors, but can also lead to prose that in any language is downright unreadable.

A glance at any of the new books or scholarly journals devoted to the digital humanities confirms that a new field has lively differences among its present practitioners, to say nothing of what its detractors and scoffers say. “But the intellectual outcomes will not be judged by their power or speed, but by the same criteria used in humanities scholarship all along: does it make us think? does it make us keep thinking?” Flanders adds.

But these examples are the humanities seizing and employing digital tools of many kinds from information processing. AI professionals, for their part, implore the humanities to contribute to the AI enterprise, to make the project more successful and ethical, and to improve it in every way for human benefit (AI Index, 2017).

5.

We come back to C. P. Snow and his Two Cultures challenge. We might go further back, to Thomas Henry Huxley three-quarters through the 19th century (a naturalist known as Darwin’s bulldog) who claimed the quarrel actually began in the 18th century between partisans of ancient literature versus those of modern literature and shifted in the 19th century to humanities versus the sciences. Huxley observed this when speaking at the inauguration of a science college, eventually to be the University of Birmingham. Snow merely revived the theme in the 1950s. My college freshman reading included Huxley’s speech, but it fell on blind eyes then (Huxley, 1875). Now I see the sciences and, miraculous to say, engineering (that Cinderella of the universities, eternally sweeping up ashes, its neglect a result of 19th-century Romanticism, which defined anything practical as negligible) both occupy the center of 21st century intellectual ferment.[7]

So the digital humanities borrow much of AI’s intellectual endowment. In AI, a Janus-faced entity emerges: its one face tools, its other, mirrors. The tools are to excavate more thoroughly and construct more precisely and inclusively who humans are, were, and might be. The mirrors reflect us and our everlasting preoccupation: it’s all about us. AI isn’t alien—though it could one day be, which would be a third face. It’s what we want to know and what we care about heightened and made more precise. How else could AI be, given our wired-in self-absorption, this remarkable adaptation that has helped us perform for eons the exacting dance between cooperation and competition among individuals and among groups?

A good thing, I’d say, for as we’ll see in a subsequent chapter, we’ll need every resource we have to meet what philosopher Nick Bostrom calls the essential task of the century.


  1. Wing has subsequently elaborated upon her ideas in many articles.
  2. You can find the Homer Multitext Project at http://www.homermultitext.org/
  3. Never mind. In 2018 Paul Romer won a Nobel Prize in Economics.
  4. Sinykin provides updates on his research via http://www.dansinykin.com/digital-humanities.html
  5. Actually, a College for Computing, but largely driven by rapid developments in AI. You may judge for yourself why this appeared in the business section, not the news or even the cultural section of the newspaper, even though the announcement stressed that this is a major intellectual turning point for the Massachusetts Institute of Technology.
  6. Quotations from Piez and Flanders appear in Terras, Nyhan, and Vanhoutte, 2013. The quotation from Thomas Leonard is from an author’s interview.
  7. Huxley on the topic: “How often have we not been told that the study of physical science is incompetent to confer culture; that it touches none of the higher problems of life; and what is worse, that the continual devotion to scientific studies tends to generate a narrow and bigoted belief in the applicability of scientific methods to the search after truth of all kinds” (Huxley, 1875. Science and Culture, and Other Essays. Project Gutenberg, www.gutenberg.org/ebooks/52344) And so on, with bracing Victorian confidence. You could even argue that the division goes back to the Greeks, who distinguished between episteme, theoretical knowledge, and tekhne, tools, methods for achieving results.

License

Icon for the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

This Could Be Important (Mobile) Copyright © 2019 by Carnegie Mellon University: ETC Press: Signature is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, except where otherwise noted.

Share This Book