When the humanities meet big data

|
Jacob Turcotte/Staff
  • Quick Read
  • Deep Read ( 3 Min. )

How did four researchers read through 40,000 transcripts of speeches made during the early years of the French Revolution? They let a computer do it. A study published in the Proceedings of the National Academy of Sciences last month describes how specialists with backgrounds in informatics, European history, and astrophysics joined forces to develop a machine-learning program that quantifies the novelty and persistence of speech patterns. Their findings illustrate how new ways of thinking about governance emerged and spread at the dawn of our modern political era. The project is just one of many in an emerging field known as digital humanities, which brings scientists and humanists together to tackle questions in history, art, and literature from an entirely new perspective. “There’s no way that a single academic could have read all 10,000 bad pulpy novels published in the 19th century,” says Indiana University historian Rebecca Spang. “So you could ask different kinds of questions because you get different kinds of information.”

Why We Wrote This

New readings of history often reveal previously hidden insights. By enlisting computers to analyze historical texts, historians are spotting patterns in language that were once invisible.

Being a voracious reader is a prerequisite for academics in the humanities, but even the most dedicated bookworm needs to eat, sleep, and socialize.

Not so for computers, which are known for being tireless, thorough, and very fast. And, when asked the right kinds of questions, these electronic speed-readers can grasp patterns that would otherwise lie beyond the reach of human scholars.

That’s exactly what happened when a team of researchers used machine-learning techniques to plow through transcripts of 40,000 speeches in a parliamentary assembly during the first two years of the French Revolution, according to a paper published in the Proceedings of the National Academy of Sciences last month. By quantifying the novelty of speech patterns and the extent to which those patterns were copied by subsequent speakers, the researchers illustrated how much of the important intellectual work of the revolution was initially carried out in committees, rather than in the whole assembly.

Why We Wrote This

New readings of history often reveal previously hidden insights. By enlisting computers to analyze historical texts, historians are spotting patterns in language that were once invisible.

“We’re really getting a quantitative sense of large-scale patterns,” says co-author Simon DeDeo, a professor at Carnegie Mellon University and the Santa Fe Institute, a research center in New Mexico that specializes in complexity science. “There’s a lot of data here. You couldn’t have run this on a machine from 2000 or 2005.... Now you can do this on a desktop.”

Professor DeDeo received his doctorate from Princeton University in 2005 – not in European history, but in astrophysics. That was the tail of an inflationary period in DeDeo’s chosen field, and opportunities to tackle cosmology’s big questions were dwindling. “It was the end of the golden age,” he says. “I went off [and] I spent some time at the Santa Fe Institute, and that’s where I kind of converted into whatever I am now.”

The academy still hasn’t quite settled on a name for what DeDeo does, but the leading contender is “digital humanities,” a term that captures the field’s deeply interdisciplinary approach. Other digital humanities projects have brought together historians, librarians, literary critics, mathematicians, and computer scientists to analyze the complete works of ShakespeareTime magazine coversthe ancient graffiti of Pompeii, and one million pages of Japanese manga.

“One of the exciting things is, can the humanities and the sciences team up?” DeDeo asks. “There’s a huge amount of knowledge and wisdom that the humanists have that the scientists don’t.”

Digital humanities can be traced to beginnings that are as diverse as the disciplines of its practitioners. One influential figure was Roberto Busa, an Italian Jesuit priest who, beginning in the 1940s, began rendering the works of St. Thomas Aquinas into a machine-readable format. Another is Franco Moretti, a Marxist-trained Italian literary critic who argues that understanding literature comes not from a close reading of the literary canon  – literature’s equivalent to the one percent – but from a “distant reading” of the entire corpus.

Whether inspired by Thomistic completism, Marxist inclusivity, or something else entirely, digital humanities holds the potential to shift the way we look at history. “There’s no way that a single academic could have read all 10,000 bad pulpy novels published in the 19th century,” says Indiana University historian Rebecca Spang, a co-author on the French Revolution paper. “So you could ask different kinds of questions because you get different kinds of information.”

In the case of the French parliamentary assembly analysis, researchers found that, unlike Democrats and Republicans today, the bourgeoise and the aristocrats tended to use same language patterns. “There isn’t a sort of discursive spectrum that we can identify,” Professor Spang says, ”where you’ve got speakers on the right who use one vocabulary and the speakers on the left using another.”

Distant reading also results in a different understanding of the subject matter, one that is more holistic but also stands at a greater remove.

From the point of view of the computer, says Professor Spang, “it doesn’t matter what ‘ghijk’ means or says, just that it’s not ‘abcdef.’

“This kind of work is not going to give us a kind of emotionally or narratively satisfying historical explanation,” says David Andress, a historian at the University of Portsmouth in Britain and an expert on the French Revolution, “but it’s certainly going to show us things that we then have to explain, that that we then have to explore why we’ve got that result.”

This explanatory gap is why Dr. Andress doesn’t see digital humanities as a threat to traditional scholarship. “The readers of history and the general public are always going to want to have the story told to them in terms of people,” he says.

[Editor's note: An earlier version misstated the year DeDeo was awarded his doctorate.]

You've read  of  free articles. Subscribe to continue.
Real news can be honest, hopeful, credible, constructive.
What is the Monitor difference? Tackling the tough headlines – with humanity. Listening to sources – with respect. Seeing the story that others are missing by reporting what so often gets overlooked: the values that connect us. That’s Monitor reporting – news that changes how you see the world.

Dear Reader,

About a year ago, I happened upon this statement about the Monitor in the Harvard Business Review – under the charming heading of “do things that don’t interest you”:

“Many things that end up” being meaningful, writes social scientist Joseph Grenny, “have come from conference workshops, articles, or online videos that began as a chore and ended with an insight. My work in Kenya, for example, was heavily influenced by a Christian Science Monitor article I had forced myself to read 10 years earlier. Sometimes, we call things ‘boring’ simply because they lie outside the box we are currently in.”

If you were to come up with a punchline to a joke about the Monitor, that would probably be it. We’re seen as being global, fair, insightful, and perhaps a bit too earnest. We’re the bran muffin of journalism.

But you know what? We change lives. And I’m going to argue that we change lives precisely because we force open that too-small box that most human beings think they live in.

The Monitor is a peculiar little publication that’s hard for the world to figure out. We’re run by a church, but we’re not only for church members and we’re not about converting people. We’re known as being fair even as the world becomes as polarized as at any time since the newspaper’s founding in 1908.

We have a mission beyond circulation, we want to bridge divides. We’re about kicking down the door of thought everywhere and saying, “You are bigger and more capable than you realize. And we can prove it.”

If you’re looking for bran muffin journalism, you can subscribe to the Monitor for $15. You’ll get the Monitor Weekly magazine, the Monitor Daily email, and unlimited access to CSMonitor.com.

QR Code to When the humanities meet big data
Read this article in
https://www.csmonitor.com/Technology/2018/0516/When-the-humanities-meet-big-data
QR Code to Subscription page
Start your subscription today
https://www.csmonitor.com/subscribe