I was recently invited to provide an intro to digital humanities tools and methods to an English graduate class. This article is a transcript of that presentation (slides are on Prezi and are available here), with minor edits to accommodate text-based webbiness. If you don’t want to read the epic text, you can jump right to my public Zotero collection of resources.
Let’s start with Judy Malloy, whose childhood fascination with painting and visual art led to a degree in literature in the early 1960s, work in studio art and art history, and a job at the Library of Congress. In the 1970s her passion and work as, essentially, a librarian and a technician, led to the creation of a series of artists books. Judy was interested in how human memory is fragmented, repetitious, and non-sequential, and her art books explored that.
For example some of them used library catalog cards to combine text and imagery in non-linear ways, letting people craft their own paths through material. In the early 80s she played with battery-operated address books that displayed images or text when buttons were pressed.
At about the same time she started working with a personal computer and started playing with databases; she realized that she could use them to further her work in non-sequential narratives. In 1986, Judy created Uncle Roger – a project that became, as far as we know, the first-ever commercially-published piece of electronic literature.
I’m in the process of introducing the “digital humanities” and I began without mentioning the term because actually don’t like defining it. In literature circles—where some say the field was born—there is a lot of pushback on the whole concept. “Defining the digital humanities” is really a field unto itself. There are whole books written on the subject. I would be foolish and arrogant to provide a concise definition now.
When the term emerged (somewhere around 2001 or so, as an evolution from other terms like humanities computing), the notion of “digital” content and tools was still somewhat novel in the humanities and social sciences. Today almost everything is digital, almost by default: in fact, some say we now live in a post-digital society.
Some define “digital humanities” as a way of bringing computers and technology to the humanities. I have an engineering and high tech background, though, and I disagree. To me, the field is actually about bringing the arts, humanities and social sciences to the world of technology. The order matters.
That is, after all, what artists like Judy Malloy have done for their entire careers. She has questions and curiosities, and she explores them through her work and her art. Her questions might be things like:
- What is reading?
- What more can you do with a storytelling when you have interactive, visual interfaces?
- What gets lost when it’s not in a book, or when you’re not even sure how the thing works? What does it mean to explore the interface to a story?
- How do the form of the story and the story relate?
These questions are deep. They illustrate the complex relationships between technology and human thought. What I really want to do with this article is to encourage you to play, and to explore your own deep thoughts and questions.
There are three things I hope this session will allow you to do:
- Understand the scope of digital methods and their impact on contemporary scholarship
- Give examples of digital methods applicable to English graduate work
- Explain where digital methods can fit into your research process
Where does digital fit?
Digital humanities is broad and boundless. It comes with a blessing: you’re welcome to explore and use its methods anywhere and everywhere. I’m about to give you a whirlwind overview, and then afterwards I’ll circle back and focus on things later are easy to jump in and play with.
What about using “digital” as input?
You can collect digital data from digital archives, or from digital other sources such as websites or social media platforms like Twitter or Reddit. You can take “analog” sources, like old physical manuscripts, and convert them to digital versions. This could be simple scanning, or it could be using techniques like text recognition to pull text from images. You can even transcribe data from one format to another, like taking the audio from a podcast and converting it to text.
There is also using “digital” as output
You could compose interactive digital fiction. Write content over time on a blog. You could create a multimedia piece that includes video. Record and distribute a podcast. Curate an online archive or a collection.
Those are just some examples, but what is nice about these forms of “output” is that you can make them public and encourage people to respond and participate.
Working with text
There is a long list of techniques here, some of which you probably already know. Popular ones include:
- close reading supported by technical tools
- distant reading by processing large amounts of text to explore patterns and structures.
- You can create visualizations that summarize a text:
- You can create maps to:
You can examine material from almost any perspective, remix it to create new material, or generate new narratives.
But wait, there’s more!
You can collaborate with other people on any of the things I’ve mentioned so far:
- create shared annotations an commentary on close readings of text
- manage projects with people from around the world
- use collaborative authoring tools to get feedback and commentary on your work as you’re doing it.
Moreover, we can always examine the pros and cons of the methods we’re using, or try variations, or critique existing work. In fact, one of the biggest anxieties in the digital humanities is over how easy it is to create projects that never end because they keep creating questions and possibilities.
At every step of any form of research process you can imagine, there are methods you can try and tools you can use.
This leads us to what is common to all this work. Josh Honn has put forth a set of “values” for all of this, based on some earlier work by Lisa Spiro. These values are part of a much larger shift in how people are thinking about scholarship and its connections to society.
- The field is experimental and somewhat playful. It encourages looping back and revisiting things. It encourages and celebrates trying things out and, yes, failing.
- It is intensely collaborative. Though the tools I’m going to show you are intended to be easy to use, you’ll see that many of them encourage working with others. And if you have the pleasure of getting involved with larger projects, you’ll see that it encourages working across disciplines.
- This work asks us to think about what scholarship looks like. Does it have to be a 300-page dissertation? Or can it be non-linear? In the form of a podcast, or a multimedia piece, or a virtual reality video game? If it’s not written out, how can it be peer reviewed?
- It leans towards openness and accessibility. This work is usually intended to be shared. More and more, it is crafted in a way that is meant to be understood and used by the general public. People want their work to be remixed and reused. They want their colleagues to interact and engage and respond.
- And of course, it’s critical and grounded in theory, just like the humanities have always been. Existing theories can be applied; new theories can be developed; and everything can be constantly interrogated to examine what is absent, what is under- or over-represented.
…and the dark side
If all of this sounds a little too optimistic and utopian, you have good company.
Novelist Stephen Marche wrote a critical essay in 2012 called Literature is Not Data. Though he doesn’t hate practitioners, he fears that using algorithms to look at literature is reductive; it prefers examining the forest over appreciating the trees. Timothy Brennan echoed this in 2017, and added that we tend to get too excited about technology and all the awesome things it can do, without asking “why” enough.
There has also recently been the claim that these methods are inherently neoliberal, celebrated because technology is finally creating “practical” value from the humanities by creating marketable technology projects and finding “efficiency” through the use of data.
More recently, Autumn Koe-Schnell celebrated the work of feminist and indigenous scholars, but pointed out that there is still much more inclusion required.
I am not going to contest any these claims. I think they’re all valid and demand thoughtful discussion—the last point especially. Most of the time when people talk about the “history” of digital humanities they start with Father Busa. He started building a concordance of the works of Thomas Aquinus in 1949. But guess what? Photos of the people who actually did the bulk of the work—Italian women—were always in Busa’s personal archive but weren’t published until Melissa Terras posted them on her blog in 2013.
“Easy” methods overview
Let’s get specific, now. Since this is a research methods class and the aim is to incorporate some digital components int your work, I’m going to narrow all of this down to a few core methods for English studies, with a focus on methods that are easy to incorporate.
When I say “easy” I don’t mean trivial, but that my focus will be on tools that do not require you to be a skilled programmer or need a whole other course to explore. What I really mean is that these are methods that are “ready to go,” as in there are some really great tools that you can jump right into using.
The first area I’ll explore is annotation, which can help with “close reading,” but is also related to more specific methods like text encoding. Annotated editions of texts have always been popular, but web-based technologies, in particular, have made these techniques more approachable and collaborative. I have several examples to share here.
Poetry genius – genius.com – has turned into more of a song lyrics site, but it also features poetry. Lyrics are posted and anyone can add annotations to them, which are rated by other users on the site.
Another really interesting example is Hypothes.is, which does similar things but does it as an “overlay” on top of ANY document on the web—even PDFs files. You can respond to other people’s annotations, add links and images, or even make annotations open for public response.
It would be bad for me not to mention the Text Encoding Initiative (TEI), which is all about making texts machine-readable. It’s actually a whole framework, with different schema developed for different fields of study. There are tools you can use to encode text. Working with the tools is a great exercise, and there are many projects that exist just create encodings for other people to use. Generally, though, text is encoded so that it can be processed further in other ways. Many, many, many projects rely on TEI to work.
One example is the Confederation Debates project from UVic, which uses TEI-encoded documents to explore the discourse and debate around the formation of Canada.
The annotation systems I’ve mentioned are collaborative—in fact, more and more online tools are as a default—but here I want to highlight two tools intended specifically for collaboration.
Take a look at Humanities Commons, with serves as a hub for communication between practitioners. Not only can you connect with people doing similar work, but you can share files. The real reason I love this site though is because it implements a tool called “Commons in a Box” that allows for really interesting collaborative authoring.
One example is Kathleen Fitzpatrick’s new book, Generous Thinking, which she wrote in a totally open way. All of the content was visible online as it was being written, and she invited scholars to read, comment, and critique her writing as she progressed. The result was a kind of “peer review” that took place while the book was being written, which is part of the larger debate about new forms of open, public scholarship.
A general resource for connecting scholars is Slack, which is a terrific online communications platform. There is a an open DH community on Slack, with more than 50 channels dedicated to specific topics including text analysis, text encoding, and visualization.
Visualization and Mapping
You can do some really interesting things with visual explorations of text. For today I’ve lumped together general visualizations with mapping. This may be seen as a horrific decision, because each is a huge topic full opportunities and options, but what unites them is their exploration of sight and space.
In this category we have tools like infographics, which can be used to present plot summaries, thematic analyses, critiques, etc. in a condensed way. We have creative visualizations, where some aspect of the text is used to inspire or to directly-generate visual interpretations. And then we have a whole universe of spatial mapping projects, where some aspects of literature are placed on maps.
An advanced example of how far you can take this is work is a Geographical Information Systems (GIS) project where paths of two travel writers were mapped out. The researchers compared the travels of the writers and examined how topography and place names may have influenced the routes they took and the choices they made.
There are some accessible visualization tools I’d love to highlight here.
Next, a relatively easy and fun way to explore map-based visualizations is with ESRI StoryMaps. The current version of the tool gives you eight different “templates” for creating narratives that incorporate images and maps, and there’s a massive gallery of examples online. There’s a beta version of an updated tool that is even more flexible.
As with distant reading and text analysis work, there are pitfalls with information visualization and mapping that are rooted in bias and subjectivity. There’s an excellent article in the most recent Digital Humanities Quarterly that explores the ethics of data visualization, and it does a really great job of breaking down the kinds of misrepresentations that can easily creep into information visualizations.
This is a way of working with text whose foundations have largely been associated with Franco Moretti. Moretti (who founded the Standord Literary Lab but is recently more controversial for other reasons) argues that you can learn a lot more about texts by aggregating them and studying them on masse. It could be done at the level of a single text, like a book, but it can also be done across an entire corpus of text: all the works of Jane Eyre, collected works of 20th Century Irish playwrights; “tragedy”…. You name it. Source texts can be almost anything you define.
With distant reading, scholars are largely interested in patterns. What ideas appear in a text? What words are used to describe those ideas? What feelings does the text evoke? How does language, idea or sentiment change throughout the text?
There are many different methods you can use to do distant reading work: text analysis, sentiment analysis, clustering, etc. A very popular and easy-to-use tool for exploring tools for this kind of work is Voyant Tools, available for free on the web and developed by Stefan Sinclair and Geoffrey Rockwell. It gives you dozens of tools for analyzing documents or groups of documents, visualization tools, and statistics.
Every tool has “help” text that describes what the tool does, and the full documentation goes into the tools in detail. The web-based system has Shakespeare’s canon and Jane Austen’s books built in, but of course you can give it almost any text. It has excellent documentation and lets you dig into text analysis quickly.
There are many, many critiques of distant reading as a method, ranging from the biases introduced by the selection of texts for a corpus, to questions of how scale influences analysis, to the perception that quantifying text in this way makes its observations objective and “scientific.”
Composition & Publishing
The concept of collaboration leads naturally to publishing and composition, which also opens up scholarly work and scholarship to new modes of writing and new forms of engagement. I’ll start off by pointing out that there are more and more free services that make it easy to create websites, and those are great for communicating about projects. If you want to do something in the form of a blog, for example, you can create a free site at Wordpess. Wix and Weebly are also very popular, as long as you don’t mind having some of their branding and ads visible on their “free” offerings. If you’ve never made a webpage before, you’ll be blown away at how easy it is to build one at sites like these. You can really focus on content instead of messing with code.
Let’s say you want to create a scholarly publication but want to do it in a non-linear and interactive way. Scalar is an incredible tool for this, and is also very easy to use, but I’d definitely recommend looking at some tutorials before you jump in because of the way it handles information. You can create any number of “paths” through your content, or let people wander through it by branching off using keywords.
Scalar is one of those tools that can help inspire and inform new ways of thinking about publishing, and is a good example of how tool, method and methodology can blur in the digital humanities. Getting in and playing with Scalar on a simple project is a great way to explore what’s possible: the more you work with the platform and tinker tinker with it, the more you will think about different ways of organizing and presenting material.
One example I want to show here is Jason Mittell’s Complex TV, which uses multimedia elements and non-linear organization as a platform to discuss complex narratives in television. Scalar was designed by scholars, for scholars, so it supports complete academic citations, Dublin Core metadata for content, and independent version control for every object.
Another fantastic tool, more in the realm of digital collections, is Omeka. There is a free version you can use on Omeka.net for simple projects, or you can set it up on a server for more complex projects. The free online service tool lets you create digital exhibits and simple web pages… but if you have access to a full server you can add modules that support maps, timelines, and more.
One Omeka project example is the University of Alberta’s online exhibition of Tinctor’s Foul Treatise, which is a 15th century manuscript intended to convince French countrymen to hunt down and persecute witches.
I have one more tools to show you that designed to help create non-linear, interactive stories. You can create scholarly content with it, but it’s also one of many tools for creating e-lit (e-literature) or digital fiction.
Twine is an open-source tool that is being used by many digital fiction writers. Some Twine projects are called “games,” because they can incorporate some neat interactive features that blur the line between literature and gaming. Here is one quick (literally) example: Queers in Love at the End of the World, by anna anthropy, lets you decide what you want to tell your love when the world is ending. As soon as you start, a timer ticks down from 10 seconds to zero. The pressure of time influences choices and the story makes its point very clearly.
Twine is unbelievably flexible and power, and easier to learn than you might think. It’s also gone quite mainstream: if you watched the Black Mirror: Bandersnatch interactive film on Netflix, you should know it was developed using Twine.
One of the biggest concerns about all of these systems, though they are very powerful and flexible, is that it’s quite easy for a simple project to explode into a very large one. As an example, two of my colleagues worked together to build a short interactive story in Twine, just to demonstrate it, and ended up spending about 40 hours developing and working through all the possible paths of a simple narrative. My recommendation, with all of these tools, is that if you want to try them out, keep. It. Simple.
If what I’ve talked about so far is the tip of the iceberg, this will give you a sense of the rest of it. I want to show you TAPOR (the Text Analysis Portal for Research). The site features a collection of almost tools for working with text. The categories on the right are aligned with common methodologies, and when you click on one you get a set of related tools that you can browse through. There’s also a search page you can use to search for specific tools. This is a crowdsourced site, so I encourage you to log in there, rate the tools you’ve worked, add new ones you know of, etc. Note that the site also serves as a historical record of tools that are no longer used, so be aware of that when searching.
I started out with the role of digital technology in the humanities, moved into a very high-level and very fast look at aspects you might want to explore, and then circled back to talk about some specific methods that may be of immediate use to you.
It was lengthy, and possibly even overwhelming, but my hope is that one or two things might have piqued your interest and encouraged you to dive in and play. I have created up public Zotero collection related to this talk, containing a list of tools, tutorials, theoretical papers, projects examples, and more. You need to register for a free account to get at it (sorry), but you are that you are free to steal from and contribute to the collection.
The range of things you can do is vast, and it might seem daunting. However, some of the most popular tools (like Voyant) are popular because they’re accessible, easy to learn, and often fun to play with. So don’t be shy. Explore.