As anticipated on this very blog, I recently spent a week in Indianapolis attending a workshop on computational text analysis at HILT 2016. We spent our time surveying a number of different tools, techniques, and concepts related to text analysis, so I walked away with a greater appreciation for data cleaning, Weka, HathiTrust, metadata, Python, and much more. The most frustrating part of the workshop was that we visited each topic so briefly and that we had so few opportunities to apply these techniques to our own work. I can’t fault the workshop organizers for these decisions—helping participants take a dozen wildly different datasets through deep dives into a particular technique would have been difficult—but I was excited enough by a lot of the concepts we covered that I was itching to try them out myself. This was the most true of topic modeling, a technique for identifying different “topics” (or themes, or discourses, or…) in the documents of a particular corpus. As we tried out this technique on a corpus of slave narratives, I was amazed at how an algorithm was able to tease out what seemed to be clearly distinct themes within and across these narratives. One of our instructors warned us against being too impressed, explaining that the underlying math was actually really simple. He certainly had a point, and I know the importance of not being blindly wowed by what an algorithm seems to do, but to not think of topic modeling as amazing because it really comes down to conditional probabilities seemed to me akin to choosing to not recognize the wonder of the French language because at its roots, it’s an arbitrary collection of mouth sounds. That said, neither French nor topic modeling can be really useful or truly amazing for me unless I spend some time figuring out how it works. I went to HILT hoping to learn a couple of neat tricks, but I came away convinced that topic modeling could have some real value for me. Over the past few weeks, I’ve added to my notebook full of dissertation brainstorming scribbles a number of references to topic modeling, and over the next few months, I hope to learn more about the process, dive more into the details, and make this a part of the work that I do.
In a few days, I’ll be heading down to Indianapolis to attend a workshop on computational text analysis that runs from June 13th through June 16th. This shouldn’t come as too much of a surprise to anyone who’s had to listen to me ramble about my research: I’ve been waist-deep in R for a while now, there’s a digital methods category on my research blog, and I certainly haven’t been shy about wanting to pick up some text analysis skills to add to my repertoire. The one thing that could turn heads, though, is that I’m a PhD Candidate in an Educational Psychology and Educational Technology program headed to a Humanities Intensive Learning and Teaching workshop. In many ways, though, this venture into Digital Humanities really isn’t that surprising. When, as a junior at Brigham Young University, it became time to ditch my major and find a new one, I was sorely tempted by Computer Science before settling on French Teaching. When, a year later, I learned that BYU’s CHum minor (which has since been replaced by a DH one) might let me combine both fields, it was tempting (but ultimately impractical) to change course again. To top it all off, all that work I did typing up all the dialogue in Astérix chez les Helvètes for a friend’s MA thesis looks—in hindsight—like it was going to be used for a text analysis not too different from the ones I’ll be learning to do next week. I even managed to attend some sessions at HASTAC last year under the not-entirely-wrong impression that it was an educational technology conference like the ones I’d been attending: I’d first heard of HASTAC because of their work with Mozilla Open Badges, one of the ideas that first piqued my interest in ed tech. So, maybe HILT is the natural next step given this series of near-misses with the field of digital humanities. Or, more accurately, maybe this is part of my realization not only that an educational technology researcher can move in DH circles but also that I’ve been doing so for a while and would benefit from taking a closer look. After all, I’m constantly clamoring for education departments to not forget history, languages, and literature as they focus ever more on STEM, the MSU DH Slack has been a great resource for me to lurk in, and there’s something about the awe and wonder of studying teachers’ use of Twitter that reminds me a lot of the feeling I got studying French history and culture as an undergrad. I’ve still got a foot planted in the humanities, and my research is increasingly being defined by the digital, so going to HILT, taking a DH seminar in the Fall, and continuing to explore this new world (for me) is probably the way to go.