This semester I encountered an interesting scenario in an undergraduate student essay. The student had counted the number of occurrences of a couple of words in a poem and concluded that, because the counts for word 1 and for word 2 were close, two concepts the words expressed were thematically related in the poem.
What the student got wrong:
- The student failed to get the counts right because she did a simple word search that did not account for morphological/spelling variants (this was a Middle English text) or close synonyms.
- Assuming that two concepts are related thematically because the words that express them occur in near equal numbers in a text is a logical fallacy.
What the student got right:
- The student tried to apply a quantitative method to understand the text in a new way.
Of course, there is nothing specifically “DH” about what the student did, other than using a browser’s search function. Before the days when this was possible, the technique would have been called “philology”, and that too is a good thing. Regardless of whether or not the student’s inspiration was facilitated by the availability of a digital tool, I was really delighted to see a student using a methodology which is very unfamiliar to most literature students these days—even in a very limited way.
The fact that the student’s application of this methodology was not very sophisticated or critical (or possibly even aware of the use of a methodology) is largely the result of a lack of training. The title of this post derives from my thinking about the broader implications of this lack of training. If students enter the workforce with the impulse to analyse quantitatively but without learning how to do so critically, the resulting misapplications of quantitative methods could be dire. As for the “Humanities” side of Digital Humanities, it is quite clear that many types of cultural objects and experiences may be analysed quantitatively using tools familiar in the natural or social sciences but not using the same methodologies. A student with experience in the Digital Humanities receives the best training for assessing (or creating) the methods appropriate to these types of analyses.
Update: Some of the issues raised above resonate in the discussion surrounding David Brooks’ recent op-ed “What Our Words Tell Us” (New York Times, 20 May, 2013). The problems with using Google ngrams have been much discussed, but the responses to this piece seem to make for a nicely self-contained teaching unit. Ted Underwood’s “How not to do things with words” is actually almost a year older than the Brooks piece but deals with two of the three article cited by Brooks from a methodological point of view, particularly their use of “present-day patterns of association to define a wordlist that they then take as an index of the fortunes of some concept (morality, individualism, etc) over historical time.” Mark Liberman’s “Ngram Morality” draws attention to the ideological premises underlying the research of all three articles Brooks summarises. I’m going to try to make these articles the focus of discussion on the use of quantitative evidence in my Digital Humanities class next semester.