TitleDigging Into Data White Paper: Trading Consequences
Publication TypeReport
Year of Publication2014
AuthorsKlein, Ewan, Alex Beatrice, Grover Claire, Tobin Richard, Coates Colin, Clifford Jim, Quigley Aaron, Hinrichs Uta, Reid James, Osborne Nicola, and Fieldhouse Ian
InstitutionTrading Consequences Project

Scholars interested in nineteenth-century global economic history face a voluminous historical record. Conventional approaches to primary source research on the economic and environmental implications of globalised commodity flows typically restrict researchers to specific locations or a small handful of commodities. By taking advantage of cutting-edge computational tools, the project was able to address much larger data sets for historical research, and thereby provides historians with the means to develop new data-driven research questions. In particular, this project has demonstrated that text mining techniques applied to tens of thousands of documents about nineteenth-century commodity trading can yield a novel understanding of how economic forces connected distant places all over the globe and how efforts to generate wealth from natural resources impacted on local environments.

The large-scale findings that result from the application of these new methodologies would be barely feasible using conventional research methods. Moreover, the project vividly demonstrates how the digital humanities can benefit from trans-disciplinary collaboration between humanists, computational linguists and information visualisation experts.

Important facets of this project include:

  • After considerable difficulty and lengthy negotiations, we acquired significantly more historical documents than we originally expected. The full corpus exceeds 7 billion word tokens, which is very big data by humanist standards.
  • Lexicon creation proved to be one of the most challenging and interesting aspects of the project, requiring interdisciplinary skills in archival research, linked data, text mining and knowledge of the historical context.
  • The project has identified almost 2,000 commodities that were regularly traded in the nineteenth century, two orders of magnitude more than are standardly studied by historians.
  • Historical sources that have undergone Optical Character Recognition (OCR) are challenging to process and this, in combination with the particular questions asked by historians, required the text mining team to develop new approaches and new text processing tools for the project.
  • The geospatial nature of the data lent itself well to an interactive visualisation that displays commodities in relation to locations on a world map. The same commodities can also be visualised on a timeline to show how trading evolved over the nineteenth century.
  • The relational database and visualisation software is well advanced and ready for use in historical research. The database can by used by historians for unguided research aimed at developing new research questions and identifying crucial primary source texts related to a specific commodity