The future of history
Dr Rachel Murphy, of University of Limerick’s Department of History, considers the impact artificial intelligence is having on digital historians and looks at what the future may hold
Dr Rachel Murphy, Department of History Pictures: Sean Curtin/True Media
Digital historians incorporate digital tools and technologies into their work to analyse, synthesise and present their data; adding Artificial Intelligence into the mix could lead to some exciting outcomes.
Historians have used computers in their research since the 1960s, but they were very much a minority. Now, most historians engage with the “digital” in some way.
New findings in our field are increasingly published as online articles, or open access books, making this work available to more people.
Through metadata and underlying search algorithms, we can explore library catalogues, locating a wealth of studies and data collections.
Some primary sources we work with are digitised - entire newspaper collections and other printed works can be digitally imaged and converted into machine-readable text.
Digitised handwritten sources such as letters and census returns can be transcribed by historians, specialist agencies or through crowdsourcing. But for every digitised source, there are many more located in archives and other repositories.
Conversely, the advent of email, digital photography, and other social media has led to a proliferation of ‘born digital’ sources.
Large datasets of digitised historical materials are called ‘historical big data’ and they lead historians and others to formulate new research questions and identify new patterns in data.
Digital historians often collaborate with colleagues from other disciplines, conducting linguistic, spatial, demographic, and social network analysis.
We also work with computer scientists to develop new specialised tools. Thanks to this open, collaborative, and interdisciplinary focus historical big data is increasingly used to inform topics of current interest such as climate change, health, wealth formation and distribution, and migration flows.
AI and history
Coined by John McCarthy of MIT in the 1950s, Artificial Intelligence (AI) has only recently entered popular discourse.
Together with Machine Learning and its many variants, AI has brought a new level of investigative and even predictive power to historical datasets and documents.
A key skill of historians is to clearly articulate our research; we teach our students to write precisely and carefully. Generative tools like ChatGPT are a concern for a range of reasons including plagiarism, but this is just one, mostly negative aspect of AI.
From an information science perspective generative AI can provide fully-referenced summaries of large bodies of scientific literature based on metadata and abstracts. It can also identify experts in specific topics.
Such tools can help us to forge new - often interdisciplinary - connections, paving the way for discovering future research potential.
There will of course be inaccuracies, so expertise and curation remain key to the management of sources: we still need the human mind to critique the information we are presented with.
Critical thinking is the foundation of a history degree, and its importance grows in a digital world.
AI tools for historians
AI has already been used in many large history projects.
Traditionally, transcription has been a time-consuming process for historians, but AI is increasingly used to read handwritten sources.
The names index that underpins the U.S. National Archives official 1950 census website was developed using Amazon Web Services’ artificial intelligence. There were errors but volunteers assisted in correcting the data.
Genealogy companies Ancestry and FamilySearch provided an alternative offering, transcribing the same census using Ancestry’s AI handwriting recognition software.
It took just nine days to create over 150 million records, searchable not just on names, but on all fields. By comparison, manually transcribing the 1940 U.S. census took nine months. Again, volunteers checked for accuracy.
Transkribus, a popular AI platform, transcribes historical documents based on text recognition models (algorithms that are trained to read a particular script and language).
Machine learning can even read obscured historical texts: in October 2023 Nature reported that a machine-learning algorithm developed by Luke Farrito, a computer science student at the University of Nebraska-Lincoln, had successfully deciphered letters in unopened scrolls from Herculaneum.
Making an important contribution to art history, a multidisciplinary team led by Bradford University’s Professor Hassan Ugail developed an algorithm that could authenticate Raphael’s works to an accuracy rate of 98 percent. Using AI they determined that Raphael had not painted Joseph’s face in the Madonna della Rosa.
Finally, Ithaca is a tool that uses deep learning to predict missing words from ancient inscriptions.Historians enter the inscription, with question marks representing missing words. Ithaca provides suggestions based on over 78,000 other inscriptions.
Ultimately the historian determines the most likely possibilities, but Ithaca assists by ranking suggestions in order of likelihood, based on where and when the inscriptions were created.
These are just some of the many developments in AI for historians. But with these new techniques also come warnings.
Probably the greatest issue with using AI for historical research relates to ethical concerns
Issues
Probably the greatest issue with using AI for historical research relates to ethical concerns.
We are trained to critique our sources and be aware of inherent biases.
Often, there is a lack of transparency around machine-learning algorithms. How was the information we are presented with selected? Are there inherent biases or errors? Which source material does the algorithm draw on?
Reputable sources will explain the datasets that they use.
Security is extremely important: both data and the algorithms themselves are susceptible to cyber-attacks. The Financial Times estimated that rebuilding the British Library’s digital services following the 2023 ransomware attack would cost £6-7m.
Likewise, AI algorithms themselves can be hacked, resulting in incorrect or biased outputs which could potentially go unnoticed for some time.
Other concerns relate to the use of data; it may be acceptable to use information that is out of copyright, but how is copyright protected? What about attribution? Who owns the new data that is produced? All these aspects of AI must be addressed.
Future opportunities
Despite these potential issues, advances afforded by AI present historians with new possibilities. Our core skillset remains the same: we need to understand the sources, tools, and technologies we work with; we must critique our sources and not take everything at face value.
It would be wise for historians to become AI-literate – learning how to formulate prompts, understanding how algorithms are constructed and the underlying source material they draw on - so that we can use it with a critical eye.
Just as AI can create links between scholarly articles, so machine-learning algorithms can connect machine-readable representations of historical sources in a range of ways: thematically, spatially, temporally.
These new capabilities might highlight connections not yet considered, not just within history, but between history and other disciplines.
I am excited about where potential new collaborations might lead.