Magna Carta

Digitizing the Dictionary

From print to an electronic database

 

Digitizing the Dictionary

From print to an electronic database

In the nearly 50 years of drafting the Dictionary, different editorial practices and conventions have inevitably created a text that varies significantly from the earliest fascicules to the present day. Many of these have been the result of conscious decisions, other simply the result of the Dictionary being the work of many people over many years. This variation has presented the team with a considerable challenge as we move towards a complete electronic text.

Work on digitizing the Dictionary began in earnest in 2009, with a move from a traditional print-based workflow to an electronic XML-based workflow, first for material already drafted on slips but not yet keyed as electronic data, and then subsequently with the introduction of full ab initio electronic drafting. The development of a new system that would provide the benefits in consistency and efficiency of working electronically while still coping with the idiosyncrasies of the diverse material and retaining the scholarly rigour of the editorial process was technically very demanding; equally complex was its gradual implementation in place of such a well-established system with minimal disruption to the ongoing drafting work.

However, even then the majority of the dictionary's content still existed only in print — in the thirteen fascicules (more than 2,500 three-column pages containing nearly 65,000 entries) published since 1965. Once the new workflow for the remaining material to be published was fully established within the team, work began on digitising earlier fascicules. The project retained a specialist firm, Data Standards Ltd., to carry out the capture and conversion work. Following the drawing up of a detailed specification and pilot tests on sample material, this mammoth task was completed in 2011.

The challenge we then faced was to bring all of our digital data into a single, clean, dataset. In 2012, the project began examining ways of standardizing all of its data to allow the complete dictionary to be published online in a user-friendly searchable and browsable format. Shortly after the final fascicule was published in 2013, we had a full XML dataset of the Dictionary of Medieval Latin from British Sources in preparation for that eventual online publication. Our plans for developing and hosting our own online platform were discontinued in 2014 due to lack of technical support and funding, but partnerships have been established to ensure that online publication is achieved.

More information about the technical aspects of the electronic systems used by the project can be found here.

Further reading

On the digital preparation of the DMLCS in Ireland:

Anthony Harvey (2008) 'From full-text database to electronic lexicon and beyond: the role of computers in the Dictionary of Celic Latin project', Listy filologické CXXXI: 469–91.