Skip to content

Research at St Andrews

Exploiting historical registers: Automatic methods for coding c19th and c20th cause of death descriptions to standard classifications

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Jamie Kirk Carson, Graham Njal Cameron Kirby, Alan Dearle, Lee Williamson, Eilidh Garrett, Alice Reid, Christopher John Lloyd Dibben

School/Research organisations


The increasing availability of digitised registration records presents a significant opportunity for research. Returning to the original records allows researchers to classify descriptions, such as cause of death, to modern medical understandings of illness and disease, rather than relying on contemporary registrars’ classifications. Linkage of an individual’s records together also allows the production of sparse life-course micro-datasets. The further linkage of these into family units then presents the possibility of reconstructing family structures and producing multi-generational studies. We describe work to develop a method for automatically coding to standard classifications the causes of death from 8.3 million Scottish death certificates. We have evaluated a range of approaches using text processing and supervised machine learning, obtaining accuracy from 72%-96% on several test sets. We present results and speculate on further development that may be needed for classification of the full data set.


Original languageEnglish
Title of host publicationNew Techniques and Technologies for Statistics
Place of Publication
Number of pages10
Publication statusPublished - 5 Mar 2013
EventNew Techniques and Technologies for Statistics (NTTS 2013) - Brussels, Belgium
Duration: 5 Mar 20137 Mar 2013


ConferenceNew Techniques and Technologies for Statistics (NTTS 2013)

Discover related content
Find related publications, people, projects and more using interactive charts.

View graph of relations

Related by author

  1. Linking Scottish vital event records using family groups

    Akgün, Ö., Dearle, A., Kirby, G. N. C., Garrett, E., Dalton, T. S., Christen, P., Dibben, C. J. L. & Williamson, L. E. P., 25 Mar 2019, In : Historical Methods: a Journal of Quantitative and Interdisciplinary History. Latest articles, 17 p.

    Research output: Contribution to journalArticle

  2. Understanding the linking possibilities in Scottish Records and an algorithmic approach to full linkage

    Dearle, A., Kirby, G. N. C., Lee, W. & Dibben, C., 20 Jun 2018. 1 p.

    Research output: Contribution to conferencePaper

  3. Using metric space indexing for complete and efficient record linkage

    Akgün, Ö., Dearle, A., Kirby, G. N. C. & Christen, P., 2018, Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, VIC, Australia, June 3-6, 2018, Proceedings, Part III. Phung, D., Tseng, V. S., Webb, G., Ho, B., Ganji, M. & Rashidi, L. (eds.). Cham: Springer, p. 89-101 12 p. (Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence); vol. 10939).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  4. Learning From Past Links: Understanding the Limits of Linkage Quality

    Akgun, O., Dearle, A., Garrett, E. & Kirby, G. N. C., 6 Sep 2017.

    Research output: Contribution to conferenceAbstract

ID: 44100399