Automatic extraction of multiple underlying causes from textual death records

Graham Njal Cameron Kirby, Masih Hajiarabderkani, Alan Dearle, Jamie Kirk Carson, Fraser Robin James Dunlop, Christopher John Lloyd Dibben, Lee Williamson

Research output: Contribution to conferencePosterpeer-review

Abstract

Data sets containing natural language strings are increasingly becoming available as outputs from various international initiatives to digitise historical population records. Analysis of such records, for example, historical causes of death, is facilitated by classification to standard systems such as ICD-10.
Most death record systems include multiple causes of death, typically a primary cause and optionally a number of contributing secondary causes. For example, the primary cause of death might be heart failure, contributed to by some underlying chronic condition.
In modern records these separate causes are clearly differentiated by being entered in different fields on the recording form. In some historical records, however, there was no imposed structure, with the person recording the death having the freedom to use any form of language.
The Digitising Scotland project is in the process of transcribing all Scottish birth, death and marriage records from 1855 to 1973. Here we describe our approach to automatic extraction of multiple causes of death from the approximately 11M death records.
Original languageEnglish
Number of pages1
Publication statusPublished - 26 Aug 2015
EventFarr Institute International Conference on Data Intensive Health Research and Care - St Andrews, United Kingdom
Duration: 26 Aug 201528 Aug 2015
http://farrinstandrews.org/

Conference

ConferenceFarr Institute International Conference on Data Intensive Health Research and Care
Country/TerritoryUnited Kingdom
CitySt Andrews
Period26/08/1528/08/15
Internet address

Keywords

  • machine learning
  • classification
  • death record
  • ICD-10
  • multiple causes

Fingerprint

Dive into the research topics of 'Automatic extraction of multiple underlying causes from textual death records'. Together they form a unique fingerprint.

Cite this