Projects per year
Abstract
‘Gold-standard’ data to evaluate linkage algorithms are rare. Synthetic data have the advantage that all the true links are known. In the domain of population reconstruction, the ability to synthesise populations on demand, with varying characteristics, allows a linkage approach to be evaluated across a wide range of data sets.
We present a micro-simulation model for generating such synthetic populations, taking as input a set of desired statistical properties. It then outlines how these desired properties are verified in the generated populations, and the intended approach to using generated populations to evaluate linkage algorithms. We envisage a sequence of experiments where a set of populations are generated to consider how linkage quality varies across different populations: with the same characteristics, with differing characteristics, and with differing types and levels of corruption. The performance of an approach at scale is also considered.
We present a micro-simulation model for generating such synthetic populations, taking as input a set of desired statistical properties. It then outlines how these desired properties are verified in the generated populations, and the intended approach to using generated populations to evaluate linkage algorithms. We envisage a sequence of experiments where a set of populations are generated to consider how linkage quality varies across different populations: with the same characteristics, with differing characteristics, and with differing types and levels of corruption. The performance of an approach at scale is also considered.
Original language | English |
---|---|
Publication status | Published - 11 May 2017 |
Event | Workshop for the Systematic Linking of Historical Records - University of Guelph, Guelph, Canada Duration: 11 May 2017 → 13 May 2017 http://recordlink.org |
Workshop
Workshop | Workshop for the Systematic Linking of Historical Records |
---|---|
Country/Territory | Canada |
City | Guelph |
Period | 11/05/17 → 13/05/17 |
Internet address |
Keywords
- record linkage
Fingerprint
Dive into the research topics of 'Evaluating record linkage: creating longitudinal synthetic data to provide gold-standard linked data sets'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Administrative Data Research Centres: ESRC - Admin Data Service - Scottish Consortium
Kirby, G. N. C. (PI)
1/11/13 → 31/10/18
Project: Standard
-
Digitising Scotland: Digitising Scotland
Kirby, G. N. C. (PI)
Economic & Social Research Council
1/09/12 → 31/10/14
Project: Standard