Skip to content

Research at St Andrews

Validating Synthetic Longitudinal Populations for evaluation of Population Data Linkage

Research output: Contribution to conferencePaper

DOI

Open Access permissions

Open

Standard

Validating Synthetic Longitudinal Populations for evaluation of Population Data Linkage. / Dalton, Thomas Stanley; Kirby, Graham Njal Cameron; Dearle, Alan; Akgun, Ozgur; MacKenzie, Monique Lea.

2018.

Research output: Contribution to conferencePaper

Harvard

Dalton, TS, Kirby, GNC, Dearle, A, Akgun, O & MacKenzie, ML 2018, 'Validating Synthetic Longitudinal Populations for evaluation of Population Data Linkage'. DOI: 10.23889/ijpds.v3i2.504

APA

Dalton, T. S., Kirby, G. N. C., Dearle, A., Akgun, O., & MacKenzie, M. L. (2018). Validating Synthetic Longitudinal Populations for evaluation of Population Data Linkage. DOI: 10.23889/ijpds.v3i2.504

Vancouver

Dalton TS, Kirby GNC, Dearle A, Akgun O, MacKenzie ML. Validating Synthetic Longitudinal Populations for evaluation of Population Data Linkage. 2018. Available from, DOI: 10.23889/ijpds.v3i2.504

Author

Dalton, Thomas Stanley ; Kirby, Graham Njal Cameron ; Dearle, Alan ; Akgun, Ozgur ; MacKenzie, Monique Lea. / Validating Synthetic Longitudinal Populations for evaluation of Population Data Linkage. 1 p.

Bibtex - Download

@conference{c4f1f1048f82470c9f62be594dd5a3a3,
title = "Validating Synthetic Longitudinal Populations for evaluation of Population Data Linkage",
abstract = "Background’Gold-standard’ data to evaluate linkage algorithms are rare. Synthetic data have the advantage that all the true links are known. In the domain of population reconstruction, the ability to synthesize populations on demand, with varying characteristics, allows a linkage approach to be evaluated across a wide range of data. We have implemented ValiPop, a microsimulation model, for this purpose.ApproachValiPop can create many varied populations based upon sets of desired population statistics, thus allowing linkage algorithms to be evaluated across many populations, rather than across a limited number of real world ’gold-standard’ data sets.Given the potential interactions between different desired population statistics, the creation of a population does not necessarily imply that all desired population statistics have been met. To address this we have developed a statistical approach to validate the adherence of created populations to the desired statistics, using a generalized linear model.This talk will discuss the benefits of synthetic data for data linkage evaluation, the approach to validating created populations, and present the results of some initial linkage experiments using our synthetic data.",
author = "Dalton, {Thomas Stanley} and Kirby, {Graham Njal Cameron} and Alan Dearle and Ozgur Akgun and MacKenzie, {Monique Lea}",
year = "2018",
month = "6",
day = "11",
doi = "10.23889/ijpds.v3i2.504",
language = "English",

}

RIS (suitable for import to EndNote) - Download

TY - CONF

T1 - Validating Synthetic Longitudinal Populations for evaluation of Population Data Linkage

AU - Dalton,Thomas Stanley

AU - Kirby,Graham Njal Cameron

AU - Dearle,Alan

AU - Akgun,Ozgur

AU - MacKenzie,Monique Lea

PY - 2018/6/11

Y1 - 2018/6/11

N2 - Background’Gold-standard’ data to evaluate linkage algorithms are rare. Synthetic data have the advantage that all the true links are known. In the domain of population reconstruction, the ability to synthesize populations on demand, with varying characteristics, allows a linkage approach to be evaluated across a wide range of data. We have implemented ValiPop, a microsimulation model, for this purpose.ApproachValiPop can create many varied populations based upon sets of desired population statistics, thus allowing linkage algorithms to be evaluated across many populations, rather than across a limited number of real world ’gold-standard’ data sets.Given the potential interactions between different desired population statistics, the creation of a population does not necessarily imply that all desired population statistics have been met. To address this we have developed a statistical approach to validate the adherence of created populations to the desired statistics, using a generalized linear model.This talk will discuss the benefits of synthetic data for data linkage evaluation, the approach to validating created populations, and present the results of some initial linkage experiments using our synthetic data.

AB - Background’Gold-standard’ data to evaluate linkage algorithms are rare. Synthetic data have the advantage that all the true links are known. In the domain of population reconstruction, the ability to synthesize populations on demand, with varying characteristics, allows a linkage approach to be evaluated across a wide range of data. We have implemented ValiPop, a microsimulation model, for this purpose.ApproachValiPop can create many varied populations based upon sets of desired population statistics, thus allowing linkage algorithms to be evaluated across many populations, rather than across a limited number of real world ’gold-standard’ data sets.Given the potential interactions between different desired population statistics, the creation of a population does not necessarily imply that all desired population statistics have been met. To address this we have developed a statistical approach to validate the adherence of created populations to the desired statistics, using a generalized linear model.This talk will discuss the benefits of synthetic data for data linkage evaluation, the approach to validating created populations, and present the results of some initial linkage experiments using our synthetic data.

U2 - 10.23889/ijpds.v3i2.504

DO - 10.23889/ijpds.v3i2.504

M3 - Paper

ER -

Related by author

  1. Probabilistic linkage of vital event records in Scotland using familial groups

    Akgun, O., Dalton, T. S., Dearle, A., Garrett, E. & Kirby, G. N. C. 11 May 2017

    Research output: Contribution to conferenceAbstract

  2. Record linking using metric space similarity search

    Dearle, A., Kirby, G. N. C., Akgun, O. & Dalton, T. S. 2 Apr 2017

    Research output: Contribution to conferenceAbstract

  3. Evaluating population data linkage: assessing stability, scalability, resilience and robustness across many data sets for comprehensive linkage evaluation

    Dalton, T. S., Akgun, O., Al-Sediqi, A., Christen, P., Dearle, A., Garrett, E., Gray, A., Kirby, G. N. C. & Reid, A. 2 Apr 2017

    Research output: Contribution to conferenceAbstract

  4. An identifier scheme for the Digitising Scotland project

    Akgun, O., Al-Sidiqi, A., Christen, P., Dalton, T. S., Dearle, A., Dibben, C. J. L., Garrett, E., Gray, A., Kirby, G. N. C. & Reid, A. 2 Apr 2017

    Research output: Contribution to conferenceAbstract

ID: 254997588