Implementing a hash-based privacy-preserving record linkage tool in the OneFlorida clinical research network

© The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association..

OBJECTIVE: To implement an open-source tool that performs deterministic privacy-preserving record linkage (RL) in a real-world setting within a large research network.

MATERIALS AND METHODS: We learned 2 efficient deterministic linkage rules using publicly available voter registration data. We then validated the 2 rules' performance with 2 manually curated gold-standard datasets linking electronic health records and claims data from 2 sources. We developed an open-source Python-based tool-OneFL Deduper-that (1) creates seeded hash codes of combinations of patients' quasi-identifiers using a cryptographic one-way hash function to achieve privacy protection and (2) links and deduplicates patient records using a central broker through matching of hash codes with a high precision and reasonable recall.

RESULTS: We deployed the OneFl Deduper (https://github.com/ufbmi/onefl-deduper) in the OneFlorida, a state-based clinical research network as part of the national Patient-Centered Clinical Research Network (PCORnet). Using the gold-standard datasets, we achieved a precision of 97.25∼99.7% and a recall of 75.5%. With the tool, we deduplicated ∼3.5 million (out of ∼15 million) records down to 1.7 million unique patients across 6 health care partners and the Florida Medicaid program. We demonstrated the benefits of RL through examining different disease profiles of the linked cohorts.

CONCLUSIONS: Many factors including privacy risk considerations, policies and regulations, data availability and quality, and computing resources, can impact how a RL solution is constructed in a real-world setting. Nevertheless, RL is a significant task in improving the data quality in a network so that we can draw reliable scientific discoveries from these massive data resources.

Medienart:

E-Artikel

Erscheinungsjahr:

2019

Erschienen:

2019

Enthalten in:

Zur Gesamtaufnahme - volume:2

Enthalten in:

JAMIA open - 2(2019), 4 vom: 31. Dez., Seite 562-569

Sprache:

Englisch

Beteiligte Personen:

Bian, Jiang [VerfasserIn]
Loiacono, Alexander [VerfasserIn]
Sura, Andrei [VerfasserIn]
Mendoza Viramontes, Tonatiuh [VerfasserIn]
Lipori, Gloria [VerfasserIn]
Guo, Yi [VerfasserIn]
Shenkman, Elizabeth [VerfasserIn]
Hogan, William [VerfasserIn]

Links:

Volltext

Themen:

Clinical research network
Journal Article
PCORnet
Privacy-preserving record linkage

Anmerkungen:

Date Revised 12.04.2022

published: Electronic-eCollection

Citation Status PubMed-not-MEDLINE

doi:

10.1093/jamiaopen/ooz050

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM306203782