CODA : an open-source platform for federated analysis and machine learning on distributed healthcare data

© The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association..

OBJECTIVES: Distributed computations facilitate multi-institutional data analysis while avoiding the costs and complexity of data pooling. Existing approaches lack crucial features, such as built-in medical standards and terminologies, no-code data visualizations, explicit disclosure control mechanisms, and support for basic statistical computations, in addition to gradient-based optimization capabilities.

MATERIALS AND METHODS: We describe the development of the Collaborative Data Analysis (CODA) platform, and the design choices undertaken to address the key needs identified during our survey of stakeholders. We use a public dataset (MIMIC-IV) to demonstrate end-to-end multi-modal FL using CODA. We assessed the technical feasibility of deploying the CODA platform at 9 hospitals in Canada, describe implementation challenges, and evaluate its scalability on large patient populations.

RESULTS: The CODA platform was designed, developed, and deployed between January 2020 and January 2023. Software code, documentation, and technical documents were released under an open-source license. Multi-modal federated averaging is illustrated using the MIMIC-IV and MIMIC-CXR datasets. To date, 8 out of the 9 participating sites have successfully deployed the platform, with a total enrolment of >1M patients. Mapping data from legacy systems to FHIR was the biggest barrier to implementation.

DISCUSSION AND CONCLUSION: The CODA platform was developed and successfully deployed in a public healthcare setting in Canada, with heterogeneous information technology systems and capabilities. Ongoing efforts will use the platform to develop and prospectively validate models for risk assessment, proactive monitoring, and resource usage. Further work will also make tools available to facilitate migration from legacy formats to FHIR and DICOM.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - volume:31

Enthalten in:

Journal of the American Medical Informatics Association : JAMIA - 31(2024), 3 vom: 16. Feb., Seite 651-665

Sprache:

Englisch

Beteiligte Personen:

Mullie, Louis [VerfasserIn]
Afilalo, Jonathan [VerfasserIn]
Archambault, Patrick [VerfasserIn]
Bouchakri, Rima [VerfasserIn]
Brown, Kip [VerfasserIn]
Buckeridge, David L [VerfasserIn]
Cavayas, Yiorgos Alexandros [VerfasserIn]
Turgeon, Alexis F [VerfasserIn]
Martineau, Denis [VerfasserIn]
Lamontagne, François [VerfasserIn]
Lebrasseur, Martine [VerfasserIn]
Lemieux, Renald [VerfasserIn]
Li, Jeffrey [VerfasserIn]
Sauthier, Michaël [VerfasserIn]
St-Onge, Pascal [VerfasserIn]
Tang, An [VerfasserIn]
Witteman, William [VerfasserIn]
Chassé, Michaël [VerfasserIn]

Links:

Volltext

Themen:

Biomedical analytics
Distributed computing
Federated learning
Healthcare data management
Journal Article
Machine learning
Predictive models
Resource usage analysis

Anmerkungen:

Date Completed 19.02.2024

Date Revised 19.02.2024

published: Print

Citation Status MEDLINE

doi:

10.1093/jamia/ocad235

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM366187791