Unique insights from ClinicalTrials.gov by mining protein mutations and RSids in addition to applying the Human Phenotype Ontology

Researchers and clinicians face a significant challenge in keeping up-to-date with the rapid rate of new associations between genetic mutations and diseases. To remedy this problem, this research mined the ClinicalTrials.gov corpus to extract relevant biological insights, produce unique reports to summarize findings, and make the meta-data available via APIs. An automated text-analysis pipeline performed the following features: parsing the ClinicalTrials.gov files, extracting and analyzing mutations from the corpus, mapping clinical trials to Human Phenotype Ontology (HPO), and finding associations between clinical trials and HPO nodes. Unique reports were created for each mutation (SNPs and protein mutations) mentioned in the corpus, as well as for each clinical trial that references a mutation. These reports, which have been run over multiple time points, along with APIs to access meta-data, are freely available at http://snpminertrials.com. Additionally, HPO was used to normalize disease terms and associate clinical trials with relevant genes. The creation of the pipeline and reports, the association of clinical trials with HPO terms, and the insights, public repository, and APIs produced are all novel in this work. The freely-available resources present relevant biological information and novel insights between biomedical entities in a robust and accessible manner, mitigating the challenge of being informed about new associations between mutations, genes, and diseases.

Medienart:

E-Artikel

Erscheinungsjahr:

2020

Erschienen:

2020

Enthalten in:

Zur Gesamtaufnahme - volume:15

Enthalten in:

PloS one - 15(2020), 5 vom: 27., Seite e0233438

Sprache:

Englisch

Beteiligte Personen:

Alag, Shray [VerfasserIn]

Links:

Volltext

Themen:

Journal Article

Anmerkungen:

Date Completed 20.08.2020

Date Revised 20.08.2020

published: Electronic-eCollection

Citation Status MEDLINE

doi:

10.1371/journal.pone.0233438

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM310417597