COVID-19 Knowledge Extractor (COKE): A Tool and a Web Portal to Extract Drug - Target Protein Associations from the CORD-19 Corpus of Scientific Publications on COVID-19
Objective: The COVID-19 pandemic has catalyzed a widespread effort to identify drug candidates and biological targets of relevance to SARS-COV-2 infection, which resulted in large numbers of publications on this subject. We have built the COVID-19 Knowledge Extractor (COKE), a web application to extract, curate, and annotate essential drug-target relationships from the research literature on COVID-19 to assist drug repurposing efforts. Materials and Methods: SciBiteAI ontological tagging of the COVID Open Research Dataset (CORD-19), a repository of COVID-19 scientific publications, was employed to identify drug-target relationships. Entity identifiers were resolved through lookup routines using UniProt and DrugBank. A custom algorithm was used to identify co-occurrences of protein and drug terms, and confidence scores were calculated for each entity pair. Results: COKE processing of the current CORD-19 database identified about 3,000 drug-protein pairs, including 29 unique proteins and 500 investigational, experimental, and approved drugs. Some of these drugs are presently undergoing clinical trials for COVID-19. Discussion: The rapidly evolving situation concerning the COVID-19 pandemic has resulted in a dramatic growth of publications on this subject in a short period. These circumstances call for methods that can condense the literature into the key concepts and relationships necessary for insights into SARS-CoV-2 drug repurposing. Conclusion: The COKE repository and web application deliver key drug - target protein relationships to researchers studying SARS-CoV-2. COKE portal may provide comprehensive and critical information on studies concerning drug repurposing against COVID-19. COKE is freely available at <a href="https://coke.mml.unc.edu/">https://coke.mml.unc.edu/</a> and the code is available at <a href="https://github.com/DnlRKorn/CoKE">https://github.com/DnlRKorn/CoKE</a>..
Medienart: |
Preprint |
---|
Erscheinungsjahr: |
2021 |
---|---|
Erschienen: |
2021 |
Enthalten in: |
chemRxiv.org - (2021) vom: 18. Nov. Zur Gesamtaufnahme - year:2021 |
---|
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Korn, Daniel [VerfasserIn] |
---|
Links: |
Volltext [kostenfrei] |
---|
Themen: |
---|
doi: |
10.26434/chemrxiv.13289222 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
XCH019420102 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | XCH019420102 | ||
003 | DE-627 | ||
005 | 20230429143103.0 | ||
007 | cr uuu---uuuuu | ||
008 | 201127s2021 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.26434/chemrxiv.13289222 |2 doi | |
035 | |a (DE-627)XCH019420102 | ||
035 | |a (chemrXiv)10.26434/chemrxiv.13289222 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Korn, Daniel |e verfasserin |4 aut | |
245 | 1 | 0 | |a COVID-19 Knowledge Extractor (COKE): A Tool and a Web Portal to Extract Drug - Target Protein Associations from the CORD-19 Corpus of Scientific Publications on COVID-19 |
264 | 1 | |c 2021 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
520 | |a Objective: The COVID-19 pandemic has catalyzed a widespread effort to identify drug candidates and biological targets of relevance to SARS-COV-2 infection, which resulted in large numbers of publications on this subject. We have built the COVID-19 Knowledge Extractor (COKE), a web application to extract, curate, and annotate essential drug-target relationships from the research literature on COVID-19 to assist drug repurposing efforts. Materials and Methods: SciBiteAI ontological tagging of the COVID Open Research Dataset (CORD-19), a repository of COVID-19 scientific publications, was employed to identify drug-target relationships. Entity identifiers were resolved through lookup routines using UniProt and DrugBank. A custom algorithm was used to identify co-occurrences of protein and drug terms, and confidence scores were calculated for each entity pair. Results: COKE processing of the current CORD-19 database identified about 3,000 drug-protein pairs, including 29 unique proteins and 500 investigational, experimental, and approved drugs. Some of these drugs are presently undergoing clinical trials for COVID-19. Discussion: The rapidly evolving situation concerning the COVID-19 pandemic has resulted in a dramatic growth of publications on this subject in a short period. These circumstances call for methods that can condense the literature into the key concepts and relationships necessary for insights into SARS-CoV-2 drug repurposing. Conclusion: The COKE repository and web application deliver key drug - target protein relationships to researchers studying SARS-CoV-2. COKE portal may provide comprehensive and critical information on studies concerning drug repurposing against COVID-19. COKE is freely available at <a href="https://coke.mml.unc.edu/">https://coke.mml.unc.edu/</a> and the code is available at <a href="https://github.com/DnlRKorn/CoKE">https://github.com/DnlRKorn/CoKE</a>. | ||
650 | 4 | |a Chemistry |7 (dpeaa)DE-84 | |
650 | 4 | |a 540 |7 (dpeaa)DE-84 | |
700 | 1 | |a Pervitsky, Vera |e verfasserin |4 aut | |
700 | 1 | |a Bobrowski, Tesia |e verfasserin |4 aut | |
700 | 1 | |a Alves, Vinicius |e verfasserin |4 aut | |
700 | 1 | |a Schmitt, Charles |e verfasserin |4 aut | |
700 | 1 | |a Bizon, Cristopher |e verfasserin |4 aut | |
700 | 1 | |a Baker, Nancy |e verfasserin |4 aut | |
700 | 1 | |a Chirkova, Rada |e verfasserin |4 aut | |
700 | 1 | |a Cherkasov, Artem |e verfasserin |4 aut | |
700 | 1 | |a Muratov, Eugene |e verfasserin |4 aut | |
700 | 1 | |a Tropsha, Alexander |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t chemRxiv.org |g (2021) vom: 18. Nov. |
773 | 1 | 8 | |g year:2021 |g day:18 |g month:11 |
856 | 4 | 0 | |u http://dx.doi.org/10.26434/chemrxiv.13289222 |z kostenfrei |3 Volltext |
912 | |a GBV_XCH | ||
912 | |a SSG-OLC-PHA | ||
951 | |a AR | ||
952 | |j 2021 |b 18 |c 11 |