Text mining in a literature review of urothelial cancer using topic model

BACKGROUND: Urothelial cancer (UC) includes carcinomas of the bladder, ureters, and renal pelvis. New treatments and biomarkers of UC emerged in this decade. To identify the key information in a vast amount of literature can be challenging. In this study, we use text mining to explore UC publications to identify important information that may lead to new research directions.

METHOD: We used topic modeling to analyze the titles and abstracts of 29,883 articles of UC from Pubmed, Web of Science, and Embase in Mar 2020. We applied latent Dirichlet allocation modeling to extract 15 topics and conducted trend analysis. Gene ontology term enrichment analysis and Kyoto encyclopedia of genes and genomes pathway analysis were performed to identify UC related pathways.

RESULTS: There was a growing trend regarding UC treatment especially immune checkpoint therapy but not the staging of UC. The risk factors of UC carried in different countries such as cigarette smoking in the United State and aristolochic acid in Taiwan and China. GMCSF, IL-5, Syndecan-1, ErbB receptor, integrin, c-Met, and TRAIL signaling pathways are the most relevant biological pathway associated with UC.

CONCLUSIONS: The risk factors of UC may be dependent on the countries and GMCSF, IL-5, Syndecan-1, ErbB receptor, integrin, c-Met, and TRAIL signaling pathways are the most relevant biological pathway associated with UC. These findings may provide further UC research directions.

Medienart:

E-Artikel

Erscheinungsjahr:

2020

Erschienen:

2020

Enthalten in:

Zur Gesamtaufnahme - volume:20

Enthalten in:

BMC cancer - 20(2020), 1 vom: 24. Mai, Seite 462

Sprache:

Englisch

Beteiligte Personen:

Lin, Hsuan-Jen [VerfasserIn]
Sheu, Phillip C-Y [VerfasserIn]
Tsai, Jeffrey J P [VerfasserIn]
Wang, Charles C N [VerfasserIn]
Chou, Che-Yi [VerfasserIn]

Links:

Volltext

Themen:

Journal Article
LDA2vec
Research trends
Review
Text mining
Topic modeling
Urothelial carcinoma

Anmerkungen:

Date Completed 02.02.2021

Date Revised 02.02.2021

published: Electronic

Citation Status MEDLINE

doi:

10.1186/s12885-020-06931-0

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM310303508