Before and After : Comparison of Legacy and Harmonized TCGA Genomic Data Commons' Data
Copyright © 2019 The Authors. Published by Elsevier Inc. All rights reserved..
We present a systematic analysis of the effects of synchronizing a large-scale, deeply characterized, multi-omic dataset to the current human reference genome, using updated software, pipelines, and annotations. For each of 5 molecular data platforms in The Cancer Genome Atlas (TCGA)-mRNA and miRNA expression, single nucleotide variants, DNA methylation and copy number alterations-comprehensive sample, gene, and probe-level studies were performed, towards quantifying the degree of similarity between the 'legacy' GRCh37 (hg19) TCGA data and its GRCh38 (hg38) version as 'harmonized' by the Genomic Data Commons. We offer gene lists to elucidate differences that remained after controlling for confounders, and strategies to mitigate their impact on biological interpretation. Our results demonstrate that the hg19 and hg38 TCGA datasets are very highly concordant, promote informed use of either legacy or harmonized omics data, and provide a rubric that encourages similar comparisons as new data emerge and reference data evolve.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2019 |
---|---|
Erschienen: |
2019 |
Enthalten in: |
Zur Gesamtaufnahme - volume:9 |
---|---|
Enthalten in: |
Cell systems - 9(2019), 1 vom: 24. Juli, Seite 24-34.e10 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Gao, Galen F [VerfasserIn] |
---|
Links: |
---|
Anmerkungen: |
Date Completed 30.07.2020 Date Revised 20.07.2022 published: Print Citation Status MEDLINE |
---|
doi: |
10.1016/j.cels.2019.06.006 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM299562336 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM299562336 | ||
003 | DE-627 | ||
005 | 20231225100614.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231225s2019 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1016/j.cels.2019.06.006 |2 doi | |
028 | 5 | 2 | |a pubmed24n0998.xml |
035 | |a (DE-627)NLM299562336 | ||
035 | |a (NLM)31344359 | ||
035 | |a (PII)S2405-4712(19)30201-7 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Gao, Galen F |e verfasserin |4 aut | |
245 | 1 | 0 | |a Before and After |b Comparison of Legacy and Harmonized TCGA Genomic Data Commons' Data |
264 | 1 | |c 2019 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 30.07.2020 | ||
500 | |a Date Revised 20.07.2022 | ||
500 | |a published: Print | ||
500 | |a Citation Status MEDLINE | ||
520 | |a Copyright © 2019 The Authors. Published by Elsevier Inc. All rights reserved. | ||
520 | |a We present a systematic analysis of the effects of synchronizing a large-scale, deeply characterized, multi-omic dataset to the current human reference genome, using updated software, pipelines, and annotations. For each of 5 molecular data platforms in The Cancer Genome Atlas (TCGA)-mRNA and miRNA expression, single nucleotide variants, DNA methylation and copy number alterations-comprehensive sample, gene, and probe-level studies were performed, towards quantifying the degree of similarity between the 'legacy' GRCh37 (hg19) TCGA data and its GRCh38 (hg38) version as 'harmonized' by the Genomic Data Commons. We offer gene lists to elucidate differences that remained after controlling for confounders, and strategies to mitigate their impact on biological interpretation. Our results demonstrate that the hg19 and hg38 TCGA datasets are very highly concordant, promote informed use of either legacy or harmonized omics data, and provide a rubric that encourages similar comparisons as new data emerge and reference data evolve | ||
650 | 4 | |a Comparative Study | |
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, N.I.H., Extramural | |
650 | 4 | |a DNA methylation | |
650 | 4 | |a The Cancer Genome Atlas | |
650 | 4 | |a human reference genome | |
650 | 4 | |a mRNA expression | |
650 | 4 | |a microRNA expression | |
650 | 4 | |a quality control | |
650 | 4 | |a somatic copy number alteration | |
650 | 4 | |a somatic mutation | |
650 | 7 | |a MicroRNAs |2 NLM | |
700 | 1 | |a Parker, Joel S |e verfasserin |4 aut | |
700 | 1 | |a Reynolds, Sheila M |e verfasserin |4 aut | |
700 | 1 | |a Silva, Tiago C |e verfasserin |4 aut | |
700 | 1 | |a Wang, Liang-Bo |e verfasserin |4 aut | |
700 | 1 | |a Zhou, Wanding |e verfasserin |4 aut | |
700 | 1 | |a Akbani, Rehan |e verfasserin |4 aut | |
700 | 1 | |a Bailey, Matthew |e verfasserin |4 aut | |
700 | 1 | |a Balu, Saianand |e verfasserin |4 aut | |
700 | 1 | |a Berman, Benjamin P |e verfasserin |4 aut | |
700 | 1 | |a Brooks, Denise |e verfasserin |4 aut | |
700 | 1 | |a Chen, Hu |e verfasserin |4 aut | |
700 | 1 | |a Cherniack, Andrew D |e verfasserin |4 aut | |
700 | 1 | |a Demchok, John A |e verfasserin |4 aut | |
700 | 1 | |a Ding, Li |e verfasserin |4 aut | |
700 | 1 | |a Felau, Ina |e verfasserin |4 aut | |
700 | 1 | |a Gaheen, Sharon |e verfasserin |4 aut | |
700 | 1 | |a Gerhard, Daniela S |e verfasserin |4 aut | |
700 | 1 | |a Heiman, David I |e verfasserin |4 aut | |
700 | 1 | |a Hernandez, Kyle M |e verfasserin |4 aut | |
700 | 1 | |a Hoadley, Katherine A |e verfasserin |4 aut | |
700 | 1 | |a Jayasinghe, Reyka |e verfasserin |4 aut | |
700 | 1 | |a Kemal, Anab |e verfasserin |4 aut | |
700 | 1 | |a Knijnenburg, Theo A |e verfasserin |4 aut | |
700 | 1 | |a Laird, Peter W |e verfasserin |4 aut | |
700 | 1 | |a Mensah, Michael K A |e verfasserin |4 aut | |
700 | 1 | |a Mungall, Andrew J |e verfasserin |4 aut | |
700 | 1 | |a Robertson, A Gordon |e verfasserin |4 aut | |
700 | 1 | |a Shen, Hui |e verfasserin |4 aut | |
700 | 1 | |a Tarnuzzer, Roy |e verfasserin |4 aut | |
700 | 1 | |a Wang, Zhining |e verfasserin |4 aut | |
700 | 1 | |a Wyczalkowski, Matthew |e verfasserin |4 aut | |
700 | 1 | |a Yang, Liming |e verfasserin |4 aut | |
700 | 1 | |a Zenklusen, Jean C |e verfasserin |4 aut | |
700 | 1 | |a Zhang, Zhenyu |e verfasserin |4 aut | |
700 | 0 | |a Genomic Data Analysis Network |e verfasserin |4 aut | |
700 | 1 | |a Liang, Han |e verfasserin |4 aut | |
700 | 1 | |a Noble, Michael S |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Cell systems |d 2015 |g 9(2019), 1 vom: 24. Juli, Seite 24-34.e10 |w (DE-627)NLM251479099 |x 2405-4720 |7 nnns |
773 | 1 | 8 | |g volume:9 |g year:2019 |g number:1 |g day:24 |g month:07 |g pages:24-34.e10 |
856 | 4 | 0 | |u http://dx.doi.org/10.1016/j.cels.2019.06.006 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 9 |j 2019 |e 1 |b 24 |c 07 |h 24-34.e10 |