Performance and accuracy evaluation of reference panels for genotype imputation in sub-Saharan African populations
© 2023 The Author(s)..
Based on evaluations of imputation performed on a genotype dataset consisting of about 11,000 sub-Saharan African (SSA) participants, we show Trans-Omics for Precision Medicine (TOPMed) and the African Genome Resource (AGR) to be currently the best panels for imputing SSA datasets. We report notable differences in the number of single-nucleotide polymorphisms (SNPs) that are imputed by different panels in datasets from East, West, and South Africa. Comparisons with a subset of 95 SSA high-coverage whole-genome sequences (WGSs) show that despite being about 20-fold smaller, the AGR imputed dataset has higher concordance with the WGSs. Moreover, the level of concordance between imputed and WGS datasets was strongly influenced by the extent of Khoe-San ancestry in a genome, highlighting the need for integration of not only geographically but also ancestrally diverse WGS data in reference panels for further improvement in imputation of SSA datasets. Approaches that integrate imputed data from different panels could also lead to better imputation.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:3 |
---|---|
Enthalten in: |
Cell genomics - 3(2023), 6 vom: 14. Juni, Seite 100332 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Sengupta, Dhriti [VerfasserIn] |
---|
Links: |
---|
Themen: |
AGR |
---|
Anmerkungen: |
Date Revised 03.07.2023 published: Electronic-eCollection Citation Status PubMed-not-MEDLINE |
---|
doi: |
10.1016/j.xgen.2023.100332 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM35890160X |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM35890160X | ||
003 | DE-627 | ||
005 | 20231226075729.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1016/j.xgen.2023.100332 |2 doi | |
028 | 5 | 2 | |a pubmed24n1196.xml |
035 | |a (DE-627)NLM35890160X | ||
035 | |a (NLM)37388906 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Sengupta, Dhriti |e verfasserin |4 aut | |
245 | 1 | 0 | |a Performance and accuracy evaluation of reference panels for genotype imputation in sub-Saharan African populations |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Revised 03.07.2023 | ||
500 | |a published: Electronic-eCollection | ||
500 | |a Citation Status PubMed-not-MEDLINE | ||
520 | |a © 2023 The Author(s). | ||
520 | |a Based on evaluations of imputation performed on a genotype dataset consisting of about 11,000 sub-Saharan African (SSA) participants, we show Trans-Omics for Precision Medicine (TOPMed) and the African Genome Resource (AGR) to be currently the best panels for imputing SSA datasets. We report notable differences in the number of single-nucleotide polymorphisms (SNPs) that are imputed by different panels in datasets from East, West, and South Africa. Comparisons with a subset of 95 SSA high-coverage whole-genome sequences (WGSs) show that despite being about 20-fold smaller, the AGR imputed dataset has higher concordance with the WGSs. Moreover, the level of concordance between imputed and WGS datasets was strongly influenced by the extent of Khoe-San ancestry in a genome, highlighting the need for integration of not only geographically but also ancestrally diverse WGS data in reference panels for further improvement in imputation of SSA datasets. Approaches that integrate imputed data from different panels could also lead to better imputation | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a AGR | |
650 | 4 | |a Africa | |
650 | 4 | |a GWAS | |
650 | 4 | |a TOPMed | |
650 | 4 | |a imputation | |
650 | 4 | |a imputation accuracy | |
650 | 4 | |a non-reference discordance rate | |
650 | 4 | |a reference panel | |
650 | 4 | |a whole-genome sequence | |
700 | 1 | |a Botha, Gerrit |e verfasserin |4 aut | |
700 | 1 | |a Meintjes, Ayton |e verfasserin |4 aut | |
700 | 1 | |a Mbiyavanga, Mamana |e verfasserin |4 aut | |
700 | 0 | |a AWI-Gen Study |e verfasserin |4 aut | |
700 | 0 | |a H3Africa Consortium |e verfasserin |4 aut | |
700 | 1 | |a Hazelhurst, Scott |e verfasserin |4 aut | |
700 | 1 | |a Mulder, Nicola |e verfasserin |4 aut | |
700 | 1 | |a Ramsay, Michèle |e verfasserin |4 aut | |
700 | 1 | |a Choudhury, Ananyo |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Cell genomics |d 2021 |g 3(2023), 6 vom: 14. Juni, Seite 100332 |w (DE-627)NLM333578430 |x 2666-979X |7 nnns |
773 | 1 | 8 | |g volume:3 |g year:2023 |g number:6 |g day:14 |g month:06 |g pages:100332 |
856 | 4 | 0 | |u http://dx.doi.org/10.1016/j.xgen.2023.100332 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 3 |j 2023 |e 6 |b 14 |c 06 |h 100332 |