A pangenome reference of 36 Chinese populations
© 2023. The Author(s)..
Human genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype-phased de novo assemblies based on 58 core samples representing 36 minority Chinese ethnic groups. With an average 30.65× high-fidelity long-read sequence coverage, an average contiguity N50 of more than 35.63 megabases and an average total size of 3.01 gigabases, the CPC core assemblies add 189 million base pairs of euchromatic polymorphic sequences and 1,367 protein-coding gene duplications to GRCh38. We identified 15.9 million small variants and 78,072 structural variants, of which 5.9 million small variants and 34,223 structural variants were not reported in a recently released pangenome reference1. The Chinese Pangenome Consortium data demonstrate a remarkable increase in the discovery of novel and missing sequences when individuals are included from underrepresented minority ethnic groups. The missing reference sequences were enriched with archaic-derived alleles and genes that confer essential functions related to keratinization, response to ultraviolet radiation, DNA repair, immunological responses and lifespan, implying great potential for shedding new light on human evolution and recovering missing heritability in complex disease mapping.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:619 |
---|---|
Enthalten in: |
Nature - 619(2023), 7968 vom: 14. Juli, Seite 112-121 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Gao, Yang [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 24.07.2023 Date Revised 24.07.2023 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1038/s41586-023-06173-7 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM35818259X |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM35818259X | ||
003 | DE-627 | ||
005 | 20231226074211.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1038/s41586-023-06173-7 |2 doi | |
028 | 5 | 2 | |a pubmed24n1193.xml |
035 | |a (DE-627)NLM35818259X | ||
035 | |a (NLM)37316654 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Gao, Yang |e verfasserin |4 aut | |
245 | 1 | 2 | |a A pangenome reference of 36 Chinese populations |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 24.07.2023 | ||
500 | |a Date Revised 24.07.2023 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a © 2023. The Author(s). | ||
520 | |a Human genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype-phased de novo assemblies based on 58 core samples representing 36 minority Chinese ethnic groups. With an average 30.65× high-fidelity long-read sequence coverage, an average contiguity N50 of more than 35.63 megabases and an average total size of 3.01 gigabases, the CPC core assemblies add 189 million base pairs of euchromatic polymorphic sequences and 1,367 protein-coding gene duplications to GRCh38. We identified 15.9 million small variants and 78,072 structural variants, of which 5.9 million small variants and 34,223 structural variants were not reported in a recently released pangenome reference1. The Chinese Pangenome Consortium data demonstrate a remarkable increase in the discovery of novel and missing sequences when individuals are included from underrepresented minority ethnic groups. The missing reference sequences were enriched with archaic-derived alleles and genes that confer essential functions related to keratinization, response to ultraviolet radiation, DNA repair, immunological responses and lifespan, implying great potential for shedding new light on human evolution and recovering missing heritability in complex disease mapping | ||
650 | 4 | |a Journal Article | |
650 | 7 | |a Euchromatin |2 NLM | |
650 | 7 | |a Keratins |2 NLM | |
650 | 7 | |a 68238-35-7 |2 NLM | |
700 | 1 | |a Yang, Xiaofei |e verfasserin |4 aut | |
700 | 1 | |a Chen, Hao |e verfasserin |4 aut | |
700 | 1 | |a Tan, Xinjiang |e verfasserin |4 aut | |
700 | 1 | |a Yang, Zhaoqing |e verfasserin |4 aut | |
700 | 1 | |a Deng, Lian |e verfasserin |4 aut | |
700 | 1 | |a Wang, Baonan |e verfasserin |4 aut | |
700 | 1 | |a Kong, Shuang |e verfasserin |4 aut | |
700 | 1 | |a Li, Songyang |e verfasserin |4 aut | |
700 | 1 | |a Cui, Yuhang |e verfasserin |4 aut | |
700 | 1 | |a Lei, Chang |e verfasserin |4 aut | |
700 | 1 | |a Wang, Yimin |e verfasserin |4 aut | |
700 | 1 | |a Pan, Yuwen |e verfasserin |4 aut | |
700 | 1 | |a Ma, Sen |e verfasserin |4 aut | |
700 | 1 | |a Sun, Hao |e verfasserin |4 aut | |
700 | 1 | |a Zhao, Xiaohan |e verfasserin |4 aut | |
700 | 1 | |a Shi, Yingbing |e verfasserin |4 aut | |
700 | 1 | |a Yang, Ziyi |e verfasserin |4 aut | |
700 | 1 | |a Wu, Dongdong |e verfasserin |4 aut | |
700 | 1 | |a Wu, Shaoyuan |e verfasserin |4 aut | |
700 | 1 | |a Zhao, Xingming |e verfasserin |4 aut | |
700 | 1 | |a Shi, Binyin |e verfasserin |4 aut | |
700 | 1 | |a Jin, Li |e verfasserin |4 aut | |
700 | 1 | |a Hu, Zhibin |e verfasserin |4 aut | |
700 | 0 | |a Chinese Pangenome Consortium (CPC) |e verfasserin |4 aut | |
700 | 1 | |a Lu, Yan |e verfasserin |4 aut | |
700 | 1 | |a Chu, Jiayou |e verfasserin |4 aut | |
700 | 1 | |a Ye, Kai |e verfasserin |4 aut | |
700 | 1 | |a Xu, Shuhua |e verfasserin |4 aut | |
700 | 1 | |a Mao, Chuangxue |e investigator |4 oth | |
700 | 1 | |a Fan, Shaohua |e investigator |4 oth | |
700 | 1 | |a Gao, Qiang |e investigator |4 oth | |
700 | 1 | |a Dai, Juncheng |e investigator |4 oth | |
700 | 1 | |a Bu, Fengxiao |e investigator |4 oth | |
700 | 1 | |a He, Guanglin |e investigator |4 oth | |
700 | 1 | |a Wu, Yang |e investigator |4 oth | |
700 | 1 | |a Yuan, Huijun |e investigator |4 oth | |
700 | 1 | |a Li, Jinchen |e investigator |4 oth | |
700 | 1 | |a Chen, Chao |e investigator |4 oth | |
700 | 1 | |a Yang, Jian |e investigator |4 oth | |
700 | 1 | |a Wei, Chaochun |e investigator |4 oth | |
700 | 1 | |a Jin, Xin |e investigator |4 oth | |
700 | 1 | |a Shen, Xia |e investigator |4 oth | |
773 | 0 | 8 | |i Enthalten in |t Nature |d 1945 |g 619(2023), 7968 vom: 14. Juli, Seite 112-121 |w (DE-627)NLM000008257 |x 1476-4687 |7 nnns |
773 | 1 | 8 | |g volume:619 |g year:2023 |g number:7968 |g day:14 |g month:07 |g pages:112-121 |
856 | 4 | 0 | |u http://dx.doi.org/10.1038/s41586-023-06173-7 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 619 |j 2023 |e 7968 |b 14 |c 07 |h 112-121 |