Overestimated prediction using polygenic prediction derived from summary statistics
© 2023. BioMed Central Ltd., part of Springer Nature..
BACKGROUND: When polygenic risk score (PRS) is derived from summary statistics, independence between discovery and test sets cannot be monitored. We compared two types of PRS studies derived from raw genetic data (denoted as rPRS) and the summary statistics for IGAP (sPRS).
RESULTS: Two variables with the high heritability in UK Biobank, hypertension, and height, are used to derive an exemplary scale effect of PRS. sPRS without APOE is derived from International Genomics of Alzheimer's Project (IGAP), which records ΔAUC and ΔR2 of 0.051 ± 0.013 and 0.063 ± 0.015 for Alzheimer's Disease Sequencing Project (ADSP) and 0.060 and 0.086 for Accelerating Medicine Partnership - Alzheimer's Disease (AMP-AD). On UK Biobank, rPRS performances for hypertension assuming a similar size of discovery and test sets are 0.0036 ± 0.0027 (ΔAUC) and 0.0032 ± 0.0028 (ΔR2). For height, ΔR2 is 0.029 ± 0.0037.
CONCLUSION: Considering the high heritability of hypertension and height of UK Biobank and sample size of UK Biobank, sPRS results from AD databases are inflated. Independence between discovery and test sets is a well-known basic requirement for PRS studies. However, a lot of PRS studies cannot follow such requirements because of impossible direct comparisons when using summary statistics. Thus, for sPRS, potential duplications should be carefully considered within the same ethnic group.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:24 |
---|---|
Enthalten in: |
BMC genomic data - 24(2023), 1 vom: 14. Sept., Seite 52 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Park, David Keetae [VerfasserIn] |
---|
Links: |
---|
Themen: |
Alzheimer’s disease |
---|
Anmerkungen: |
Date Completed 18.09.2023 Date Revised 21.11.2023 published: Electronic Citation Status MEDLINE |
---|
doi: |
10.1186/s12863-023-01151-4 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM362074011 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM362074011 | ||
003 | DE-627 | ||
005 | 20231226090457.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1186/s12863-023-01151-4 |2 doi | |
028 | 5 | 2 | |a pubmed24n1206.xml |
035 | |a (DE-627)NLM362074011 | ||
035 | |a (NLM)37710206 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Park, David Keetae |e verfasserin |4 aut | |
245 | 1 | 0 | |a Overestimated prediction using polygenic prediction derived from summary statistics |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 18.09.2023 | ||
500 | |a Date Revised 21.11.2023 | ||
500 | |a published: Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a © 2023. BioMed Central Ltd., part of Springer Nature. | ||
520 | |a BACKGROUND: When polygenic risk score (PRS) is derived from summary statistics, independence between discovery and test sets cannot be monitored. We compared two types of PRS studies derived from raw genetic data (denoted as rPRS) and the summary statistics for IGAP (sPRS) | ||
520 | |a RESULTS: Two variables with the high heritability in UK Biobank, hypertension, and height, are used to derive an exemplary scale effect of PRS. sPRS without APOE is derived from International Genomics of Alzheimer's Project (IGAP), which records ΔAUC and ΔR2 of 0.051 ± 0.013 and 0.063 ± 0.015 for Alzheimer's Disease Sequencing Project (ADSP) and 0.060 and 0.086 for Accelerating Medicine Partnership - Alzheimer's Disease (AMP-AD). On UK Biobank, rPRS performances for hypertension assuming a similar size of discovery and test sets are 0.0036 ± 0.0027 (ΔAUC) and 0.0032 ± 0.0028 (ΔR2). For height, ΔR2 is 0.029 ± 0.0037 | ||
520 | |a CONCLUSION: Considering the high heritability of hypertension and height of UK Biobank and sample size of UK Biobank, sPRS results from AD databases are inflated. Independence between discovery and test sets is a well-known basic requirement for PRS studies. However, a lot of PRS studies cannot follow such requirements because of impossible direct comparisons when using summary statistics. Thus, for sPRS, potential duplications should be carefully considered within the same ethnic group | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, U.S. Gov't, Non-P.H.S. | |
650 | 4 | |a Research Support, Non-U.S. Gov't | |
650 | 4 | |a Alzheimer’s disease | |
650 | 4 | |a Complex genetic disease | |
650 | 4 | |a Overestimation bias | |
650 | 4 | |a Polygenic risk score | |
700 | 1 | |a Chen, Mingshen |e verfasserin |4 aut | |
700 | 1 | |a Kim, Seungsoo |e verfasserin |4 aut | |
700 | 1 | |a Joo, Yoonjung Yoonie |e verfasserin |4 aut | |
700 | 1 | |a Loving, Rebekah K |e verfasserin |4 aut | |
700 | 1 | |a Kim, Hyoung Seop |e verfasserin |4 aut | |
700 | 1 | |a Cha, Jiook |e verfasserin |4 aut | |
700 | 1 | |a Yoo, Shinjae |e verfasserin |4 aut | |
700 | 1 | |a Kim, Jong Hun |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t BMC genomic data |d 2021 |g 24(2023), 1 vom: 14. Sept., Seite 52 |w (DE-627)NLM32128156X |x 2730-6844 |7 nnns |
773 | 1 | 8 | |g volume:24 |g year:2023 |g number:1 |g day:14 |g month:09 |g pages:52 |
856 | 4 | 0 | |u http://dx.doi.org/10.1186/s12863-023-01151-4 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 24 |j 2023 |e 1 |b 14 |c 09 |h 52 |