The 1000 Chinese Pangenome empowers medical and population genetics.
Wang Yifei, Y Duan, Zhongqu Z et al.
Publication Details
Comprehensive information about this research publication
Abstract
Summary of the research findings
Pangenomes are revolutionizing our ability to resolve genomic regions with complex variations1. However, existing human pangenomes2,3, constrained by small sample sizes, provide limited utility for medical and population genetic applications. Here we generated 1,116 diploid genome assemblies (55 de novo and 1,061 pangenome-informed) with an average size of 2.98 Gb and a mean quality value of 46 as part of the 1000 Chinese Pangenome (1KCP) project. On the basis of these assemblies, we constructed a pangenome comprising 405.3 million base pairs of sequences absent from the current references GRCh38 and CHM13, including 26.2 million base pairs of functional genic and predicted regulatory elements. We catalogued a full spectrum of genetic variation, including 35.4 million small variants, 110,530 structural variants (SVs), 485,575 tandem repeats (TRs) and 0.86 million nested variants embedded in non-reference sequences. This extensive dataset enabled detailed characterization of multiscale genic variations relevant to medical genetics, including gene-altering SVs, TR expansions, gene cluster variations and HLA gene haplotypes. Coupled with the 1KCP gene expression data, we conducted pan-variant expression quantitative trait locus (eQTL) mapping to analyse diverse variant types. We identified 3,256 eQTLs involving complex variants (SVs, TRs and nested variants) and elucidated their regulatory complexity. Finally, we developed a 1KCP pan-variant imputation reference panel, which provides multitype genetic markers to enhance the resolution of future association studies. This resource advances our understanding of complex variants and their functional implications to provide new insights into human health.
Analysis
Comprehensive review of ancestry and genetic findings
Important Disclaimer: This review has been performed semi-automatically and is provided for informational purposes only. While we strive for accuracy, this analysis may contain errors, omissions, or misinterpretations of the original research. DNA Genics disclaims all liability for any inaccuracies, errors, or consequences arising from the use of this information. Users should independently verify all information and consult original research publications before making any decisions based on this content. This analysis is not intended as a substitute for professional scientific review or medical advice.