Unlock your genome's hidden variants
Genotype imputation uses linkage disequilibrium patterns from reference populations to statistically predict the ~29 million variants your DNA chip didn't directly measure�enabling finer-resolution ancestry analysis, archaeological comparisons, and academic research workflows.
Why impute your DNA data?
Your raw DNA chip captures ~700K SNPs�less than 3% of common human variation. Imputation predicts the remaining variants, unlocking advanced analysis capabilities.
Higher-Resolution PCA
More variants means finer population clustering in principal component analysis. Distinguish closely related populations that appear merged with chip-only data.
Archaeological Comparisons
Ancient DNA studies use imputed data for qpAdm, f-statistics, and admixture modeling. Match the SNP density of published aDNA datasets.
Academic Research Workflows
Imputed data integrates seamlessly with standard bioinformatics pipelines. Output files are PLINK-compatible and ready for downstream analysis.
Third-Party Platform Compatibility
Many ancestry platforms benefit from denser SNP coverage. Improve your results on sites that accept imputed or high-density raw files.
What is genotype imputation?
A statistical method that leverages population-level haplotype patterns to predict genetic variants not directly genotyped by your DNA chip.
The Simple Version
Your DNA chip tests specific positions (SNPs) across your genome�typically 600K�900K sites. But the human genome has over 80 million known variants. Imputation fills in the gaps by comparing your tested SNPs against a reference panel of fully sequenced individuals. Because genetic variants are inherited together in blocks (called linkage disequilibrium), we can statistically infer what's in the untested regions with high confidence.
Our Imputation Pipeline
Research-grade bioinformatics workflow using industry-standard tools
Before & After Imputation
See the dramatic increase in genomic coverage
~2.3% of common variants
~43� more variants
Compatible Downstream Tools
What You'll Receive
Two output files optimized for different use cases, with clear documentation
30M SNPs � Research-Grade
- Complete imputed output (r� = 0.3)
- 23andMe-style format, convertible to PLINK
- Suitable for PCA, admixture, f-statistics
- GRCh37/hg19, chromosomes 1�22
2.5M SNPs � Optimized Subset
- High-confidence variants (r� = 0.8)
- Optimized for third-party platforms
- Fast uploads, quick processing
- Same format, smaller file size
Best for: GEDmatch, MyTrueAncestry, and other sites where you need a balance of density and file size.
Compatible DNA Providers
Imputation quality depends on your chip's SNP density. Higher-density chips yield better results.
* WGS providers require conversion to RAW format first. Use our WGS to RAW service.
Important Technical Limitations
Imputation is powerful but not perfect. Here's what you should know.
Expected Error Rates
Imputation accuracy varies by allele frequency. Common variants (MAF > 5%) are imputed with ~98% accuracy; rare variants (MAF < 1%) have higher error rates.
Autosomes Only
We impute chromosomes 1�22 only. mtDNA and Y-DNA are not imputed as they require specialized reference panels and analysis pipelines.
GRCh37/hg19 Reference
All coordinates are in GRCh37 (hg19) assembly. If you need GRCh38, you'll need to liftover the positions using tools like CrossMap or UCSC liftOver.
Not for Health/Medical Use
Imputed genotypes are statistical estimates, not direct measurements. They are not suitable for clinical decision-making or health-related interpretations.
Simple Pricing
One order includes both output files. No subscription.
DNA Imputation Service
- Full 30M SNP file (r� = 0.3)
- High-confidence 2.5M file (r� = 0.8)
- 1000 Genomes Phase 3 reference panel
- ~24 hour average turnaround
- Secure processing & private download
- Email support for questions
- Files deleted after 15 days
Legal & Disclaimer
Please review before ordering
Intended Use
This service is intended solely for ancestry research, genealogy, and academic purposes. It is not designed, intended, or suitable for health-related uses or medical decision-making.
Key Terms
- Statistical Nature: Imputed genotypes are probabilistic estimates, not direct measurements. Error rates of 1�5% are expected depending on variant frequency.
- No Health Claims: We explicitly disclaim responsibility for any health interpretations made from imputed data.
- Third-Party Compatibility: Compatibility with external platforms is provided as-is and not guaranteed.
- No Warranty: This service is provided "as is" without warranties of any kind.
- Data Handling: Your files are processed securely and deleted 15 days after delivery.
Unlock your genome's hidden variants
Get research-grade imputed data for advanced ancestry analysis and academic workflows.