The Story
The journey of mtDNA haplogroup C*
Origins and Evolution
mtDNA haplogroup C is a branch of macro-haplogroup M (often placed within the M8'CZ cluster) and is thought to have arisen in northeastern Asia or Siberia during the Late Pleistocene. C\* denotes basal or unclassified lineages within haplogroup C that are not assigned to named downstream subclades (for example, C1, C4, C5). Based on phylogenetic placement and molecular-clock estimates for the C clade, the origin of C and its early diversification is commonly dated to about 30-40 kya, consistent with a Pleistocene expansion of maternal lineages across northern Asia.
Subclades (if applicable)
The broader haplogroup C includes several well-characterized subclades with distinct geographic patterns: C1 (important among Indigenous peoples of the Americas), C4 and C5 (widely reported in Siberia and parts of Central Asia), and other localized branches. C* refers to samples that carry diagnostic markers of haplogroup C but cannot be reliably placed into these named subclades with available sequence data; such basal C lineages can reflect either ancient diversity that predates major subclade splits or lineages that have not yet been fully resolved by modern phylogenies.
Geographical Distribution
Haplogroup C and its derivatives are distributed across northern and eastern Asia, Central Asia, and the Americas. C* specifically is most often detected at low-to-moderate frequencies among Siberian and northeastern Asian hunter-gatherer groups and appears in some modern East Asian and Central Asian populations. In the Americas, C (including certain subclades) is one of the founding maternal lineages, so basal detections related to C may appear in ancient or mixed contexts in North and South America when complete subclade resolution is not available. Ancient DNA studies have recovered C-lineage sequences in Pleistocene and Holocene contexts, supporting a role in prehistoric migrations such as the peopling of the Americas via Beringia.
Historical and Cultural Significance
Haplogroup C and its sublineages are tied to major prehistoric demographic events in northern Eurasia. The distribution of C and related lineages supports scenarios of Late Pleistocene population structure in Siberia, a possible Beringian standstill or differentiation phase, and subsequent migration(s) into the Americas during the Late Pleistocene or early Holocene. In Siberia and adjacent regions, C-lineages appear among groups traditionally described as Paleo-Siberian, Tungusic, Mongolic, and some Turkic-speaking peoples, reflecting both deep continuity and later population movements. In Arctic regions, C and specific subclades have been associated with the maternal ancestry of Inuit and Yupik groups, linking genetic patterns to adaptations and cultural continuity in high-latitude environments.
Conclusion
C* represents the basal, unassigned portion of the mtDNA C clade and is a useful marker of deep maternal ancestry in northeastern Asia and related populations. While many meaningful geographic and historical inferences come from named C subclades (e.g., C1 in the Americas), C* occurrences highlight either undercharacterized ancient diversity or limitations of partial sequence data. Continued whole-mitochondrial sequencing and ancient DNA sampling will refine the placement of C* lineages and improve understanding of their roles in prehistoric migrations across Siberia, Central/East Asia, and into the Americas.
Key Points
- Origins and Evolution
- Subclades (if applicable)
- Geographical Distribution
- Historical and Cultural Significance
- Conclusion