2/27 = 7.4% F-M89(xK-M9)
2/27 = 7.4% O-M175(xO1a-M119, O2a1-M95, O3-M122)
8/27 = 29.6% O3-M122(xM7, M134)
2/27 = 7.4% O3a2c1-M134(xM117)
2/27 = 7.4% O3a2c1a-M117
1/27 = 3.7% O1a-M119(xM110)
10/27 = 37.0% O2a1-M95(xM88)Kinh from Hanoi, Vietnam (He et al. 2012)
9/76 = 11.8% C2-M217
1/76 = 1.3% K-P131(xN-M231, O-P191, Q1-P36, R-M207)
2/76 = 2.6% N-M231
5/76 = 6.6% O1a1-P203(xM101)
9/76 = 11.8% O2a1-M95(xM88)
23/76 = 30.3% O2a1a-M88
7/76 = 9.2% O3a-P200(xM121, M164, P201, JST002611)
2/76 = 2.6% O3a2-P201(xM7, M134)
8/76 = 10.5% O3a2b-M7
7/76 = 9.2% O3a2c1-M134
2/76 = 2.6% O3a1c-JST002611
1/76 = 1.3% R1a1a-M17Vietnam (Karafet et al. 2010)
3/70 = 4.3% C2-M217
2/70 = 2.9% D1a1-M15
1/70 = 1.4% J-M304(xJ1-M267, J2-M172)
1/70 = 1.4% J2-M172(xJ2b-M12)
2/70 = 2.9% N-M231 [LLY22g+]
2/70 = 2.9% O3a-P197(xP201, JST002611)
10/70 = 14.3% O3a1c-JST002611
1/70 = 1.4% O3a2-P201(xM7, M134)
4/70 = 5.7% O3a2b-M7
11/70 = 15.7% O3a2c1-M134
4/70 = 5.7% O1a1-P203
1/70 = 1.4% O2-P31(xO2a1-M95, O2b-SRY465)
1/70 = 1.4% O2b-SRY465(x47z)
2/70 = 2.9% O2b1a-47z
5/70 = 7.1% O2a1-M95(xM111)
14/70 = 20.0% O2a1a-M111
5/70 = 7.1% Q1-P36(xM346)
1/70 = 1.4% R1a1a-M17
Last Glacial Maximum, haplogroup O split up into the subclades O1 (MSY2.2), O2 (M268) and O3 (M122).
The three subclades can be putatively assigned to three geographical loci along an east-west axis without any claim to geographical precision. Whereas haplogroup O1 moved to the drainage of the Pearl River and its tributaries, the bearers of haplogroup O2 moved to southern Yunnan, whilst bearers of haplogroup O3 remained in the Eastern Himalaya. The O2 clade split into O2a (M95) and O2b (M176). Asian rice may have first been domesticated roughly in the area hypothetically imputed to O2 south of the central Yangtze.
The interaction between ancient Austroasiatics and the early Hmong-Mien not only involved the sharing of rice agriculture technology, but also left high frequencies of haplogroup O2a in today’s Hmong-Mien and haplogroup O3a3b in today’s Austroasiatic populations.
The bearers of the subclade O2a became the ancestors of the Austroasiatics, who spread initially to the Salween drainage in northeastern Burma, to northern Thailand and to western Laos. In time, the Austroasiatics would spread as far as the Mekong delta, the Malay peninsula and the Nicobars. Later, early Austroasiatics would introduce both their language and their paternal lineage to indigenous peoples of eastern India, whose descendants are today’s Munda language communities.
Meanwhile, the bearers of the fraternal subclade O2b spread eastward, where they introduced rice agriculture to areas downstream south of the Yangtze. The bearers of the O2b haplogroup continued to sow seed as they continued to move ever further eastward, but they left no linguistic traces. This paternal lineage moved as far as the Korean peninsula and represents the second major wave of peopling attested in the Japanese genome. Yet the Japanese speak a language of the Altaic linguistic phylum, and the peopling of Japan is a distinct episode of prehistory.
At the dawn of the Holocene in the Eastern Himalaya, haplogroup O3 gave rise to the ancestral Trans-Himalayan paternal lineage O3a3c (M134) and the original Hmong-Mien paternal lineage O3a3b (M7). The bearers of haplogroup O3a3c stayed behind in the Eastern Himalaya, whilst bearers of the O3a3b lineage migrated east to settle in areas south of the Yangtze. On their way, the early Hmong-Mien encountered the ancient Austroasiatics, from whom they adopted rice agriculture.
The interaction between ancient Austroasiatics and the early Hmong-Mien not only involved the sharing of rice agriculture technology, but also left high frequencies of haplogroup O2a in today’s Hmong-Mien and haplogroup O3a3b in today’s Austroasiatic populations. The Austroasiatic paternal contribution to Hmong-Mien populations was modest, but the Hmong-Mien paternal contribution to Austroasiatic populations in Southeast Asia was significant. However, the incidence of haplogroup O3a3b in Austroasiatic communities of the Subcontinent is undetectably low. Subsequently, the Hmong-Mien continued to move eastward, as did bearers of haplogroup O2b.
Even further east, the O1 (MSY2.2) paternal lineage gave rise to the O1a (M119) subclade, which moved from the Pearl River to the Min river drainage in the Fujian hill tracts and then across the Taiwan Strait. Formosa consequently became the homeland of the Austronesians. The Malayo-Polynesian expansion via the Philippines into insular Southeast Asia must have entailed the introduction of Austronesian languages by bearers of haplogroup O1a to resident communities, whose original Austroasiatic paternal haplogroup O2a alongside other older paternal lineages would remain dominant even after linguistic assimilation. Similarly, Malagasy is an Austronesian language, but the Malagasy people trace their biological ancestries equally to Borneo and the African mainland.
Back in the Eastern Himalaya, the paternal spread of Trans-Himalayan is preserved in the distribution of Y-chromosomal haplogroup O3a3c (M134). The centre of phylogenetic diversity of the Trans-Himalayan language family is rooted squarely in the Eastern Himalaya, with outliers trailing off towards the loess plains of the Yellow River basin in the northeast.
While Van Driem posits a southerly origin of O and O3 due to the Trans-Himalayan distribution of O3a3c, DNA studies based on the populations of Central Plains and surrounding it show a O3 as the securely a northern Han marker and that Hengbei emerges as the centre of the northern Han and contact zone between Northern and Southern Han where O*, O2a and O3 and N haplogroups are mixed into the populations here.
Zhao Y-B, Zhang Y, Zhang Q-C, Li H-J, Cui Y-Q, Xu Z, et al. (2015) Ancient DNA Reveals That the Genetic Structure of the Northern Han Chinese Was Shaped Prior to 3,000 Years Ago. PLoS ONE 10(5): e0125676. https://doi.org/10.1371/journal.pone.0125676
The Han Chinese are the largest ethnic group in the world, and their origins, development, and expansion are complex. Many genetic studies have shown that Han Chinese can be divided into two distinct groups: northern Han Chinese and southern Han Chinese. The genetic history of the southern Han Chinese has been well studied. However, the genetic history of the northern Han Chinese is still obscure. In order to gain insight into the genetic history of the northern Han Chinese, 89 human remains were sampled from the Hengbei site which is located in the Central Plain and dates back to a key transitional period during the rise of the Han Chinese (approximately 3,000 years ago). We used 64 authentic mtDNA data obtained in this study, 27 Y chromosome SNP data profiles from previously studied Hengbei samples, and genetic datasets of the current Chinese populations and two ancient northern Chinese populations to analyze the relationship between the ancient people of Hengbei and present-day northern Han Chinese. We used a wide range of population genetic analyses, including principal component analyses, shared mtDNA haplotype analyses, and geographic mapping of maternal genetic distances. The results show that the ancient people of Hengbei bore a strong genetic resemblance to present-day northern Han Chinese and were genetically distinct from other present-day Chinese populations and two ancient populations. These findings suggest that the genetic structure of northern Han Chinese was already shaped 3,000 years ago in the Central Plain area.
According to historical documents, the generally accepted view is that the Han Chinese can trace their origins to the Huaxia ethnic group, which formed during the Shang and Zhou dynasties (21st–8th centuries BC) in the Central Plain region of China (Fig 1) . During the Han Dynasty (260 BC-220 AD), the Huaxia ethnic group developed into a tribe known as the Han Chinese . Because of their advanced agriculture and technology, this group migrated northward into regions inhabited by many ancient northern ethnic groups that were most likely Altaic in origin . In addition, they migrated south into regions originally inhabited by ancient southern ethnic groups, including those speaking the Daic, Austro-Asiatic, and Hmong-Mien languages . Historically, the Han Chinese dispersed across China, becoming the largest of the 56 officially recognized ethnic groups.
To date, studies of classic genetic markers and microsatellites have revealed that the Han Chinese can be divided into two distinct groups: the northern Han Chinese (NH) and the southern Han Chinese (SH) [9,10]. Based on present-day genetic data from NH, SH, and southern minorities, the genetic history of the SH group has been well studied. The consensus is that the Han Chinese migrated south and contributed greatly to the paternal gene pool of the SH, whereas the Han Chinese and ancient southern ethnic groups both contributed almost equally to the SH maternal gene pool . However, the genetic history of the NH is still obscure. Currently, NH populations inhabit much of northern China, including the Central Plain and many outer regions that were inhabited by ancient northern ethnic groups (Fig 1). The Han Chinese or their ancestors who migrated northward from the Central Plain might have mixed with ancient northern ethnic groups or culturally assimilated the native population. This scenario would indicate that the Han Chinese living in different areas should have genetic profiles that differ from each other. However, genetic analyses have shown that there are no significant differences among the northern Han Chinese populations , which has led to conflicting arguments on whether the genetic structure of the NH is the result of an earlier ethnogenesis or, instead, results from a combination of population admixture and continuous migration of the Han Chinese. The addition of ancient DNA analysis on ancient Han Chinese samples provides increased information that can be used to reconstruct recent human evolutionary events in ancient China .
Until now, only a few genetic studies have investigated the ancient Han Chinese or their ancestors. These studies have been restricted by small sample sizes [14,15], high levels of kinship among samples , and short fragments of mitochondrial DNA (mtDNA) [17,18] and thus provide limited insights into the genetic history of the Han Chinese. Recently, a large number of graves were excavated at a necropolis called Hengbei located in the southern part of Shanxi Province, China, on the Central Plain (Fig 1), that dates back to approximately 3,000 years ago (Zhou dynasty) , a key transitional period for the rise of the Han Chinese. In a previous study investigating when haplogroup Q1a1 entered the genetic pool of the Han Chinese, we analyzed Y chromosome single nucleotide polymorphisms (SNPs) from human remains excavated from the Hengbei (HB) site and identified haplogroups for 27 samples. In the present study, we attempted to extract DNA from 89 human remains. Using a combination of Y chromosome SNPs and mtDNA genetic data, we uncover aspects of the genetic structure of the ancient people from the Central Plain region and begin to determine the genetic legacy of the northern Han Chinese in both the maternal and paternal lineages
Recently, a large number of graves were excavated at a necropolis called Hengbei located in the southern part of Shanxi Province, China, on the Central Plain (Fig 1), that dates back to approximately 3,000 years ago (Zhou dynasty) , a key transitional period for the rise of the Han Chinese. In a previous study investigating when haplogroup Q1a1 entered the genetic pool of the Han Chinese, we analyzed Y chromosome single nucleotide polymorphisms (SNPs) from human remains excavated from the Hengbei (HB) site and identified haplogroups for 27 samples. In the present study, we attempted to extract DNA from 89 human remains. Using a combination of Y chromosome SNPs and mtDNA genetic data, we uncover aspects of the genetic structure of the ancient people from the Central Plain region and begin to determine the genetic legacy of the northern Han Chinese in both the maternal and paternal lineages
The Han Chinese originated from the Central Plain region, which is substantially smaller than the region the Han Chinese now occupy. According to historical documents, the Han Chinese suffered many conflicts with natives prior to expansion into their lands. The Han migrated northward into regions inhabited by many ancient northern ethnic groups. Based on the advanced agriculture, technology, and culture, the Han Chinese or their ancestors often had a greater demographic advantage over ancient northern ethnic groups. Thus, the Han Chinese or their ancestors might have played a predominant role in the genetic mixture of populations. This scenario would mean that the genetic structure of the NH was shaped a long time ago. In our study, the HB population showed great genetic affinities with the NH when maternal lineages were tested. First, the HB contained a distribution and component of mtDNA similar to that of the NH and clustered closely together with the NH in the PCA plot. Second, the HB shared more haplotypes with the NH than with other populations in the haplotype-sharing analysis. Third, the FST value from comparisons between the HB and NH populations was lowest and negative. Generally, FST value should theoretically range between 0 and 1. However, if the estimate of within diversity is larger than the estimate obtained of variance among groups, negative FST values should be obtained, and they are represented as equal to zero[48,49]. It indicated that HB bore a very high similarity to NH populations. Considering the location and culture of the HB, we suggest that the NH might have provided a significant contribution to the HB and find that the maternal genetic profiles of the NH were shaped 3,000 years ago.
These conclusions are further supported by the relationship between the HB and NM, XN, and XB. In our study, the PCA plot is consistent with the SH not only mixing with the SM but also with the NH, which is consistent with a previous genetic study that concluded that the SH was formed from almost equal contributions of southward migrating Han Chinese and southern natives . However, the NH and NM group into two separate clusters, which is not consistent with their current geographic distributions because these two populations often live together in the northern region of China. Moreover, XN,XB1 and XB2 pool into the NM and are far away from HB and NH. A haplotype-sharing analysis of the three ancient populations and each present-day Han Chinese population shows that the fraction of haplotypes from HB is significantly higher than that from XN, XB1 and XB2 (all of the p values of HB/XN, HB/XB1 and XB2 are less than 0.01, two-tailed t-test; S4 Fig). In the FST comparisons, the FST values of the XN/HB, XB/HB, XB/NH, XN/NH, and NM/NH are significantly higher, and all of the p values are less than 0.05, indicating that the XN and XB were distinct from the NH and HB (S3 Fig). This finding indicates that the ancient populations of the XN and XB had a limited maternal genetic impact on present-day Han Chinese.
Y chromosome SNP analysis was consistent with the conclusions drawn from studying the maternal lineages. In the paternal lineage, HB contained the haplogroups or sub-haplogroups N, O*, O2a, O3 and Q1a1. The total frequencies of these haplogroups reached high levels (66%–100%) in current Han Chinese [11,27,30,52,53]. Haplogroup Q1a1, which was predominant in HB, is highly specific to the Han Chinese . Haplogroup O3, the second highest frequency (33.34%) in HB, occupies the highest frequencies in almost all current Han Chinese populations (32.5%-76.92%) [11,27,30,52,53]. Moreover, in the PCA plot, HB groups closely with the Han Chinese. These results indicate that the 3,000-year-old ancient people from the Central Plain region share similar paternal genetic profiles with the current Han Chinese. In contrast, XN yielded three haplogroups (N3, Q, and C) but no haplogroup O . The frequency of O in NM is significantly lower than the frequency of O in NH, but the frequency of haplogroup N shows the inverse trend. Moreover, NM has a relatively high frequency of haplogroup R, but NH does not. Thus, the major paternal genetic component of NH was shaped in the Central Plain region of China prior to 3,000 years ago.
According to historical documents, most of the ancient populations that inhabited the northern region of China were nomads. With no permanent settlement, these populations often moved from place to place. In contrast, the ancestors of the Han Chinese were farming people, who often settled down in a region and seldom moved. Following increases in population size, the ancestors of the Han Chinese gradually expanded into the surrounding areas and conflicted with the ancient northern groups. Finally, most of the ancient northern groups gradually disappeared. Because of the large differences in lifestyle and culture between farmers and nomads, most of the ancient northern ethnic populations might have migrated to other areas when they were defeated, and their lands were gradually occupied by the Han Chinese. A similar population replacement model is also found in Europe, where the diffusion of agriculture and language from the Near East was concomitant with a large movement of farmers [13,55–58]. The Han Chinese have the largest population size relative to the populations they admixed with, suggesting a stable genetic structure in the northern Han Chinese for at least the past 3,000 years.
DISTRIBUTION OF MITOCHONDRIAL DNA HAPLOGROUPS
According to a previous study, the haplogroups of the Han Chinese can be classified into the northern East Asian-dominating haplogroups, including A, C, D, G, M8, M9, and Z, and the southern East Asian-dominating haplogroups, including B, F, M7, N*, and R . These haplogroups account for 52.7% and 33.85% of those in the NH, respectively. Among these haplogroups, D, B, F, and A were predominant in the NH, with frequencies of 25.77%, 11.54%, 11.54%, and 8.08%, respectively [11,23,24,28,51]. However, in the SH, the northern and southern East Asian-dominating haplogroups accounted for 35.62% and 51.91%, respectively. The frequencies of haplogroups D, B, F, and A reached 15.68%, 20.85%, 16.29%, and 5.63%, respectively. Notably, in the HB samples, haplogroups D, B, F, and A were also predominant and showed frequencies of 23.44%, 12.5%, 10.93%, and 10.93%, respectively. In addition, the frequency of haplogroup M* was high and reached 17.19%. Other haplogroups such as C, G, M7, M8, M9, Z, N9a and R had lower frequencies at 3.13%, 1.56%, 1.56%, 3.13%, 7.81%, 3.13%, 3.13% and 1.56%, respectively. The northern and southern East Asian-dominating haplogroups account for 50.04% and 26.56%, respectively, which is similar to the values in the NH (S2 Fig).
PRINCIPLE COMPONENT ANALYSIS
To further identify the genetic affinities among the HB, two ancient populations and the present-day Chinese population, represented by 9 NH, 9 NM, 14 SH and 57 SM groups, the mtDNA haplogroup distributions were compared using a PCA. The PCA plot of the first and second components (31.81% of the total variance, Fig 2A) shows that the current populations largely segregate into three main clusters: NH (in orange), SH (in blue) and SM (in gray), and NM (in green). The distribution of populations in the PCA plot was in line with their geographic distribution, and these populations were separated by the first principal component. The populations living in northern China (NH and NM) are located on the right side of the PCA, and they contain the northern East Asian-dominating haplogroups A, C, D, G, M8, M9, and Z. In contrast, the populations living in southern China (SH and SM) are located on the left side of the PCA, and they contain the southern East Asian-dominating haplogroups B, F, M7, and R. Moreover, the NH can be separated from other populations except for two SH (Hubei and Shanghai), using the second principal component. The HB population (PC1 value: 0.071; PC2 value: 1.453) groups closely with the NH (PC1 value: 0.239±0.269; PC2 value: 1.590±0.336). Overall, these results indicate that the HB population shares a similar genetic profile with the NH that is distinct from the NM and ancient northern ethnic groups
O2 and O2b-M176 (relabeled as O1b2-M176) patterns of expansion
Note: Emerging Ancient DNA shows a need to modify some of the earlier theories of southerly cradles for O haplogroups, as well as update the Phylogenetic Trees of O haplogroup, that were formed prior to aDNA analysis, as seen in
Banpo Village, Xi’an of Shaanxi Province is considered by some Chinese to Zhao Y-B, Zhang Y, Zhang Q-C, Li H-J, Cui Y-Q, Xu Z, et al. (2015) Ancient DNA Reveals That the Genetic Structure of the Northern Han Chinese Was Shaped Prior to 3,000 Years Ago. PLoS ONE 10(5): e0125676. https://doi.org/10.1371/journal.pone.0125676 the cradle of Chinese civilization
See updated Phylogenetic Tree of O haplogroup especially of the positions of O2a and PK4 found in Pakistan. The older literature and studies may all have to be revised substantially.
Yali Xue, Tatiana Zerjal, Weidong Bao, Suling Zhu, Qunfang Shu, Jiujin Xu, Ruofu Du, Songbin Fu, Pu Li, Matthew E. Hurles, Huanming Yang and Chris Tyler-Smith Male Demography in East Asia: A North–South Contrast in Human Population Expansion Times
GENETICS April 1, 2006 vol. 172 no. 4 2431-2439; https://doi.org/10.1534/genetics.105.054270