Study of Taojiazhai remains reveals central Asian origins of O3 in the Di-Qiang populations

It is about time to review the position of the 2005 paper (Hong Shi et al.) Y-Chromosome Evidence of Southern Origin of the East Asian–Specific Haplogroup O3-M122 that befuddled most theories on origins of East Asians for a good 15 years, and internet discussions till this day.  The idea of the cradle of Han cultural origins as originating in the south needs to be thrown out for good, as a huge amount of interdisciplinary data as well as on the genetics front now shows extremely ancient populations in more northerly parts. Clear evidence in terms of ancient DNA as well as evidence in the archaeological record is revealing the presence of more northerly prehistoric societies that have entered and formed the Han Chinese as well as surrounding populations of Northeast and East Asian populations (such as that of Korea and Japan). The origins of East Asians involves a much more complicated picture than the previously modeled by the Hong Shi paper which went on to be cited by papers for a decade and beyond, and new models and theories need to be disseminated to all who are interested in Chinese origins.

This paper on Taojiazhai (excerpts follow) showing mtDNA and Y-DNA that is predominantly northern in nature, and its conclusions that Di-Qiang populations migrated south, is but one of many other emerging studies that show the demic and cultural influences of Central Asia, “the Northern Zone”, northeast Asia and the coastal SEA, in merging form what is the Han culture. 

Location of Taojiazhai cemetery site in Qinghai

Here’s what the abstract from 2011,  Ancient DNA Evidence Supports the Contribution of Di-Qiang People to the Han, Chinese Gene Pool, American Journal of Physical Anthropology · February 20

Han Chinese is the largest ethnic group in the world. During its development, it gradually integrated with many neighboring populations. To uncover the origin of the Han Chinese, ancient DNA analysis was performed on the remains of 46 humans (1700 to 1900 years ago) excavated from the Taojiazhai site in Qinghai province, northwest of China, where the Di-Qiang populations had previously lived. In this study, eight mtDNA haplogroups (A, B, D, F, M*, M10, N9a, and Z) and one Y-chromosome haplogroup (O3) were identified. All analyses show that the Taojiazhai population presents close genetic affinity to Tibeto-Burman populations (descendants of Di-Qiang populations) and Han Chinese, suggesting that the Di-Qiang populations may have contributed to the Han Chinese genetic pool.

Analysis of the Taojiazhai cemetery’s remains reveals the following about the origins of the Di-Qiang’s formation of the Han people:

A great deal of research using ancient DNA study of human remains has been reported (Hofreiter et al.,2001) since the genetic examination of an Egyptian
mummy over 20 years ago (Paabo, 1985). Among many markers, the mitochondrial DNA (mtDNA) has been widely adopted. More recently, the Y chromosome nonrecombining region has also been used in ancient DNA
research (Hummel and Herrmann, 1991; Kuch et al., 2007). The two markers have a great advantage in tracing the origins of populations in maternal lineages and paternal lineages, respectively, because of their monophyletic inheritance and absence of recombination (Giles et al., 1980; Jobling and Tyler-Smith, 1995).
The Hehuang area, the upper part of the Yellow River, was the cradle of many Chinese ethnic groups, according to historical documents. Ancient people migrated from Southeast Asia to this region and formed the Di-Qiang populations about 10,000–40,000 years ago (Su et al., 2000). During two periods, respectively, 4,000–5,000 and 2,000–2,500 years ago, the Di-Qiang people embarked on large-scale southward migrations into the southwest of
China, where they mixed with southern natives, including those speaking Daic, Hmong-Mien, and Austro-Asiatic. They developed into such Tibeto-Burman populations as the Tibetan, Qiang, Yi, Pumi, Tujia, and so on (Yang and Ding, 2003). In addition, a branch of the DiQiang population migrated eastward to the central plain area, the middle and lower Yellow River Valley, and these integrated gradually with the natives around 5000–6000 years ago. During the Han Dynasty (206 B.C. to 220 A.D.), they developed into a large population known as Han Chinese. With the expansion, especially southward, of Han Chinese, this group became much the largest of the 56 officially recognized ethnic populationsin China (Tian, 2001; Xu, 2003). Genetic studies based on modern people have hitherto been reported to clarify the origin and development of the Han Chinese. The Han Chinese were divided into two different groups, northern Han and southern Han, through analysis of the classic markers (Zhao and Lee, 1989) and STR markers (Chu et al., 1998). The differentiation between southern and northern Han was also observed at mtDNA marker (Yao et al., 2002a; Wen et al., 2004a).
According to Y-chromosome SNP analysis, however, both southern and northern Han people shared the same paternal lineages (Wen et al., 2004a). Although the genetic polymorphism of modern Han Chinese is now well understood, little is known about the genetics of the ancient Han Chinese. The Taojiazhai site is located in the Hehuang area (Fig. 1). Archaeological studies show that it was occupied from the Han to the Jin Dynasty (1700–1900 years ago), and many excavated funerary objects (pottery vessels, ironware, copperware, gold and silver ornaments, and agate ornaments) belonged to the Han culture of that time. In addition, the special burial pattern, in which a male and a female individual were buried in one coffin, showed that the ancient Taojiazhai
population had a native burial custom (Zhang, 2008).
In this study, we investigated both the maternal and paternal lineages of the human remains excavated from the Taojiazhai site. Through the analysis of mtDNA and Y-chromosome SNP, we have attempted to explore the origin of the Han Chinese.

Results of the study:

… Taojiazhai pooled into the northern Han Chinese and was also close to a few Tibeto-Burman populations. … The people who lived in northern China such as the northern Han people …, northern minorities … and ancient Taojiazhai people … clustered together. In contrast, those who lived in southern China as southern Han people …,  Tibeto-Burman…, HmongMien…, and Daic populations …were near each other. In the plotting of PCA (Fig. 3), the northern minority populations, northern Han Chinese populations, and southern Han Chinese populations clustered, respectively. Hmong-Mien populations and Daic populations gathered together, whereas Tibeto-Burman populations scattered in both PC1 and PC2, and mixed with southern Han Chinese populations. Taojiazhai pooled into the northern Han Chinese and was also close to a few Tibeto-Burman populations.

AMOVA was used to evaluate maternal genetic differentiation between Taojiazhai and other populations (Table 6). The Fst between Taojiazhai people and TibetoBurman population was the lowest (20.00095), indicating
that Taojiazhai people bore a very high similarity to Tibeto-Burman population. In addition, Taojiazhai and Han Chinese (including northern Han and southern Han) were not significantly different (Fst value: 0.00359 and 0.00406, P [ 0.05). In contrast, the Fst values of Taojiazhai/northern minorities,  Taojiazhai/Hmong-Mien, and Taojiazhai/Daic (0.01085, 0.01210, and 0.03217) were much higher than that of Taojiazhai/Tibeto-Burman, Taojiazhai/northern Han, and Taojiazhai/southern Han.
Based on the mtDNA HVR I motifs, combined with the East Asian mtDNA classification tree (Kivisild et al., 2002; Yao et al., 2002a; Yao et al., 2002b; Kong et al., 2003; Yao et al., 2003), some mitochondrial haplotypes
were further attributed to the subhaplogroups such as D4, D5, B4, B5, and F1b. The mtDNA haplogroups (A, B5, D4, D5, F1b, N9a, and Z) were analyzed in network (Fig. 4). B4 and M were not analyzed because the network profiles of them were not star-like but complex. …

The network for M10 was not constructed due to data scarcity. In the Maximum Parsimony trees, every haplogroup had one central node at least. In northern predominant haplogroups A, D4, D5 and Z, most of the Taojiazhai people shared the central haplotypes and fewer shared peripheral haplotypes. In contrast, Taojoazhai people were around the biggest central circle in the southern main haplogroups B5, F1b, and N9a. Moreover, regardless of whether they were in northern predominant haplogroups or southern predominant haplogroups, most of Taojiazhai samples were always nearby the Tibeto-Burman populations and Han Chinese. Some Daic and Hmong-Mien individuals also shared same nodes with Taojiazhai people but they accounted for only a small part of overall Daic and Hmong-Mien populations used in network analysis. Although two Taojiazhai individuals were close to Daic populations in haplogroup B5, they formed one cluster with some Tibeto-Burman populations, southern Han, and northern Han. It suggests that the ancient Taojiazhai people were close to the Tibeto-Burman populations and modern Han populations in the maternal lineages.

In a previous study, we analyzed an ancient Di-Qiang people (3800 years ago) from the Lajia site, which is in approximately 100 kilometers distant from Taojiazhai (Gao et al., 2007). Comparing these two ancient populations, we found that the ancient Taojiazhai people contained four of the five haplogroups (B, D, M10, and M*) of Lajia; Haplogroup D was predominant in Taojiazhai as well as in Lajia; haplotypes 16223-16362 and 16223-16311 existed both in Taojiazhai and Lajia. There should be like similarities between the ancient Taojiazhai people and the ancient Lajia Di-Qiang populations.
However, we cannot test this thesis through statistical analysis because of the small numbers and the kinship of the ancient Lajia people. Therefore, samples from modern Tibeto-Burman populations were retrieved to further clarify the relationship between the ancient Taojiazhai and Di-Qiang populations. According to historical records and genetic evidence (Du and Yip, 1993; Yao
et al., 2002b), the Tibeto-Burman populations were formed by two parental groups: southward-immigrated Di-Qiang populations and native southerners (HmongMien and Daic populations). Based on the mtDNA marker, some Tibeto-Burman populations, such as Tibetan, Tujia, and Hani, showed a higher proportion of contribution from Di-Qiang populations than from southern natives (Yao et al., 2002b; Wen et al., 2004b). In the plotting of the PCA, some Tibeto-Burman populations were near the northern natives although most of the
Tibeto-Burman populations live in southern China. Two Tibeto-Burman populations, Tibetan and Hani, were the closest to the ancient Taojiazhai people, and four TibetoBurman populations (Tujia, Pumi, Yi, and Aini) were also closer to the Taojiazhai, whereas the southern natives, such as Hmong-Mien and Daic, were distant genetically from the Taojiazhai people. The results of AMOVA further indicated that the Taojiazhai people bore a high resemblance to the Tibeto-Burman populations in their maternal lineages (Fst 5 20.00095, P [0.05). In contrast, the Taojiazhai people and the Daic populations were significantly different in their mtDNA lineages (Fst 5 0.03217, P \ 0.05), and Fst value between the Taojiazhai people and the Hmong-Mien populations
was also high (Fst 5 0.01210, P 5 0.05670). This means that those Tibeto-Burman populations who had high contribution of the Di-Qiang populations were close to the Taojiazhai people. Moreover, in the network analysis,
there were some Tibeto-Burman populations who shared some nodes with the ancient Taojiazhai people. The close genetic affinity among them presented evidence that the Di-Qiang populations played a significant role in shaping
the gene pool of the ancient Taojiazhai people.
However, this study showed that the ancient Taojiazhai people bore a high genetic resemblance to the northern Han Chinese in maternal lineages. First, the distribution frequency of Taojiazhai haplogroups was similar to that of the northern Han Chinese, and the Taojiazhai pooled into the northern Han group in PCA; second, the Taojiazhai people and the northern Hans were not significantly different (Fst 5 0.00359, P [ 0.05); third, a close genetic relationship between the Taojiazhai people and the Han Chinese was reflected in the network.
Although a few southern Han populations were close to the ancient Taojiazhai people in PCA, and Fst value of Taojiazhai/southern Han were low (Fst 5 0.00406, P [0.05), it should be pointed out that we did not find a very close relationship between the ancient Taojiazhai population and the most of the southern Han Chinese in maternal lineages in PCA. Even as the origin of TibetoBurman populations, the formation of southern Han Chinese was the result of a mixture between the northern Han Chinese immigrating southward and the southern natives. Consequently, a genetic similarity between the
southern Han and the southern natives was evident in the maternal lineage (Wen et al., 2004a). Plotting of the PCA was quite consistent with this result. Most of the southern Han clustered into the southern native group, which were far from the northern Han and Taojiazhai, whereas the northern Han and Taojiazhai formed a tight cluster.

Almost all Han populations, however, have a high resemblance in paternal lineages because of the presence of a sex-biased admixture pattern among the southern Han Chinese (Wen et al., 2004a). The haplogroup O3 was the dominant haplogroup not only in the northern Han Chinese (the frequency of O3 was 54.8%) but also in the southern Han Chinese (the frequency of O3 was 56.4%) (Wen et al., 2004a). In addition, the Tibeto-Burman populations presented more male lineages from the Di-Qiang populations, because of the sex-biased admixture, and the high frequency of O3 (an average of 40.7%)
was found in the Tibeto-Burman populations (Wen et al., 2004b). In contrast, the dominant haplogroups in the Daic populations were O1a and O2a* (Li, 2005). O3* and O2a* were frequent in the Hmong-Mien populations(Feng, 2007) (Table 5). In this study, all 12 male individuals were typed as haplogroup O3, suggesting that the Di-Qiang populations, ancient Taojiazhai people, and Han Chinese shared a genetic similarity in the male lineages.
According to historical documents, several periods of intermixing are known between the Di-Qiang populations and the Han Chinese (or their ancestors). There have been at least three explicit records of migration:

(1) a branch of the Di-Qiang population migrated eastward to the central plain area around 5000–6000 years ago;

(2) during the Western Han Dynasty (202 B.C. to 25 A.D.), many people from the central plain area expanded westward into Hehuang, and these greatly influenced the Di-Qiang populations;

(3) western groups living in the Hehuang area expanded into the central plain area and admixed with the Han in the Southern and Northern Dynasties (420 A.D. to 589 A.D.) (Du and Yip, 1993).

The Ancient Taojiazhai people, coincidently residing in the Hehuang area where the Di-Qiang populations had previously lived, shared a close genetic relationship with the Tibeto-Burman populations who have been identified
as the descendants of Di-Qiang populations by genetic studies, indicating that Taojiazhai people might be descended from the Di-Qiang populations. In addition, the ancient Taojiazhai people also bore a strong resemblance to the Han Chinese who is the majority of inhabitants in Hehuang area now. The discussion above illuminates the contributions of the Di-Qiang populations to the gene pool of Han Chinese.

The results of genetic analysis, which the ancient Taojiazhai people bore a very high similarity to those Tibeto-Burman populations who had high contribution of the DiQiang populations, together with the geographic location of Taojiazhai site, suggested that the ancient Taojiazhai people was the descendant of the Di-Qiang populations.
Moreover, genetic and archaeological data of the ancient Taojiazhai people showed that they were close to the Han Chinese. These evidences are consistent with the history of the ethnic groups analyzed in this study. Therefore, we
conclude that the ancient Di-Qiang populations may be one of the genetic contributors to the Han Chinese people.

Given its location in Qinghai, it might be also useful to consider the location of Taojiazhai in relation to Qinghai route and its geoposition – with Mongolia to the north, Northern and Southern China to the east, Tibet to the south, and Central Asia to the west – and the Qinghai route is (the lesser known of routes compared to the Northern route and Hexi corridor) yet to be recognized an obvious trade route standing as a practical intersection of five routes. Research is now beginning to reveal the Northern Zone and these routes as important for how metallurgy and domesticated animals (such as cattle and sheep) and grain arrived from the West or from the Mongolian steppe.


Further readings:

Jiawei Li, 2017, Ancient DNA reveals genetic connections between early Di-Qiang and Han Chinese BMC Evolutionary Biology volume 17, Article number: 239 (2017)

Mogou mtDNA haplogroups were highly diverse, comprising 14 haplogroups: A, B, C, D (D*, D4, D5), F, G, M7, M8, M10, M13, M25, N*, N9a, and Z. In contrast, Mogou males were all Y-DNA haplogroup O3a2/P201; specifically one male was further assigned to O3a2c1a/M117 using targeted unique regions on the non-recombining region of the Y-chromosome. We compared Mogou to 7 other ancient and 38 modern Chinese groups, in a total of 1793 individuals, and found that Mogou shared close genetic distances with Taojiazhai (a more recent Di-Qiang population), Hengbei, and Northern Han. We modeled their interactions using Approximate Bayesian Computation, and support was given to a potential admixture of ~13-18% between the Mogou and Northern Han around 3300–3800 years ago.

Ye Zhang et al., 2016, Genetic diversity of two Neolithic populations provides evidence of farming expansions in North China Journal of Human Genetics volume 62, pages199–204(2017) –  Individuals from the Jiangjialiang belonged to two Y haplogroups, N1 (not N1a or N1c) and N1c. The individuals from the Sanguan are Y haplogroup O3. The West Liao River Valley and the Yellow River Valley are recognized Neolithic farming centers in North China. The population dynamics between these two centers have significantly contributed to the present-day genetic patterns and the agricultural advances of North China, and the study concluded that populations from the West Liao River Valley spread south at about 3000 BC, and a second northward expansion from the Yellow River Valley occurred later (3000–1500 BC).

Yong Bin Zhao et al., 2014, Ancient DNA evidence reveals that the Y chromosome haplogroup Q 1a1 admixed into the Han Chinese 3,000 years ago Am. J. Hum. Biol. 26:813–821, 2014. Wiley Periodicals, Inc.

The origins of the observed haplotypes and their distribution in present day Han Chinese and in the samples suggest that haplogroup Q1a1 was probably introduced into the Han Chinese population approximately 3,000 years ago. Samples from 27 individuals were assigned to haplogroups N, O*, O2a, O3a, and Q1a1