New analysis suggests that Himalayans share northeast Asian common ancestor and origins with ancient Jomon

The Haplogroup D-M174 Y-chromosomes that are found among populations of the Japanese Archipelago (haplogroup D-M55 a.k.a. haplogroup D1b) are particularly distinctive, bearing a complex of at least five individual mutations along an internal branch of the Haplogroup D-M174 phylogeny, thus distinguishing them clearly from the other Haplogroup D-M174 chromosomes that are found among the Tibetans and Andaman Islanders and providing evidence that Y-chromosome Haplogroup D-M55 was the modal haplogroup in the ancestral population that developed the prehistoric Jōmon culture in the Japanese islands.

Previous literature suggested that the majority of D-M174 Y-chromosome carriers migrated from Central Asia to East Asia….that one group migrated into the Andaman Islands and mixed with the native Negrito population, thus forming the today Andamanese people (probably a male-only migration). Another group stayed in modern Tibet and southern China (today Tibeto-Burman peoples) and another group migrated to Japan, possibly via the Korean Peninsula (Jōmon people).

Bing Su et al., Y chromosome haplotypes reveal prehistorical migrations to the Himalayas, however, suggested that

“Our results showed that a T to C mutation at locus M122 is highly prevalent in almost all of the Sino-Tibetan populations, implying a strong genetic affinity among populations in the same language family. Furthermore, the extremely high frequency of H8, a haplotype derived from M122C, in the Sino-Tibetan speaking populations in the Himalayas including Tibet and northeast India indicated a strong bottleneck effect that occurred during a westward and then southward migration of the founding population of Tibeto-Burmans. We, therefore, postulate that the ancient people, who lived in the upper-middle Yellow River basin about 10,000 years ago and developed one of the earliest Neolithic cultures in East Asia, were the ancestors of modern Sino-Tibetan populations.”

According to Mitsuru Sakitani, Haplogroup D1 arrived from Central Asia to northern Kyushu via the Altai Mountains and the Korean Peninsula more than 40,000 years before present, and Haplogroup D-M55 (D1b) was born in Japanese archipelago. Recently it was confirmed that the Japanese branch of haplogroup D-M55 is distinct and isolated from other D-branches since more than 53,000 years. The split between D1a was thought to have happened in Central Asia, while some others suggest a instant split during the origin of haplogroup D itself, as the Japanese branch has five unique mutations not found in any other D-branch. Source: Haplogroup D-175, Wikipedia

The 2012 Wang et al., paper also unexpectedly found an East Asian/East Eurasian ancestry source for the peopling of Nepal, as opposed to an all northeast Indian one:

“To trace the origin of the Nepalese maternal genetic components, especially those of East Eurasian ancestry, and then to better understand the role of the Himalayas in peopling Nepal, we have studied the matenal genetic composition extensively, especially the East Eurasian lineages, in Nepalese and its surrounding populations. Our results revealed the closer affinity between the Nepalese and the Tibetans, specifically, the Nepalese lineages of the East Eurasian ancestry generally are phylogenetically closer with the ones from Tibet, albeit a few mitochondrial DNA haplotypes, likely resulted from recent gene flow, were shared between the Nepalese and northeast Indians. It seems that Tibet was most likely to be the homeland for most of the East Eurasian in the Nepalese. Taking into account the previous observation on Y chromosome, now it is convincing that bearer of the East Eurasian genetic components had entered Nepal across the Himalayas around 6 kilo years ago (kya), a scenario in good agreement with the previous results from linguistics and archeology.

Arciero et al. 2018 paper Demographic history of genetic adaptations in the Himalayan Region inferred from the Genome-wide SNP genotypes of 49 populations find recent geneflow from the north into the Himalayan region and that Himalayan populations cluster mostly with East Asians as well as South Asians.

A paper by Mondal et al., on the ancestry of Indian populations concluded:

The Jarawa and Onge shared haplogroup D lineages with each other within the last ~7000 years, but had diverged from Japanese haplogroup D Y-chromosomes ~53000 years ago, most likely by a split from a shared ancestral population.

Source: Mondal, Mayukh; et al., 2017). “Y-chromosomal sequences of diverse Indian populations and the ancestry of the Andamanese”. Human Genetics. 136 (5): 499–510.

Furthermore, a number of new studies are discounting hg D/YAP+ alu polymorphism-bearing Himalayan populations (Tibetans, Sherpa, or Bhutanese) as ancestral source lineages, with findings that these genes more likely have emanated from northeast Asian (or East Asian) ancestral sources (a recent find of DE-YAP* out of Beshbetir (Mongolia) could support this view. )

A 2012 thesis by Tenzin Gayden on the Himalayan and Tibetan populations suggests that the D-haplogroup and YAP+ alu polymorphism carrying populations were Neolithic migrants who arrived to their current Himalayan positions from Northeast Asia. In the said study, 17 Y-STR loci were typed in the three Nepalese populations of Tamang, Newar and Kathmandu as well as a general collection from Tibet to investigate their genetic ancestry and phylogenetic relationships to previously published geographically targeted groups from Northeast Asia, Southeast Asia, South Central Asia and Central Asia using 9-loci minimal haplotypes (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392 and DYS393, DYS385a/b). The study concluded the following:

”Tamang and Tibet exhibit minimal percentages of Y- haplogroup R (8.8 and 2.5%, respectively), indicating that the Himalayas served as a formidable orographic barrier to gene flow from the south1. Findings from the current investigation lend support to the aforementioned statement as admixture results reveal a null contribution from South Asia to both Tamang and Bhutan and only a minor genetic impact onto the Tibetan collection (2.7%). The absence of the South Asian signature in the gene pools of Tamang and Bhutan may be the result of geographic isolation and/or founder effects from another source population(s).
Close genetic ties have been reported between the Tamang and Tibet1. It is likely that Tamangs are descendants of Tibetans who migrated south and settled in the southern region of the Himalayan range1. The genetic affinity between Tamang and Tibet is also reflected in both CA plots (Figs. 2 and 3) and NJ dendrograms (Figs. 4 and 5). In addition, the Tibetan connection to the Tamang is evident in their shared cultural and religious practices. The partitioning of these two populations with Bhutan and their proximity to the general collection from Nepal (Figs. 2, 3, 4 and 5) may be associated with Neolithic migrants carrying Y-haplogroup O3a5-M134, an East Asian specific marker, shared among TB populations1, 3, 4, 9, 49. …

The Himalayan populations, with the exception of Newar and Kathmandu, segregate close to the Northeast Asian cluster in agreement with the admixture analyses results (Table 3). Northeast Asia is the major contributor to both Tibet (63.4%) and Tamang (59.7%) while Newar (44.7%) and Bhutan (41.1%) received equivalent percentages, followed by Kathmandu (22.3%). These results corroborate studies indicating a shared common ancestry between Tibet and the Northeast Asian collections of Japan and Korea by a variety of marker systems, including classical50, 51, autosomal52, Y-chromosome…and mtDNA studies...

More than half of the Tibetan males possess the YAP polymorphic Alu insertion in their Y-chromosome, which is believed to have originated in Central Asia1, 4, 11, 14, although its source remains highly debated53, 57, 58. In the present study, however, given the lack of representative Central Asian populations due to the paucity of the data available from the region, no clear connections were made between Tibet and its possible Central Asian genetic contributors. Afghanistan is the sole Central Asian collection included in the analyses and appears to make no contributions to any of the Himalayan groups except for a minor influence in Kathmandu (12.9%).
In order to evaluate the genetic relationships between the Himalayan collections and the neighboring TB speaking populations at the regional level, six Northeast Indian TB groups were included in the phylogenetic and statistical analyses performed using the 13 core CODIS STR loci. These Northeast Indian TB groups map distantly from both the Himalayan and East Asian populations in the CA graph (Fig. 3), inconsistent with previous Y-chromosome and mtDNA studies which report a high degree of genetic homogeneity between Himalayan and Northeast Indian TB groups3, 4, 9, 59. The discrepancy observed between Y-chromosome and microsatellite polymorphisms in the Northeast Indian TB groups may be explained by a male founder effect from Northeast Asia and their subsequent genetic isolation for an extended period of time following their arrival9. Alternatively, the practice of patrilocal residence among these groups may have introduced the maternal genes from the local populations leading to the microsatellite diversity patterns in these Northeast Indian TB collections that are phylogenetically different from their Y-chromosomal profiles9.
Altogether, our results suggest a Northeast Asian ancestry for the Himalayan populations with subsequent genetic admixture in Kathmandu and Newar populations from South Asia. However, South Asian influences in Tibet and Tamang are negligible most likely the result of the natural barrier presented by the Himalayas1. Tamang, Tibet and Bhutan display close genetic affiliations in all analyses possibly indicating a shared common ancestry. The biparental markers examined in the present study reveal unique genetic profiles for the Northeast Indian TB groups which are distinct from their Himalayan counterparts implying limited gene flow, geographic isolation and/or founder effects. The phylogeny also supports the genetic divergence between Northeast and Southeast Asian collections with a possible southern origin for East Asian populations.”

”Our previous studies of Y-chromosomal biallelic [16] and autosomal STR polymorphisms [17] of Tibet and Nepal have revealed that these Himalayan groups arrived in the area during the Neolithic time from the Northeast Asia with subsequent gene flow from the Indian subcontinent into the Kathmandu valley and Newar population. The latter conclusion is congruent with recent mtDNA studies [18, 19], which reported shared maternal lineage between Indian and Nepalese populations. In contrast, Tibet and the Tamang population display limited influence from the Indian subcontinent suggesting that the Himalayan massif has acted as a barrier for gene flow from the south into the Tibetan plateau [16, 17]. In addition to the Northeast Asian influence, the high frequency of the Asian-specific Alu insertion at the YAP (Y Alu polymorphism) locus in the Tibetan Y-chromosomes had previously led some researchers to argue for Central Asian contribution in the Tibetan gene pool [20-22]….

Phylogenetic relationships among the four Himalayan collections and other geographically targeted populations were assessed using CA (Figure 1) and NJ (Figure 2) analyses. Figure 1B represents the contribution of each of 94 alleles of 9 Y-STR loci to the partition of the populations. The Himalayans cluster loosely in the upper right quadrant of the plot (Figure 1A) and share the same clade in the tree (Figure 2), with the exception of Kathmandu which maps closer to the South Central Asian group in both the analyses [16, 17]. Newar also seems to display some affinity to the South Central Asian assemblage along the X axis, whereas Tamang and Haryana are outliers from their respective groups (Figure 1A). The genetic similarities observed within the Himalayas based on their Y-STR loci (Figures 1A and 2) are reflected in the high frequencies of Y- haplogroup O3a3c-M134, common among Tibeto-Burman speakers [16, 21, 52]. The above inference is supported by the fact that the most frequent minimal haplotype (14-12- 28-23-10-14-12-13-19), as well as its one-step mutation neighbor at DYS385 (DYS385a/b = 13, 18 allele), both belong to haplogroup O3a3c-M134 [16]. On the other hand, Kathmandu and Newar’s affinity to the South Central Asian cluster in the CA and NJ tree (Figures 1A and 2, respectively) may be due to the presence of Indian Y-lineages (Haplogroups R, H and C5) in their gene pools [16].
The East Asian collections map toward the middle of the lower half of the graph while the Mongolians and Buryats segregate to the left of the chart (Figure 1A). There is no clear genetic partitioning between the northern and southern East Asian populations in both the CA and NJ tree (Figures 1A and 2) [2]. Overall, the NJ dendrogram mirrors the distributions of populations in the CA with the exceptions of Mongolia and Buryat which form a sister clade with the South Central Asian branch, and the general population of Nepal showing more affinities with Tamang than with Bhutan (Figure 2).
The lack of phylogenetic affinities exhibited by Tamang in relation to Tibet (Figures 1A and 2) are of interest given its proposed close genetic association with the latter in previous studies [16, 17]. Although both Tamang and Tibet share high frequencies of haplogroup O3a3c-M134 [16], their Y-STR profiles differ considerably. In order to gain further insight into the recent demographic history of these two groups, a median-joining network based solely on Y-haplogroup O3a3c-M134 was constructed at the level of the 15 Y-STR loci utilizing our four Himalayan populations (Figure 3). It is notable that Tamang and Newar form distinct clusters because of their shared or closely related haplotypes, while Tibet and Kathmandu are highly divergent (Figure 3). This finding suggests either a male founder effect in Tamang, possibly from Tibet, or a recent bottleneck event as they settled south of the Himalayas from Tibet, leading to their highly reduced Y-SNP [18] and Y-STR diversity (Table 2). On the other hand, Newar’s unique genetic profile may be due to isolation and/or drift [17].

In summary, our results confirmed previous Y-chromosomal and autosomal STR reports that Newar and Kathmandu experienced substantial gene flow from the India whereas Tamang and Tibet display no genetic influences from the subcontinent. A median-joining network of haplogroup O3a3c-M134 based on 15 Y-STR loci suggests recent bottleneck and/or founder effect in Tamang. A high value of combined haplotype diversity (0.9970) from the four Himalayan populations is indicative of genetic heterogeneity within the region. In addition, very high percentages (99.82-100%) of the maximum probability (dbmax) of finding two different Y-haplotypes when sampling a pair of individuals between two different Himalayan populations underscores the genetic singularity of the four Himalayan collections reported. The uniqueness of our four Himalayan populations argues for independent databases for forensic analysis and paternity testing….

Archaeological records indicate late Paleolithic inhabitation of the Tibetan plateau [3], while Y-chromosomal data..suggest that the peopling of the highland occurred during the Neolithic period. Recent articles on mtDNA genome diversity in Tibetan populations… revealed evidence of successful late Paleolithic settlement on the plateau, thereby bridging the gap between the findings from genetic and archaeological studies….

The phylogenetic relationships between the three Tibetan collections and 23 geographically targeted reference populations from the Himalayas, Southeast Asia, South Central Asia, Central Asia and Northeast Asia were assessed via CA (Fig. 1) and NJ (Fig. 2) analyses on the basis of their allele frequencies at 11 Y-STR loci. The three Tibetan provinces loosely cluster within the upper right quadrant of the CA plot, together with a Tibetan population from Qinghai and a collection from Lhasa. These five populations also share the same clade on the NJ tree, with Amdo and Kham sharing the terminal node of the branch and U-Tsang diverging earlier, supporting the results of a recent study using autosomal STRs [45]. While Bhutan, Newar and Nepal form a sister clade with the Tibetan populations in the NJ tree, the CA plot indicates that these Himalayan populations possess stronger genetic ties with the East Asian populations. Moreover, Kathmandu shows genetic affinity towards the South Asian populations, as reported previously… Overall, the phylogenetic relationships established in the NJ dendrogram reflect the distribution of the populations in the CA plot.”

Source: Genetic Diversity in the Himalayan Populations of Nepal and Tibet

Refer also to Gayden, Tenzin et al., The Himalayas: barrier and conduit for gene flow. Am J Phys Anthropol. 2013 Jun;151(2):169-82. doi: 10.1002/ajpa.22240. Epub 2013 Apr 12.

“Although previous Y-chromosome studies indicate that the Himalayas served as a natural barrier for gene flow from the south to the Tibetan plateau, this region is believed to have played an important role as a corridor for human migrations between East and West Eurasia along the ancient Silk Road.” The analysis of  mitochondrial DNA variation in 344 samples from three Nepalese collections (Newar, Kathmandu and Tamang) and a general population of Tibet “revealed a predominantly East Asian-specific component in Tibet and Tamang, whereas Newar and Kathmandu are both characterized by a combination of East and South Central Asian lineages. Newar and Kathmandu harbor several deep-rooted Indian lineages, including M2, R5, and U2, whose coalescent times from this study (U2, >40 kya) and previous reports (M2 and R5, >50 kya) suggest that Nepal was inhabited during the initial peopling of South Central Asia.”

The study confirmed “that while the Himalayas acted as a geographic barrier for human movement from the Indian subcontinent to the Tibetan highland, it also served as a conduit for gene flow between Central and East Asia.”

Lastly, a Tibeto-Burman-related I.e. SEA/EAS origin is suggested as a source of ancestry for the Sherpa populations of Tibet.

Kang L, et al., Northward genetic penetration across the Himalayas viewed from Sherpa people. Mitochondrial DNA. 2014 Mar 11.p

“…genetic components from Indian Subcontinent have been observed in Sherpa people living in Tibet. … Those lineages with South Asian origin indicate that the Himalayas have been permeable to bidirectional gene flow.

“…In this paper, we mainly focus on the origin and migration pattern of Sherpa people. According to the historical literature, Sherpa migrated from the Kham region in eastern Tibet and western Sichuan to the southern foot of the Himalayas (Oppitz, 1968). However, some folktales suggest the Sherpa as descendants of the Tangut Kingdom (1038 to 1227 AD) who fled their homeland in Muyag district to escape Mongol invasion (Gong-Bo, 2011). Here, we use informative Y chromosome and mtDNA markers to give a clue about the northward gene flow across the Himalayas and shed light on the origin of the Sherpa..”

“According to the nomenclature of Y Chromosome Consortium(YCC) (Karafet et al., 2008; Yan et al., 2011), nine SNP haplogroups were determined from the 84 male individual samples (Figure 1a and Table S1). Haplogroup D1-M15, which is supposed to be the Paleolithic genetic legacy with a wide distribution among most Tibeto-Burman, Tai-Kadai, and Hmong-Mien populations (Shi et al., 2008), is also prevalent in Sherpa(11.90%). Haplogroup D3-P99 and its sublineage D3a-P47 are almost exclusively distributed in Tibeto-Burman populations(Shi et al., 2008), and also found highly frequent in Sherpa(7.14% and 15.48%, respectively). Haplogroup O3a2c1a-M117,one of the three main sublineages of O3, accounts for about 16%of Han Chinese and also exhibits high frequencies in Tibeto-Burman populations (Wang & Li, 2013; Yan et al., 2011). In this study, O3a2c1a-M117 comprises nearly half of Sherpa people(45.24%). The frequencies of another two main components of Sino-Tibetan populations, O3a2c1*-M134 and O3a1c-002611(Wang et al., 2013; Yan et al., 2011), are negligible in Sherpa(1.19% and 0, respectively). …

Almost all the Tibeto-Burman populations, including Sherpa, cluster together in the middle left corner of the plot, which is accounted for by the extensive sharing of haplogroup D1-M15, D3-P99, and O3a2c1a-M117 among them………..
In haplogroup O3a2c1a-M117, most of the Tibetan populations cluster tightly together in the NJ tree, along with Sherpa and Tamang of Nepal. However, more haplotypes of Sherpa samples share ancestry with Tibetan and other Tibeto-Burman populations from East Asia other than from Nepal (Figure 3b). Similarly, haplotypes of D1-M15 of Sherpa share ancestry with Tibetan, northwestern Han, and Zhuang (included in Tai-Kadai) populations from East Asia, although Sherpa has tended to be segregated away from the Tibetan cluster in the NJ tree.

…. As we have mentioned above, haplogroup D3-P99 and D3a-P47 are almost exclusively distributed in Tibeto-Burman populations; but not only that, haplotypes of D3 also show strong similarities among different populations with distinctive and specific seven repeats at locus DYS392 (Figure 3d and Table S1)…..

The most common mtDNA haplogroups in Sherpa are A4, C4a3b, M9a1a, D4, and U(including U* and U2a), in order of frequency. The majority of the mtDNA lineages belong to eastern Eurasian specific groups, including those from Northeast Asia (A, D4, D5, G, C, and Z)(Derenko et al., 2003, 2007; Tanaka et al., 2004) and Southern China or Southeast Asia (F, M9, M12 and M13) (Li et al., 2007), accounting for 59.02% and 23.50%, respectively….

Most of the Tibeto-Burman populations, including Sherpa, cluster tightly in the upper leftcorner of the plot. …Similarly, with the Y chromosome PCA plot, the Altaic populations also aggregate intermediate between the East Asian Tibeto-Burman cluster and the South Asian groups. However, the results based on haplogroup frequency comparisons could be misleading because of the quickly changing frequencies of the mtDNA lineages (Lu et al., 2013). A network analysis of individual lineages will most likely offer a better investigation of maternal relationships among the Sherpa and Himalayan populations. Haplogroup A4, C4a, and M9a comprise more than 60% of Sherpa samples, and the networks of those haplogroups were analyzed based on the HVS-I motif (Figure 4). In haplogroup A4, most haplotypes of Sherpa are shared with Tibeto-Burman, Altaic, and Han Chinese and clustered in the main clade of the network. …


Tibeto-Burman origin of Sherpa
About 83% of Sherpa Y chromosomes, including haplogroup C,D, and O, can be assigned an East or Southeast Asian origin. Detailed genetic structures at haplotype level of those lineages reveal strong affinities between Sherpa and Tibeto-Burman populations (especially with Tibetans). From the maternal side, mtDNA lineages that can trace to the East or Southeast Asian origin comprise about 82.5% of Sherpa people, and most of HVS-I haplotypes are shared or close connected with samples of Tibeto-Burman and Altaic populations. The internal homogeneity observed in some lineages suggests a possible founder effect during the origin of Sherpa, especially for Y chromosome haplogroup D1-M15 and O3a2c1a-M117, mtDNA haplogroup A4 and C4a; that is, Sherpa people of those haplogroups are derived from a small number of migrants from a Tibeto-Burman source population.