The Koreans are generally considered a northeast Asian group because of their geographical location. However, recent findings from Y chromosome studies showed that the Korean population contains lineages from both southern and northern parts of East Asia. To understand the genetic history and relationships of Korea more fully, additional data and analyses are necessary.
All the mtDNA profiles studied here were classified into subsets of haplogroups common in East Asia, with just two exceptions. In general, the Korean mtDNA profiles revealed similarities to other northeastern Asian populations through analysis of individual haplogroup distributions, genetic distances between populations or an analysis of molecular variance, although a minor southern contribution was also suggested. Reanalysis of Y-chromosomal data confirmed both the overall similarity to other northeastern populations, and also a larger paternal contribution from southeastern populations.
The present work provides evidence that peopling of Korea can be seen as a complex process, interpreted as an early northern Asian settlement with at least one subsequent male-biased southern-to-northern migration, possibly associated with the spread of rice agriculture.
From the paper:
The highest (23.8%) frequency in the Korean mtDNA pool was observed for haplogroup D4, which is widespread in northern East Asia and especially in the Korean-Chinese (21.6%), and Manchurians (20.0%). In total, haplogroup D lineages including the subhaplogroups (D4, D4a, D4b, D5, and D5a) accounted for 32.4% of the Korean mtDNA pool. In addition, the Koreans present moderate frequencies of (sub)haplogroup A (8.1%) and (sub)haplogroup G (10.3%) lineages, mostly prevalent in northeast Asia and southeast Siberia [20,55–57]. Other Siberian and Mongolian-prevalent haplogroups from the C, Y and Z lineages make up less than 4% of the Korean mtDNA pool. Haplogroups A5a and Y2 are found almost exclusively in Korea but were present at extremely low frequencies. In total, these northern haplogroups account for ,60% of the mtDNA gene pool of the Koreans. In addition, southeast Asian-prevalent mtDNA lineages of (sub)haplogroups B (14.6%), M7 (10.3%), and F (9.7) are also found at moderate frequencies in the Korean population (Table 2). These findings suggest that more than 30% of the Korean mtDNA pool is attributable to maternal lineages with a more southern origin. We also found the haplogroup M7a1 exclusively in the Korean population. This result is consistent with previous reports that haplogroup M7a is restricted to Japan and south Korea [18,20]. Thus, the distribution pattern of mtDNA haplogroups leads us to consider that the peopling of Korea is likely to have involved multiple sources
What could be the origin of the male-biased southern contribution to Korean gene pool illustrated, for example, by haplogroups O-M122 (42.2%) and O-SRY465 (20.1%) . Recent molecular genetic analyses and the geographical distribution of haplogroup O-M122 lineages, found widely throughout East Asia at high frequencies (especially in southern populations and China), have suggested a link between these Y-chromosome expansions and the spread of rice agriculture in East Asia –. In general, Y-chromosomes might be spread via a process of demic diffusion during the early agricultural expansion period , . If this interpretation were substantiated, the spatial pattern of Y-haplogroup O would imply a genetic contribution to Korea through the spread of male-mediated agriculture.
The FST distances of mtDNA markers (mtDNA haplogroups and HVR-I sequences) of Korean populations showed close relationships with Manchurians, Japanese, Mongolians and northern Han Chinese but not with southern Asians (Supplementary Tables S4 and S5; Figure 2A, B). In the MDS plots, the Korean samples lay entirely within the cluster of northern populations. In contrast, the results of Y chromosome analyses (based on YSNPs and Y-STRs) of Korean populations revealed closer relationships with both northeast and southeast Asian populations (Supplementary Tables S6 and S7; Figure 2C, D). Like the mtDNA distances, Y-chromosomal distances from Manchurian, Japanese and northern Han Chinese populations were usually not significantly greater than zero, but some distances from southern Han populations (e.g. Yunnan Han, Y haplogroups; Meixian Han, Y-STRs) or other southern populations (e.g. Vietnamese, Y haplogroups) were also not significantly above zero (Supplementary Tables S6 and S7), as noted previously . In the MDS plots, the Korean samples lay at the border between the northern and southern clusters, rather than within the northern cluster (Figure 2C, D). In order to investigate Y-chromosomal relationships in more detail, we visualized STR haplotypes within a common predominantly northern haplogroup (C*) and southern haplogroup (O3) using networks  constructed with the seven Y-STRs common to all datasets (Figure 3). These networks did not show striking geographical structure, so we calculated, for each Korean haplotype, the distance to the closest northern and southern haplotype. In both haplogroups, the mean distance to the southern haplotypes was lower than to the northern haplotypes (C* Korean-north 5.0 steps, Korean-south 4.5 steps; O3 Koreannorth 3.5 steps, Korean-south 2.2 steps). This finding is particularly striking for haplogroup C* because it is far more prevalent in the north (Figure 3A). The genetic differences between the Koreans and other East Asians were examined by AMOVA (Table 4). When samples were
grouped into northeast Asians and southeast Asians (excluding Koreans), a highly significant difference was found between the two groups with all markers. Thus there is significant genetic differentiation within the region, and we could then compare each group separately with the Koreans. With mtDNA, Koreans were not significantly different from either group when HVRI sequences were compared, although they were distinct from the southeast Asians in the haplogroup comparisons. With the Y chromosomes, they were again not distinct from either group when haplogroup comparisons were made, but were distinct from the southeast Asians in the STR-based comparison (Table 4). Our study documents the genetic relationships of the Koreans with their neighboring populations in unprecedented detail. Two major findings emerge. First, the Koreans are overall more similar to northeast Asians than to southeast Asians. This conclusion would be expected from the general correlation between genetic variation and geography observed for human populations, and is supported here by an examination of individual mtDNA haplogroups (Table 2), genetic distances between populations derived from mtDNA or Y-chromosomal data (Figure 2), and the apportionment of genetic diversity between different groups of populations (Table 4). Second, the conclusions from mtDNA and Y-chromosomal analyses differ. Sex-biased admixture is common in human expansions such as that of Bantu-speaking farmers in Africa , the spread of the Han ethnic group in China  or the post-Columbian peopling of the Americas . The effects in Korea are more subtle, but show a larger male than female contribution from southern East Asia to the population of Korea, most clearly revealed by the admixture estimates, where a 35% contribution from the south was estimated for mtDNA, compared with a 83% contribution for the Y chromosome (Table 5). The predominant genetic relationship with northern East Asians is consistent with other lines of evidence. Xue et al.  reported that the northern East Asian populations started to expand in number before the last glacial maximum at 21-18 KYA, while the southern populations all started to expand after it, but then grew faster, and they suggested that the northern populations expanded earlier because they could exploit the abundant megafauna of the ‘‘Mammoth Steppe,’’ while the southern populations could increase in number only when a warmer and more stable climate led to more plentiful plant resources such as tubers. By this criterion, the Koreans, expanding at about 30 KYA  also resemble other northern populations. Historical evidence suggests that the Ancient Chosun, the first state-level society, was established in the region of southern Manchuria and later moved into the Pyongyang area of the northwestern Korean Peninsula. Based on archeological and anthropological data, the early Korean population possibly had an origin in the northern regions of the Altai-Sayan and Baikal regions of Southeast Siberia [7,8,61]. What could be the origin of the male-biased southern contribution to Korean gene pool illustrated, for example, by haplogroups O-M122 (42.2%) and O-SRY465 (20.1%) . Recent molecular genetic analyses and the geographical distribution of haplogroup O-M122 lineages, found widely throughout East Asia at high frequencies (especially in southern populations and China), have suggested a link between these Y-chromosome expansions and the spread of rice agriculture in East Asia [62–64]. In general, Y-chromosomes might be spread via a process of demic diffusion during the early agricultural expansion period [65,66]. If this interpretation were substantiated, the spatial pattern of Yhaplogroup O would imply a genetic contribution to Korea through the spread of male-mediated agriculture. Large-scale genetic analyses thus begin to reveal some of the complexities of the peopling of Korea, and further studies of individual autosomal loci or genomewide genotyping and sequencing are expected to provide further insights
Table 2 has a very useful listing of mtDNA haplogroup frequencies in several East Asian populations.
Table 5 has admixture estimates of NE and SE Asians in Korean populations; notice the gender asymmetry, with males of more southern origin than females.
Table S3 (Excel) has Y-chromosome haplogroup frequencies.
“There were dolmens and Germanic Caucasian Population Settlements in North Korea. North Koreans are Germanic Caucasian Settlers of R1B from the Bronze age. 43% of North Koreans are genetically proven to have R1B which is similar to the amount of 40~50% R1A in India and the Middle East. ” – Korean Nobility, FamilyTree DNA
Allentoft et al. paper. and Haak et al. 2015, Eastern Yamnaya populations carry Y-haplogroup R1b; but one of the five samples belongs to Y-haplogroup I2a (see here). See Allentoft et al., Bronze Age population dynamics, selection, and the formation of Eurasian genetic structure, Nature 522, 167–172 (11 June 2015) doi:10.1038/nature14507
Abstract: The Bronze Age of Eurasia (around 3000–1000 BC) was a period of major cultural changes. However, there is debate about whether these changes resulted from the circulation of ideas or from human migrations, potentially also facilitating the spread of languages and certain phenotypic traits. We investigated this by using new, improved methods to sequence low-coverage genomes from 101 ancient humans from across Eurasia. We show that the Bronze Age was a highly dynamic period involving large-scale population migrations and replacements, responsible for shaping major parts of present-day demographic structure in both Europe and Asia. Our findings are consistent with the hypothesized spread of Indo-European languages during the Early Bronze Age. We also demonstrate that light skin pigmentation in Europeans was already present at high frequency in the Bronze Age, but not lactose tolerance, indicating a more recent onset of positive selection on lactose tolerance than previously thought.