Y chromosome haplogroups of Japanese population analysed in a 2013 forensic paper

Y-chromosome phylogenetic tree

Fig 1 Phylogenetic tree of 22 Y chromosome binary polymorphisms analyzed in this study

Yuta Harayama et al., Analysis of Y chromosome haplogroups in Japanese population using short amplicons and its application in forensic analysis Leg Med (Tokyo) January 2014, Vol.16(1):20–25, Epub 2013 Nov 1.  doi:10.1016/j.legalmed.2013.10.005 Excerpts follow below.


We designed three mini multiplex PCR systems using single-base extension reactions to identify Japanese Y chromosome haplogroups. We selected a group of 22 Y chromosome single nucleotide polymorphisms (SNPs) from the haplogroups most commonly reported in East Asia. To make the systems more useful in analyzing degraded DNA samples, we designed primers to render amplicons of 150bp. Applying these systems, we classified the Japanese population into major haplogroups and confirmed the applicability of these systems in forensic DNA analysis.

1. Introduction

Short tandem repeat (STR) markers are highly effective in determining personal identity, and Y chromosome STR loci and population genetic data from a wide range of ethnic groups are now routinely used in forensics [1], [2], [3], [4]. While SNPs have also been applied in kinship testing, more of these binary markers than STRs are required to be useful. However, SNPs have certain advantages over STRs, including much greater mutational stability and good performance when typing highly degraded DNA [5], [6], [7]. The Y chromosome carries the largest amount of non-recombining DNA and contains stable binary markers that can be used in evolutionary studies. Y chromosome SNPs typing can help trace the origins and history of human populations by tracking migrational patterns [8].

The human Y chromosome tree contains 20 major clades, consisting of 311 distinct haplogroups defined by hundreds of binary markers [9], [10]. To classify the Japanese population, we selected haplogroups C, D, and O, reported as major haplogroups in Japan and East Asia, and haplogroups N and Q, found at low frequencies in Japan [9], [11]. Recent reports indicate that much of the Japanese population can be subdivided into sub-haplogroups D2 and O2 [9]. We established three mini multiplex PCR systems to classify the Japanese population. System 1 is capable of classifying the Japanese population into the major clades C, D, D1, D2, D3, O, O1a, O2, O3, N, and Q. System 2 subdivides clade D2; System 3 subdivides clade O2. These PCR systems use single-base extension (SBE) reactions.

The goal of this study was to develop methods for analyzing difficult DNA samples encountered in forensics. The analysis of highly fragmented DNA or samples containing PCR inhibitors using commercially-available STR typing kits often fails to resolve informative profiles. Several methods have been proposed to remove the inhibitors or reduce their effects. MiniSTR analysis allows us to analyze degraded DNA samples efficiently by obtaining short PCR products [12], [13], [14], [15], [16]. We focused on Y chromosome SNPs used in haplogroup classification and applied the present systems to personal identification tasks. The advantage of using the simultaneously detected multiplex system of Y chromosome SNPs is the capacity to predict the haplogroup even if the typing of the alleles is incomplete. To make the systems more useful with such samples, we designed primers to render amplicons of 150bp. We analyzed highly degraded DNA to determine the efficacy of these multiplex systems.

2. Materials and methods

2.1. Samples and DNA extraction

This study was approved by the Ethics Committee of Shinshu University. After obtaining informed consent, we collected samples from 432 healthy unrelated adult Japanese males representing virtually every prefecture in Japan (including Hokkaido and Okinawa) and extracted DNA from blood or buccal mucosa cells using the QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany). We also extracted DNA from various male bone samples by SDS-proteinase K treatment followed by phenol/chloroform extraction.

2.2. Primer design and multiplex PCR amplification

We selected 22 SNPs from the non-coding regions of the Y chromosome using the phylogenetic tree of Y chromosome haplogroups, focusing on Japanese groups (Fig. 1). Each primer set was designed using Primer3Plus software (http://primer3plus.com/) to generate amplicons (including each SNP) of ⩽150bp by setting each primer binding site close to SNP. Each primer was checked for the potential self-dimer structures using AutoDimer software (http://www.cstl.nist.gov/biotech/strbase/AutoDimerHomepage/AutoDimerProgramHomepage.htm). We checked each PCR primer set by agarose gel electrophoresis to confirm that each product was peculiar to male DNA and confirmed the allele typing of single base extension products by DNA sequencing with a 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA, USA).

      • View full-size image.
        • View Large Image (top of the page)
      • Fig. 1.

        Phylogenetic tree of 22 Y chromosome binary polymorphisms analyzed in this study. Marker names are indicated above the lines. SNPs are indicated by red letters in System 1, blue letters in System 2, and green letters in System 3.

We ran three mini multiplex PCR systems. System 1 (undecaplex M15, RPS4Y711, M231, P31, P191, M119, IMS-JST021355, M242, P99, M179, and M122) roughly subdivided the Japanese population into haplogroups C, D, D1, D2, D3, O, O1a, O2, O3, N, and Q. System 2 (octaplex IMS-JST022457, M116.1, M125, P151, P120, P42, M179, and P12) further subdivided haplogroup D2, while System 3 (pentaplex SRY465, M95, P31, M88, and PK4) further subdivided haplogroup O2. …

3. Results

We established three mini Y chromosome SNP multiplex systems using 22 Y chromosome binary markers to identify 23 haplogroups in the Japanese population. Sensitivity studies detected allele peaks at >150 relative fluorescence units. In investigating template DNA concentrations with System 1, we observed several additional peaks with 50pg of template DNA. Interpretation of analyses with 50pg of template DNA in Systems 2 and 3 proved difficult due to low peaks. To avoid mistyping attributable to extra peaks, we set the low template level between 50 and 100pg. While allele typing was successful in the group with 5ng of template DNA, the target peaks were too high, and extra peaks were observed. Thus, we set the maximum template level at <2ng. Typing proved possible for all samples with template DNA amounts between 100pg and 2ng, and no significant extra peaks were observed. Within these limits established by DNA detection range analysis, allele typing for all selected SNPs proved successful with each system. Fig. 3 shows the results for DNA from 9948 DNA (Promega, Madison, WI) obtained using our systems. When SNP analysis was performed using female DNA as a template or negative control, no PCR bands were detected. Non-expected peaks were occasionally visible, but these peaks did not affect SNP evaluations.


  • Fig. 3.

    Electropherograms for 9948 DNA obtained using the present SNP systems.

Most of the Japanese population can be classified using these three mini Y chromosome SNP multiplex systems. Table 1 shows the frequency for the Japanese population. Mutations RPS4Y711 (haplogroup C), IMS-JST021355 (haplogroup D), and P191 (haplogroup O), respectively, were 8.3%, 30.3%, and 59.0%, haplogroup frequencies similar to those found in past studies [18], [19], [20], [21]. Mutations M231 (haplogroup N) and M242 (haplogroup Q) were rarely found in this study. Using Systems 2 and 3, we subdivided populations of haplogroups D2 and O2. In this survey, haplogroup D2a1b (16.2%) was the most frequent in Japanese haplogroup D populations and haplogroup O2b (32.2%) the most frequent in the haplogroup O population. The haplogroup frequencies observed in haplogroup D2 and O2 were similar to those reported in previous studies[19], [20].

Table 1. The haplogroup frequencies for Japanese population.
Haplogroup No. of samples Frequency
C 36 0.083
D 0 0
D1 0 0
D2 22 0.051
D2a 29 0.067
D2a1 3 0.007
D2a1a 6 0.014
D2a1a1 0 0
D2a1b 70 0.162
D2a2 0 0
D2a3 1 0.002
D3 0 0
N 4 0.009
O 0 0
O1 0 0
O1a 7 0.016
O2 15 0.035
O2a 0 0
O2a1 0 0
O2a1a 0 0
O2b 139 0.322
O3 94 0.218
Q 2 0.005
Not determined 4 0.009

We also investigated the effectiveness of our systems in analyzing degraded samples. We re-analyzed a set of 30 hard tissue samples unsuccessfully examined using the protocol for a commercially available AmpFLSTR Yfiler Kit. This protocol had produced unsatisfactory results for at least 7 of the 16 loci. Fig. 4 shows the results of our analysis of the degraded DNA samples. Only 8 alleles were successfully typed using the AmpFLSTR Yfiler Kit; in contrast, the present systems proved able to detect all alleles and define the haplogroup. Table 2 presents the results of our analysis. The present systems proved capable of classifying 29 of 30 degraded DNA samples previously examined unsuccessfully using the AmpFLSTR Yfiler Kit. We also used the three systems to analyze an artificially degraded DNA sample (Table 3). In tests of degraded DNA digested with DNase, typing had failed for more than half the loci. In contrast, the present systems also proved effective with these degraded samples (Supplementary data 3).

      • View full-size image.
      • Fig. 4.

        (a) Electropherograms for degraded DNA sample from a male, extracted from a hard tissue sample and obtained using the AmpFLSTR Yfiler Kit (Applied Biosystems). (b) Electropherograms for degraded DNA samples from a male, extracted from hard tissue sample and obtained using the present SNP systems.

Table 2. The result of allele typing using the Y-SNPs multiplex systems for degraded DNA.
Sample No. AmpflSTR
Yfiler Kit
System 1 System 2 System 3 Haplogroup
1 9a 9b 3 3 (–)c
2 7 9 8 5 O2b
3 3 11 6 5 O2b
4 8 10 6 5 O2b
5 9 10 6 5 D2a1
6 9 11 7 5 D2a1a
7 9 10 8 5 D2a1b
8 9 10 8 5 O2b
9 8 8 7 5 O2b
10 9 8 7 5 D2a1b
11 9 8 7 4 D2a1b
12 9 8 8 5 D2a1a
13 9 10 6 4 Q
14 6 7 7 5 O2b
15 6 5 5 5 O2b
16 7 8 5 4 O3
17 3 9 6 4 D2a1a
18 6 11 8 5 O3
19 2 8 7 2 O1a
20 9 11 7 4 O3
21 8 7 8 5 D2a1b
22 4 8 8 5 D2
23 7 10 8 5 D2a
24 7 10 8 5 O2b
25 9 9 8 5 O2
26 9 9 6 4 O3
27 6 8 7 5 O2b
28 8 8 8 5 D2a
29 6 9 7 5 D2a1b
30 9 7 8 5 D2a

aNumber of loci typed successfully in the AmpflSTR Yfiler Kit.

bNumber of loci typed successfully in miniY-SNP systems.

c(–) Indicate not fully information.

Table 3. The result of allele typing using the Y-SNPs multiplex systems for artificially degraded DNA.
Enzyme reaction
time (min)
Yfiler Kit
System 1 System 2 System 3
0 16a 11b 8 5
2 16 11 8 5
5 16 11 8 5
10 16 11 8 5
30 16 11 8 5
60 11 11 8 5
90 7 11 8 5
120 7 11 8 5

aNumber of loci typed successfully in the AmpflSTR Yfiler Kit.

bNumber of loci typed successfully in miniY-SNP systems.

4. Discussion

STR and SNP analyses have become essential tools for determining personal identity based on biological samples. Current research is especially active in the area of autosomal and Y chromosome STRs and SNPs [22], [23], [24], [25], [26], [27]. Commercially available STR multiplex kits are not specifically manufactured for the analysis of forensic samples; forensic scientists often encounter problems during analysis of degraded DNA. To date, multiplex SNP analysis of degraded DNA samples has not been investigated extensively. We configured three systems to perform simultaneous analysis of biallelic markers on the Y chromosome that classify haplogroups in the Japanese population and began by evaluating the performance of our systems with Japanese haplogroup classification. We applied the newly devised mini Y chromosome SNP multiplex PCR systems to the analysis of samples from 432 Japanese men. The results indicated frequencies of major haplogroups consistent with those found in previous studies [18], [19], [20], [21]. For 0.9% of the Japanese population, we failed to discover any mutations using our three Y chromosome SNP analysis systems. These samples appear to belong to haplogroups I and R [11]. The haplogoup D lineage occurs most frequently in Central Asia and in Japan; the haplogroup D2 lineage is rarely found outside Japan [11]. In this survey, all haplogroup D instances belonged to haplogroup D2, while the frequencies of subhaplogroups D2, D2a1, and D2a1b showed no significant differences from previous reports and fine classifications, suggesting that System 2 may be very useful in subdividing the Japanese haplogroup D2 population. Where further classification is required, IMS-JST022456 may help define the subclades of haplogroup D2 [20].

Haplogroup O, the most prevalent haplogroup in Japan, was divided by System 1 and further divided by System 3. In System 1, 21.3% of samples branched into haplogroup O3. Using System 3, we demonstrated that haplogroup O2 branched into haplogroup O2b (32.2%). Haplogroups O2b and O3 accounted for more than half the Japanese population. Introducing still another system to subdivide haplogroups O2b and O3 should make it still more useful for personal identification. Reports indicate many individuals in the Japanese haplogroup O2b have the 47z mutation (haplogroup O2b1) [11], [25]. Additionally, the Japanese haplogroup O3 can be divided into further subgroups [27], [28].

We found that …

– haplogroup O accounted for 59.0% of the samples;

– haplogroup D for 30.3% of the samples; and

haplogroup C for 8.3%.

Several studies indicate haplogroups C, D, and O are found in more than 95% of the East Asian population [18], [28], but at differing proportions from country to country. Japan features high proportions of haplogroup D, while South Korea features high proportions of haplogroup C [28]. Genetic differences between East Asians are also evident in mitochondrial DNA haplogroups. Mitochondrial DNA is an excellent tool for forensic genetics due to the high copy numbers per cell and maternal inheritance. Certain mitochondrial haplogroups, such as M7a and N9b, occur frequently in the Japanese population but are rarely encountered in other East Asian populations [29]. Using mitochondrial and Y chromosome SNPs, we can exploit these differences to categorize East Asian populations into the appropriate haplogroups.

Personal identification requires further classification; forensic scientists often encounter major difficulties in analyzing degraded DNA samples. Quite often, degraded DNA samples cannot be successfully analyzed using commercially available kits subject to sample volume limitations. In forensic examinations, an additional system capable of fine sub-classification may help. The objective of the present study is to apply these methods to analyze degraded samples for forensic purposes. Allele typing by Y chromosome SNPs analysis is easier than with autosomal or X chromosome SNPs because heterozygosities and systems that detect stimulatory Y chromosome SNP can often predict haplogroups, even with incomplete allele typing. STRs are known to produce stutter artifacts differing from true alleles that may complicate analysis; on the other hand, SNP analysis is very simple.

Our new systems containing 22 Y chromosome SNPs promise effective and efficient analysis of highly degraded DNA samples in the Japanese population. The short amplicons used in this study offer the potential to become the tool of choice for analyzing degraded DNA samples [17], [30]. To test these hypotheses, we used amplification product lengths between 77bp (M122) and 150bp (M231 and M95) for all Y chromosome SNPs. On this basis, our systems proved capable of generating favorable results with highly degraded DNA samples. … these systems proved effective with samples in which STRs could be detected in >2 loci. Analytical results for artificially degraded samples substituting for highly degraded forensic DNA sources were also superior to those obtained using commercial STR kits. For degraded DNA samples for which alleles were not completely detected, this means these systems can easily determine haplogroups and that even if haplogroups are not determined to precise subgroups, the detected SNPs can help achieve personal identification.

Selecting 22 Y chromosome SNPs and developing Y chromosome SNP multiplex systems (mini Y chromosome SNP) to analyze degraded DNA samples, we demonstrated these systems are capable of identifying polymorphisms in Japanese subjects and of analyzing highly degraded samples for personal identification in forensic studies.


One response to “Y chromosome haplogroups of Japanese population analysed in a 2013 forensic paper

  1. I’m curious why some Far East Asians male have Y Chromosome Hg C-M130, C2-M217, D-M175, D2-M55, etc have very distantly related with Y Chromosome Hg NO-M214, N-M231, O-M175, 01a-M119, O1b-M256 / P31, O2-M122, O2***…., Q-M242, Q2-M129 and so on? So do with Far East Asians Maternal Line Mitochondrial DNA Hg M*, M7, M8, M8a, CZ, C, Z, M9, E, M11,G (M Type) + an mtDNA Hg N9a, A and Y (N Type), Similar with male Y Hg NO, N, O and Q, Far East Asian Maternal Mitochondrial Hg N9a, Y, A, R11, B, R9, F ,…. (N and R Type) are quite difference with East Asians Maternal Hg M*?
    I asked Dr Miguel Vilar, The Genographic Project Leader after Dr Spencer Wells leaded this project on The NatGeo companies about East and Southeast Asian Uniparental Haplogroups. And He told me it’s possible when Asians Y Hg C-M130, C2-M217, D-M174, D2-M55, etc were the counterpart with an Asians Mitochondrial Hg M* and it’s subclades….. He said when the best way to predicted that Asians Haplogroups were with compared the age of these Y DNA and mtDNA Haplogroups.

    Talking about Japanese Jomon, Ainu and Japanese Yayoi DNA Ethnic group.. Even though Ainu people have “Sundadonty” dental morphology, wet earwax, hairier body, curly hair, deep set eyes (Dolichocephalic – Mesocephalic), more prominent nosebridge, higher risk to baldines, etc. But Ainuid and Jomon Peoples closest Autosomal Regional DNA were East Siberians, Nivkhs, perhaps Tungusic Peoples from Northern Manchuria near Amur River. “Pure” Ainu people haven’t Y Hg O-M175, N-M231, Q-M242 and mtDNA Hg R, B and F at all. At least an “Old” Japanese male have Y Chromosome Hg D2-M55, rarely C1-M216″?” and C2-M217 (around 5 – 15%) on them.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s