Human Genetics

© Springer-Verlag 2003
10.1007/s00439-003-0948-y

Original Investigation
The origins and genetic structure of three co-resident Chinese Muslim populations: the Salar, Bo'an and Dongxiang
Wei Wang1, 2, Cheryl Wise1, Tom Baric1, Michael L. Black1 and Alan H. Bittles1

(1) Centre for Human Genetics, Edith Cowan University, 100 Joondalup Drive, WA 6027 Perth, Australia
(2) Medical Centre of Peking University-Hong Kong University of Sciences and Technology, 518036 ShenZhen, PR China


Alan H. Bittles
Email: a.bittles@ecu.edu.au
Phone: +61-8-94005623
Fax: +61-8-94005851

Received: 2 December 2002 Accepted: 1 March 2003 Published online: 21 May 2003

Abstract A genome-based investigation of three Muslim populations, the Salar, Bo'an, and Dongxiang, was conducted on 212 individuals (148 males, 64 females) co-resident in Jishisan County, a minority autonomous region located in the province of Gansu, PR China. The Salar are believed to be of Turkic origin, whereas the Bo'an and Dongxiang both speak Mongolian. Biparental dinucleotide markers on chromosomes 13 and 15 indicated elevated mean homozygosity in the Salar (0.32), Bo'an (0.32), and Dongxiang (0.27), equivalent to inbreeding coefficients (Fis) of 0.16; 0.12; 0.01, confirming varying levels of endogamous and consanguineous marriage in all three communities. Y-chromosome unique event polymorphisms (UEPs) showed that males in the three communities shared common ancient origins, with 80–90% of haplotypes in common. However, the high levels of community-specific Y-chromosome STR haplotypes strongly suggested the action(s) of founder effect, genetic drift and preferential consanguinity during more recent historical time. By comparison with the marked inter-community differentiation revealed by the Y-chromosome STRs (29.4%), the mtDNA data indicated similarity between the female lineages of each community with just 1.2% inter-community variation. The combined use of these different marker systems gives an in-depth historical perspective, and provides evidence of past inter-marriage between genetically diverse male founders of each community and Han Chinese females with subsequent community endogamy.

--------------------------------------------------------------------------------

Introduction
The history of Chinese Muslims dates back some 1,400 years to the Tang Dynasty (618–907 AD), when followers of the Islamic faith are reported to have variously entered China as soldiers, merchants and political emissaries from Arabia and Persia (Du and Yip 1993; Gladney 1996, 1998; Leslie 1986; Lipman 1997; Wong and Dajani 1988). These groups and individuals subsequently settled throughout China, and contributed appreciably towards local and national development. There are ten officially recognized Muslim minorities in the People's Republic of China, the Bo'an, Dongxiang, Hui, Kazakh, Kirghiz, Salar, Tatar, Tajik, Uygur and Uzbek, with a combined current population of 91 million (Family Planning Commission 1997). Although their individual histories and population sizes vary, the origins of the ten populations have been traced to Arab, Iranian and Central Asian sources, and/or the Mongol peoples (Gladney 1996; Wong and Dajani 1988).

A number of studies have been conducted into the genetic history of Chinese groups, each of which primarily focused on selection, founder and/or bottleneck effects, genetic drift, and mutation, and addressed human migration on an evolutionary time scale (Chu et al. 1998; Su et al. 1999; Ding et al. 2000). Only limited attention has been paid to factors such as endogamy, past and current population sizes, polygyny and polyandry, and kin-structured migration, which could significantly shape the pattern of genetic diversity within the relatively shorter time frame of historical events (Bittles and Neel 1994; Seielstad et al. 1998; Stoneking 1998).

The present study examined genomic variation in three endogamous Chinese Muslim communities, the Salar, Bo'an, and Dongxiang, by genotyping uni- and biparental markers. The three ethnic groups are typical of contemporary Chinese Muslim communities; each has maintained many of the cultural and religious traditions of their founders and all favour community endogamy and permit consanguineous marriage (Gladney 1996; Wong and Dajani 1988). However, the Salar are a Turkic language community, whereas the Bo'an and Dongxiang are Mongolian-speaking (Gladney 1998). In 1998 the populations of the three communities were reported as 11,683 Bo'an, 87,546 Salar, and 373,600 Dongxiang (Family Planning Commission 1997).


--------------------------------------------------------------------------------

Subjects and methods
Ethical approval
As in a large majority of traditional Asian populations, family and community support was an essential prerequisite to the study and a three-stage process of approval was followed. Acting on the advice of the Chinese Academy of Sciences, the cooperation of local community religious leaders (Ahong) was first requested, with details of the sampling procedure provided. The Ahong who expressed interest in the project obtained parental approval for the collection of samples and then organized recruitment of the subjects, all of whom were volunteers. On this basis, approval for the study was given by the Ethics Committee of Tongji Medical University, PR China and Edith Cowan University, Australia.

Analytical methods
Finger-prick blood samples from 81 Salar (52 male, 29 female), 67 Bo'an (47 male, 20 female) and 64 Dongxiang (49 male, 15 female) aged 12–18 years were collected from randomly selected individuals in each community. All subjects were resident in Jishisan County, a minority autonomous region located in Gansu province, northwest China (Fig. 1).

Fig. 1. Geographical location of Jishisan County, a minority autonomous region located in Gansu province, PR China

--------------------------------------------------------------------------------

A set of ten dinucleotide STR markers on chromosomes 13 and 15 (D13S126, D13S133, D13S192, D13S270, D15S11, D15S97, D15S98, D15S101, D15S108 and GABRB3) was investigated in all samples following procedures previously standardized on East and South Asian populations (Wang et al. 2000; Black et al. 2001). For Y-chromosome analysis, one tri- and three tetranucleotide STR markers (DYS19, DYS388, DYS389I, DYS393) were chosen from a panel of markers widely used in forensic examinations (de Knijff et al. 1997). PCR conditions for the autosomal markers and for the Y-chromosome markers were as previously described (Wang et al. 2000). Separation of PCR products was performed on 6% denaturing polyacrylamide gels using an ABI 373A automatic sequencer (Applied Biosystems). The Genotyper program was used to size STR alleles by reference to an internal size standard Genescan-500 TAMRA (Applied Biosystems).

Male samples were analysed for 15 unique event polymorphisms (UEPs) on the Y-chromosome: M1 (YAP), M216/M130 (RPS4Y), M89/M213, M172, M170, M9, M175/M214, M122, M134, M159, M119, M95, M45, M173, and M17 (Underhill et al. 2001). PCR primers were as described elsewhere (Underhill et al. 2001), with the exception of the forward primers for M45 (5-ATTGGCAGTGAAAAATTATAGCTA-3) and M17 (5-GTGGTTGCTGGTTGTTACGTG-3), which were designed with a 3 mismatch, creating a restriction fragment length polymorphism (RFLP) site. M45 was typed by BfaI restriction digestion of the PCR product and M17 by AflIII digestion. Five other markers, M130 (BslI), M213 (NlaIII), M9 (HinfI), M175 (MboII) and M122 (NlaIII), were typed using PCR-RFLP assays according to the manufacturer's instructions (New England BioLabs). M1 was detected by PCR amplification of either a 455-bp (YAP+) or 150-bp (YAP–) fragment that can be resolved by electrophoresis on 2% agarose. The remaining polymorphisms were analysed using a modified version of the primer-extension assay (protocol available on request) and matrix-assisted laser desorption/ionization mass spectrometry (Haff and Smirnov 1997). Mass spectra were collected using a Voyager-DE PRO MALDI-TOF instrument (Applied Biosystems). Genotypes were determined by calculating the mass of the dideoxynucleotide incorporated at the variant site.

The mtDNA hypervariable region I (HV-I) was amplified and sequenced in a subset of 30 samples (n=10 from each community) according to Hopgood et al. (1992), using ABI Prism dye primer kits running on 6% denaturing polyacrylamide gels. Sequence data were analysed using ABI DNA analysis software and SEQUENCER (Gene Codes).

Statistical methods
Basic statistical computations on the autosomal data included allele frequency, heterozygosity and gene diversity, with Hardy-Weinberg equilibrium (HWE) tests performed using the GENEPOP program (Rousset 1995). An exact probability test was employed to assess the significance of deviation from HWE (Guo and Thompson 1992). Where deviation from HWE was confirmed, a U-test was used to further assess whether it was due to heterozygote deficiency (Rousset and Raymond 1995). Both the exact probability test and the U-test are based on Markov chain Monte Carlo type algorithms (Guo and Thompson 1992). The correlation of genes of individuals within populations (FIS) was calculated for each population (Weir and Cockerham 1984).

Haplotypes for the Y-chromosome UEPs were assigned, based on the typing results and the defined evolutionary relationship of the markers. As described in Wells et al. (2001), haplotypes were named for their most derived marker. For example, individuals typed as M9G, M45A, and M17delG were classified as haplotype M17. Haplotype frequencies were determined by direct counting with diversities calculated according to the method of Nei (1987).

The mtDNA sequences were edited, aligned and compared with the published reference sequence (Anderson et al. 1981) using Mitodesc, a software package developed and kindly made available by Dr. Francesc Calafell, Pompeu Fabra University, Barcelona. Descriptive statistics, pairwise difference comparisons, and polymorphic sites were calculated using both Mitodesc and the Arlequin software program (Excoffier et al. 1992).

The degree of population differentiation for the autosomal, Y-chromosome and mtDNA data sets was ascertained by analysis of molecular variance (AMOVA), calculated using Arlequin. Nei's standard distance (Ds) was calculated for autosomal and Y-chromosome STR data (Nei 1987) using the Microsat software package (Minch et al. 1997). Tamura and Nei's distance (Tamura and Nei 1993) was calculated for the mtDNA data using MEGA2.1 (Kumar et al. 2001). This program was also used to generate neighbour-joining (NJ) phylogenetic trees from the autosomal, Y-chromosome STRs and mtDNA distance matrices. The statistical robustness of the trees was tested on a comparison of 1,000 bootstrap iterations (Felsenstein 1985).


--------------------------------------------------------------------------------

Results
The three Muslim populations showed basically similar allelic distributions across all ten autosomal loci. A total of 160 alleles were identified in the 212 individuals tested, with 53 alleles (33.1%) shared by all three communities. The mean number of alleles observed at each locus varied by STR marker, and ranged from 6.3 alleles for D13S126 to 18.0 for D13S133. On average, 1.7 community-specific alleles were detected at each locus.

As indicated in Table 1, significant deviations (P<0.05) from HWE were observed at six loci (D13S133, D13S192, D13S270, D15S108, D15S11, D15S98) in the Salar, seven loci (D13S133, D13S270, D15S11, D15S98, D15S101, D15S108, GABRB3) in the Bo'an, and two loci (D13S270, D15S108) in the Dongxiang. U-tests confirmed that all deviations were due to heterozygote deficiency. The mean Fis values, which describe the inbreeding effect within a sub-population, were 0.16 for Salar, 0.12 for Bo'an and 0.01 for the numerically larger Dongxiang community (Table 1). The components of autosomal genetic variation within and between the three communities examined by AMOVA indicated a negative inter-community variation of –2.2% (equal to an Fst of –0.02), suggesting they have similar autosomal gene pools (Table 2).
Table 1. Hardy-Weinberg equilibrium tests and Fis values in the Salar, Bo'an and Dongxiang
Locus
Salar
Bo'an
Dongxiang


P value
Fis
P value
Fis
P value
Fis

D13S126
0.0804
–0.0237
0.1067
0.0739
0.5343
0.0394

D13S133
0.0000
0.1772
0.0000
0.2595
0.1849
–0.0133

D13S192
0.0000
0.1401
0.0817
0.0669
0.7687
–0.0008

D13S270
0.0000
0.3956
0.0175
0.0054
0.0000
–0.0387

D15S11
0.0003
0.3394
0.0268
–0.0386
0.4398
0.0089

D15S97
0.0553
0.2293
0.4492
0.0190
0.0935
0.0588

D15S98
0.0009
0.1656
0.0000
0.1027
0.1807
0.1100

D15S101
0.3771
0.0609
0.0083
0.1442
0.1019
–0.1246

D15S108
0.0016
0.2434
0.0022
0.2823
0.0000
0.1356

GABRB3
0.9682
–0.1742
0.0014
0.3330
0.7473
–0.0876

Mean
0.1484
0.1554
0.0694
0.1248
0.3051
0.0088


Table 2. Structure of variance components for the Salar, Bo'an and Dongxiang
Gene pool
Variance components (AMOVA) (%)


Within population
Between population

Autosome
102.17
–2.17

Y-chromosome
70.61
29.39

MtDNA
98.85
1.15


The average gene diversity calculated for the Y-chromosome STRs was 0.59 for the Salar, 0.52 for the Bo'an, and 0.40 for the Dongxiang. Haplotypes were constructed from the Y-chromosome data (Table 3). Of the 39 haplotypes identified, 12% were shared by all three communities, 34% were shared by two of the three communities, and 54% were community-specific (Table 3). Haplotype diversity was 0.40 for the Salar, 0.45 for the Bo'an, and 0.38 for the Dongxiang. The Salar had the highest mean number of pairwise differences (1.76), followed by the Bo'an (1.57) and Dongxiang (0.81). AMOVA showed that 29.4% of variation was between-population, which gave an accumulated Fst value of 0.29 and indicated significant inter-population diversity (Table 2). The corresponding intra-population variation was 70.6%.
Table 3. Haplotype frequencies of the Y-chromosome STRs among the Salar, Bo'an and Dongxiang
Haplotype
STRs (allele size in bp)
Population


DYS19
DYS388
DYS389I
DYS393
Salar
Bo'an
Dongxiang






(n=52)
(n=47)
(n=49)

H1
14
12
13
13
0.100
0.175
0.085

H2
14
13
12
12

0.025
0.085

H3
14
12
13
12
0.040
0.200


H4
14
12
12
12
0.100
0.150
0.064

H5
16
13
13
13

0.100
0.064

H6
16
12
12
13
0.040
0.025
0.191

H7
16
14
13
13

0.050
0.000

H8
17
12
13
14
0.060
0.075
0.021

H9
14
12
12
14
0.020
0.050
0.064

H10
14
14
14
14

0.025


H11
14
10
12
12

0.025
0.064

H12
14
14
13
12

0.075
0.043

H13
14
12
13
15

0.025


H14
14
12
14
11
0.100



H15
14
12
14
11
0.020



H16
16
13
13
11
0.020



H17
14
15
13
11
0.060



H18
14
12
12
11
0.160

0.064

H19
16
12
13
11
0.120

0.064

H20
14
11
14
11
0.020



H21
14
14
13
11
0.120

0.021

H22
14
10
12
11
0.040

0.021

H23
14
12
13
11
0.080



H24
14
14
13
15
0.020

0.021

H25
14
14
14
13
0.020



H26
14
14
14
11
0.020



H27
14
12
14
12
0.060

0.021

H28
14
10
13
12


0.021

H29
14
13
14
12


0.043

H30
14
15
13
12


0.021

H31
14
13
13
12


0.043

H32
14
13
14
14


0.021

H33
14
12
14
13
0.020

0.043

H34
14
12
14
13
0.020

0.064

H35
14
13
14
13


0.021

H36
14
15
14
12


0.021

H37
14
15
12
12


0.064

H38
14
11
12
12


0.064

H39
14
10
14
12


0.043


Haplotype diversity for the Y-chromosome UEPs was 0.88 for the Salar, 0.86 for the Bo'an, and 0.87 for the Dongxiang, comparable to the mean regional diversities reported for North East Asia (0.84), South East Asia (0.86), and Central Asia (0.86) (Karafet et al. 2001). Ten haplotypes were observed in each of the three Muslim populations, with eight or nine haplotypes shared between the different communities. The haplotype frequencies in the present study populations and selected populations from Asia (Karafet et al. 2001; Wells et al. 2001) are shown graphically in Fig. 2, with haplotypes grouped according to Underhill et al. (2001). The two most common haplotypes in the Salar, Bo'an and Dongxiang were M122 (including M134 and LINE-1/M159), with frequencies ranging from 24–30%, and M17, which occurs at similar frequencies in the Bo'an (26%) and Dongxiang (28%), and at a slightly lower frequency in the Salar (17%).

Fig. 2. Y-chromosome UEP haplotype frequencies in three Chinese Muslim populations (Salar, n=46; Bo'an, n=35; Dongxian, n=46) and seven previously described Asian populations (Karafet et al. 2001; Wells et al. 2001). Haplotypes are grouped according to Underhill et al. (2001). For comparison purposes, YAP also includes markers M96, M174 and M15. Haplotype M130 (RPS4Y) (phylogenetically equivalent to M216) includes M217 and M48. Haplotype M89 (equivalent to M213 and P14 (DYS188)) includes M52, P15 (DYS221) and p12f2. Haplotype M9 includes M20 and M46 (Tat). Haplotype M175 (equivalent to M214) includes P31 (ARSEP71227) and M176 (SRY465). Haplotype M122 includes M134 and LINE-1. Haplotype M45 (equivalent to P27 (DYS257)) includes M124 and M207 (UTY2). Haplotype M173 includes P25 (DYS194) and M17 includes SRY10831 and M87

--------------------------------------------------------------------------------

M122 haplotypes are found at high frequency in North East Asia (22%), but are relatively rare in Central Asia (3% on average, although higher in some populations of Uzbekistan and the Uygurs) (Karafet et al. 2001; Wells et al. 2001). The M17 (or SRY-1532, also known as SRY10831) haplotypes on the other hand are found at high frequency throughout Central Asia (26%), but are rare in North East Asia (5%). The Bo'an and Dongxiang are Mongolian-speaking, so M130 (including M217) was another haplotype of potential interest occurring at frequencies of >50% in North Asia (Mongolians and Siberians), moderate frequency in Central Asia (25%), and at lower frequency in the Hui (17%) and Northern Han (5%) (Karafet et al. 2001). In fact, this haplotype was present at low frequencies in the Salar (7%), Bo'an (3%), and Dongxiang (0%).

A 360-nucleotide sequence of HV-I in the mitochondrial D-loop (position 16,024–16,383) was analysed in 30 samples, ten from each community. As indicated in Table 4, at least two polymorphic sites were common in all samples. These comprised a site characterised by a T at position 16,223 (Salar, 90%; Bo'an, 70%; Dongxiang, 70%), and a C at position 16,362 (Salar 60%; Bo'an, 70%; Dongxiang, 30%). Of the 30 mtDNA sequences examined, 23 were unique, two were found twice in the Bo'an and Salar, and one sequence was found three times in the Salar (Table 4). A total of 44 polymorphic sites were identified among the three communities, equivalent to 12.2% of the overall 360-bp sequence. The pairwise mtDNA nucleotide differences for the Salar, Bo'an and Dongxiang were 5.70, 6.42 and 5.90 (SD=2.47), respectively. The mtDNA AMOVA results indicated that inter-community variation was low (1.2%), with 98.8% diversity distributed within the communities (Table 2).
Table 4. Comparative alignment of mitochondrial DNA sequences

As shown in Fig. 3, the three communities branched differently when compared with two Chinese reference populations, the Hui and Han (Black et al. 2001). The mtDNA tree placed the Salar, Bo'an and Dongxiang equidistantly in a star-like shape (Fig. 3a), whereas the autosomal and Y-chromosome STR trees both showed deep divergence (Fig. 3b, c). Genetic distances estimated from the autosomal and Y-chromosome STRs and mtDNA differed significantly, with the differences in branch patterns most striking between the Y-chromosome STRs and mtDNA data (Fig. 3).

Fig. 3. Unrooted neighbour-joining trees showing mtDNA (a), autosomal (b) and Y-chromosome (c) phylogenetic affinities between three co-resident populations, the Salar, Bo'an, Dongxiang and the Han and Hui reference populations, based on Nei's distance

--------------------------------------------------------------------------------


--------------------------------------------------------------------------------

Discussion
The higher homozygosity levels and significant deviations from HWE widely observed in the Salar, Bo'an and Dongxiang are suggestive of high levels of endogamous and/or consanguineous marriage (Table 1). This is consistent with a previous genealogical study, which reported that all three Muslim populations were endogamous, with mean coefficient of inbreeding () values of 0.0023–0.00558 (Du and Zhao 1981).

None of the ten loci investigated showed significant differences in allele size distribution and, for the most part, mean allele numbers were similar throughout the three populations. These findings add weight to the concept of shared genetic identity at autosomal loci. There were several high frequency shared alleles distributed across all three populations, e.g., D13S126, 104 bp; D13S133, 132 bp; D13S192, 103 bp; D13s270, 81 bp; D15S108, 145 bp; D15S11, 243 bp; D15S97, 183 bp; D15S98, 157 bp; GABRB3, 185 bp. These alleles may be older in evolutionary terms, i.e. they may have existed prior to the subdivision of modern human populations. Alternatively, they could have originated in the most recent common ancestors (MRCA) of the three populations. The latter explanation may be more probable given the claimed Han background of the founding females in each community (Wong and Dajani 1988; Du and Yip, 1993; Rahman 1997). It is further supported by the negative inter-community Fst value (Table 2), and the similar allelic distributions of the three populations when compared with Han Chinese across the ten STR loci tested (Black et al. 2001).

AMOVA analysis showed that the inter-population variation of Y-chromosome STRs greatly exceeded those for autosomes and mtDNA. Lower male-transgeneration migration and/or patrilocality have been identified as major genetic forces for the higher Fst values in Hindu castes and other populations (Bamshad et al. 1998; Seielstad et al. 1998; Hammer et al. 2001). However, in the present study, besides patrilocality it seems that diverse paternal lineages and a shared maternal gene pool have contributed significantly to the observed differences in Y-chromosome, autosome and mtDNA Fst values in the three communities.

The high levels of unique Y-chromosome STR haplotypes within each community (54% of the haplotypes were community-specific) could not have arisen solely from random mutational events, given an average mutation rate of ~0.18×10–2–10–3 for the STR loci (Heyer et al. 1997) and the relatively short time frame of 1,350 years (~50 generations) since the introduction of Islam into China in 651 AD (Wong and Dajani, 1988). This time period would, however, have been sufficient for founder effect, genetic drift and preferential inbreeding to develop the high degree of genetic differentiation observed between the males of the three populations.

In contrast to the Y-chromosome STRs, the more ancient Y-chromosome lineages represented by UEPs occur at quite similar frequencies in the three populations, with 80–90% of the haplotypes in common, suggesting shared male origins on an evolutionary time scale. The high frequency of the M17 haplotype is in keeping with a Central Asian origin, with frequencies similar to those of the Altaic/Turkic-speaking Uzbeks and Uygurs (Wells et al. 2001). The frequency decreases eastward across Siberia and Mongolia, with a low frequency in North East Asia (Hammer et al. 2001; Karafet et al. 2001; Wells et al. 2001). An analogue of the M17 haplotype (also known as SRY-1532 or SRY10831) is also seen at high frequency in Eastern Iranian populations and may have spread from Central Asia into modern Iran, Pakistan and northern India (Quintana-Murci et al. 2001; McElreavey and Quintana-Murci, 2002). This is consistent with the claimed historical origins of Muslim minorities in PR China (Gladney 1996; Wong and Dajani 1988), but suggests that the Bo'an and Dongxiang acquired the Mongolian language with relatively little Mongolian genetic admixture.

The mtDNA profiles in the three Muslim communities were comparable to Han Chinese and Central Asian populations, in particular the TC transition at position 16,223 and the CT transition at position 16,362 (Comas et al. 1998). The average mtDNA nucleotide diversity index for the three populations was 0.017, which is similar to those reported in Eastern Asian (0.017), Mongolians (0.018) and Turkish (0.015) populations (Comas et al. 1998). The levels of nucleotide diversities within each community were higher than expected for isolated endogamous communities, perhaps reflecting the prevalence of the two common mutations at positions 16,223 and 16,362.

According to recorded history, the Salar, Bo'an and Dongxiang communities have been largely sedentary and intra-community marriage was traditional (Gladney 1996, 1998; Ma 1998). Endogamous marriage would promote low mean pairwise nucleotide differences between the individual mtDNA sequences of each population, and the level observed (6.0) was indeed lower than in other Middle Eastern, European and Central Asian populations (7.0–14.0) (Comas et al. 1998; Budowle et al. 1999).

In summary, analysis of the UEPs indicates that all three Muslim populations shared common ancient origins, but the Y-chromosome STR data show that their male gene pools were significantly altered by subsequent historical events, resulting in the establishment of community-specific Y-chromosome STR haplotypes. In contrast, the close genetic distances between the mtDNA profiles of the three communities suggest very similar female founder gene pools, mostly Chinese Han females. Only detailed demographic histories for the three populations can identify the main driving forces for the differences between the autosomes, mtDNA and Y-chromosomes. This is especially pertinent in Chinese Muslim populations given the founder hypotheses associated with the Y-chromosome and mtDNA lineages. Historical accounts suggest that it was predominantly the transit of males into China and their partial incorporation into Chinese society that led to the formation of the majority of present-day ethnic minorities. Besides the observed common genetic background of ancient UEPs, these historical sources are validated by the high heterogeneity of the Y-chromosomal STRs and the cohesive female lineages of the three Muslim populations.

Acknowledgements The cooperation of the Salar, Bo'an, Dongxiang communities is acknowledged with gratitude. Assistance in sample collection was provided by the Institute of Genetics, Chinese Academy of Sciences. Thanks go to Miss S. G. Sullivan for her technical support in the genotyping work. Financial support was provided by an Australian Research Council Small Grant (number A350 352).


--------------------------------------------------------------------------------

References
Anderson S, Bankier AT, Barrell BG, de Bruijn MH, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJ, Staden R, Young IG (1981) Sequence and organization of the human mitochondrial genome. Nature 290:467–465

Bamshad MJ, Watkins WS, Dixon ME, Bhaskara BR, Naidu JM, Rasanayagam A, Hammer ME, Jorde LB (1998) Female gene flow stratifies Hindu castes. Nature 395:651–652

Bittles AH, Neel JV (1994) The costs of human inbreeding and their implications for variations at the DNA level. Nat Genet 8:117–121

Black ML, Wang W, Bittles AH (2001). A genome-based study of the Muslim Hui and the Han population of Liaoning province, PR China. Hum Biol 73:801–803

Budowle B, Wilson MP, DiZinno JA Stauffre C, Fasano MA, Holland MM, Monson KL (1999) Mitochondrial DNA regions HVI and HVII population data. Forensic Sci Int 103:23–25

Comas D, Calafell F, Mateu E, Perez-Lezaun A, Bosch E, Martinez-Arias R, Clarimon J, Facchini F, Fiori G, Luiselli D, Pettener D, Bertranpetit J (1998) Trading genes along the Silk Road: mtDNA sequences and origin of Central Asian populations. Am J Hum Genet 63:1824–1838

Chu JY, Huang W, Kuang SQ, Wang JM, Xu JJ, Chu ZT, Yang ZQ, Lin KQ, Li P, Wu M, Geng ZC, Tan CC, Du RF, Jin L (1998) Genetic relationship of populations in China. Proc Natl Acad Sci USA 95:11763–11768

de Knijff P, Kayser M, Caglia A, Corach D, Fretwell N, Gehrig C, Graziosi G, Heidorn F, Herrmann S, Herzog B, Hidding M, Honda K, Jobling M, Krawczak M, Leim K, Meuser S, Meyer E, Oesterreich W, Pandya A, Parson W, Penacino G, Perez-Lezaun A, Piccinini A, Prinz M, Roewer L, et al. (1997) Chromosome Y microsatellites: population genetic and evolutionary genetic and evolutionary aspects. Int J Legal Med 110:134–149

Ding YC, Wooding S, Harpending HC, Chi HC, Li HP, Fu YX, Pang JF, Yao YG, Yu JG, Moyzis R, Zhang Y (2000) Population structure and history in East Asia. Proc Natl Acad Sci USA 97:14003–14006

Du R, YipVF (1993) Ethnic groups in China. Science Press, Beijing and New York

Du R, Zhao ZL (1981) Percentage and types of consanguineous marriages of different nationalities and regions in China (in Chinese). Natl Med J China 61:723–728

Ebrey PB (1999) Cambridge illustrated history of China. Cambridge University Press, London

Excoffier L, Smouse PE, Quattro J (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data. Genetics 131:479–491

Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 35:785–791

Family Planning Commission (1997) Chinese family planning yearbook 1997. Family Planning Commission, Beijing

Gladney DC (1996) Muslim Chinese: ethnic nationalism in the People's Republic. Harvard University Press, Cambridge, Massachusetts

Gladney DC (1998) Ethnic identity in China: the making of a Muslim minority nationality. Harcourt Brace, Fort Worth, Texas

Guo S, Thompson EA (1992) Performing the exact test of Hardy-Weinberg proportion for multiple alleles. Biometrics 48:361–372

Grant JC, Bittles AH (1997) The comparative role of consanguinity in infant and child mortality in Pakistan. Ann Hum Genet 61:143–149

Haff LA, Smirnov IP (1997) Single-nucleotide polymorphism identification assays using a thermostable DNA polymerase and delayed extraction MALDI-TOF mass spectrometry. Genome Res 7:378–388

Hammer MF, Karafet TM, Redd AJ, Jarjanazi, Sanatachiara-Benerecetti S, Soodyall H, Zegura SL (2001) Hierarchical patterns of global human Y-chromosome diversity. Mol Biol Evol 18:1189–1203

Heyer E, Puymirat J, Dielties P, Bakker E, de Knijff P (1997) Estimating Y chromosome specific microsatellite mutation frequency using deep rooting pedigrees. Hum Mol Genet 6:799–803

Hopgood R, Sullivan KM, Gill P (1992) Strategies for automated sequencing of Human mitochondrial DNA directly from PCR products. Biotechniques 13:82–92

Karafet T, Xu L, Du R, Wang W, Feng S, Wells RS, Redd AJ, Zegura SL, Hammer MF (2001) Paternal population history of East Asia: sources, patterns, and microevolutionary processes. Am J Hum Genet 69:615–628

Kumar S, Tamura K, Jakobsen IB, Nei M (2001) MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244–1245

Leslie DD (1986) Islam in traditional China: a short history to 1800. Canberra College of Advanced Education, Canberra

Lipman JN (1997) Familiar strangers: a history of Muslims in Northwest China. Hong Kong University Press, Hong Kong

Ma P (1998) The taboo of the Hui woman to marry non-Hui men in the Hui people in Northwest China. The 14th International Congress of Anthropological and Ethnological Sciences, p 231

McElreavey K, Quintana-Murci L (2002) Understanding inherited disease through human migrations: a south-west Asian perspective. Community Genet 5:153–156

Minch E, Ruiz-Linares A, Goldstein D, Feldman M, Cavalli-Sforza LL (1997) Microsat v.1.5d: a computer program for calculating various statistics on microsatellite allele data (http://lotka.stanford.edu/microsat/microsat.html)

Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York

Quintana-Murci L, Krausz C, Zerjal T, Sayar SH, Hammer MF, Mehdi SQ, Ayub Q, Qamar R, Mohyuddin A, Radhakrishna U, Jobling MA, Tyler-Smith C, McElreavey K (2001) Y-chromosome lineages trace diffusion of people and languages in Southwestern Asia. Am J Hum Genet 68:537–542

Rahman YA (1997) Islam in China, http://www.erols.com/ameen/islchina

Rousset F (1995) Population genetics software for exact tests and ecumenicalism. J Hered 83:239

Rousset F, Raymond M (1995) Testing heterozygote excess and deficiency. Genetics 140:1413–1419

Seielstad MT, Minch E, Cavalli-Sforza LL (1998) Genetic evidence for a higher female migration rate in humans. Nat Genet 20:278–280

Stoneking M (1998) Women on the move. Nat Genet 20:219–220

Su B, Xiao J, Underhill P, Deka R, Zhang W, Akey J, Huang W, Shen D, Lu D, Luo J, Chu J, Tan J, Shen P, Davis R, Cavalli-Sforza L, Chakraborty R, Xiong M, Du R, Oefner P, Chen Z, Jin L (1999) Y chromosome evidence for a northward migration of modern humans in East Asia during the last ice age. Am J Hum Genet 65:1718–1724

Tamura K, Nei M (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10:512–526

Underhill P, Passerine G, In AA, Sheen P, Lah MM, Foley RA, Oefner PJ, Cavalli-Sforza LL (2001) The pylogeography of Y chromosome binary haplotypes and the origins of modern human populations. Ann Hum Genet 65:43–62

Wang W, Sullivan SG, Ahmed A, Chandler D, Zhivotovsky LA, Bittles AH (2000) A genome-based study of consanguinity in three co-resident endogamous Pakistan communities. Ann Hum Genet 64:41–49

Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358–1370

Wells RS, Yuldasheva N, Ruzibakiev R, Underhill PA, Evseeva I, Blue-Smith J, Jin L, Su B, Pitchappan R, Shanmugalakshmi S, Balakrishnan K, Read M, Pearson NM, Zerjal T, Webster MT, Zholoshvili I, Jamarjashvili E, Gambarov S, Nikbin B, Dostiev A, Aknazarov O, Zalloua P, Tsoy I, Kitaev M, Mirrakhimov M, Chariev A, Bodmer WF (2001) The Eurasian heartland: a continental perspective on Y-chromosome diversity. Proc Natl Acad Sci USA 98:10244–10249

Wong HM, Dajani AA (1988) Islamic frontiers in China. Scorpion, London