Linking Italy and the Balkans. A Y-chromosome perspective from the Arbereshe of Calabria.

From the paper:

The Arbereshe are one of the largest linguistic minorities in Italy. They are the result of complicated movements of Albanians around the end of the 15th and beginning of the 16th century, often linked to the invasion of the Balkans by the Ottoman Empire. Despite that, it is generally agreed that most of the immigrants started moving from the south of Albania (Toskeria), with, very often, intermediate steps in Greece, particularly in the Peloponnese (Zangari 1941). Further evidence is provided by linguistic research, according to which Arberisht, the language spoken by Arbereshe, is part of the Tosk dialect group of Albanian, a language originally spoken in Toskeria (Babiniotis 1998).

On the sample:

The Arbereshe Y-chromosome variation was investigated by sampling individuals from different villages of the Pollino area (Calabria) who bear one of the founding surnames of the population. The genotyping was performed using 12 microsatellites (STRs) and 31 unique event polymorphisms (UEPs), defining, respectively, haplotypes and haplogroups.

The Italian and Balkan genetic backgrounds were explored using the large amount of data provided by recent Y-chromosome studies in the two peninsulas and by literature data on STRs from forensic research.

Comparison of Y-haplogroup frequency and diversity between Albanians from Tirana and Arbereshe from Calabria (from Table III):

The presence of F*(xG,I,J,K) in Albanians is interesting as this occurs in Romania and Bosnia Herzegovina (all groups), and in South Apulia, It could potentially be haplogroup H and may reflect a Gypsy element that was not present when the Arbereshe moved to Italy from the Balkans.

Haplogroup I shows similar frequencies, but:
I-M170 is the most common Balkan haplogroup (Pericic et al. 2005a,b) and the second most frequent Arbereshe clade. Nevertheless, analysis of its network reveals unexpected results: most of the Arbereshe I-M170 haplotypes are not included in the Balkan cluster (Figure 3), but are located in the long branches containing mainly Italian chromosomes.

Comparisons with literature data (Semino et al. 2000; Barac et al. 2003, Rootsi et al. 2004) show that the core haplotype of the Balkan cluster (16-14-15-13-31-24-11-11-13; locus order as above) is consistent with the almost Balkan exclusive I2a (formerly I1b) clade. The proposed interpretation of the Arbereshe as a proxy of the founder Albanian population leads us to hypothesize that the I2a clade was less common in the southern Balkans 500 years ago than nowadays. The very tight shape of the I2a cluster in the network suggests a very recent expansion of this haplogroup in the southern Balkans. Furthermore, I2a is still rare in
mountain populations such as the Albanians of Kosovo (Pericic et al. 2005a,b) and in a randomly selected Arbereshe sample from Rootsi et al. (2004).
This is an interesting finding in the light of recent evidence for selection in Y-haplogroup I.

The situation with J2 is also quite interesting as this is rarer in Arbereshe (3%) than Albanians (17%):

The scarcity of J2 chromosomes in the Arbereshe sample (1/40) is very difficult to explain, given that they are very common in both the Italian peninsula and the southern Balkans. Literature data on J2 indicate that most of the haplotypes included in the Balkan (B) cluster of the network (Figure 3) have an STR configuration consistent with the J2-M12 sub-clade (Di Giacomo et al. 2004; Semino et al. 2004; Cruciani et al. 2007). In contrast, most of the haplotypes in the other clusters agree with the STR configuration given for the J2-M67 clade, with its sub-clade J2-M92 (Di Giacomo et al. 2004). It is unconvincing to attribute the rarity of J2 in the Arbereshe to random sampling or to the effect of genetic drift.

Furthermore, the Arbereshe sample analysed by Semino et al. (2004) also completely lacks the typically Balkan J2-M12 chromosomes. If we interpret our Arbereshe sample as representative of the founding Albanian population, we may hypothesize that the J2 haplogroup was considerably less diffuse in the southern Balkans five centuries ago than today.

What we can conclude from this study is that the founding Albanian population was J2- and I2a- lite compared to modern Albanians. The source for the I2a seems to be either the Albanization of people from the West Balkans and/or selection, although it would be difficult to see a massive increase in frequency in only five centuries. The I2a-deficiency of the Arbereshe also gives support to the theory that the Albanians are relatively recent arrivals from the northeast; this theory has been upheld in the past on the basis of the (i) their historical obscurity until the last millennium, and (ii) the paucity of native sea terms and Greek loanwords in Albanian, which is difficult to explain if Albanians always occupied their current location on the Adriatic.

The source of J2 is less clear, and could be either the Albanization of Greeks (the only Balkan population with a sizeable J2 frequency) or remnants of Muslim Anatolians from Ottoman times. However, modern Albanians belong mainly to clade J2b, while Anatolians belong to J2a. Thus, I tend to dismiss the Anatolian connection.

The low frequency of R1*x(R1a1) in the Arbereshe, together with the high E1b1b1a frequency are quite convincing of the Balkan origins of this population.