Genetic epidemiology of single-nucleotide polymorphisms.

Collins A, Lonjou C, Morton NE.

Human Genetics, University of Southampton, Southampton General Hospital, Tremona Road, Southampton SO16 6YD, United Kingdom.

On the causal hypothesis, most genetic determinants of disease are single-nucleotide polymorphisms (SNPs) that are likely to be selected as markers for positional cloning. On the proximity hypothesis, most disease determinants will not be included among markers but may be detected through linkage disequilibrium with other SNPs. In that event, allelic association among SNPs is an essential factor in positional cloning. Recent simulation based on monotonic population expansion suggests that useful association does not usually extend beyond 3 kb. This is contradicted by significant disequilibrium at much greater distances, with corresponding reduction in the number of SNPs required for a cost-effective genome scan. A plausible explanation is that cyclical expansions follow population bottlenecks that establish new disequilibria. Data on more than 1,000 locus pairs indicate that most disequilibria trace to the Neolithic, with no apparent difference between haplotypes that are random or selected through a major disease gene. Short duration may be characteristic of alleles contributing to disease susceptibility and haplotypes characteristic of particular ethnic groups. Alleles that are highly polymorphic in all ethnic groups may be older, neutral, or advantageous, in weak disequilibrium with nearby markers, and therefore less useful for positional cloning of disease genes. Significant disequilibrium at large distance makes the number of suitably chosen SNPs required for genome screening as small as 30,000, or 1 per 100 kb, with greater density (including less common SNPs) reserved for candidate regions.