Recently many investigators have used microsatellite DNA loci for studying the evolutionary relationships of closely related populations or species, and some authors proposed new genetic distance ...measures for this purpose. However, the efficiencies of these distance measures in obtaining the correct tree topology remains unclear. We therefore investigated the probability of obtaining the correct topology (PC) for these new distances as well as traditional distance measures by using computer simulation. We used both the infinite-allele model (IAM) and the stepwise mutation model (SMM), which seem to be appropriate for classical markers and microsatellite loci, respectively. The results show that in both the IAM and SMM CAVALLI-SFORZA and EDWARDS' chord distance (DC) and NEI et al.'s DA distance generally show higher PC values than other distance measures, whether the bottleneck effect exists or not. For estimating evolutionary times, however, NEI's standard distance and GOLDSTEIN et al.'s (delta mu)2 are more appropriate than other distances. Microsatellite DNA seems to be very useful for clarifying the evolutionary relationships of closely related populations.
Currently, there is a demand for software to analyze polymorphism data such as microsatellite DNA and single nucleotide polymorphism with easily accessible interface in many fields of research. In ...this article, we would like to make an announcement of POPTREE2, a computer program package, that can perform evolutionary analyses of allele frequency data. The original version (POPTREE) was a command-line program that runs on the Command Prompt of Windows and Unix. In POPTREE2 genetic distances (measures of the extent of genetic differentiation between populations) for constructing phylogenetic trees, average heterozygosities (H) (a measure of genetic variation within populations) and G(ST) (a measure of genetic differentiation of subdivided populations) are computed through a simple and intuitive Windows interface. It will facilitate statistical analyses of polymorphism data for researchers in many different fields. POPTREE2 is available at http://www.med.kagawa-u.ac.jp/ approximately genomelb/takezaki/poptree2/index.html.
In recent years, copy number variation (CNV) of DNA segments has become a hot topic in the study of genetic variation, and a large amount of CNVs has been uncovered in human populations. The CNVs ...involving the smallest units of DNA segments are microsatellite DNAs, and the evolutionary change of microsatellite DNAs is believed to occur mostly by the increase or decrease of one repeat unit at a time in a more or less neutral fashion. If we note that eukaryotic genomes contain millions of microsatellite loci, this pattern of nucleotide change is expected to generate random changes of genome size, that is, genomic drift, and will provide a neutral model of CNV evolution. We therefore investigated the amount of variation of the total number of repeats (TNR) per individual concerned with 145 microsatellite loci in three human populations, Africans, Europeans, and Asians. It was shown that the TNR follows the normal distribution in all three populations and that the extent of variation of TNR is more than 50% greater in Africans than in Europeans and Asians as expected from the hypothesis of African origin of modern humans. If we consider all microsatellite loci in the human genome and compute the variation of the total number of nucleotides involved (TNN), it is possible to study the contribution of microsatellite loci to the genome size variation. This study has shown that the genome sizes of human individuals are affected considerably by genomic drift of microsatellite DNA alone. This pattern of evolution is similar to that of olfactory receptor (OR) genes previously studied in human populations and support the idea that the number of OR genes has evolved in a more or less neutral fashion. However, this conclusion does not necessarily apply to the genomewide CNVs of various DNA segments, and it appears that long variant DNA fragments are deleterious and under purifying selection.
Summary
We investigated Y chromosomal binary and STR polymorphisms in 263 unrelated male individuals from the Japanese population and further examined the relationships between the two separate types ...of data. Using 47 biallelic markers we distinguished 20 haplogroups, four of which (D2b1/‐022457, O3/‐002611*, O3/‐LINE1 del, and O3/‐021354*) were newly defined in this study. Most haplogroups in the Japanese population are found in one of the three major clades, C, D, or O. Among these, two major lineages, D2b and O2b, account for 66% of Japanese Y chromosomes. Haplotype diversity of binary markers was calculated at 86.3%. The addition of 16 Y‐STR markers increased the number of haplotypes to 225, yielding a haplotype diversity of 99.40%. A comparison of binary haplogroups and Y‐STR type revealed a close association between certain binary haplogroups and Y‐STR allelic or conformational differences, such as those at the DXYS156Y, DYS390m, DYS392, DYS437, DYS438 and DYS388 loci. Based on our data on the relationships between binary and STR polymorphisms, we estimated the binary haplogroups of individuals from STR haplotypes and frequencies of binary haplogroups in other Japanese, Korean and Taiwanese Han populations. The present data will enable researchers to connect data from binary haplogrouping in anthropological studies and Y‐STR typing in forensic studies in East Asian populations, especially those in and around Japan.
The relative efficiencies of different protein-coding genes of the mitochondrial genome and different tree-building methods in recovering a known vertebrate phylogeny (two whale species, cow, rat, ...mouse, opossum, chicken, frog, and three bony fish species) was evaluated. The tree-building methods examined were the neighbor joining (NJ), minimum evolution (ME), maximum parsimony (MP), and maximum likelihood (ML), and both nucleotide sequences and deduced amino acid sequences were analyzed. Generally speaking, amino acid sequences were better than nucleotide sequences in obtaining the true tree (topology) or trees close to the true tree. However, when only first and second codon positions data were used, nucleotide sequences produced reasonably good trees. Among the 13 genes examined, Nd5 produced the true tree in all tree-building methods or algorithms for both amino acid and nucleotide sequence data. Genes Cytb and Nd4 also produced the correct tree in most tree-building algorithms when amino acid sequence data were used. By contrast, Co2, Nd1, and Nd41 showed a poor performance. In general, large genes produced better results, and when the entire set of genes was used, all tree-building methods generated the true tree. In each tree-building method, several distance measures or algorithms were used, but all these distance measures or algorithms produced essentially the same results. The ME method, in which many different topologies are examined, was no better than the NJ method, which generates a single final tree. Similarly, an ML method, in which many topologies are examined, was no better than the ML star decomposition algorithm that generates a single final tree. In ML the best substitution model chosen by using the Akaike information criterion produced no better results than simpler substitution models. These results question the utility of the currently used optimization principles in phylogenetic construction. Relatively simple methods such as the NJ and ML star decomposition algorithms seem to produce as good results as those obtained by more sophisticated methods. The efficiencies of the NJ, ME, MP, and ML methods in obtaining the correct tree were nearly the same when amino acid sequence data were used. The most important factor in constructing reliable phylogenetic trees seems to be the number of amino acids or nucleotides used.
Although African populations have been shown to be most divergent from any other human populations, it has been difficult to establish the root of the phylogenetic tree of human populations since the ...rate of evolutionary change may vary from population to population owing to the fluctuation of population size and other factors. However, the root can be determined by using the chimpanzee as an outgroup and by employing proper statistical methods. Using this strategy, we constructed phylogenetic trees of human populations for five different sets of gene frequency data. The data sets used were two sets of microsatellite loci data (25 and 8 loci, respectively), restriction fragment length polymorphism (RFLP) data (79 loci), protein polymorphism data (15 loci), and Alu insertion frequency data (4 loci). All these data sets showed that the root is located in the branch connecting African and non-African populations, and in the four data sets the root was established at a significant level. These results indicate that Africans are the first group of people that split from the rest of the human populations.
Concatenated sequences of all protein-coding genes in mitochondria recovered a known phylogeny of 11 vertebrate species correctly with statistical significance. However, when it was rooted by ...lampreys or sea urchins, the root of the vertebrate tree was placed between the mammal cluster and the chicken-frog-fish cluster or between the mammal-chicken cluster and the frog-fish cluster, depending on the tree-making method used. Although the frog-fish or chicken-frog-fish cluster was biologically incorrect, it was again supported with a significantly high bootstrap value. In this study, we investigated the reasons why this happened. It has been suggested that an incorrect phylogeny may be constructed due to a change of amino acid composition in different lineages or due to homoplasies at sites with hydrophobic amino acids. However, our results indicated that these were not the causes of the incorrect rooting of the vertebrate tree. Rather, it was important to take into account an extensive rate variation across sites and different probabilities of substitution among different amino acids. The substitution rates for mitochondrial sequences vary considerably for different vertebrate lineages. In such a case, it is known to be important to use the model that reflects the actual substitution probability to obtain a correct tree topology. The correct rooting of the vertebrate tree was recovered when rate variation across sites was properly accounted for.
The genetic relationships of seven Japanese and four mainland-Asian horse populations, as well as two European horse populations, were estimated using data for 20 microsatellite loci. Mongolian ...horses showed the highest average heterozygosities (0.75–0.77) in all populations. Phylogenetic analysis showed the existence of three distinct clusters supported by high bootstrap values: the European cluster (Anglo-Arab and thoroughbreds), the Hokkaido-Kiso cluster, and the Mongolian cluster. The relationships of these clusters were consistent with their geographical distributions. Basing our assumptions on the phylogenetic tree and the genetic variation of horse populations, we suggest that Japanese horses originated from Mongolian horses migrating through the Korean Peninsula. The genetic relationship of Japanese horses corresponded to their geographical distribution. Microsatellite polymorphism data were shown to be useful for estimating the genetic relationships between Japanese horses and Asian horses.
Abstract Allele frequencies for 15 short tandem repeat (STR) loci D8S1179, D21S11, D7S820, CSF1PO, D3S1358, TH01, D13S317, D16S539, D2S1338, D19S433, vWA, TPOX, D18S51, D5S818 and FGA (AmpF/STR ...Identifiler PCR Amplification kit, PE Applied Biosystems) were obtained from a sample of 110 unrelated individuals from the Malay population living in and around Kuala Lumpur, Malaysia, and the characteristics of the population was compared with other East Asian populations.
In examining genetic data in recent publications, Backeljau et al. showed cases in which two or more different trees (tie trees) were constructed from a single data set for the neighbor-joining (NJ) ...method and the unweighted pair group method with arithmetic mean (UPGMA). However, it is still unclear how often and under what conditions tie trees are generated. Therefore, I examined these problems by computer simulation. Examination of cases in which tie trees occur shows that tie trees can appear when no substitutions occur along some interior branch(es) on a tree. However, even when some substitutions occur along interior branches, tie trees can appear by chance if parallel or backward substitutions occur at some sites. The simulation results showed that tie trees occur relatively frequently for sequences with low divergence levels or with small numbers of sites. For such data, UPGMA sometimes produced tie trees quite frequently, whereas tie trees for the NJ method were generally rare. In the simulation, bootstrap values for clusters (tie clusters) that differed among tie trees were mostly low (< 60%). With a small probability, relatively high bootstrap values (at most 70%-80%) appeared for tie clusters. The bias of the bootstrap values caused by an input order of sequence can be avoided if one of the different paths in the cycles of making an NJ or UPGMA tree is chosen at random in each bootstrap replication.