Summary
The Biological Science Group has 1 CCS staff, Dr. Y. Inagaki. He is officially affiliated in the Graduate School of Life and Environmental Sciences, working at CCS. In addition, a campus affiliated staff, Dr. T. Hashimoto in the Graduate School of Life and Environmental Sciences, collaborates with Dr. Y. Inagaki in the group.
The Biological Science Group conducts studies on molecular evolution of microorganisms with regard to two major subjects.
- Analysis of global phylogeny
- We are particularly interested in global phylogeny of eukaryotes (the organisms with nuclei).
- Methodology for molecular phylogenetic analyses
- Phylogenetic analyses can be misled by various reasons. Therefore, it is very important to investigate how to avoid biased estimations. Based on the knowledge from these studies, we may be able to reconstruct more robust global eukaryotic phylogeny.
Achievements
Summary of the projects carried out during a school year 2008 are listed below.
Global phylogeny of eukaryotes
Resolving the global phylogeny of eukaryotes is one of the most fundamental problems in biology that has been addressed using molecular phylogenetic techniques. Recently, with the accumulation of a large scale sequence data from various eukaryotic microorganisms, multi-gene phylogeny has frequently been examined to infer deep relationships of the eukaryotic tree. The approach has successfully provided novel insights into deep splits in eukaryotic evolution. This year we conducted multi-gene analyses for resolving a phylogenetic position of Centrohelida of which position in eukaryotic tree has long been unknown. Collaborating with a reserch group in Univ. Geneva, we sequenced 60,000 cDNA clones by pyrosequencing from a centrohelid, Raphidiophrys contractilis. Collaborating further with an another reserch group in Univ. Oslo that provided a large scale sequence data from other eukaryotic microorganism, Telonema subtilis, of which phylogenetic position has also been unknown, we made a large muti-gene data matrix with 76 taxa `30,000 amino acid positions (127 genes). On the basis of the maximum likelihood (ML) analyses of the multi-gene data set we suggested at the first time that the Centrohelida (Raphidiophrys contractilis) and Telonema subtilis are the each other's closest relatives and their Common ancestor is located at the sister group position to the clade comprising of Cryptophyta and Haptophyta.
Gene sampling bias in multi-gene phylogenetic inference
Multi-gene phylogeny sometimes misleads affected by presence of unusual gene(s) that Contain far different phylogenetic signals from other genes. Thus, gene sampling bias can significantly contribute to a conclusion of multi-gene phylogeny especially when selected positions are not large enough. In order to evaluate how much extent gene sampling affect the phylogenetic inference by using a data set with relatively smaller number of positions, we examined a case study of eukaryotic phylogeny focusing on the reconstruction of Plantae monophyly (Green plants + Red algae). The monophyly of Plantae has been recovered in recent phylogenetic Analyses of large multi-gene data sets (e.g. those including > 30,000 amino acid positions). On the Other hand, Plantae monophyly has not been stably reconstructed in inferences from multi-gene Data sets with fewer than 10,000 positions. Actually, significant incongruity has been observed between two different studies favoring and against the Plantae monophyly hypothesis. Using 27 multi-gene data set we performed extensive comparisons between multi-gene analyses with different gene samplings. and found that Recovery of a sistergroup between green plants and red algae primarily depends on gene samplings.
Efficiency of heuristic tree search methods
In theory, the maximum likelihood (ML) tree should be selected from all possible trees, the number of which depends on the number of OTUs in a dataset. Such exhaustive search (ES) is, however, impractical for analyses considering >=10 OTUs, and we generally search for the ML tree by various heuristic search (HS) methods. Here we conducted ES on 10-taxon datasets (require lnL calculations of 2,027,025 trees per dataset), and compared the trees inferred by four HS methods. Significantly, we identified a condition in which the HS methods showed low success rates (`30%). We also discussed a solution to increase the success rates of HS methods.
Future plan
Analysis of global eukaryotic phylogeny
Multi-gene phylogeny for uncovering the positions of Centrohelida, catablepharids, and several excavate lineages ; Chloroplast phylogeny and the origin of apicoplast ; global phylogeny of EF1α / EFL genes
Methodology for molecular phylogenetic analyses
Analyses on the efficiency of heuristic tree search methods