Research Fellow of the Japan Society for the Promotion of Science (PD)
School of Science, University of Tokyo
Iwasaki Wataru Lab
Please see below and check "CV_SAI.pdf"
1, Methodological studies for phylogenetic artifacts caused by compositional biases of sequences
In phylogenetic analyses of nucleotide sequences, ‘homogeneous’ substitution models, which assume the stationarity of base composition across lineages, are widely used. However, a homogeneous model-based analysis can yield an artifactual tree when our data exhibit heterogeneous base compositions among sequences. Potential artifacts stemming from compositional heterogeneity in tree reconstruction can be countered by two approaches, ‘RY-coding’ and ‘non-homogeneous (NH)’ models. The former approach converts four bases into two-state characters, purine (R) and pyrimidine (Y), to homogenize their compositions among sequences (Phillips and Penny, 2003). In contrast, compositional heterogeneity is explicitly incorporated in the latter approach by allocating free model parameters in a branch-by-branch fashion (Galtier and Gouy, 1998; Dutheil and Boussau, 2008). Although these approaches have been applied to several real-world data analyses, their basic properties have not been fully examined by pioneering simulation studies.
In this study, I demonstrated the de facto first simulation to assess the performance of the maximum-likelihood phylogenetic analyses incorporating RY-coding and NH models under the presence of compositional heterogeneity. These two methods were applied to the analyses of the ‘4-taxon’ datasets bearing various degrees of the heterogeneity of adenine and thymine (AT) content. Both RY-coding and NH model-based analyses showed superior performance to reconstruct the true phylogenetic relationships against ~20% AT content difference among sequences, compared to a homogeneous model-based analysis. Nevertheless, I revealed that the accuracy of phylogenetic inference based on RY-coding, at least to some extent, depends on the substitution process that generated the sequence data of interest (e.g, transition/transversion ratio). Furthermore, the inferences from RY-coding-based analyses can be severely biased when the data-recoding cannot ameliorate complex patterns of compositional heterogeneity in the data. On the other hand, NH models appeared to be robust against all types of compositional heterogeneity examined in this study, and are widely applicable to phylogenetic analyses of various empirical datasets. For more information, please refer to Ishikawa, Inagaki, and Hashimoto. (2012) listed in my CV.
2, Computational challenges for the efficient parallelization of phylogenetic inferences with non-homogeneous models, on current supercomputing systems
3, Detection of gene conversion (recombination) events among bacterial sequences, based on the phylogenetic methods
Bacteria have two paralogs of peptide-chain release factor, RF1 and RF2, which are different from each other in stop-codon recognition. The two RF families are generally expected to have taken independent evolutionary paths after they arose from a single gene-duplication event in the ancestral bacterial genome. However, my survey based on phylogenetic and statistical methods detected inter- or intra-genomic conversions between RF1 and RF2 genes in diverse bacterial genomes, which encompass a domain that has a key role in the interaction with the ribosome during translation termination process. Structural analyses suggested that conversions of the corresponding region are functionally neutral for both RF1 and RF2, implying that the frequency of 'partial' conversion between paralogous genes is higher than we generally assume. For more detailed information, please check Ishikawa, Kamikawa, and Inagaki (2015) listed in my CV.
4, Collaboration for the large-scale phylogenetic analyses
Peer-reviewed Journal Papers
†: Equally contributed authors
1. Templeton T, Asada M, Jiratanh M, Sohta A. Ishikawa, Tiawsirisup S, Sivakumar T, Namangala B, Takeda M, Mohkaew K, Ngamjituea S, Inoue N, Sugimoto C, Inagaki Y, Suzuki Y, Yokoyama N, Kaewthamasorn M, Kaneko O. (2016), Ungulate malaria parasites. accepted to be published in Scientific Reports
2. Sohta A. Ishikawa, Ryoma Kamikawa, Inagaki Y. (2015), Multiple conversion between the genes encoding bacterial class-I release factors. Scientific Reports, 5:12406.
3. Kamikawa R, Tanifuji G, Sohta A. Ishikawa, Ishii K, Matsuo Y, Onodera N, Ishida K, Hashimoto T, Miyashita H, Mayama S, Inagaki Y. (2015), Proposal of a Twin-arginine translocator system–mediated constraint against loss of ATP synthase genes from nonphotosynthetic plastid genomes. Molecular Biology and Evolution, 32(10):2598–2604.
4. Sohta A. Ishikawa, Nakao M, Inagaki Y, Hashimoto T, Sato M. (2014), MPI/OpenMP HYBRID Parallelization of Phylogenetic Analyses based on Non-Homogeneous Substitution Models:Implementation and Performance Evaluation for Large-Scale Computing Systems. IPSJ Transactions on Advanced Computing Systems, 7(3), pp 13–24 (2014). written in Japanese
5. Yabuki A†, Kamikawa R†, Sohta A. Ishikawa, Kolisko M, Kim E, Tanabe AS, Kume K, Ishida K, Inagaki Y. (2014), Palpitomonas bilix presents a basal cryptist lineage: insight into the character evolution in Cryptista. Scientific Reports, 4:4641.
6. Kamikawa R, Kolisko M, Nishimura Y, Yabuki A, Brown MW, Sohta A. Ishikawa, Ishida K, Roger AJ, Hashimoto T, Inagaki Y. (2014), Gene-content evolution in discobid mitochondria deduced from the phylogenetic position and complete mitochondrial genome of Tsukubamonas globosa. Genome Biology and Evolution, 6(2), pp 306-315.
7. Nagayasu E, Sohta A. Ishikawa, Taketani S, Chakraborty G, Yoshida A, Inagaki Y, Maruyama H. (2013), Identification of a bacteria-like ferrochelatase in Strongyloides venezuelensis, an animal parasitic Nematode. PLOS ONE, 8(3), e58458.
8. Sohta A. Ishikawa, Hashimoto T. (2012), Assessment of the performance of phylogenetic inference based on simulated protein-coding sequences with significant compositional heterogeneity. Proceedings of the Institute of Statistical Mathematics, 60(2), pp 289-303. written in Japanese
9. Sohta A. Ishikawa, Inagaki Y, Hashimoto T. (2012). RY-coding and non-homogeneous models can ameliorate the maximum-likelihood inferences from nucleotide sequence data with parallel compositional heterogeneity. Evolutionary Bioinformatics, 8, pp 357-371.
10. Ishitani Y†, Sohta A. Ishikawa†, Inagaki Y, Tsuchiya M, Takahashi K, Takishita K. (2011), Multigene phylogenetic anaylses including diverse radiolarian species support the "Retaria" hypothesis - the sister relationship of Radiolaria and Foraminifera. Marine Micropaleontology, 81(1), pp 32-42.
11. Matsumoto T, Sohta A. Ishikawa, Hashimoto T, Inagaki Y. (2011), A deviant genetic code in the green alga-derived plastid in the dinoflagellate Lepidodinium chlorophorum. Molecular Phylogenetics and Evolution, 60(1), pp 68-72.
12. Reimer JD, Sohta A. Ishikawa, Hirose M. (2011), New records and molecular characterization of Acrozoanthus (Cnidaria: Anthozoa: Hexacorallia) and its endosymbionts (Symbiodinium spp.) from Taiwan. Marine Biodiversity, 41(2), pp 313-323.
Peer-reviewed Conference Papers
1. Sohta A. Ishikawa, Nakao M, Inagaki Y, Hashimoto T, Sato M. (2014), Hybrid MPI/OpenMP parallelization of a phylogenetic program with Non-Homogeneous models: toward the analyses of large-scale sequence datasets. High Performance Computing Symposium 2014, pp 10-20. written in Japanese
Please check Ishikawa_Presentations
Please check Ishikawa_Products
mail: saishi＠b.s.u-tokyo.ac.jp, or s.ishikawa.biol.phylo＠gmail.com
＊please convert a full-width "＠" to a half-width one