Home
/
Authors
/
Kei-ichi Kuma

Author

Kei-ichi Kuma

Other affiliations: Kyushu University, National Institute of Informatics

Bio: Kei-ichi Kuma is an academic researcher from Kyoto University. The author has contributed to research in topics: Phylogenetic tree & Phylogenetics. The author has an hindex of 29, co-authored 42 publications receiving 17462 citations. Previous affiliations of Kei-ichi Kuma include Kyushu University & National Institute of Informatics.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform

[...]

Kazutaka Katoh¹, Kazuharu Misawa, Kei-ichi Kuma¹, Takashi Miyata¹•Institutions (1)

Kyoto University¹

15 Jul 2002-Nucleic Acids Research

TL;DR: A simplified scoring system is proposed that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length.

...read moreread less

Abstract: A multiple sequence alignment program, MAFFT, has been developed. The CPU time is drastically reduced as compared with existing methods. MAFFT includes two novel techniques. (i) Homologous regions are rapidly identified by the fast Fourier transform (FFT), in which an amino acid sequence is converted to a sequence composed of volume and polarity values of each amino acid residue. (ii) We propose a simplified scoring system that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length. Two different heuristics, the progressive method (FFT-NS-2) and the iterative refinement method (FFT-NS-i), are implemented in MAFFT. The performances of FFT-NS-2 and FFT-NS-i were compared with other methods by computer simulations and benchmark tests; the CPU time of FFT-NS-2 is drastically reduced as compared with CLUSTALW with comparable accuracy. FFT-NS-i is over 100 times faster than T-COFFEE, when the number of input sequences exceeds 60, without sacrificing the accuracy.

...read moreread less

12,003 citations

Journal Article•DOI•

MAFFT version 5: improvement in accuracy of multiple sequence alignment

[...]

Kazutaka Katoh¹, Kei-ichi Kuma, Hiroyuki Toh, Takashi Miyata•Institutions (1)

Kyoto University¹

01 Jan 2005-Nucleic Acids Research

TL;DR: Improvement in accuracy was generally observed for most methods, but remarkably large for the new options of MAFFT proposed here, which showed higher accuracy than currently available methods including TCoffee version 2 and CLUSTAL W in benchmark tests consisting of alignments of >50 sequences.

...read moreread less

Abstract: The accuracy of multiple sequence alignment program MAFFT has been improved. The new version (5.3) of MAFFT offers new iterative refinement options, H-INS-i, F-INS-i and G-INS-i, in which pairwise alignment information are incorporated into objective function. These new options of MAFFT showed higher accuracy than currently available methods including TCoffee version 2 and CLUSTAL W in benchmark tests consisting of alignments of >50 sequences. Like the previously available options, the new options of MAFFT can handle hundreds of sequences on a standard desktop computer. We also examined the effect of the number of homologues included in an alignment. For a multiple alignment consisting of ∼8 sequences with low similarity, the accuracy was improved (2–10 percentage points) when the sequences were aligned together with dozens of their close homologues (E-value < 10−5–10−20) collected from a database. Such improvement was generally observed for most methods, but remarkably large for the new options of MAFFT proposed here. Thus, we made a Ruby script, mafftE.rb, which aligns the input sequences together with their close homologues collected from SwissProt using NCBI-BLAST.

...read moreread less

4,528 citations

Journal Article•DOI•

Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes

[...]

Naoyuki Iwabe¹, Kei-ichi Kuma, Masami Hasegawa, Syozo Osawa, Takashi Miyata - Show less +1 more•Institutions (1)

Kyushu University¹

01 Dec 1989-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A composite phylogenetic tree with two clusters corresponding to different proteins, from which the evolutionary relationship of the primary kingdoms is determined uniquely is proposed, revealing that archaebacteria are more closely related to eukaryotes than to eubacteria for all the cases.

...read moreread less

Abstract: All extant organisms are though to be classified into three primary kingdoms, eubacteria, eukaryotes, and archaebacteria. The molecular evolutionary studies on the origin and evolution of archaebacteria to date have been carried out by inferring a molecular phylogenetic tree of the primary kingdoms based on comparison of a single molecule from a variety of extant species. From such comparison, it was not possible to derive the exact evolutionary relationship among the primary kingdoms, because the root of the tree could not be determined uniquely. To overcome this difficulty, we compared a pair of duplicated genes, elongation factors Tu and G, and the alpha and beta subunits of ATPase, which are thought to have diverged by gene duplication before divergence of the primary kingdoms. Using each protein pair, we inferred a composite phylogenetic tree with two clusters corresponding to different proteins, from which the evolutionary relationship of the primary kingdoms is determined uniquely. The inferred composite trees reveal that archaebacteria are more closely related to eukaryotes than to eubacteria for all the cases. By bootstrap resamplings, this relationship is reproduced with probabilities of 0.96, 0.79, 1.0, and 1.0 for elongation factors Tu and G and for ATPase subunits alpha and beta, respectively. There are also several lines of evidence for the close sequence similarity between archaebacteria and eukaryotes. Thus we propose that this tree topology represents the general evolutionary relationship among the three primary kingdoms.

...read moreread less

800 citations

Journal Article•DOI•

The Complete Nucleotide Sequence of the Human Immunoglobulin Heavy Chain Variable Region Locus

[...]

Fumihiko Matsuda¹, Kazuo Ishii¹, Patrice Bourvagnet², Kei-ichi Kuma¹, Hidenori Hayashida³, Takashi Miyata¹, Tasuku Honjo¹ - Show less +3 more•Institutions (3)

Kyoto University¹, Centre national de la recherche scientifique², Nara Medical University³

07 Dec 1998-Journal of Experimental Medicine

TL;DR: Comparison between different copies of homologous units that appear repeatedly across the locus clearly demonstrates that dynamic DNA reorganization of the loci took place at least eight times between 133 and 10 million years ago.

...read moreread less

Abstract: The complete nucleotide sequence of the 957-kb DNA of the human immunoglobulin heavy chain variable (VH) region locus was determined and 43 novel VH segments were identified. The region contains 123 VH segments classifiable into seven different families, of which 79 are pseudogenes. Of the 44 VH segments with an open reading frame, 39 are expressed as heavy chain proteins and 1 as mRNA, while the remaining 4 are not found in immunoglobulin cDNAs. Combinatorial diversity of VH region was calculated to be ∼6,000. Conservation of the promoter and recombination signal sequences was observed to be higher in functional VH segments than in pseudogenes. Phylogenetic analysis of 114 VH segments clearly showed clustering of the VH segments of each family. However, an independent branch in the tree contained a single VH, V4-44.1P, sharing similar levels of homology to human VH families and to those of other vertebrates. Comparison between different copies of homologous units that appear repeatedly across the locus clearly demonstrates that dynamic DNA reorganization of the locus took place at least eight times between 133 and 10 million years ago. One nonimmunoglobulin gene of unknown function was identified in the intergenic region.

...read moreread less

458 citations

Journal Article•DOI•

A giant nucleopore protein that binds Ran/TC4.

[...]

Nobuhiko Yokoyama¹, Naoyuki Hayashi¹, Tatsuya Seki¹, Nelly Panté², Tomoyuki Ohba¹, K. Nishii¹, Kei-ichi Kuma³, Toshiro Hayashida¹, Takashi Miyata³, Ueli Aebi⁴, Ueli Aebi², M. Fukui¹, Takeharu Nishimoto¹ - Show less +9 more•Institutions (4)

Kyushu University¹, University of Basel², Kyoto University³, Johns Hopkins University School of Medicine⁴

13 Jul 1995-Nature

TL;DR: The identification of RanBP2, a novel protein of 3,224 residues, which contains the XFXFG pentapeptide motif characteristic of nuclear pore complex (NPC) proteins, and immunolocalization suggests that Ran BP2 is a constituent of the NPC.

...read moreread less

Abstract: RAN/TC4 is a small nuclear G protein1 that forms a complex with the chromatin-bound guanine nucleotide release factor RCC1 (ref. 2). Loss of RCC1 causes defects in cell cycle progression3,4, RNA export5-7 and nuclear protein import8. Some of these can be suppressed by overexpression of Ran/TC4 (ref. 1), suggesting that Ran/TC4 functions downstream of RCC1. We have searched for proteins that bind Ran/TC4 by using a two-hybrid screen, and here we report the identification of RanBP2, a novel protein of 3,224 residues. This giant protein comprises an amino-terminal 700-residue leucine-rich region, four RanBPl-homologous (refs 9, 10) domains, eight zinc-finger motifs similar to those of NUP153 (refs 11, 12), and a carboxy terminus with high homology to cyclophilin13. The molecule contains the XFXFG pentapeptide motif characteristic of nuclear pore complex (NPC) proteins14, and immunolocalization suggests that RanBP2 is a constituent of the NPC. The fact that NLS-mediated nuclear import can be inhibited by an antibody directed against RanBP2 supports a functional role in protein import through the NPC.

...read moreread less

454 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

MUSCLE: multiple sequence alignment with high accuracy and high throughput

[...]

Robert C. Edgar

01 Mar 2004-Nucleic Acids Research

TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.

...read moreread less

Abstract: We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using treedependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

...read moreread less

37,524 citations

Journal Article•DOI•

QIIME allows analysis of high-throughput community sequencing data.

[...]

J. Gregory Caporaso¹, Justin Kuczynski¹, Jesse Stombaugh¹, Kyle Bittinger², Frederic D. Bushman², Elizabeth K. Costello¹, Noah Fierer³, Antonio Gonzalez Peña¹, Julia K. Goodrich¹, Jeffrey I. Gordon⁴, Gavin A. Huttley⁵, Scott T. Kelley⁶, Dan Knights¹, Jeremy E. Koenig⁷, Ruth E. Ley⁷, Catherine A. Lozupone¹, Daniel McDonald¹, Brian D. Muegge⁴, Meg Pirrung¹, Jens Reeder¹, Joel Sevinsky, Peter J. Turnbaugh⁴, William A. Walters¹, Jeremy Widmann¹, Tanya Yatsunenko⁴, Jesse R. Zaneveld¹, Rob Knight⁸, Rob Knight¹ - Show less +24 more•Institutions (8)

University of Colorado Boulder¹, University of Pennsylvania², Cooperative Institute for Research in Environmental Sciences³, Washington University in St. Louis⁴, Australian National University⁵, San Diego State University⁶, Cornell University⁷, Howard Hughes Medical Institute⁸

11 Apr 2010-Nature Methods

TL;DR: An overview of the analysis pipeline and links to raw data and processed output from the runs with and without denoising are provided.

...read moreread less

Abstract: Supplementary Figure 1 Overview of the analysis pipeline. Supplementary Table 1 Details of conventionally raised and conventionalized mouse samples. Supplementary Discussion Expanded discussion of QIIME analyses presented in the main text; Sequencing of 16S rRNA gene amplicons; QIIME analysis notes; Expanded Figure 1 legend; Links to raw data and processed output from the runs with and without denoising.

...read moreread less

28,911 citations

Journal Article•DOI•

MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability

[...]

Kazutaka Katoh¹, Daron M. Standley¹•Institutions (1)

Osaka University¹

01 Apr 2013-Molecular Biology and Evolution

TL;DR: This version of MAFFT has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update.

...read moreread less

Abstract: We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.

...read moreread less

27,771 citations

Journal Article•DOI•

Clustal W and Clustal X version 2.0

[...]

Mark A. Larkin¹, Gordon Blackshields², Nigel P. Brown², R. Chenna², Paul A. McGettigan², Hamish McWilliam², Franck Valentin², Iain M. Wallace², Andreas Wilm², Rodrigo Lopez², J.D. Thompson², Toby J. Gibson², Desmond G. Higgins² - Show less +9 more•Institutions (2)

University College Dublin¹, European Bioinformatics Institute²

01 Nov 2007-Bioinformatics

TL;DR: The Clustal W and ClUSTal X multiple sequence alignment programs have been completely rewritten in C++ to facilitate the further development of the alignment algorithms in the future and has allowed proper porting of the programs to the latest versions of Linux, Macintosh and Windows operating systems.

...read moreread less

Abstract: Summary: The Clustal W and Clustal X multiple sequence alignment programs have been completely rewritten in C++. This will facilitate the further development of the alignment algorithms in the future and has allowed proper porting of the programs to the latest versions of Linux, Macintosh and Windows operating systems. Availability: The programs can be run on-line from the EBI web server: http://www.ebi.ac.uk/tools/clustalw2. The source code and executables for Windows, Linux and Macintosh computers are available from the EBI ftp site ftp://ftp.ebi.ac.uk/pub/software/clustalw2/ Contact: clustalw@ucd.ie

...read moreread less

25,325 citations

Journal Article•DOI•

Fast, scalable generation of high‐quality protein multiple sequence alignments using Clustal Omega

[...]

Fabian Sievers¹, Andreas Wilm², David Dineen¹, Toby J. Gibson, Kevin Karplus³, Weizhong Li⁴, Rodrigo Lopez⁴, Hamish McWilliam⁴, Michael Remmert⁵, Johannes Söding⁵, Julie D. Thompson⁶, Desmond G. Higgins¹ - Show less +8 more•Institutions (6)

University College Dublin¹, Genome Institute of Singapore², University of California, Santa Cruz³, European Bioinformatics Institute⁴, Ludwig Maximilian University of Munich⁵, University of Strasbourg⁶

01 Jan 2011-Molecular Systems Biology

TL;DR: A new program called Clustal Omega is described, which can align virtually any number of protein sequences quickly and that delivers accurate alignments, and which outperforms other packages in terms of execution time and quality.

...read moreread less

Abstract: Multiple sequence alignments are fundamental to many sequence analysis methods. Most alignments are computed using the progressive alignment heuristic. These methods are starting to become a bottleneck in some analysis pipelines when faced with data sets of the size of many thousands of sequences. Some methods allow computation of larger data sets while sacrificing quality, and others produce high-quality alignments, but scale badly with the number of sequences. In this paper, we describe a new program called Clustal Omega, which can align virtually any number of protein sequences quickly and that delivers accurate alignments. The accuracy of the package on smaller test cases is similar to that of the high-quality aligners. On larger data sets, Clustal Omega outperforms other packages in terms of execution time and quality. Clustal Omega also has powerful features for adding sequences to and exploiting information in existing alignments, making use of the vast amount of precomputed information in public databases like Pfam.

...read moreread less

12,489 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse