Home
/
Authors
/
Taisuke Boku

Author

Taisuke Boku

Other affiliations: Kyoto Prefectural University, University of Tokyo, University of Tennessee ...read more

Bio: Taisuke Boku is an academic researcher from University of Tsukuba. The author has contributed to research in topics: Supercomputer & Benchmark (computing). The author has an hindex of 21, co-authored 180 publications receiving 2358 citations. Previous affiliations of Taisuke Boku include Kyoto Prefectural University & University of Tokyo.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1991
1990
1988
1987
1985

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The International Exascale Software Project roadmap

[...]

Jack Dongarra¹, Pete Beckman¹, Terry Moore¹, Patrick Aerts¹, Giovanni Aloisio¹, Jean-Claude Andre¹, David Barkai¹, Jean-Yves Berthou¹, Taisuke Boku¹, Bertrand Braunschweig¹, Franck Cappello¹, Barbara Chapman¹, Xuebin Chi¹, Alok Choudhary¹, Sudip S. Dosanjh¹, Thom H. Dunning¹, Sandro Fiore¹, Al Geist¹, Bill Gropp¹, Robert W. Harrison¹, Mark Hereld¹, Michael A. Heroux¹, Adolfy Hoisie¹, Koh Hotta¹, Zhong Jin¹, Yutaka Ishikawa¹, Fred Johnson¹, Sanjay Kale¹, Richard Kenway¹, David E. Keyes¹, Bill Kramer¹, Jesús Labarta¹, Alain Lichnewsky¹, Thomas Lippert¹, Bob Lucas¹, Barney Maccabe¹, Satoshi Matsuoka¹, Paul Messina¹, Peter Michielse¹, Bernd Mohr¹, Matthias S. Mueller¹, Wolfgang E. Nagel¹, Hiroshi Nakashima¹, Michael E. Papka¹, Daniel A. Reed¹, Mitsuhisa Sato¹, Edward Seidel¹, John Shalf¹, David Skinner¹, Marc Snir¹, Thomas Sterling¹, Rick Stevens¹, Frederick H. Streitz¹, Bob Sugar¹, Shinji Sumimoto¹, William Tang¹, John Taylor¹, Rajeev Thakur¹, Anne E. Trefethen¹, Mateo Valero¹, Aad J. van der Steen¹, Jeffrey S. Vetter¹, Peg Williams¹, Robert W. Wisniewski¹, Katherine Yelick¹ - Show less +61 more•Institutions (1)

University of Tennessee¹

01 Feb 2011

TL;DR: The work of the community to prepare for the challenges of exascale computing is described, ultimately combing their efforts in a coordinated International Exascale Software Project.

...read moreread less

Abstract: Over the last 20 years, the open-source community has provided more and more software on which the worldâs high-performance computing systems depend for performance and productivity. The community has invested millions of dollars and years of effort to build key components. However, although the investments in these separate software elements have been tremendously valuable, a great deal of productivity has also been lost because of the lack of planning, coordination, and key integration of technologies necessary to make them work together smoothly and efficiently, both within individual petascale systems and between different systems. It seems clear that this completely uncoordinated development model will not provide the software needed to support the unprecedented parallelism required for peta/ exascale computation on millions of cores, or the flexibility required to exploit new hardware models and features, such as transactional memory, speculative execution, and graphics processing units. This report describes the work of the community to prepare for the challenges of exascale computing, ultimately combing their efforts in a coordinated International Exascale Software Project.

...read moreread less

736 citations

Proceedings Article•DOI•

Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster

[...]

Yoshihiko Hotta¹, Mitsuhisa Sato¹, Hideaki Kimura¹, Satoshi Matsuoka², Taisuke Boku¹, Daisuke Takahashi¹ - Show less +2 more•Institutions (2)

University of Tsukuba¹, Tokyo Institute of Technology²

25 Apr 2006

TL;DR: This paper proposes an optimization algorithm to select a gear using the execution and power profile by taking the transition overhead into account, and achieves almost 40% reduction in terms of EDP without performance impact compared to results using the standard clock frequency.

...read moreread less

Abstract: Currently, several of the high performance processors used in a PC cluster have a DVS (dynamic voltage scaling) architecture that can dynamically scale processor voltage and frequency. Adaptive scheduling of the voltage and frequency enables us to reduce power dissipation without a performance slowdown during communication and memory access. In this paper, we propose a method of profiled-based power-performance optimization by DVS scheduling in a high-performance PC cluster. We divide the program execution into several regions and select the best gear for power efficiency. Selecting the best gear is not straightforward since the overhead of DVS transition is not free. We propose an optimization algorithm to select a gear using the execution and power profile by taking the transition overhead into account. We have built and designed a power-profiling system, PowerWatch. With this system we examined the effectiveness of our optimization algorithm on two types of power-scalable clusters (Crusoe and Turion). According to the results of benchmark tests, we achieved almost 40% reduction in terms of EDP (energy-delay product) without performance impact (less than 5%) compared to results using the standard clock frequency.

...read moreread less

129 citations

Journal Article•DOI•

A massively-parallel electronic-structure calculations based on real-space density functional theory

[...]

Jun-ichi Iwata¹, Daisuke Takahashi¹, Atsushi Oshiyama², Atsushi Oshiyama¹, Taisuke Boku¹, Kenji Shiraishi¹, Susumu Okada¹, Kazuhiro Yabana¹ - Show less +4 more•Institutions (2)

University of Tsukuba¹, University of Tokyo²

20 Mar 2010-Journal of Computational Physics

TL;DR: A first-principles density functional program that efficiently performs large-scale calculations on massively-parallel computers and obtains a self-consistent electronic-structure in a few hundred hours is developed.

...read moreread less

126 citations

Proceedings Article•DOI•

Emprical study on Reducing Energy of Parallel Programs using Slack Reclamation by DVFS in a Power-scalable High Performance Cluster

[...]

Hideaki Kimura¹, Mitsuhisa Sato¹, Yoshihiko Hotta¹, Taisuke Boku¹, Daisuke Takahashi¹ - Show less +1 more•Institutions (1)

University of Tsukuba¹

01 Sep 2006

TL;DR: A new algorithm is proposed that reduces energy consumption in a parallel program executed on a power-scalable cluster using DVFS, which reclaims slack time by changing the voltage and frequency, which allows a reduction in energy consumption without impacting on the performance of the program.

...read moreread less

Abstract: It has become important to improve the energy efficiency of high performance PC clusters. In PC clusters, high-performance microprocessors have a dynamic voltage and frequency scaling (DVFS) mechanism, which allows the voltage and frequency to be set for reduction in energy consumption. In this paper, we proposed a new algorithm that reduces energy consumption in a parallel program executed on a power-scalable cluster using DVFS. Whenever the computational load is not balanced, parallel programs encounter slack time, that is, they must wait for synchronization of the tasks. Our algorithm reclaims slack time by changing the voltage and frequency, which allows a reduction in energy consumption without impacting on the performance of the program. Our algorithm can be applied to parallel programs represented by a directed acyclic task graph (DAG). It selects an appropriate set of voltages and frequencies (called the gear) that allow the tasks to execute at the lowest frequency that does not increase the overall execution time, but at the same time allows the tasks to be executed as uniformly as possible in frequency. We built two different types of power-scalable clusters using AMD Turion and Transmeta Crusoe. For the empirical study on energy reduction in PC clusters, we designed a toolkit called PowerWatch that includes power monitoring tools and the DVFS control library. This toolkit precisely measures the power consumption of the entire cluster in real time. The experimental results using benchmark problems show that our algorithm reduces energy consumption by 25% with only a 1 % loss in performance

...read moreread less

98 citations

Journal Article•DOI•

SALMON: Scalable Ab-initio Light-Matter simulator for Optics and Nanoscience

[...]

Masashi Noda, Shunsuke A. Sato¹, Yuta Hirokawa², Mitsuharu Uemoto², Takashi Takeuchi, Shunsuke Yamada², Atsushi Yamada², Yasushi Shinohara³, Maiku Yamaguchi³, Kenji Iida, Isabella Floss⁴, Tomohito Otobe, Kyung-Min Lee¹, Kazuya Ishimura, Taisuke Boku², George F. Bertsch⁵, Katsuyuki Nobusada, Kazuhiro Yabana² - Show less +14 more•Institutions (5)

Max Planck Society¹, University of Tsukuba², University of Tokyo³, Vienna University of Technology⁴, University of Washington⁵

01 Feb 2019-Computer Physics Communications

TL;DR: An overview of the capabilities of the SALMON software package is provided, showing several sample calculations of the real-time, real-space calculation of the electron dynamics induced in molecules and solids by an external electric field solving the time-dependent Kohn–Sham equation.

...read moreread less

93 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

Collapse

Cited by

PDF

Open Access

More filters

Fast parallel algorithms for short-range molecular dynamics

[...]

Steven J. Plimpton¹•Institutions (1)

Sandia National Laboratories¹

01 May 1993

TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.

...read moreread less

Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

...read moreread less

29,323 citations

Proceedings Article•DOI•

Hedera: dynamic flow scheduling for data center networks

[...]

Mohammad Al-Fares¹, Sivasankar Radhakrishnan¹, Barath Raghavan², Nelson Huang¹, Amin Vahdat¹ - Show less +1 more•Institutions (2)

University of California, San Diego¹, Williams College²

28 Apr 2010

TL;DR: Hedera is presented, a scalable, dynamic flow scheduling system that adaptively schedules a multi-stage switching fabric to efficiently utilize aggregate network resources and delivers bisection bandwidth that is 96% of optimal and up to 113% better than static load-balancing methods.

...read moreread less

Abstract: Today's data centers offer tremendous aggregate bandwidth to clusters of tens of thousands of machines. However, because of limited port densities in even the highest-end switches, data center topologies typically consist of multi-rooted trees with many equal-cost paths between any given pair of hosts. Existing IP multipathing protocols usually rely on per-flow static hashing and can cause substantial bandwidth losses due to long-term collisions.In this paper, we present Hedera, a scalable, dynamic flow scheduling system that adaptively schedules a multi-stage switching fabric to efficiently utilize aggregate network resources. We describe our implementation using commodity switches and unmodified hosts, and show that for a simulated 8,192 host data center, Hedera delivers bisection bandwidth that is 96% of optimal and up to 113% better than static load-balancing methods.

...read moreread less

1,602 citations

Book•

Hawaii International Conference on System Sciences

[...]

M. Stuart Lynn

01 Jan 1996

1,170 citations

Journal Article•DOI•

Data Center Energy Consumption Modeling: A Survey

[...]

Miyuru Dayarathna¹, Yonggang Wen¹, Rui Fan¹•Institutions (1)

Nanyang Technological University¹

21 Jan 2016-IEEE Communications Surveys and Tutorials

TL;DR: An in-depth study of the existing literature on data center power modeling, covering more than 200 models, organized in a hierarchical structure with two main branches focusing on hardware-centric and software-centric power models.

...read moreread less

Abstract: Data centers are critical, energy-hungry infrastructures that run large-scale Internet-based services. Energy consumption models are pivotal in designing and optimizing energy-efficient operations to curb excessive energy consumption in data centers. In this paper, we survey the state-of-the-art techniques used for energy consumption modeling and prediction for data centers and their components. We conduct an in-depth study of the existing literature on data center power modeling, covering more than 200 models. We organize these models in a hierarchical structure with two main branches focusing on hardware-centric and software-centric power models. Under hardware-centric approaches we start from the digital circuit level and move on to describe higher-level energy consumption models at the hardware component level, server level, data center level, and finally systems of systems level. Under the software-centric approaches we investigate power models developed for operating systems, virtual machines and software applications. This systematic approach allows us to identify multiple issues prevalent in power modeling of different levels of data center systems, including: i) few modeling efforts targeted at power consumption of the entire data center ii) many state-of-the-art power models are based on a few CPU or server metrics, and iii) the effectiveness and accuracy of these power models remain open questions. Based on these observations, we conclude the survey by describing key challenges for future research on constructing effective and accurate data center power models.

...read moreread less

741 citations

The 2016 terahertz science and technology roadmap

[...]

S. S. Dhillon, Vitiello, Edmund H. Linfield, A. G. Davies, Matthias C. Hoffmann, John H. Booske, Claudio Paoloni, M. Gench, Peter Weightman, Gwyn P. Williams, Enrique Castro-Camus, David R. S. Cumming, F. Simoens, I. Escorcia Carranza, John Grant, Stepan Lucyszyn, Makoto Kuwata-Gonokam, Kuniaki Konishi, Martin Koch, Charles A. Schmuttenmaer, Tyler L. Cocker, Rupert Huber, Andrea Markelz, Z. D. Taylor, Vincent P. Wallace, J. Axel Zeitler, Juraj Sibik, Timothy M. Korter, Brian N. Ellison, Suzanne Rea, Paul F. Goldsmith, Ken B. Cooper, Roger Appleby, D. Pardo, Peter G. Huggard, Krozer, Haymen Shams, Martyn J. Fice, Cyril C. Renaud, Alwyn J. Seeds, Andreas Stohr, Mira Naftaly, Nick M. Ridler, Roland Clarke, John Cunningham, Michael B. Johnston - Show less +42 more

01 Jan 2017

TL;DR: The 2017 roadmap of terahertz frequency electromagnetic radiation (100 GHz-30 THz) as mentioned in this paper provides a snapshot of the present state of THz science and technology in 2017, and provides an opinion on the challenges and opportunities that the future holds.

...read moreread less

Abstract: Science and technologies based on terahertz frequency electromagnetic radiation (100 GHz–30 THz) have developed rapidly over the last 30 years. For most of the 20th Century, terahertz radiation, then referred to as sub-millimeter wave or far-infrared radiation, was mainly utilized by astronomers and some spectroscopists. Following the development of laser based terahertz time-domain spectroscopy in the 1980s and 1990s the field of THz science and technology expanded rapidly, to the extent that it now touches many areas from fundamental science to 'real world' applications. For example THz radiation is being used to optimize materials for new solar cells, and may also be a key technology for the next generation of airport security scanners. While the field was emerging it was possible to keep track of all new developments, however now the field has grown so much that it is increasingly difficult to follow the diverse range of new discoveries and applications that are appearing. At this point in time, when the field of THz science and technology is moving from an emerging to a more established and interdisciplinary field, it is apt to present a roadmap to help identify the breadth and future directions of the field. The aim of this roadmap is to present a snapshot of the present state of THz science and technology in 2017, and provide an opinion on the challenges and opportunities that the future holds. To be able to achieve this aim, we have invited a group of international experts to write 18 sections that cover most of the key areas of THz science and technology. We hope that The 2017 Roadmap on THz science and technology will prove to be a useful resource by providing a wide ranging introduction to the capabilities of THz radiation for those outside or just entering the field as well as providing perspective and breadth for those who are well established. We also feel that this review should serve as a useful guide for government and funding agencies.

...read moreread less

690 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse