Home
/
Authors
/
Rezaul Chowdhury

Author

Rezaul Chowdhury

Other affiliations: University of Texas at Austin, Boston University, Bangladesh University of Engineering and Technology ...read more

Bio: Rezaul Chowdhury is an academic researcher from Stony Brook University. The author has contributed to research in topics: Cache & Cache-oblivious algorithm. The author has an hindex of 17, co-authored 83 publications receiving 1419 citations. Previous affiliations of Rezaul Chowdhury include University of Texas at Austin & Boston University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The heap-mergesort

[...]

Rezaul Chowdhury¹, Suman Nath¹, Mohammad Kaykobad¹•Institutions (1)

Bangladesh University of Engineering and Technology¹

01 Apr 2000-Computers & Mathematics With Applications

TL;DR: A new mergesort algorithm which can sort n(= 2h+1 − 1) elements using no more than n log2(n+1) − (1312)n − 1 element comparisons in the worst case is presented.

...read moreread less

Abstract: In this paper, we present a new mergesort algorithm which can sort n(= 2h+1 − 1) elements using no more than n log2(n+1) − (1312)n − 1 element comparisons in the worst case. This algorithm includes the heap (fine heap) creation phase as a pre-processing step, and for each internal node v, its left and right subheaps are merged into a sorted list of the elements under that node. Experimental results show that this algorithm requires only n log2(n+1) − 1.2n element comparisons in the average case. But it requires extra space for n LINK fields.

...read moreread less

Proceedings Article•DOI•

Poster: Polarization Energy on a Cluster of Multicores

[...]

Jesmin Jahan Tithi¹, Rezaul Chowdhury¹•Institutions (1)

Stony Brook University¹

10 Nov 2012

TL;DR: An octree-based hierarchical algorithm, built on GreengardRokhlin type near and far decomposition of data points which calculates the polarization energy of a molecule using the r6 approximation of Generalized Born (GB) Radii of atoms

...read moreread less

Abstract: When a molecule experiences an electric field, its charge distribution is relaxed in response to that field. The energy associated with this relaxation is known as the polarization energy . Computing the polarization energy between a ligand (i.e., a small molecule such as a drug molecule) and a receptor (e.g., a virus molecule) is of utmost importance in drug design, protein-protein docking, virus/bacterium cell analysis, molecular dynamics simulations for determining the molecular conformation with minimal total free energy. We have implemented distributed-memory and distributed shared-memory parallel algorithms for approximating polarization energy of a molecule by extending a prior work for shared-memory (multicore) architectures. This is an octree-based hierarchical algorithm, built on GreengardRokhlin type near and far decomposition of data points (i.e., atoms and points sampled from the molecular surface) which calculates the polarization energy of a molecule using the r6 approximation of Generalized Born (GB) Radii of atoms. Both Poisson-Boltzmann (PB) GeneralizedBorn (GB) models can be used for approximating polarization energy. However, due to high computational costs PB method is rarely used for large molecules such as proteins.

...read moreread less

Journal Article•DOI•

An Optimal Level-synchronous Shared-memory Parallel BFS Algorithm with Optimal parallel Prefix-sum Algorithm and its Implications for Energy Consumption

[...]

Jesmin Jahan Tithi, Yonatan Fogel, Rezaul Chowdhury

19 Sep 2022-arXiv.org

TL;DR: A work-eﬃcient parallel level-synchronous Breadth First Search (BFS) algorithm for shared-memory architectures which achieves the theoretical lower bound on parallel running time and the optimality holds regardless of the shape of the graph.

...read moreread less

Abstract: . We present a work-eﬃcient parallel level-synchronous Breadth First Search (BFS) algorithm for shared-memory architectures which achieves the theoretical lower bound on parallel running time. The optimality holds regardless of the shape of the graph. We also demonstrate the implication of this optimality for the energy consumption of the program empirically. The key idea is to never use more processing cores than necessary to complete the work in any computation step eﬃciently. We keep rest of the cores idle to save energy and to reduce other resource contentions (e.g., band-width, shared caches, etc). Our BFS does not use locks and atomic-instructions and is easily extendible to shared-memory coprocessors.

...read moreread less

Journal Article•DOI•

Fast Option Pricing using Nonlinear Stencils

[...]

Zafar Nazir Ahmad, Rezaul Chowdhury, Rathish Das, Yushen Huang, Yimin Zhu - Show less +1 more

04 Mar 2023-arXiv.org

TL;DR: In this article , the binomial option pricing problem is transformed into nonlinear stencil computation problems, and the problem is solved using FFT-based stencil algorithms and shown to span asymptotically.

...read moreread less

Abstract: We study the binomial option pricing model and the Black-Scholes-Merton pricing model. In the binomial option pricing model, we concentrate on two widely-used call options: (1) European and (2) American. Under the Black-Scholes-Merton model, we investigate pricing American put options. Our contributions are two-fold: First, we transform the option pricing problems into nonlinear stencil computation problems and present efficient algorithms to solve them. Second, using our new FFT-based nonlinear stencil algorithms, we improve the work and span asymptotically for the option pricing problems we consider. In particular, we perform $O(T\log^2 T)$ work for both American call and put option pricing, where $T$ is the number of time steps.

...read moreread less

Journal Article•DOI•

Cache-Oblivious Parallel Convex Hull in the Binary Forking Model

[...]

Rezaul Chowdhury, Shih-Yu Tsai, Yimin Zhu

17 May 2023-arXiv.org

TL;DR: In this article , the authors present two cache-oblivious sorting-based convex hull algorithms in the Binary Forking Model, one achieves O(n)$ work, O(log n)$ span, and O (n/B)$ serial cache complexity, where B is the cache line size.

...read moreread less

Abstract: We present two cache-oblivious sorting-based convex hull algorithms in the Binary Forking Model. The first is an algorithm for a presorted set of points which achieves $O(n)$ work, $O(\log n)$ span, and $O(n/B)$ serial cache complexity, where $B$ is the cache line size. These are all optimal worst-case bounds for cache-oblivious algorithms in the Binary Forking Model. The second adapts Cole and Ramachandran's cache-oblivious sorting algorithm, matching its properties including achieving $O(n \log n)$ work, $O(\log n \log \log n)$ span, and $O(n/B \log_M n)$ serial cache complexity. Here $M$ is the size of the private cache.

...read moreread less

1
2
3
4
5
6
7
8
9
10
11
12
…
13
14
15
16
17
18

Collapse

Cited by

PDF

Open Access

More filters

Fast parallel algorithms for short-range molecular dynamics

[...]

Steven J. Plimpton¹•Institutions (1)

Sandia National Laboratories¹

01 May 1993

TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.

...read moreread less

Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

...read moreread less

29,323 citations

Book•

Computational geometry

[...]

F. Frances Yao

02 Jan 1991

1,377 citations

Proceedings Article•DOI•

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines

[...]

Jonathan Ragan-Kelley¹, Connelly Barnes², Andrew Adams¹, Sylvain Paris², Frédo Durand¹, Saman Amarasinghe¹ - Show less +2 more•Institutions (2)

Massachusetts Institute of Technology¹, Adobe Systems²

16 Jun 2013

TL;DR: A systematic model of the tradeoff space fundamental to stencil pipelines is presented, a schedule representation which describes concrete points in this space for each stage in an image processing pipeline, and an optimizing compiler for the Halide image processing language that synthesizes high performance implementations from a Halide algorithm and a schedule are presented.

...read moreread less

Abstract: Image processing pipelines combine the challenges of stencil computations and stream programs. They are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. Because of their complex structure, the performance difference between a naive implementation of a pipeline and an optimized one is often an order of magnitude. Efficient implementations require optimization of both parallelism and locality, but due to the nature of stencils, there is a fundamental tension between parallelism, locality, and introducing redundant recomputation of shared values.We present a systematic model of the tradeoff space fundamental to stencil pipelines, a schedule representation which describes concrete points in this space for each stage in an image processing pipeline, and an optimizing compiler for the Halide image processing language that synthesizes high performance implementations from a Halide algorithm and a schedule. Combining this compiler with stochastic search over the space of schedules enables terse, composable programs to achieve state-of-the-art performance on a wide range of real image processing pipelines, and across different hardware architectures, including multicores with SIMD, and heterogeneous CPU+GPU execution. From simple Halide programs written in a few hours, we demonstrate performance up to 5x faster than hand-tuned C, intrinsics, and CUDA implementations optimized by experts over weeks or months, for image processing applications beyond the reach of past automatic compilers.

...read moreread less

1,074 citations

Journal Article•DOI•

End-Point Binding Free Energy Calculation with MM/PBSA and MM/GBSA: Strategies and Applications in Drug Design

[...]

Ercheng Wang¹, Huiyong Sun¹, Junmei Wang², Zhe Wang¹, Hui Liu¹, John Z. H. Zhang, Tingjun Hou¹ - Show less +3 more•Institutions (2)

Zhejiang University¹, University of Pittsburgh²

24 Jun 2019-Chemical Reviews

TL;DR: In this review, methods to adjust the polar solvation energy and to improve the performance of MM/PBSA and MM/GBSA calculations are reviewed and discussed and guidance is provided for practically applying these methods in drug design and related research fields.

...read moreread less

Abstract: Molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) and molecular mechanics generalized Born surface area (MM/GBSA) are arguably very popular methods for binding free energy prediction since they are more accurate than most scoring functions of molecular docking and less computationally demanding than alchemical free energy methods. MM/PBSA and MM/GBSA have been widely used in biomolecular studies such as protein folding, protein-ligand binding, protein-protein interaction, etc. In this review, methods to adjust the polar solvation energy and to improve the performance of MM/PBSA and MM/GBSA calculations are reviewed and discussed. The latest applications of MM/GBSA and MM/PBSA in drug design are also presented. This review intends to provide readers with guidance for practically applying MM/PBSA and MM/GBSA in drug design and related research fields.

...read moreread less

822 citations

Journal Article•DOI•

Software for molecular docking: a review

[...]

Nataraj Sekhar Pagadala¹, Khajamohiddin Syed², Jack A. Tuszynski¹, Jack A. Tuszynski³•Institutions (3)

University of Alberta¹, Central University of Technology², Cross Cancer Institute³

16 Jan 2017-Biophysical Reviews

TL;DR: Docking against homology-modeled targets also becomes possible for proteins whose structures are not known, and the druggability of the compounds and their specificity against a particular target can be calculated for further lead optimization processes.

...read moreread less

Abstract: Molecular docking methodology explores the behavior of small molecules in the binding site of a target protein. As more protein structures are determined experimentally using X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy, molecular docking is increasingly used as a tool in drug discovery. Docking against homology-modeled targets also becomes possible for proteins whose structures are not known. With the docking strategies, the druggability of the compounds and their specificity against a particular target can be calculated for further lead optimization processes. Molecular docking programs perform a search algorithm in which the conformation of the ligand is evaluated recursively until the convergence to the minimum energy is reached. Finally, an affinity scoring function, ΔG [U total in kcal/mol], is employed to rank the candidate poses as the sum of the electrostatic and van der Waals energies. The driving forces for these specific interactions in biological systems aim toward complementarities between the shape and electrostatics of the binding site surfaces and the ligand or substrate.

...read moreread less

817 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse