Home
/
Authors
/
Xiaodong Zhang

Author

Xiaodong Zhang

University of Texas MD Anderson Cancer Center

Other affiliations: University of Texas at San Antonio, College of William & Mary, Emory University ...read more

Bio: Xiaodong Zhang is an academic researcher from University of Texas MD Anderson Cancer Center. The author has contributed to research in topics: Cache & Proton therapy. The author has an hindex of 66, co-authored 390 publications receiving 16459 citations. Previous affiliations of Xiaodong Zhang include University of Texas at San Antonio & College of William & Mary.

Topics: Cache, Proton therapy, Medicine, The Internet, Scalability ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1991

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance

[...]

Song Jiang¹, Xiaodong Zhang¹•Institutions (1)

College of William & Mary¹

01 Jun 2002

TL;DR: LIRS effectively addresses the limits of LRU by using recency to evaluate Inter-Reference Recency (IRR) for making a replacement decision, and significantly outperforms LRU, and outperforms other existing replacement algorithms in most cases.

...read moreread less

Abstract: Although LRU replacement policy has been commonly used in the buffer cache management, it is well known for its inability to cope with access patterns with weak locality. Previous work, such as LRU-K and 2Q, attempts to enhance LRU capacity by making use of additional history information of previous block references other than only the recency information used in LRU. These algorithms greatly increase complexity and/or can not consistently provide performance improvement. Many recently proposed policies, such as UBM and SEQ, improve replacement performance by exploiting access regularities in references. They only address LRU problems on certain specific and well-defined cases such as access patterns like sequences and loops. Motivated by the limits of previous studies, we propose an efficient buffer cache replacement policy, called Low Inter-reference Recency Set (LIRS). LIRS effectively addresses the limits of LRU by using recency to evaluate Inter-Reference Recency (IRR) for making a replacement decision. This is in contrast to what LRU does: directly using recency to predict next reference timing. At the same time, LIRS almost retains the same simple assumption of LRU to predict future access behavior of blocks. Our objectives are to effectively address the limits of LRU for a general purpose, to retain the low overhead merit of LRU, and to outperform those replacement policies relying on the access regularity detections. Conducting simulations with a variety of traces and a wide range of cache sizes, we show that LIRS significantly outperforms LRU, and outperforms other existing replacement algorithms in most cases. Furthermore, we show that the additional cost for implementing LIRS is trivial in comparison with LRU.

...read moreread less

575 citations

Journal Article•DOI•

Hadoop GIS: a high performance spatial data warehousing system over mapreduce

[...]

Ablimit Aji¹, Fusheng Wang¹, Hoang Vo¹, Rubao Lee², Qiaoling Liu¹, Xiaodong Zhang², Joel H. Saltz¹ - Show less +3 more•Institutions (2)

Emory University¹, Ohio State University²

01 Aug 2013

TL;DR: Hadoop-GIS - a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop and integrated into Hive to support declarative spatial queries with an integrated architecture is presented.

...read moreread less

Abstract: Support of high performance queries on large volumes of spatial data becomes increasingly important in many application domains, including geospatial problems in numerous fields, location based services, and emerging scientific applications that are increasingly data- and compute-intensive. The emergence of massive scale spatial data is due to the proliferation of cost effective and ubiquitous positioning technologies, development of high resolution imaging technologies, and contribution from a large number of community users. There are two major challenges for managing and querying massive spatial data to support spatial queries: the explosion of spatial data, and the high computational complexity of spatial queries. In this paper, we present Hadoop-GIS - a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop. Hadoop-GIS supports multiple types of spatial queries on MapReduce through spatial partitioning, customizable spatial query engine RESQUE, implicit parallel spatial query execution on MapReduce, and effective methods for amending query results through handling boundary objects. Hadoop-GIS utilizes global partition indexing and customizable on demand local spatial indexing to achieve efficient query processing. Hadoop-GIS is integrated into Hive to support declarative spatial queries with an integrated architecture. Our experiments have demonstrated the high efficiency of Hadoop-GIS on query response and high scalability to run on commodity clusters. Our comparative experiments have showed that performance of Hadoop-GIS is on par with parallel SDBMS and outperforms SDBMS for compute-intensive queries. Hadoop-GIS is available as a set of library for processing spatial queries, and as an integrated software package in Hive.

...read moreread less

571 citations

Proceedings Article•DOI•

Understanding intrinsic characteristics and system implications of flash memory based solid state drives

[...]

Feng Chen¹, David A. Koufaty², Xiaodong Zhang¹•Institutions (2)

Ohio State University¹, Intel²

15 Jun 2009

TL;DR: This study reveals several unanticipated aspects in the performance dynamics of SSD technology that must be addressed by system designers and data-intensive application users in order to effectively place it in the storage hierarchy.

...read moreread less

Abstract: Flash Memory based Solid State Drive (SSD) has been called a "pivotal technology" that could revolutionize data storage systems. Since SSD shares a common interface with the traditional hard disk drive (HDD), both physically and logically, an effective integration of SSD into the storage hierarchy is very important. However, details of SSD hardware implementations tend to be hidden behind such narrow interfaces. In fact, since sophisticated algorithms are usually, of necessity, adopted in SSD controller firmware, more complex performance dynamics are to be expected in SSD than in HDD systems. Most existing literature or product specifications on SSD just provide high-level descriptions and standard performance data, such as bandwidth and latency.In order to gain insight into the unique performance characteristics of SSD, we have conducted intensive experiments and measurements on different types of state-of-the-art SSDs, from low-end to high-end products. We have observed several unexpected performance issues and uncertain behavior of SSDs, which have not been reported in the literature. For example, we found that fragmentation could seriously impact performance -- by a factor of over 14 times on a recently announced SSD. Moreover, contrary to the common belief that accesses to SSD are uncorrelated with access patterns, we found a strong correlation between performance and the randomness of data accesses, for both reads and writes. In the worst case, average latency could increase by a factor of 89 and bandwidth could drop to only 0.025MB/sec. Our study reveals several unanticipated aspects in the performance dynamics of SSD technology that must be addressed by system designers and data-intensive application users in order to effectively place it in the storage hierarchy.

...read moreread less

529 citations

Proceedings Article•DOI•

Measurements, analysis, and modeling of BitTorrent-like systems

[...]

Lei Guo¹, Songqing Chen², Zhen Xiao³, Enhua Tan¹, Xiaoning Ding¹, Xiaodong Zhang¹ - Show less +2 more•Institutions (3)

College of William & Mary¹, George Mason University², AT&T Labs³

19 Oct 2005

TL;DR: An analysis of representative Bit-Torrent traffic provides several new findings regarding the limitations of BitTorrent systems: due to the exponentially decreasing peer arrival rate in reality, service availability in such systems becomes poor quickly, after which it is difficult for the file to be located and downloaded.

...read moreread less

Abstract: Existing studies on BitTorrent systems are single-torrent based, while more than 85% of all peers participate in multiple torrents according to our trace analysis. In addition, these studies are not sufficiently insightful and accurate even for single-torrent models, due to some unrealistic assumptions. Our analysis of representative Bit-Torrent traffic provides several new findings regarding the limitations of BitTorrent systems: (1) Due to the exponentially decreasing peer arrival rate in reality, service availability in such systems becomes poor quickly, after which it is difficult for the file to be located and downloaded. (2) Client performance in the BitTorrent-like systems is unstable, and fluctuates widely with the peer population. (3) Existing systems could provide unfair services to peers, where peers with high downloading speed tend to download more and upload less. In this paper, we study these limitations on torrent evolution in realistic environments. Motivated by the analysis and modeling results, we further build a graph based multi-torrent model to study inter-torrent collaboration. Our model quantitatively provides strong motivation for inter-torrent collaboration instead of directly stimulating seeds to stay longer. We also discuss a system design to show the feasibility of multi-torrent collaboration.

...read moreread less

432 citations

Proceedings Article•DOI•

Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems

[...]

Jiang Lin¹, Qingda Lu², Xiaoning Ding², Zhao Zhang¹, Xiaodong Zhang², P. Sadayappan² - Show less +2 more•Institutions (2)

Iowa State University¹, Ohio State University²

24 Oct 2008

TL;DR: This paper has comprehensively evaluated several representative cache partitioning schemes with different optimization objectives, including performance, fairness, and quality of service (QoS) and provides new insights into dynamic behaviors and interaction effects.

...read moreread less

Abstract: Cache partitioning and sharing is critical to the effective utilization of multicore processors. However, almost all existing studies have been evaluated by simulation that often has several limitations, such as excessive simulation time, absence of OS activities and proneness to simulation inaccuracy. To address these issues, we have taken an efficient software approach to supporting both static and dynamic cache partitioning in OS through memory address mapping. We have comprehensively evaluated several representative cache partitioning schemes with different optimization objectives, including performance, fairness, and quality of service (QoS). Our software approach makes it possible to run the SPEC CPU2006 benchmark suite to completion. Besides confirming important conclusions from previous work, we are able to gain several insights from whole-program executions, which are infeasible from simulation. For example, giving up some cache space in one program to help another one may improve the performance of both programs for certain workloads due to reduced contention for memory bandwidth. Our evaluation of previously proposed fairness metrics is also significantly different from a simulation-based study. The contributions of this study are threefold. (1) To the best of our knowledge, this is a highly comprehensive execution- and measurement-based study on multicore cache partitioning. This paper not only confirms important conclusions from simulation-based studies, but also provides new insights into dynamic behaviors and interaction effects. (2) Our approach provides a unique and efficient option for evaluating multicore cache partitioning. The implemented software layer can be used as a tool in multicore performance evaluation and hardware design. (3) The proposed schemes can be further refined for OS kernels to improve performance.

...read moreread less

382 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93

Collapse

Cited by

PDF

Open Access

More filters

NCCN Clinical Practice Guidelines in Oncology (NCCN Guidelines

[...]

Hodgkin Lymphoma

01 Jan 2014

TL;DR: Lymphedema is a common complication after treatment for breast cancer and factors associated with increased risk of lymphedEMA include extent of axillary surgery, axillary radiation, infection, and patient obesity.

...read moreread less

1,988 citations

Journal Article•DOI•

The World’s Technological Capacity to Store, Communicate, and Compute Information

[...]

Martin Hilbert¹, Priscila López²•Institutions (2)

University of Southern California¹, Open University of Catalonia²

01 Apr 2011-Science

TL;DR: An inventory of the world’s technological capacity from 1986 to 2007 reveals the evolution from analog to digital technologies, and the majority of the authors' technological memory has been in digital format since the early 2000s.

...read moreread less

Abstract: We estimated the world’s technological capacity to store, communicate, and compute information, tracking 60 analog and digital technologies during the period from 1986 to 2007. In 2007, humankind was able to store 2.9 × 10 20 optimally compressed bytes, communicate almost 2 × 10 21 bytes, and carry out 6.4 × 10 18 instructions per second on general-purpose computers. General-purpose computing capacity grew at an annual rate of 58%. The world’s capacity for bidirectional telecommunication grew at 28% per year, closely followed by the increase in globally stored information (23%). Humankind’s capacity for unidirectional information diffusion through broadcasting channels has experienced comparatively modest annual growth (6%). Telecommunication has been dominated by digital technologies since 1990 (99.9% in digital format in 2007), and the majority of our technological memory has been in digital format since the early 2000s (94% digital in 2007).

...read moreread less

1,450 citations

Journal Article•DOI•

Radiobiology for the Radiologist

[...]

J. H. Hendry

01 Apr 1979-British Journal of Cancer

TL;DR: This text is a general introduction to radiation biology and a complete, self-contained course especially for residents in diagnostic radiology and nuclear medicine that follows the Syllabus in Radiation Biology of the RSNA.

...read moreread less

Abstract: The text consists of two sections, one for those studying or practicing diagnostic radiology, nuclear medicine and radiation oncology; the other for those engaged in the study or clinical practice of radiation oncology--a new chapter, on radiologic terrorism, is specifically for those in the radiation sciences who would manage exposed individuals in the event of a terrorist event. The 17 chapters in Section I represent a general introduction to radiation biology and a complete, self-contained course especially for residents in diagnostic radiology and nuclear medicine that follows the Syllabus in Radiation Biology of the RSNA. The 11 chapters in Section II address more in-depth topics in radiation oncology, such as cancer biology, retreatment after radiotherapy, chemotherapeutic agents and hyperthermia.

...read moreread less

1,359 citations

Book•

Hawaii International Conference on System Sciences

[...]

M. Stuart Lynn

01 Jan 1996

1,170 citations

Book•

Computer Networking: A Top-Down Approach

[...]

James F. Kurose, Keith W. Ross

05 Mar 2012

TL;DR: Computer Networking: A Top-Down Approach Featuring the Internet explains the engineering problems that are inherent in communicating digital information from point to point, and presents the mathematics that determine the best path, show some code that implements those algorithms, and illustrate the logic by using excellent conceptual diagrams.

...read moreread less

Abstract: Certain data-communication protocols hog the spotlight, but all of them have a lot in common. Computer Networking: A Top-Down Approach Featuring the Internet explains the engineering problems that are inherent in communicating digital information from point to point. The top-down approach mentioned in the subtitle means that the book starts at the top of the protocol stack--at the application layer--and works its way down through the other layers, until it reaches bare wire. The authors, for the most part, shun the well-known seven-layer Open Systems Interconnection (OSI) protocol stack in favor of their own five-layer (application, transport, network, link, and physical) model. It's an effective approach that helps clear away some of the hand waving traditionally associated with the more obtuse layers in the OSI model. The approach is definitely theoretical--don't look here for instructions on configuring Windows 2000 or a Cisco router--but it's relevant to reality, and should help anyone who needs to understand networking as a programmer, system architect, or even administration guru.The treatment of the network layer, at which routing takes place, is typical of the overall style. In discussing routing, authors James Kurose and Keith Ross explain (by way of lots of clear, definition-packed text) what routing protocols need to do: find the best route to a destination. Then they present the mathematics that determine the best path, show some code that implements those algorithms, and illustrate the logic by using excellent conceptual diagrams. Real-life implementations of the algorithms--including Internet Protocol (both IPv4 and IPv6) and several popular IP routing protocols--help you to make the transition from pure theory to networking technologies. --David WallTopics covered: The theory behind data networks, with thorough discussion of the problems that are posed at each level (the application layer gets plenty of attention). For each layer, there's academic coverage of networking problems and solutions, followed by discussion of real technologies. Special sections deal with network security and transmission of digital multimedia.

...read moreread less

1,079 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse