Home
/
Authors
/
Ju Wang

Author

Ju Wang

Other affiliations: University of California, San Diego

Bio: Ju Wang is an academic researcher from Microsoft. The author has contributed to research in topics: Virtual machine & Scalability. The author has an hindex of 20, co-authored 44 publications receiving 1944 citations. Previous affiliations of Ju Wang include University of California, San Diego.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Windows Azure Storage: a highly available cloud storage service with strong consistency

[...]

Brad Calder¹, Ju Wang¹, Aaron W. Ogus¹, Niranjan Nilakantan¹, Arild E. Skjolsvold¹, Sam McKelvie¹, Yikang Xu¹, Shashwat Srivastav¹, Jiesheng Wu¹, Huseyin Simitci¹, Jaidev Haridas¹, Chakravarthy Uddaraju¹, Hemal Khatri¹, Andrew James Edwards¹, Vaman Bedekar¹, Mainali Shane Kumar¹, Rafay Abbasi¹, Arpit Agarwal¹, Mian Fahim ul Haq¹, Muhammad Ikram ul Haq¹, Deepali Bhardwaj¹, Sowmya Dayanand¹, Anitha Adusumilli¹, Marvin McNett¹, Sriram Sankaran¹, Kavitha Manivannan¹, Leonidas Rigas¹ - Show less +23 more•Institutions (1)

Microsoft¹

23 Oct 2011

TL;DR: The WAS architecture, global namespace, and data model is described, as well as its resource provisioning, load balancing, and replication systems.

...read moreread less

Abstract: Windows Azure Storage (WAS) is a cloud storage system that provides customers the ability to store seemingly limitless amounts of data for any duration of time. WAS customers have access to their data from anywhere at any time and only pay for what they use and store. In WAS, data is stored durably using both local and geographic replication to facilitate disaster recovery. Currently, WAS storage comes in the form of Blobs (files), Tables (structured storage), and Queues (message delivery). In this paper, we describe the WAS architecture, global namespace, and data model, as well as its resource provisioning, load balancing, and replication systems.

...read moreread less

871 citations

Patent•

Partition management in a partitioned, scalable, and available structured storage

[...]

Bradley Gene Calder¹, Ju Wang¹, Arild E. Skjolsvold¹, Shashwat Srivastav¹, Niranjan Nilakantan¹, Deepali Bhardwaj¹ - Show less +2 more•Institutions (1)

Microsoft¹

23 Oct 2009

TL;DR: In this article, a table is partitioned into a number of partitions, each partition including a contiguous range of rows, and the partitions are served by table servers and managed by a table master.

...read moreread less

Abstract: Partition management for a scalable, structured storage system is provided. The storage system provides storage represented by one or more tables, each of which includes rows that represent data entities. A table is partitioned into a number of partitions, each partition including a contiguous range of rows. The partitions are served by table servers and managed by a table master. Load distribution information for the table servers and partitions is tracked, and the table master determines to split and/or merge partitions based on the load distribution information.

...read moreread less

126 citations

Patent•

Paas hierarchial scheduling and auto-scaling

[...]

Bradley Gene Calder¹, Ju Wang¹, Vaman Bedekar¹, Sriram Sankaran¹, Ii Marvin Mcnett¹, Pradeep Kumar Gunda¹, Yang Zhang¹, Shyam Antony¹, Kavitha Manivannan¹, Arild E. Skjolsvold¹, Hemal Khatri¹ - Show less +7 more•Institutions (1)

Microsoft¹

28 Dec 2012

TL;DR: In this paper, a platform as a service (PaaS) is used to provide resources by way of a distributed computing environment to perform a job, where the system may be comprised of a number of components, such as a task machine, a task location service machine, and a high level location service machines.

...read moreread less

Abstract: In various embodiments, systems and methods are presented for providing resources by way of a platform as a service in a distributed computing environment to perform a job. The system may be comprised of a number of components, such as a task machine, a task location service machine, and a high-level location service machines that in combination are useable to accomplish functions provided herein. It is contemplated that the system performs methods for providing resources by determining resources of the system, such as virtual machines, and applying auto-scaling rules to the system to scale those resources. Based on the determination of the auto-scaling rules, the resources may be allocated to achieve a desired result.

...read moreread less

109 citations

Patent•

Decoupling paas resources, jobs, and scheduling

[...]

Microsoft¹

07 Jan 2013

TL;DR: In this article, the authors present a platform as a service (PaaS) based approach for providing resources by way of a platform-as-a-service in a distributed computing environment to perform a job.

...read moreread less

Abstract: Systems and methods are presented for providing resources by way of a platform as a service in a distributed computing environment to perform a job. Resources of the system, job performing on the system, and schedulers of the jobs performing on the system are decoupled in a manner that allows a job to easily migrate among resources. It is contemplated that the migration of jobs from a first pool of resource to a second pool of resource is performed by the system without human intervention. The migration of a job may utilize different schedulers for the different resources. Further, it is contemplated that a pool of resources may automatically allocate additional or fewer resources in response to a migration of a job.

...read moreread less

96 citations

Patent•

Platform as a service job scheduling

[...]

Microsoft¹

09 Jan 2012

TL;DR: In this paper, the authors present a system for providing resources by way of a platform as a service (PaaS) in a distributed computing environment to perform a job, where a user may submit a work item to the system that results in a job being processed on a pool of virtual machines.

...read moreread less

Abstract: Systems and methods are presented for providing resources by way of a platform as a service in a distributed computing environment to perform a job. A user may submit a work item to the system that results in a job being processed on a pool of virtual machines. The pool may be automatically established by the system in response to the work item and other information associated with the work item, the user, and/or the account. Further, it is contemplated that resources associated with the pool, such as virtual machines, may be automatically allocated based, at least in part, on information associated with the work item, the user, the account, the pool, and/or the system.

...read moreread less

91 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Random graphs

[...]

Alan Frieze¹•Institutions (1)

Carnegie Mellon University¹

22 Jan 2006

TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.

...read moreread less

Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

...read moreread less

7,116 citations

Proceedings Article•

Erasure coding in windows azure storage

[...]

Cheng Huang¹, Huseyin Simitci¹, Yikang Xu¹, Aaron W. Ogus¹, Brad Calder¹, Parikshit Gopalan¹, Jin Li¹, Sergey Yekhanin¹ - Show less +4 more•Institutions (1)

Microsoft¹

13 Jun 2012

TL;DR: This paper describes how LRC is used in WAS to provide low overhead durable storage with consistently low read latencies, and introduces a new set of codes for erasure coding called Local Reconstruction Codes (LRC).

...read moreread less

Abstract: Windows Azure Storage (WAS) is a cloud storage system that provides customers the ability to store seemingly limitless amounts of data for any duration of time WAS customers have access to their data from anywhere, at any time, and only pay for what they use and store To provide durability for that data and to keep the cost of storage low, WAS uses erasure coding In this paper we introduce a new set of codes for erasure coding called Local Reconstruction Codes (LRC) LRC reduces the number of erasure coding fragments that need to be read when reconstructing data fragments that are offline, while still keeping the storage overhead low The important benefits of LRC are that it reduces the bandwidth and I/Os required for repair reads over prior codes, while still allowing a significant reduction in storage overhead We describe how LRC is used in WAS to provide low overhead durable storage with consistently low read latencies

...read moreread less

1,002 citations

Proceedings Article•DOI•

Naiad: a timely dataflow system

[...]

Derek G. Murray¹, Frank McSherry¹, Rebecca Isaacs¹, Michael Isard¹, Paul Barham¹, Martín Abadi¹ - Show less +2 more•Institutions (1)

Microsoft¹

03 Nov 2013

TL;DR: It is shown that many powerful high-level programming models can be built on Naiad's low-level primitives, enabling such diverse tasks as streaming data analysis, iterative machine learning, and interactive graph mining.

...read moreread less

Abstract: Naiad is a distributed system for executing data parallel, cyclic dataflow programs. It offers the high throughput of batch processors, the low latency of stream processors, and the ability to perform iterative and incremental computations. Although existing systems offer some of these features, applications that require all three have relied on multiple platforms, at the expense of efficiency, maintainability, and simplicity. Naiad resolves the complexities of combining these features in one framework.A new computational model, timely dataflow, underlies Naiad and captures opportunities for parallelism across a wide class of algorithms. This model enriches dataflow computation with timestamps that represent logical points in the computation and provide the basis for an efficient, lightweight coordination mechanism.We show that many powerful high-level programming models can be built on Naiad's low-level primitives, enabling such diverse tasks as streaming data analysis, iterative machine learning, and interactive graph mining. Naiad outperforms specialized systems in their target application domains, and its unique features enable the development of new high-performance applications.

...read moreread less

779 citations

Journal Article•DOI•

Big Data computing and clouds

[...]

Marcos Dias De Assuncao¹, Rodrigo N. Calheiros², Silvia Cristina Sardela Bianchi³, Marco A. S. Netto³, Rajkumar Buyya² - Show less +1 more•Institutions (3)

École normale supérieure de Lyon¹, University of Melbourne², IBM³

01 May 2015-Journal of Parallel and Distributed Computing

TL;DR: This paper discusses approaches and environments for carrying out analytics on Clouds for Big Data applications, and identifies possible gaps in technology and provides recommendations for the research community on future directions on Cloud-supported Big Data computing and analytics solutions.

...read moreread less

773 citations

Journal Article•DOI•

XORing elephants: novel erasure codes for big data

[...]

Maheswaran Sathiamoorthy¹, Megasthenis Asteris¹, Dimitris S. Papailiopoulos², Alexandros G. Dimakis², Ramkumar Vadali³, Scott Chen³, Dhruba Borthakur³ - Show less +3 more•Institutions (3)

University of Southern California¹, University of Texas at Austin², Facebook³

01 Mar 2013

TL;DR: In this article, the authors present a family of erasure codes that are efficient repairable and offer higher reliability compared to Reed-Solomon codes, which is the standard design choice and their high repair cost is often considered an unavoidable price to pay for high storage efficiency and high reliability.

...read moreread less

Abstract: Distributed storage systems for large clusters typically use replication to provide reliability. Recently, erasure codes have been used to reduce the large storage overhead of three-replicated systems. Reed-Solomon codes are the standard design choice and their high repair cost is often considered an unavoidable price to pay for high storage efficiency and high reliability.This paper shows how to overcome this limitation. We present a novel family of erasure codes that are efficiently repairable and offer higher reliability compared to Reed-Solomon codes. We show analytically that our codes are optimal on a recently identified tradeoff between locality and minimum distance.We implement our new codes in Hadoop HDFS and compare to a currently deployed HDFS module that uses Reed-Solomon codes. Our modified HDFS implementation shows a reduction of approximately 2× on the repair disk I/O and repair network traffic. The disadvantage of the new coding scheme is that it requires 14% more storage compared to Reed-Solomon codes, an overhead shown to be information theoretically optimal to obtain locality. Because the new codes repair failures faster, this provides higher reliability, which is orders of magnitude higher compared to replication.

...read moreread less

742 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse