scispace - formally typeset
Search or ask a question
Author

Ludmila Cherkasova

Other affiliations: Russian Academy of Sciences
Bio: Ludmila Cherkasova is an academic researcher from Hewlett-Packard. The author has contributed to research in topics: Workload & Capacity planning. The author has an hindex of 44, co-authored 210 publications receiving 5706 citations. Previous affiliations of Ludmila Cherkasova include Russian Academy of Sciences.


Papers
More filters
Patent
29 Nov 2001
TL;DR: In this article, the admission controller relays to the server the messages in the stream that correspond to a number of sessions already underway between the clients and the server, if a hybrid and predictive admission control strategy using information provided by a resource monitor indicates that additional sessions can be handled by the server.
Abstract: An admission control system for a server including an admission controller that receives a stream of messages from one or more clients targeted for the server. The admission controller relays to the server the messages in the stream that correspond to a number of sessions already underway between the clients and the server. The admission controller also relays to the server the messages in the stream that do not correspond to sessions already underway if a hybrid and predictive admission control strategy using information provided by a resource monitor indicates that additional sessions can be handled by the server. The admission controller defers the messages otherwise.

152 citations

Patent
27 Apr 2012
TL;DR: In this article, the authors present a method for estimating resource costs required to process an workload to be completed using at least two different cloud computing models: t-shirt and time-sharing.
Abstract: At least one embodiment is for a method for estimating resource costs required to process an workload to be completed using at least two different cloud computing models. Historical trace data of at least one completed workload that is similar to the workload to be completed is received by the computer. The processing of the completed workload is simulated using a t-shirt cloud computing model and a time-sharing model. The t-shirt and time-sharing resource costs are estimated based on their respective simulations. The t-shirt and resource costs are then compared.

118 citations

Journal ArticleDOI
TL;DR: This paper describes an automated capacity and workload management system that integrates multiple resource controllers at three different scopes and time scales and confirms that such an integrated solution ensures efficient and effective use of data center resources while reducing service level violations for high priority applications.
Abstract: Recent advances in hardware and software virtualization offer unprecedented management capabilities for the mapping of virtual resources to physical resources. It is highly desirable to further create a "service hosting abstraction" that allows application owners to focus on service level objectives (SLOs) for their applications. This calls for a resource management solution that achieves the SLOs for many applications in response to changing data center conditions and hides the complexity from both application owners and data center operators. In this paper, we describe an automated capacity and workload management system that integrates multiple resource controllers at three different scopes and time scales. Simulation and experimental results confirm that such an integrated solution ensures efficient and effective use of data center resources while reducing service level violations for high priority applications.

116 citations

Patent
27 Mar 1998
TL;DR: In this paper, the server stores the requests in a queue and determines a priority value for the request, the priority value includes the sum of a counter value and a cost value, the cost value being monotonically related to the quantity of server resources needed to service the request.
Abstract: A method for operating a server on a computer network to supply data stored on the server in response to requests received on the network. The server stores the requests in a queue and determines a priority value for the request. The priority value includes the sum of a counter value and a cost value, the cost value being monotonically related to the quantity of server resources needed to service the request. When the server selects one of the requests stored in the queue for servicing it picks the request having the lowest priority value of the requests stored in the queue. The selected request is removed from the queue and the counter value is incremented by a value proportional to the cost value associated with the request selected for servicing. In one embodiment, the cost value is proportional to the length of a file specified in the received request. In one embodiment, one of the received requests also includes information specifying a class for the request. The server also determines a maximum priority value for that class, the maximum priority value being at least as great as the priority value having the highest value for any request of that class currently stored in the queue. The server compares the determined priority value for the received request with the maximum priority value and changes the determined priority value to a value greater than the maximum value if the determined priority value was less than or equal to the maximum priority value.

112 citations

Proceedings ArticleDOI
12 Jul 2005
TL;DR: The Quartermaster capacity manager service implements a trace-based technique that models workload resource demands, their corresponding resource allocations, and resource access quality of service and is significantly more accurate at estimating per-server required capacity than a benchmark method used in practice to manage a resource pool.
Abstract: Resource pools are computing environments that offer virtualized access to shared resources. When used effectively they can align the use of capacity with business needs (flexibility), lower infrastructure costs (via resource sharing), and lower operating costs (via automation). This paper describes the Quartermaster capacity manager service for managing such pools. It implements a trace-based technique that models workload (e.g., application) resource demands, their corresponding resource allocations, and resource access quality of service. The primary advantages of the technique are its accuracy, generality, support for resource access qualities of service, and optimizing search method. We pose general capacity management questions for resource pools and explain how the capacity manager helps to address them in an automated manner. A case study demonstrates and validates the method on empirical data from an enterprise application. We show that the technique exploits much of the resource savings to be achieved from resource sharing and is significantly more accurate at estimating per-server required capacity than a benchmark method used in practice to manage a resource pool. Finally, we explain how the problems relate to other practices regarding enterprise capacity management and software performance engineering.

109 citations


Cited by
More filters
Journal ArticleDOI
01 Apr 1989
TL;DR: The author proceeds with introductory modeling examples, behavioral and structural properties, three methods of analysis, subclasses of Petri nets and their analysis, and one section is devoted to marked graphs, the concurrent system model most amenable to analysis.
Abstract: Starts with a brief review of the history and the application areas considered in the literature. The author then proceeds with introductory modeling examples, behavioral and structural properties, three methods of analysis, subclasses of Petri nets and their analysis. In particular, one section is devoted to marked graphs, the concurrent system model most amenable to analysis. Introductory discussions on stochastic nets with their application to performance modeling, and on high-level nets with their application to logic programming, are provided. Also included are recent results on reachability criteria. Suggestions are provided for further reading on many subject areas of Petri nets. >

10,755 citations

Book
28 Nov 1995
TL;DR: This book presents a unified theory of Generalized Stochastic Petri Nets together with a set of illustrative examples from different application fields to show how this methodology can be applied in a range of domains.
Abstract: From the Publisher: This book presents a unified theory of Generalized Stochastic Petri Nets (GSPNs) together with a set of illustrative examples from different application fields. The continuing success of GSPNs and the increasing interest in using them as a modelling paradigm for the quantitative analysis of distributed systems suggested the preparation of this volume with the intent of providing newcomers to the field with a useful tool for their first approach. Readers will find a clear and informal explanation of the concepts followed by formal definitions when necessary or helpful. The largest section of the book however is devoted to showing how this methodology can be applied in a range of domains.

1,487 citations

Proceedings ArticleDOI
17 Apr 2015
TL;DR: A summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions, and a qualitative examination of lessons learned from a decade of operational experience with it are presented.
Abstract: Google's Borg system is a cluster manager that runs hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters each with up to tens of thousands of machines. It achieves high utilization by combining admission control, efficient task-packing, over-commitment, and machine sharing with process-level performance isolation. It supports high-availability applications with runtime features that minimize fault-recovery time, and scheduling policies that reduce the probability of correlated failures. Borg simplifies life for its users by offering a declarative job specification language, name service integration, real-time job monitoring, and tools to analyze and simulate system behavior. We present a summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions, and a qualitative examination of lessons learned from a decade of operational experience with it.

1,185 citations

Journal ArticleDOI
19 Oct 2003
TL;DR: Unlike the Web, whose workload is driven by document change, it is demonstrated that clients' fetch-at-most-once behavior, the creation of new objects, and the addition of new clients to the system are the primary forces that drive multimedia workloads such as Kazaa.
Abstract: Peer-to-peer (P2P) file sharing accounts for an astonishing volume of current Internet traffic. This paper probes deeply into modern P2P file sharing systems and the forces that drive them. By doing so, we seek to increase our understanding of P2P file sharing workloads and their implications for future multimedia workloads. Our research uses a three-tiered approach. First, we analyze a 200-day trace of over 20 terabytes of Kazaa P2P traffic collected at the University of Washington. Second, we develop a model of multimedia workloads that lets us isolate, vary, and explore the impact of key system parameters. Our model, which we parameterize with statistics from our trace, lets us confirm various hypotheses about file-sharing behavior observed in the trace. Third, we explore the potential impact of locality-awareness in Kazaa.Our results reveal dramatic differences between P2P file sharing and Web traffic. For example, we show how the immutability of Kazaa's multimedia objects leads clients to fetch objects at most once; in contrast, a World-Wide Web client may fetch a popular page (e.g., CNN or Google) thousands of times. Moreover, we demonstrate that: (1) this "fetch-at-most-once" behavior causes the Kazaa popularity distribution to deviate substantially from Zipf curves we see for the Web, and (2) this deviation has significant implications for the performance of multimedia file-sharing systems. Unlike the Web, whose workload is driven by document change, we demonstrate that clients' fetch-at-most-once behavior, the creation of new objects, and the addition of new clients to the system are the primary forces that drive multimedia workloads such as Kazaa. We also show that there is substantial untapped locality in the Kazaa workload. Finally, we quantify the potential bandwidth savings that locality-aware P2P file-sharing architectures would achieve.

941 citations

Proceedings ArticleDOI
02 Jun 2008
TL;DR: The social networking in YouTube videos is investigated, finding that the links to related videos generated by uploaders' choices have clear small-world characteristics, indicating that the videos have strong correlations with each other, and creates opportunities for developing novel techniques to enhance the service quality.
Abstract: YouTube has become the most successful Internet website providing a new generation of short video sharing service since its establishment in early 2005. YouTube has a great impact on Internet traffic nowadays, yet itself is suffering from a severe problem of scalability. Therefore, understanding the characteristics of YouTube and similar sites is essential to network traffic engineering and to their sustainable development. To this end, we have crawled the YouTube site for four months, collecting more than 3 million YouTube videos' data. In this paper, we present a systematic and in-depth measurement study on the statistics of YouTube videos. We have found that YouTube videos have noticeably different statistics compared to traditional streaming videos, ranging from length and access pattern, to their growth trend and active life span. We investigate the social networking in YouTube videos, as this is a key driving force toward its success. In particular, we find that the links to related videos generated by uploaders' choices have clear small-world characteristics. This indicates that the videos have strong correlations with each other, and creates opportunities for developing novel techniques to enhance the service quality.

773 citations