scispace - formally typeset
Search or ask a question

Showing papers in "Sigmetrics Performance Evaluation Review in 2022"


Journal ArticleDOI
TL;DR: The square root scaling laws for the amount of traffic injected by a covert attacker into a network from a set of homes are state and proved under the assumption that traffic descriptors follow a multivariate Gaussian distribution.
Abstract: We state and prove the square root scaling laws for the amount of traffic injected by a covert attacker into a network from a set of homes under the assumption that traffic descriptors follow a multivariate Gaussian distribution. We numerically evaluate the obtained result under realistic settings wherein traffic is collected from real users, leveraging detectors that exploit multiple features. Under such circumstances, we observe that phase transitions predicted by the model still hold.

6 citations


Journal ArticleDOI
TL;DR: An automatic online log parsing method, name as LogStamp, which can achieve high accuracy with only a small portion of the training set and can achieve an average accuracy of 0.956 when using only 10% of the data training.
Abstract: —Logs are one of the most critical data for service management. It contains rich runtime information for both services and users. Since size of logs are often enormous in size and have free handwritten constructions, a typical log-based analysis needs to parse logs into structured format first. However, we observe that most existing log parsing methods cannot parse logs online, which is essential for online services. In this paper, we present an automatic online log parsing method, name as LogStamp . We extensively evaluate LogStamp on five public datasets to demonstrate the effectiveness of our proposed method. The experiments show that our proposed method can achieve high accuracy with only a small portion of the training set. For example, it can achieve an average accuracy of 0.956 when using only 10% of the data training.

4 citations


Journal ArticleDOI
TL;DR: In this article , the authors present a tool called rmf tool, which takes the description of a mean field model, and can numerically compute its mean field approximations and refinement.
Abstract: Mean field approximation is a powerful technique to study the performance of large stochastic systems represented as systems of interacting objects. Applications include load balancing models, epidemic spreading, cache replacement policies, or large-scale data centers, for which mean field approximation gives very accurate estimates of the transient or steady-state behaviors. In a series of recent papers [9, 7], a new and more accurate approximation, called the refined mean field approximation is presented. Yet, computing this new approximation can be cumbersome. The purpose of this paper is to present a tool, called rmf tool, that takes the description of a mean field model, and can numerically compute its mean field approximations and refinement.

2 citations


Journal ArticleDOI
TL;DR: This paper analyzes the performance of the EDF scheduling policy for charging electrical vehicles when the exact deadlines are not known by the scheduler and describes the average gain for a given uncertainty model and devise a policy to curtail strategic users.
Abstract: In this paper, we analyze the performance of the EDF scheduling policy for charging electrical vehicles when the exact deadlines are not known by the scheduler. Instead, they are declared by users. We quantify the effect of this uncertainty in a mean field regime, and show that incentives appear for users to under-report their sojourn time. We characterize the average gain for a given uncertainty model and devise a policy to curtail strategic users.

2 citations


Journal ArticleDOI
TL;DR: This paper considers queueing models for multi-server jobs in a scaling regime where the number of servers in the system becomes large, and shows that a Priority policy achieves order optimality for minimizing mean queueing time and the Priority policy is strictly better than the First-Come-First-Serve policy.
Abstract: Multi-server jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in today's computing clusters. But little is known about the delay performance of systems with multi-server jobs. In this paper, we consider queueing models for multi-server jobs in a scaling regime where the number of servers in the system becomes large. Prior work has derived upper bounds on the queueing probability in this scaling regime. But without proper lower bounds, the results cannot be used to differentiate between policies. We focus on the mean queueing time of multi-server jobs, and establish both upper and lower bounds under various scheduling policies. Our results show that a Priority policy achieves order optimality for minimizing mean queueing time, and the Priority policy is strictly better than the First-Come-First-Serve policy.

2 citations


Journal ArticleDOI
TL;DR: This paper proposes correlation-aware flow consolidation, i.e. aggregating inversely correlated flows into superflows and using them as building blocks for load balancing because superflows are smoother than individual flows, and thus are easier to estimate with a higher confidence, and can reduce overshooting of link capacities.
Abstract: Existing load balancing solutions rely on direct or indirect measurement of rates (or congestion) averaged over short periods of time. Sudden fluctuations in flow rates can lead to significant undershooting/ overshooting of target link loads. In this paper, we make the case for taking variations and correlations of flows into account in load balancing. We propose correlation-aware flow consolidation, i.e. aggregating inversely correlated (or uncorrelated) flows into superflows and using them as building blocks for load balancing. Superflows are smoother than individual flows, and thus are easier to estimate with a higher confidence, and can reduce overshooting/ undershooting of link capacities. We present heuristic methods combined with predictive models to consolidate flows and show they can lead to significant reductions in rate standard deviations compared to correlation-agnostic solutions (up to 33% and 12% improvements at the 50th and 99th percentiles respectively for 20 superflows based on real traffic traces).

1 citations


Journal ArticleDOI
TL;DR: In this paper , a quantum entanglement distribution switch serving a set of users in a star topology with equal-length links is considered, where the function of the switch is to create bipartite or tripartite entangled states among users at the highest possible rates at a fixed ratio.
Abstract: We study a quantum entanglement distribution switch serving a set of users in a star topology with equal-length links. The quantum switch, much like a quantum repeater, can perform entanglement swapping to extend entanglement across longer distances. Additionally, the switch is equipped with entanglement switching logic, enabling it to implement switching policies to better serve the needs of the network. In this work, the function of the switch is to create bipartite or tripartite entangled states among users at the highest possible rates at a fixed ratio. Using Markov chains, we model a set of randomized switching policies. Discovering that some are better than others, we present analytical results for the case where the switch stores one qubit per user, and find that the best policies outperform a time division multiplexing policy for sharing the switch between bipartite and tripartite state generation. This performance improvement decreases as the number of users grows. The model is easily augmented to study the capacity region in the presence of quantum state decoherence and associated cut-off times for qubit storage, obtaining similar results. Moreover, decoherence-associated quantum storage cut-off times appear to have little effect on capacity in our identical-link system. We also study a smaller class of policies when the switch stores two qubits per user.

1 citations


Journal ArticleDOI
TL;DR: Ghosh et al. as mentioned in this paper presented an unbiased gradient estimator for robust optimization, which can be used to estimate the gradient of an optimization problem and to obtain a robust optimization result.
Abstract: research-article Share on Unbiased Gradient Estimation for Robust Optimization Authors: Soumyadip Ghosh Mathematical Sciences, IBM Research Thomas J. Watson Research Center, Yorktown Heights, NY, USA Mathematical Sciences, IBM Research Thomas J. Watson Research Center, Yorktown Heights, NY, USAView Profile , Mark S. Squillante Mathematical Sciences, IBM Research Thomas J. Watson Research Center, Yorktown Heights, NY, USA Mathematical Sciences, IBM Research Thomas J. Watson Research Center, Yorktown Heights, NY, USAView Profile Authors Info & Claims ACM SIGMETRICS Performance Evaluation ReviewVolume 49Issue 2September 2021 pp 39–41https://doi.org/10.1145/3512798.3512813Online:20 January 2022Publication History 0citation14DownloadsMetricsTotal Citations0Total Downloads14Last 12 Months14Last 6 weeks0 Get Citation AlertsNew Citation Alert added!This alert has been successfully added and will be sent to:You will be notified whenever a record that you have chosen has been cited.To manage your alert preferences, click on the button below.Manage my AlertsNew Citation Alert!Please log in to your account Save to BinderSave to BinderCreate a New BinderNameCancelCreateExport CitationPublisher SiteGet Access

Journal ArticleDOI
TL;DR: In this paper , the authors examine five performance questions which are repeatedly asked by practitioners in industry: (i) My system utilization is very low, so why are job delays so high? (ii) What should I do to lower job delays? (iii) How can I favor short jobs if I don't know which jobs are short? (iv) If some jobs are more important than others, how do I negotiate importance versus size? (v) How do answers change when dealing with a closed-loop system, rather than an open system?
Abstract: This document examines five performance questions which are repeatedly asked by practitioners in industry: (i) My system utilization is very low, so why are job delays so high? (ii) What should I do to lower job delays? (iii) How can I favor short jobs if I don't know which jobs are short? (iv) If some jobs are more important than others, how do I negotiate importance versus size? (v) How do answers change when dealing with a closed-loop system, rather than an open system? All these questions have simple answers through queueing theory. This short paper elaborates on the questions and their answers. To keep things readable, our tone is purposely informal throughout. For more formal statements of these questions and answers, please see [14].

Journal ArticleDOI
TL;DR: TauSSA is presented, a discrete-event tool for stochastic queueing networks integrated in the LINE solver and various strategies for handling ordering and illegal states in tau leaping that arise specifically within queueing network m odels are conceptualized.
Abstract: In this paper, we present TauSSA , a discrete-event si m u- lation tool for stochastic queueing networks integrated in the LINE solver. TauSSA co m bines Gillespie’s stochastic si m ulation algorith m with tau leaping , a m ethodology for opti m istic si m ulation acceleration. Although tau leaping is frequently used in che m ical reaction network si m ulation, it has so far found li m ited application in queueing theory. TauSSA offers one of the very first atte m pts to m ake this m ethod broadly applicable to analyze extended queue- ing network m odels, which include class switching, fork-join, and non-exponential service and arrival distributions. We conceptualize various strategies for handling ordering and illegal states in tau leaping that arise specifically within queueing network m odels, and co m pare their perfor m ance through nu m erical experi m ents. Our m ain finding is that strategies that sort events based on the network topological order incur a better trade-off between speedup and approx- i m ation error.

Journal ArticleDOI
TL;DR: This paper analyzes a simple two dimensional Markov chain model of a queueing system in which multiple servers can arrive to increase service capacity, and depart if a server has been idle for too long.
Abstract: In many systems, in order to fulfill demand (computing or other services) that varies over time, service capacities often change accordingly. In this paper, we analyze a simple two dimensional Markov chain model of a queueing system in which multiple servers can arrive to increase service capacity, and depart if a server has been idle for too long. It is well known that multi-dimensional Markov chains are in general difficult to analyze. Our focus is on an approximation method of stationary performance of the system via the Stein method. For this purpose, innovative methods are developed to estimate the moments of the Markov chain, as well as the solution to the Poisson equation with a partial differential operator.

Journal ArticleDOI
TL;DR: The workshop aims to revisit the development and the application of reinforcement learning techniques in the various application areas covered by the SIGMETRICS conference.
Abstract: The workshop aims to revisit the development and the application of reinforcement learning techniques in the various application areas covered by the SIGMETRICS conference. Topics include but are not limited to queueing networks (scheduling, resource allocations), cloud computing, cyberphysical systems (including the smart grid), computer and communication networks, etc. This workshop aims to bring together researchers working on the theoretical aspects and the application of reinforcement learning techniques. It is intended to provide a focus on reinforcement learning techniques at SIGMETRICS conferences for talks on early research on the subject. We aim to gather talks based on recent research results (including work in progress or work that have been submitted to a journal) as well as recently published results in other conferences or journals. Thus, part of the goal is to complement and supplement the SIGMETRICS Conference program with such talks without removing any theoretical contributions from the main technical program.

Journal ArticleDOI
TL;DR: Online dispatching refers to the process (or an algorithm) that dispatches incoming jobs to available servers in realtime.
Abstract: Online dispatching refers to the process (or an algorithm) that dispatches incoming jobs to available servers in realtime. The problem arises in many different fields. Examples include routing customer calls to representatives in a call center, assigning patients towards in a hospital, dispatching goods to different shipping companies, scheduling packets over multiple frequency channels in wireless communications, routing search queries to servers in a data center, selecting an advertisement to display to an Internet user, and allocating jobs to workers in crowdsourcing.

Journal ArticleDOI
TL;DR: The methodology is based on a node embedding method that models and unveils the nodes' importance in mobility and connectivity patterns while preserving their spatial and temporal characteristics and shows that it provides a rich representation for extracting different mobility and mobility patterns.
Abstract: Motivated by the growing number of mobile devices capable of connecting and exchanging messages, we propose a methodology aiming to model and analyze node mobility in networks. We note that many existing solutions in the literature rely on topological measurements calculated directly on the graph of node contacts, aiming to capture the notion of the node's importance in terms of connectivity and mobility patterns beneficial for prototyping, design, and deployment of mobile networks. However, each measure has its specificity and fails to generalize the node importance notions that ultimately change over time. Unlike previous approaches, our methodology is based on a node embedding method that models and unveils the nodes' importance in mobility and connectivity patterns while preserving their spatial and temporal characteristics. We focus on a case study based on a trace of group meetings. The results show that our methodology provides a rich representation for extracting different mobility and connectivity patterns, which can be helpful for various applications and services in mobile networks.

Journal ArticleDOI
TL;DR: There are a large number of queueing models that are open research problems regarding the Age of Information metric, and the time elapsed since the generation of the last successfully received packet by the monitor containing information about the source is counted.
Abstract: Timely information is a crucial factor in a wide range of information, communication, and control systems. For instance, in autonomous driving systems, the state of the traffic and the location of the vehicles must be as recent as possible. The Age of Information is a relatively new metric that measures the freshness of the knowledge we have about the status of a remote system. More specifically, the Age of Information is the time elapsed since the generation of the last successfully received packet by the monitor containing information about the source. Since the seminal paper [2], in several models it has been observed that the policies that optimize performance metrics of interest in queueing theory do not necessarily minimize the Age of Information. Hence, there is a large number of queueing models that are open research problems regarding the Age of Information metric. We refer to [6] for a recent survey of the Age of Information.

Journal ArticleDOI
TL;DR: The 2019 Performance Workshop as mentioned in this paper was held in conjunction with IFIP Performance 2021, where the authors focused on the need to change the way they teach performance analysis in the educational system.
Abstract: The teaching of performance analysis started in the early 70’s. The importance of such courses is obvious, considering the immense changes in computing systems over the years. However, rarely do academics meet to discuss and take stock of performance modeling and analysis in their teaching. Moreover, in the last two decades, an economic crisis has involved the educational system, and many changes are happening without the awareness of the parties involved. These reflections point to the need for this Workshop. Some issues to be discussed included: Are we teaching what our students and the industry want or need? Do we need to change the way we teach performance analysis? Can our teaching ride on the contents of popular courses in the curriculum? The Workshop was held in conjunction with IFIP Performance 2021. Since the Conference was online (because of COVID-19), the Workshop followed suit. This meant the program schedule was severely constrained by time differences. The 5 talks and 2 discussions were fast-paced (with just two 5-minute breaks) and finished in under 5 hours. The number of participants fluctuated between 20 and 25, and the discussions covered much ground and generated several ideas. This report summarizes the invited talks, records the discussions and lists some recommendations. (The extended abstracts and slides are available at the conference website.)

Journal ArticleDOI
TL;DR: In this article , the authors analyze the performance of the Earliest Deadline First (EDF) and Least-Laxity First (LEF) policies under uncertain departure times.
Abstract: In an EV charging facility, with multiple vehicles requesting charge simultaneously, scheduling becomes crucial to provide adequate service under vehicle sojourn time constraints. However, these departure times may not be known accurately, and typical policies such as Earliest-Deadline- First or Least-Laxity-First are affected by this uncertainty in information. In this paper, we analyze the performance of these policies under uncertain deadlines, using a meanfield approach. We characterize the deviation in individual attained service as a function of the uncertainty. Since incentives appear to under-report deadlines in order to be prioritized, we analyze a simple modification of the policies to enforce incentive compatibility. Simulation experiments are carried out with a practical data set.

Journal ArticleDOI
TL;DR: The 3rd InternationalWorkshop on AI in Networks and Distributed Systems (WAIN) aims to present high-quality researches leveraging machine learning and data analysis solutions to take full advantage of network systems and infrastructures.
Abstract: We are pleased towelcome you to the 3rd InternationalWorkshop on AI in Networks and Distributed Systems (WAIN). The workshop aims to present high-quality researches leveraging machine learning and data analysis solutions to take full advantage of network systems and infrastructures. In detail, this year WAIN presents innovations for data center scheduling and management and network monitoring and security, using, among the others, node embedding techniques, GNN and federated learning. This year, our technical program is rich and varied, with 1 keynote speech and 6 accepted papers thoroughly evaluated and selected by the 18 members of the Technical Program Committee. All papers received at least 3 reviews, and decisions were made after a vigorous online discussion. In an effort to increase the technical soundness of the contributions, we have increased the papers page limit to 5 pages plus references. For the second year, and hopefully the last one, the workshop is entirely virtual with live streaming presentations to encourage researchers to interact, share their experiences and ideas and discuss the open issues. We would like to thank both the Authors and the Technical Program Committee for their hard work and precious contribution for making WAIN a fruitful workshop.

Journal ArticleDOI
TL;DR: In this article, it is shown that the problem is open for multiserver systems, including the M/G/k and load-balancing systems, as well as a wide variety of other metrics that can be more important.
Abstract: Recent progress in queueing theory has made it possible to analyze the mean response time of multiserver queueing systems under advanced scheduling policies. However, this progress has so far been limited to the metric of mean response time. In practice, there are a wide variety of other metrics that can be more important. One such metric is mean slowdown, which is the average ratio between a job's response time and its size. While it is known that the "RS" policy minimizes mean slowdown in the single-server M/G/1, the problem is open for multiserver systems, including the M/G/k and load-balancing systems.

Journal ArticleDOI
TL;DR: Performance modeling and analysis has become a common practice to assist the development of modern information networks and service systems.
Abstract: Performance modeling and analysis has become a common practice to assist the development of modern information networks and service systems.

Journal ArticleDOI
TL;DR: In this paper , a model of an homogeneous bike sharing system where two classes of bikes interact only through the finite capacity of stations is presented. But this model does not consider the interactions between the two populations of bikes.
Abstract: Electric bikes are deployed massively in preexisting bike sharing system in order to attract new users and replace cars on a larger scale (see [2]). But this causes interactions between the two populations of bikes. In this paper, we analyze a model of an homogeneous bike sharing system where two classes of bikes interact only through the finite capacity of stations. It models systems with both electric and normal bikes, these classes requiring different subscriptions. As far as we know (see [7]), it is the first stochastic large-scale analysis for integrated e-bike and bike sharing systems. The aim of the paper is to derive explicitly the limiting stationary distribution of the state of a station when the number of stations and the fleet size of each class increase at the same rate. Analysis for a spatially heterogeneous network is in preparation and discussed in Section 4.

Journal ArticleDOI
TL;DR: The study of preemptive scheduling is essential to computer systems as mentioned in this paper , and decades of queueing theory research have been done on the subject [19, 18, 16, 13, 21, 8, 17, 2, 11, 20, 10, 1].
Abstract: The study of preemptive scheduling is essential to computer systems [15, 12, 3, 4]. Motivated by this, decades of queueing theory research have been done on the subject [19, 18, 16, 13, 21, 8, 17, 2, 11, 20, 10, 1]. However, almost all queuing theoretic literature on preemptive scheduling concerns systems without switching overhead - pausing or resuming a job is assumed to be instant. Practically speaking, switching in computer systems incurs some overhead [14], which causes a divide between models in research and the real world.

Journal ArticleDOI
TL;DR: This work constructs novel exact and approximate solutions for meanvalue analysis and probabilistic evaluation of closed queueing network models with limited load-dependent (LLD) nodes, and provides an explicit formula for the normalizing constant that applies to models with arbitrary LLD functions, whilst retaining constant complexity with respect to the total population size.
Abstract: We construct novel exact and approximate solutions for meanvalue analysis and probabilistic evaluation of closed queueing network models with limited load-dependent (LLD) nodes. In this setting, load-dependent functions are assumed to become constant after a finite queue-length threshold. For single-class models, we provide an explicit formula for the normalizing constant that applies to models with arbitrary LLD functions, whilst retaining constant complexity with respect to the total population size. From this result, we then derive corresponding closed-form solutions for the multiclass case and show that these yield a novel mean value analysis approach for LLD models. Significantly, this allows us to determine exactly the correction factor between a load-independent solution and a limited load-dependent one, enabling the reuse of state-of-the-art methods for loadindependent models in the analysis of load-dependent networks.