scispace - formally typeset
Search or ask a question
Author

Dan Tsafrir

Bio: Dan Tsafrir is an academic researcher from Technion – Israel Institute of Technology. The author has contributed to research in topics: Cache & Virtualization. The author has an hindex of 27, co-authored 80 publications receiving 3188 citations. Previous affiliations of Dan Tsafrir include Hebrew University of Jerusalem & IBM.


Papers
More filters
Journal ArticleDOI
01 Sep 2013
TL;DR: By analyzing the spot price histories of Amazon's EC2 cloud, this work reverse engineer how prices are set and construct a model that generates prices consistent with existing price traces, finding that prices are usually not market-driven as sometimes previously assumed.
Abstract: Cloud providers possessing large quantities of spare capacity must either incentivize clients to purchase it or suffer losses. Amazon is the first cloud provider to address this challenge, by allowing clients to bid on spare capacity and by granting resources to bidders while their bids exceed a periodically changing spot price. Amazon publicizes the spot price but does not disclose how it is determined.By analyzing the spot price histories of Amazon’s EC2 cloud, we reverse engineer how prices are set and construct a model that generates prices consistent with existing price traces. Our findings suggest that usually prices are not market-driven, as sometimes previously assumed. Rather, they are likely to be generated most of the time at random from within a tight price range via a dynamic hidden reserve price mechanism. Our model could help clients make informed bids, cloud providers design profitable systems, and researchers design pricing algorithms.

372 citations

Journal ArticleDOI
TL;DR: The end result is a surprisingly simple scheduler, which requires minimal deviations from current practices and behaves exactly like EASY as far as users are concerned; nevertheless, it achieves significant improvements in performance, predictability, and accuracy.
Abstract: The most commonly used scheduling algorithm for parallel supercomputers is FCFS with backfilling, as originally introduced in the EASY scheduler. Backfilling means that short jobs are allowed to run ahead of their time provided they do not delay previously queued jobs (or at least the first queued job). However, predictions have not been incorporated into production schedulers, partially due to a misconception (that we resolve) claiming inaccuracy actually improves performance, but mainly because underprediction is technically unacceptable: users will not tolerate jobs being killed just because system predictions were too short. We solve this problem by divorcing kill-time from the runtime prediction and correcting predictions adaptively as needed if they are proved wrong. The end result is a surprisingly simple scheduler, which requires minimal deviations from current practices (e.g., using FCFS as the basis) and behaves exactly like EASY as far as users are concerned; nevertheless, it achieves significant improvements in performance, predictability, and accuracy. Notably, this is based on a very simple runtime predictor that just averages the runtimes of the last two jobs by the same user; counter intuitively, our results indicate that using recent data is more important than mining the history for similar jobs. All the techniques suggested in this paper can be used to enhance any backfilling algorithm and are not limited to EASY

354 citations

Proceedings ArticleDOI
29 Nov 2011
TL;DR: By analyzing the spot price histories of Amazon's EC2 cloud, this work reverse engineer how prices are set and construct a model that generates prices consistent with existing price traces, finding that prices are usually not market-driven as sometimes previously assumed.
Abstract: Cloud providers possessing large quantities of spare capacity must either incentivize clients to purchase it or suffer losses. Amazon is the first cloud provider to address this challenge, by allowing clients to bid on spare capacity and by granting resources to bidders while their bids exceed a periodically changing spot price. Amazon publicizes the spot price but does not disclose how it is determined. By analyzing the spot price histories of Amazon's EC2 cloud, we reverse engineer how prices are set and construct a model that generates prices consistent with existing price traces. We find that prices are usually not market-driven as sometimes previously assumed. Rather, they are typically generated at random from within a tight price interval via a dynamic hidden reserve price. Our model could help clients make informed bids, cloud providers design profitable systems, and researchers design pricing algorithms.

250 citations

Proceedings ArticleDOI
03 Mar 2012
TL;DR: ELI (ExitLess Interrupts), a software-only approach for handling interrupts within guest virtual machines directly and securely, manages to improve the throughput and latency of unmodified, untrusted guests by 1.3x-1.6x, allowing them to reach 97%-100% of bare-metal performance even for the most demanding I/O-intensive workloads.
Abstract: Direct device assignment enhances the performance of guest virtual machines by allowing them to communicate with I/O devices without host involvement. But even with device assignment, guests are still unable to approach bare-metal performance, because the host intercepts all interrupts, including those interrupts generated by assigned devices to signal to guests the completion of their I/O requests. The host involvement induces multiple unwarranted guest/host context switches, which significantly hamper the performance of I/O intensive workloads. To solve this problem, we present ELI (ExitLess Interrupts), a software-only approach for handling interrupts within guest virtual machines directly and securely. By removing the host from the interrupt handling path, ELI manages to improve the throughput and latency of unmodified, untrusted guests by 1.3x-1.6x, allowing them to reach 97%-100% of bare-metal performance even for the most demanding I/O-intensive workloads.

215 citations

Journal ArticleDOI
TL;DR: This work considers issues like missing data, inconsistent data, erroneous data, system configuration changes during the logging period, and unrepresentative user behavior in the Parallel Workloads Archive, a repository of job-level usage data from large-scale parallel supercomputing systems.

210 citations


Cited by
More filters
Journal ArticleDOI
06 Mar 2015-Science
TL;DR: Large-scale single-cell RNA sequencing is used to classify cells in the mouse somatosensory cortex and hippocampal CA1 region and found 47 molecularly distinct subclasses, comprising all known major cell types in the cortex.
Abstract: The mammalian cerebral cortex supports cognitive functions such as sensorimotor integration, memory, and social behaviors. Normal brain function relies on a diverse set of differentiated cell types, including neurons, glia, and vasculature. Here, we have used large-scale single-cell RNA sequencing (RNA-seq) to classify cells in the mouse somatosensory cortex and hippocampal CA1 region. We found 47 molecularly distinct subclasses, comprising all known major cell types in the cortex. We identified numerous marker genes, which allowed alignment with known cell types, morphology, and location. We found a layer I interneuron expressing Pax6 and a distinct postmitotic oligodendrocyte subclass marked by Itpr2. Across the diversity of cortical cell types, transcription factors formed a complex, layered regulatory code, suggesting a mechanism for the maintenance of adult cell type identity.

2,675 citations

01 Mar 2001
TL;DR: Using singular value decomposition in transforming genome-wide expression data from genes x arrays space to reduced diagonalized "eigengenes" x "eigenarrays" space gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype.
Abstract: ‡We describe the use of singular value decomposition in transforming genome-wide expression data from genes 3 arrays space to reduced diagonalized ‘‘eigengenes’’ 3 ‘‘eigenarrays’’ space, where the eigengenes (or eigenarrays) are unique orthonormal superpositions of the genes (or arrays). Normalizing the data by filtering out the eigengenes (and eigenarrays) that are inferred to represent noise or experimental artifacts enables meaningful comparison of the expression of different genes across different arrays in different experiments. Sorting the data according to the eigengenes and eigenarrays gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype, respectively. After normalization and sorting, the significant eigengenes and eigenarrays can be associated with observed genome-wide effects of regulators, or with measured samples, in which these regulators are overactive or underactive, respectively.

1,815 citations