scispace - formally typeset
Search or ask a question
Author

Shivani Tripathy

Bio: Shivani Tripathy is an academic researcher from Indian Institute of Technology Bhubaneswar. The author has contributed to research in topics: Cache & Dram. The author has an hindex of 1, co-authored 7 publications receiving 3 citations.

Papers
More filters
Proceedings ArticleDOI
29 Jun 2020
TL;DR: This work proposes a fuzzy logic-based fairness control mechanism that characterizes the degree of flow intensity of a workload and assigns priorities to the workloads and observes that the proposed mechanism improves the fairness, weighted speedup, and harmonic speedup of SSD by 29.84, 11.24, and 24.90% on average over state of the art.
Abstract: Modern NVMe SSDs are widely deployed in diverse domains due to characteristics like high performance, robustness, and energy efficiency. It has been observed that the impact of interference among the concurrently running workloads on their overall response time differs significantly in these devices, which leads to unfairness. Workload intensity is a dominant factor influencing the interference. Prior works use a threshold value to characterize a workload as high-intensity or low-intensity; this type of characterization has drawbacks due to lack of information about the degree of low- or high-intensity. A data cache in an SSD controller - usually based on DRAMs - plays a crucial role in improving device throughput and lifetime. However, the degree of parallelism is limited at this level compared to the SSD back-end consisting of several channels, chips, and planes. Therefore, the impact of interference can be more pronounced at the data cache level. No prior work has addressed the fairness issue at the data cache level to the best of our knowledge. In this work, we address this issue by proposing a fuzzy logic-based fairness control mechanism. A fuzzy fairness controller characterizes the degree of flow intensity (i.e., the rate at which requests are generated) of a workload and assigns priorities to the workloads. We implement the proposed mechanism in the MQSim framework and observe that our technique improves the fairness, weighted speedup, and harmonic speedup of SSD by 29.84%, 11.24%, and 24.90% on average over state of the art, respectively. The peak gains in fairness, weighted speedup, and harmonic speedup are 2.02x, 29.44%, and 56.30%, respectively.

11 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a methodological survey of cache management policies for these three types of internal caches in SSDs and derive a set of guidelines for a future cache designer, and enumerates a number of future research directions for designing an optimal SSD internal cache management policy.

5 citations

Proceedings ArticleDOI
01 Nov 2019
TL;DR: This research considers a recent ONFI standard and successfully model the NAND flash memory device with the advanced operations in addition to many basic operations in contrast to the existing research works, and encode all the properties obtained from the standard in LTL and proved those using symbolic model checking.
Abstract: NAND flash memory has become the de facto standard for several non-volatile storages used in commercial and mission-critical applications. The advanced operations supported by NAND flash memory contribute to device performance and also influence the design of some of the crucial mechanisms of the flash memory device controller. In this research, we consider a recent ONFI standard i.e. ONFI-3.2 and successfully model the NAND flash memory device with the advanced operations in addition to many basic operations in contrast to the existing research works. We encode all the properties obtained from the standard in LTL and proved those using symbolic model checking. Our modeling approach simplifies the state machines and reduces resource requirements while capturing the essential information needed to verify the requirement based properties.

3 citations

Proceedings ArticleDOI
01 Jan 2019
TL;DR: A multidimensional grid aware address predictor that takes the advantage of SM level concurrency to correctly predict the memory address references for future thread blocks well in advance and provides a cooperative approach where information once learned is shared with all the SMs.
Abstract: GPGPUs are predominantly being used as accelerators for general purpose data parallel applications. Most GPU applications are likely to exhibit regular memory access patterns. It has been observed that warps within a thread block show striding behavior in their memory accesses corresponding to the same load instruction. However, determination of this inter warp stride at thread block boundaries is not trivial. We observed that thread blocks along different dimensions have different stride values. Leveraging this observation, we characterize the relationship between memory address references of warps from different thread blocks. Based on this relationship, we propose a multidimensional grid aware address predictor that takes the advantage of SM level concurrency to correctly predict the memory address references for future thread blocks well in advance. Our technique provides a cooperative approach where information once learned is shared with all the SMs. When compared with the CTA-aware technique, our predictor enhances average prediction coverage by 36% while showing almost similar prediction accuracy.

1 citations

Proceedings ArticleDOI
30 Sep 2018
TL;DR: This work discusses an orthogonal problem in the design space exploration of the DRAM cache related to the cache block allocation policy and believes it is the first work to analyze this aspect of theDRAM cache.
Abstract: Emerging 3D stacking technologies have enabled the use of DRAMs as the last level cache of CPUs. Several designs have been proposed in the existing literature of DRAM caches towards the design space exploration [1]. While the debate on the design trade-offs between block-based and page-based DRAM caches continues, we discuss an orthogonal problem in the design space exploration of the DRAM cache related to the cache block allocation policy. We believe ours is the first work to analyze this aspect of the DRAM cache.

Cited by
More filters
Journal ArticleDOI
TL;DR: PAVER as mentioned in this paper is a priority-aware vertex scheduler, which takes a graph-theoretic approach toward thread scheduling, and analyzes the cache locality behavior among thread blocks (TBs) through a just-in-time compilation.
Abstract: The massive parallelism present in GPUs comes at the cost of reduced L1 and L2 cache sizes per thread, leading to serious cache contention problems such as thrashing. Hence, the data access locality of an application should be considered during thread scheduling to improve execution time and energy consumption. Recent works have tried to use the locality behavior of regular and structured applications in thread scheduling, but the difficult case of irregular and unstructured parallel applications remains to be explored. We present PAVER, a Priority-Aware Vertex schedulER, which takes a graph-theoretic approach toward thread scheduling. We analyze the cache locality behavior among thread blocks (TBs) through a just-in-time compilation, and represent the problem using a graph representing the TBs and the locality among them. This graph is then partitioned to TB groups that display maximum data sharing, which are then assigned to the same streaming multiprocessor by the locality-aware TB scheduler. Through exhaustive simulation in Fermi, Pascal, and Volta architectures using a number of scheduling techniques, we show that PAVER reduces L2 accesses by 43.3%, 48.5%, and 40.21% and increases the average performance benefit by 29%, 49.1%, and 41.2% for the benchmarks with high inter-TB locality.

10 citations

Journal ArticleDOI
TL;DR: In this article , the authors present a methodological survey of cache management policies for these three types of internal caches in SSDs and derive a set of guidelines for a future cache designer, and enumerates a number of future research directions for designing an optimal SSD internal cache management policy.

5 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a methodological survey of cache management policies for these three types of internal caches in SSDs and derive a set of guidelines for a future cache designer, and enumerates a number of future research directions for designing an optimal SSD internal cache management policy.

5 citations

Journal ArticleDOI
TL;DR: In this article , a write-optimized edge storage system via concurrent microwrites merging is proposed to solve the problem of frequent competition on cache blocks, massive fragments caused by merging, and cache pollution due to cache updating.

2 citations

Journal ArticleDOI
TL;DR: A DRAM-based Over-Provisioning (OP) cache management mechanism, named Justitia, to reduce data cache contention and improve fairness for modern SSDs is proposed.
Abstract: Modern NVMe SSDs have been widely deployed in multi-tenant cloud computing environments or multi-programming systems. When multiple applications concurrently access one SSD hardware, unfairness within the shared SSD will slow down the application significantly and lead to a violation of service level objectives. However, traditional data cache management within SSDs mainly focuses on improving cache hit ratio, which causes data cache contention and sacrifices fairness among multiple applications. In this paper, we propose a DRAM-based Over-Provisioning (OP) cache management mechanism, named Justitia, to reduce data cache contention and improve fairness for modern SSDs. Justitia consists of two stages including Static-OP stage and Dynamic-OP stage. Through the novel OP mechanism in the two stages, Justitia reduces the max slowdown by 4.5x on average. At the same time, Justitia increases fairness by 20.6x and buffer hit ratio by 19.6% averagely, compared with the traditional shared mechanism.

2 citations