Adaptive-latency DRAM: Optimizing DRAM timing for the common-case
read more
Citations
Design Of Analog Cmos Integrated Circuits
Ramulator: A Fast and Extensible DRAM Simulator
Ambit: in-memory accelerator for bulk bitwise operations using commodity DRAM technology
Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-State Drives
AVATAR: A Variable-Retention-Time (VRT) Aware Refresh for DRAM Systems
References
Design of Analog CMOS Integrated Circuits
Flipping bits in memory without accessing them: an experimental study of DRAM disturbance errors
Design Of Analog Cmos Integrated Circuits
RAIDR: Retention-Aware Intelligent DRAM Refresh
Self-Optimizing Memory Controllers: A Reinforcement Learning Approach
Related Papers (5)
Frequently Asked Questions (16)
Q2. Why does the bitline delay the cell's ability to be fully charged?
Owing to the large resistance and the large capacitance of the bitline, the cell experiences a large RC-delay, which increases the time it takes for the cell to become fullycharged.
Q3. Why do some outlier cells suffer from a larger RC-delay than others?
due to process variation, some outlier cells suffer from a larger RC-delay than other cells, and require more time to be charged.
Q4. What is the common reason why a cell is not fully charged?
In practice, the cell is usually not completely charged because of a phenomenon called leakage, wherein the cell capacitor loses charge over time.
Q5. How did the authors measure the temperature of the DRAM in a server cluster?
The authors measured the DRAM ambient temperature in a server cluster running a memory-intensive benchmark, and found that the temperature never exceeds 34◦C — as well as never changing by more than 0.1◦C per second.
Q6. Why do the manufacturers choose to discard the slowest cells?
The manufacturers, in turn, are driven by the extremely cost-sensitive nature of the DRAM market, which encourages them to adopt pessimistic timing parameters rather than to (i) discard chips with the slowest cells or (ii) test chips at lower temperatures.
Q7. What are the two physical phenomena that impact a DRAM cell’s ability to receive and?
the authors examine two physical phenomena that critically impact a DRAM cell’s ability to receive and retain charge: (i) process variation and (ii) temperature dependence.
Q8. How do the authors measure the safety-margin of a DRAM module?
The authors first measure the safety-margin of a DRAM module by sweeping the refresh interval at the worst operating temperature (85◦C), using the standard timing parameters.
Q9. How do the authors determine the safe refresh interval for a DRAM module?
Based on this experiment, the authors define the safe refresh interval for a DRAM module as the maximum refresh interval that leads to no errors, minus an additional margin of 8 ms, which is the increment at which the authors sweep the refresh interval.
Q10. How can the authors reduce DRAM latency without sacrificing any observed degree of reliability?
Using an FPGA-based testing platform [29, 31, 41], the authors then demonstrate that DRAM timing parameters can be shortened to reduce DRAM latency without sacrificing any observed degree of DRAM reliability.
Q11. How much time is spent on removing the last 5% of the charge from the bitline?
At the end of the precharging phase, nearly half the time (45%) is spent on extracting the last 5% of the charge from the bitline.
Q12. Why is the DRAM market driven by pessimism?
Such pessimism on the part of the DRAM manufacturers is motivated by their desire to (i) increase chip yield and (ii) reduce chip testing time.
Q13. Why do the read and write operations need to be profiled separately?
This is why the read and write operations need to be profiled separately, since they are likely to sensitize errors in different sets of cells.
Q14. What is the mechanism for identifying and enforcing the timing parameters for each?
Their mechanism consists of two steps: (i) identification of the best timing parameters for each DIMM/temperature, and (ii) enforcement, wherein the memory controller dynamically extracts each DIMM’s operating temperature and uses the best timing parameters for each DIMM/temperature combination.
Q15. What is the time it takes for the cell to reach this state?
If there is a write operation, some additional time is required for the bitline and the cell to reach this state, which is expressed as a timing parameter called tWR.
Q16. How much margin should the authors strip away from the DRAM timing parameters?
The remaining margin should be enough for DRAM to achieve correctness by overcoming process variation and temperature dependence (as the authors discussed in Section 4.3).