Showing papers on "Memory controller published in 2009"

PDF

Open Access

Proceedings Article•DOI•

Analyzing CUDA workloads using a detailed GPU simulator

[...]

Ali Bakhoda¹, George L. Yuan¹, Wilson W. L. Fung¹, Henry Wong¹, Tor M. Aamodt¹ - Show less +1 more•Institutions (1)

26 Apr 2009

TL;DR: In this paper, the performance of non-graphics applications written in NVIDIA's CUDA programming model is evaluated on a microarchitecture performance simulator that runs NVIDIA's parallel thread execution (PTX) virtual instruction set.

...read moreread less

Abstract: Modern Graphic Processing Units (GPUs) provide sufficiently flexible programming models that understanding their performance can provide insight in designing tomorrow's manycore processors, whether those are GPUs or otherwise. The combination of multiple, multithreaded, SIMD cores makes studying these GPUs useful in understanding tradeoffs among memory, data, and thread level parallelism. While modern GPUs offer orders of magnitude more raw computing power than contemporary CPUs, many important applications, even those with abundant data level parallelism, do not achieve peak performance. This paper characterizes several non-graphics applications written in NVIDIA's CUDA programming model by running them on a novel detailed microarchitecture performance simulator that runs NVIDIA's parallel thread execution (PTX) virtual instruction set. For this study, we selected twelve non-trivial CUDA applications demonstrating varying levels of performance improvement on GPU hardware (versus a CPU-only sequential version of the application). We study the performance of these applications on our GPU performance simulator with configurations comparable to contemporary high-end graphics cards. We characterize the performance impact of several microarchitecture design choices including choice of interconnect topology, use of caches, design of memory controller, parallel workload distribution mechanisms, and memory request coalescing hardware. Two observations we make are (1) that for the applications we study, performance is more sensitive to interconnect bisection bandwidth rather than latency, and (2) that, for some applications, running fewer threads concurrently than on-chip resources might otherwise allow can improve performance by reducing contention in the memory system.

...read moreread less

1,558 citations

Proceedings Article•DOI•

Scalable high performance main memory system using phase-change memory technology

[...]

Moinuddin K. Qureshi¹, Vijayalakshmi Srinivasan¹, Jude A. Rivers¹•Institutions (1)

IBM¹

20 Jun 2009

TL;DR: This paper analyzes a PCM-based hybrid main memory system using an architecture level model of PCM and proposes simple organizational and management solutions of the hybrid memory that reduces the write traffic to PCM, boosting its lifetime from 3 years to 9.7 years.

...read moreread less

Abstract: The memory subsystem accounts for a significant cost and power budget of a computer system. Current DRAM-based main memory systems are starting to hit the power and cost limit. An alternative memory technology that uses resistance contrast in phase-change materials is being actively investigated in the circuits community. Phase Change Memory (PCM) devices offer more density relative to DRAM, and can help increase main memory capacity of future systems while remaining within the cost and power constraints.In this paper, we analyze a PCM-based hybrid main memory system using an architecture level model of PCM.We explore the trade-offs for a main memory system consisting of PCMstorage coupled with a small DRAM buffer. Such an architecture has the latency benefits of DRAM and the capacity benefits of PCM. Our evaluations for a baseline system of 16-cores with 8GB DRAM show that, on average, PCM can reduce page faults by 5X and provide a speedup of 3X. As PCM is projected to have limited write endurance, we also propose simple organizational and management solutions of the hybrid memory that reduces the write traffic to PCM, boosting its lifetime from 3 years to 9.7 years.

...read moreread less

1,451 citations

Operating system support for NVM+DRAM hybrid main memory

[...]

Jeffrey C. Mogul¹, Eduardo Argollo¹, Mehul A. Shah¹, Paolo Faraboschi¹•Institutions (1)

Hewlett-Packard¹

18 May 2009

TL;DR: Preliminary experiments suggesting that this approach to building main memory as a hybrid between DRAM and non-volatile memory, such as flash or PC-RAM, is viable are described.

...read moreread less

Abstract: Technology trends may soon favor building main memory as a hybrid between DRAM and non-volatile memory, such as flash or PC-RAM. We describe how the operating system might manage such hybrid memories, using semantic information not available in other layers. We describe preliminary experiments suggesting that this approach is viable.

...read moreread less

248 citations

Proceedings Article•DOI•

Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System

[...]

Daniel Molka, Daniel Hackenberg, Robert Schöne, Matthias S. Müller

12 Sep 2009

TL;DR: This paper presents fundamental details of the newly introduced Intel Nehalem microarchitecture with its integrated memory controller, Quick Path Interconnect, and ccNUMA architecture, based on sophisticated benchmarks to measure the latency and bandwidth between different locations in the memory subsystem.

...read moreread less

Abstract: Today's microprocessors have complex memory subsystems with several cache levels. The efficient use of this memory hierarchy is crucial to gain optimal performance, especially on multicore processors. Unfortunately, many implementation details of these processors are not publicly available. In this paper we present such fundamental details of the newly introduced Intel Nehalem microarchitecture with its integrated memory controller, Quick Path Interconnect, and ccNUMA architecture. Our analysis is based on sophisticated benchmarks to measure the latency and bandwidth between different locations in the memory subsystem. Special care is taken to control the coherency state of the data to gain insight into performance relevant implementation details of the cache coherency protocol. Based on these benchmarks we present undocumented performance data and architectural properties.

...read moreread less

243 citations

Patent•

Efficient readout from analog memory cells using data compression

[...]

Uri Perlmutter¹, Dotan Sokolov¹, Ofir Shalvi¹, Oren Golov¹•Institutions (1)

Apple Inc.¹

04 Mar 2009

TL;DR: In this article, a method for data storage includes storing data in a group of analog memory cells by writing respective input storage values to the memory cells in the group, and then reading the output storage values from the analog memory cell in the groups.

...read moreread less

Abstract: A method for data storage includes storing data in a group of analog memory cells by writing respective input storage values to the memory cells in the group. After storing the data, respective output storage values are read from the analog memory cells in the group. Respective confidence levels of the output storage values are estimated, and the confidence levels are compressed. The output storage values and the compressed confidence levels are transferred from the memory cells over an interface to a memory controller.

...read moreread less

238 citations

Patent•

Method to efficiently locate meta-data structures on a flash-based storage device

[...]

Robert Haas¹, Xiao-Yu Hu¹, Roman A. Pletka¹•Institutions (1)

IBM¹

30 Nov 2009

TL;DR: In this paper, a method for fast reconstruction of metadata structures on a memory storage device includes writing a plurality of checkpoints holding a root of metadata structure in an increasing order of timestamps.

...read moreread less

Abstract: A method for facilitating fast reconstruction of metadata structures on a memory storage device includes writing a plurality of checkpoints holding a root of metadata structures in an increasing order of timestamps to a plurality of blocks respectively on the memory storage device utilizing a memory controller, where each checkpoint is associated with a timestamp, and wherein the last-written checkpoint contains a root to the latest metadata information from where metadata structures are reconstructed.

...read moreread less

199 citations

Proceedings Article•DOI•

Achieving predictable performance through better memory controller placement in many-core CMPs

[...]

Dennis Abts¹, Natalie Enright Jerger², John Kim³, Dan Gibson⁴, Mikko H. Lipasti⁴ - Show less +1 more•Institutions (4)

Google¹, University of Toronto², KAIST³, University of Wisconsin-Madison⁴

20 Jun 2009

TL;DR: This paper shows how the location of the memory controllers can reduce contention (hot spots) in the on-chip fabric and lower the variance in reference latency, which provides predictable performance for memory-intensive applications regardless of the processing core on which a thread is scheduled.

...read moreread less

Abstract: In the near term, Moore's law will continue to provide an increasing number of transistors and therefore an increasing number of on-chip cores. Limited pin bandwidth prevents the integration of a large number of memory controllers on-chip. With many cores, and few memory controllers, where to locate the memory controllers in the on-chip interconnection fabric becomes an important and as yet unexplored question. In this paper we show how the location of the memory controllers can reduce contention (hot spots) in the on-chip fabric and lower the variance in reference latency. This in turn provides predictable performance for memory-intensive applications regardless of the processing core on which a thread is scheduled. We explore the design space of on-chip fabrics to find optimal memory controller placement relative to different topologies (i.e. mesh and torus), routing algorithms, and workloads.

...read moreread less

167 citations

Patent•

Semiconductor memory device

[...]

Koujiro Hatanaka, Koji Ohishi, Hikaru Kuriyama

23 Apr 2009

TL;DR: In this paper, the authors proposed a memory controller for a semiconductor memory device that can be adapted to a plurality of storage regions whose request levels are different from one another by using each of the memory blocks as a deletion unit.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To provide a highly reliable semiconductor memory device for adapted to a plurality of storage regions whose request levels are different from one another. SOLUTION: The semiconductor memory device 20 comprises: a memory 21 which has a plurality of memory blocks having memory cells capable of storing a plurality of different kinds of data which require memory areas having different characteristics; and a memory controller 22 for managing the memory by using each of the memory blocks as a deletion unit. The memory controller 22 has a function for converting the logical address of the memory 21 into a physical address identifying the memory block, and executes processing to replace the memory block with a preregistered free block in rewriting the memory block. The memory controller 22 manages the different kinds of data stored in the memory 21 so as to store the same kind of data as before even after each of the memory blocks and the free blocks in the memory 21 is rewritten. COPYRIGHT: (C)2009,JPO&INPIT

...read moreread less

161 citations

Journal Article•DOI•

An Analyzable Memory Controller for Hard Real-Time CMPs

[...]

Marco Paolieri, Eduardo Quinones, Francisco J. Cazorla, Mateo Valero

01 Dec 2009-IEEE Embedded Systems Letters

TL;DR: An analyzable JEDEC-compliant DDRx SDRAM memory controller (AMC) for hard real-time CMPs is proposed, that reduces the impact of memory interferences caused by other tasks on WCET estimation, providing a predictable memory access time and allowing the computation of tight WCET estimations.

...read moreread less

Abstract: Multicore processors (CMPs) represent a good solution to provide the performance required by current and future hard real-time systems. However, it is difficult to compute a tight WCET estimation for CMPs due to interferences that tasks suffer when accessing shared hardware resources. We propose an analyzable JEDEC-compliant DDRx SDRAM memory controller (AMC) for hard real-time CMPs, that reduces the impact of memory interferences caused by other tasks on WCET estimation, providing a predictable memory access time and allowing the computation of tight WCET estimations.

...read moreread less

161 citations

Patent•

Method of managing a solid state drive, associated systems and implementations

[...]

Jin Gyu Heo¹, Dong-Gi Lee¹, Seongsik Hwang¹, Dong Jin Lee¹, Jeong Woo Lee¹, Won-Moon Cheon¹, Seung-Ho Lim¹, Jong Min Kim¹, Jae-hwa Lee¹, Haeri Lee¹, Woonhyug Jee¹ - Show less +7 more•Institutions (1)

Samsung¹

29 Sep 2009

TL;DR: In this paper, a boot image from a solid state drive to an operating memory of a computing system during an initialization operation of the computing system is described, where the initialization operation initializes components of the computer system.

...read moreread less

Abstract: One embodiment of a method includes loading, by a memory controller, a boot image from a solid state drive to an operating memory of a computing system during an initialization operation of the computing system. The initialization operation initializes components of the computing system.

...read moreread less

153 citations

Patent•

Solid state drive power safe wear-leveling

[...]

Mei-Man L. Syu¹•Institutions (1)

Western Digital¹

04 Mar 2009

TL;DR: In this paper, it is determined that a second erase counter associated with a second zip code is low relative to at least one other erase counter, and based on this determination, data from blocks in the second code may be written to new blocks as part of a wear-leveling operation.

...read moreread less

Abstract: A solid state drive includes a plurality of flash memory devices, and a memory controller coupled to the plurality of flash memory devices. The memory controller is configured to logically associate blocks from the plurality of flash memory devices to form zip codes, the zip codes associated with corresponding erase counters. The solid state drive further includes a processor and a computer-readable memory having instructions stored thereon. The processor may perform a wear-leveling operation by determining that blocks in a first zip code have been erased and incrementing a first erase counter associated with the first zip code. It may then be determined that a second erase counter associated with a second zip code is low relative to at least one other erase counter, and based on this determination, data from blocks in the second zip code may be written to new blocks as part of a wear-leveling operation.

...read moreread less

Proceedings Article•DOI•

Complexity effective memory access scheduling for many-core accelerator architectures

[...]

George L. Yuan¹, Ali Bakhoda¹, Tor M. Aamodt¹•Institutions (1)

University of British Columbia¹

12 Dec 2009

TL;DR: This paper proposes a complexity-effective solution to DRAM request scheduling which recovers most of the performance loss incurred by a naive in-order first-in first-out (FIFO) DRAM Scheduler compared to an aggressive out-of-order DRAM scheduler.

...read moreread less

Abstract: Modern DRAM systems rely on memory controllers that employ out-of-order scheduling to maximize row access locality and bank-level parallelism, which in turn maximizes DRAM bandwidth. This is especially important in graphics processing unit (GPU) architectures, where the large quantity of parallelism places a heavy demand on the memory system. The logic needed for out-of-order scheduling can be expensive in terms of area, especially when compared to an in-order scheduling approach. In this paper, we propose a complexity-effective solution to DRAM request scheduling which recovers most of the performance loss incurred by a naive in-order first-in first-out (FIFO) DRAM scheduler compared to an aggressive out-of-order DRAM scheduler. We observe that the memory request stream from individual GPU "shader cores" tends to have sufficient row access locality to maximize DRAM efficiency in most applications without significant reordering. However, the interconnection network across which memory requests are sent from the shader cores to the DRAM controller tends to finely interleave the numerous memory request streams in a way that destroys the row access locality of the resultant stream seen at the DRAM controller. To address this, we employ an interconnection network arbitration scheme that preserves the row access locality of individual memory request streams and, in doing so, achieves DRAM efficiency and system performance close to that achievable by using out-of-order memory request scheduling while doing so with a simpler design. We evaluate our interconnection network arbitration scheme using crossbar, mesh, and ring networks for a baseline architecture of 8 memory channels, each controlled by its own DRAM controller and 28 shader cores (224 ALUs), supporting up to 1,792 in-flight memory requests. Our results show that our interconnect arbitration scheme coupled with a banked FIFO in-order scheduler obtains up to 91% of the performance obtainable with an out-of-order memory scheduler for a crossbar network with eight-entry DRAM controller queues.

...read moreread less

Proceedings Article•

Complexity effective memory access scheduling for many-core accelerator architectures

[...]

Yuan, Bakhoda, Aamodt

01 Jan 2009

Journal Article•DOI•

Multicore DIMM: an Energy Efficient Memory Module with Independently Controlled DRAMs

[...]

Jung Ho Ahn¹, Jacob Leverich², Robert Schreiber¹, Norm Jouppi¹•Institutions (2)

Hewlett-Packard¹, Stanford University²

01 Jan 2009-IEEE Computer Architecture Letters

TL;DR: The Multicore DIMM is designed to improve the energy efficiency of memory systems with small impact on system performance, where DRAM chips are grouped into multiple virtual memory devices, each of which has its own data path and receives separate commands.

...read moreread less

Abstract: Demand for memory capacity and bandwidth keeps increasing rapidly in modern computer systems, and memory power consumption is becoming a considerable portion of the system power budget. However, the current DDR DIMM standard is not well suited to effectively serve CMP memory requests from both a power and performance perspective. We propose a new memory module called a multicore DIMM, where DRAM chips are grouped into multiple virtual memory devices, each of which has its own data path and receives separate commands. The Multicore DIMM is designed to improve the energy efficiency of memory systems with small impact on system performance. Dividing each memory modules into 4 virtual memory devices brings a simultaneous 22%, 7.6%, and 18% improvement in memory power, IPC, and system energy-delay product respectively on a set of multithreaded applications and consolidated workloads.

...read moreread less

Patent•

Non-volatile memory storage system with two-stage controller architecture

[...]

Roger Chin, Gary Wu

17 Feb 2009

TL;DR: In this article, a non-volatile memory storage system with two-stage controller is described, consisting of a plurality of flash memory devices, a plurality first-stage controllers coupled to the plurality of devices, and a storage adapter communicating with the first stage controllers through one or more internal interfaces.

...read moreread less

Abstract: The present invention discloses a non-volatile memory storage system with two-stage controller, comprising: a plurality of flash memory devices; a plurality of first stage controllers coupled to the plurality of flash memory devices, respectively, wherein each of the first stage controllers performs data integrity management as well as writes and reads data to and from a corresponding flash memory device; and a storage adapter communicating with the plurality of first stage controllers through one or more internal interfaces.

...read moreread less

Patent•

Erased page detection

[...]

ChengKuo Huang, Siu-Hung Frederick Au

10 Sep 2009

TL;DR: A flash memory device and method as discussed by the authors includes a memory having a plurality of nonvolatile memory cells for storing stored values of user data and a memory controller includes an encoder for encoding user write data for storage of code values as the stored values in the memory.

...read moreread less

Abstract: A memory device and method, such as a flash memory device and method, includes a memory having a plurality of nonvolatile memory cells for storing stored values of user data. The memory device and method includes a memory controller for controlling the memory. The memory controller includes an encoder for encoding user write data for storage of code values as the stored values in the memory. The encoder includes an inserter for insertion of an indicator as part of the stored values for use in determining when the stored values are or are not in an erased state. The memory controller includes a decoder for reading the stored values from the memory to form user read data values when the stored values are not in the erased state.

...read moreread less

Patent•

Method and circuit for configuring memory core integrated circuit dies with memory interface integrated circuit dies

[...]

Suresh N. Rajan

27 Jul 2009

TL;DR: In this paper, a memory device comprises a first and second integrated circuit dies, and a speed test on the memory core integrated circuit is conducted, and the interface integrated circuit die is electrically coupled to the memory-core integrated circuit, based on the speed of the memory.

...read moreread less

Abstract: A memory device comprises a first and second integrated circuit dies. The first integrated circuit die comprises a memory core as well as a first interface circuit. The first interface circuit permits full access to the memory cells (e.g., reading, writing, activating, pre-charging and refreshing operations to the memory cells). The second integrated circuit die comprises a second interface that interfaces the memory core, via the first interface circuit, an external bus, such as a synchronous interface to an external bus. A technique combines memory core integrated circuit dies with interface integrated circuit dies to configure a memory device. A speed test on the memory core integrated circuit dies is conducted, and the interface integrated circuit die is electrically coupled to the memory core integrated circuit die based on the speed of the memory core integrated circuit die.

...read moreread less

Patent•

Main memory with non-volatile memory and dram

[...]

Jeffrey C. Mogul¹, Eduardo Argollo de Oliveira Dias¹, Paolo Faraboschi¹, Mehul A. Shah¹•Institutions (1)

Hewlett-Packard¹

17 Sep 2009

TL;DR: One embodiment of main memory is main memory that includes a combination of non-volatile memory (NVM) and dynamic random access memory (DRAM) as mentioned in this paper, and an operating system migrates data between the NVM and the DRAM.

...read moreread less

Abstract: One embodiment is main memory that includes a combination of non-volatile memory (NVM) and dynamic random access memory (DRAM). An operating system migrates data between the NVM and the DRAM.

...read moreread less

Patent•

Stacked memory module and system

[...]

Uk-Song Kang¹, Hoe-ju Chung¹, Jang-Seok Choi¹, Hoon Lee¹•Institutions (1)

Samsung¹

12 May 2009

TL;DR: In this article, a three dimensional memory module and system are formed with at least one slave chip stacked over a master chip, which includes a memory core for increased capacity of the memory module/system.

...read moreread less

Abstract: A three dimensional memory module and system are formed with at least one slave chip stacked over a master chip. Through semiconductor vias (TSVs) are formed through at least one of the master and slave chips. The master chip includes a memory core for increased capacity of the memory module/system. In addition, capacity organizations of the three dimensional memory module/system resulting in efficient wiring is disclosed for forming multiple memory banks, multiple bank groups, and/or multiple ranks of the three dimensional memory module/system.

...read moreread less

Proceedings Article•DOI•

Decoupled DIMM: building high-bandwidth memory system using low-speed DRAM devices

[...]

Zheng Hongzhong¹, Jiang Lin², Zhao Zhang³, Zhichun Zhu¹•Institutions (3)

University of Illinois at Chicago¹, IBM², Iowa State University³

20 Jun 2009

TL;DR: A new memory system design called decoupled DIMM is proposed that allows the memory bus to operate at a data rate much higher than that of the DRAM devices, and improves reliability, power efficiency, and cost effectiveness by using relatively slow memory devices.

...read moreread less

Abstract: The widespread use of multicore processors has dramatically increased the demands on high bandwidth and large capacity from memory systems. In a conventional DDR2/DDR3 DRAM memory system, the memory bus and DRAM devices run at the same data rate. To improve memory bandwidth, we propose a new memory system design called decoupled DIMM that allows the memory bus to operate at a data rate much higher than that of the DRAM devices. In the design, a synchronization buffer is added to relay data between the slow DRAM devices and the fast memory bus; and memory access scheduling is revised to avoid access conflicts on memory ranks. The design not only improves memory bandwidth beyond what can be supported by current memory devices, but also improves reliability, power efficiency, and cost effectiveness by using relatively slow memory devices. The idea of decoupling, precisely the decoupling of bandwidth match between memory bus and a single rank of devices, can also be applied to other types of memory systems including FB-DIMM.Our experimental results show that a decoupled DIMM system of 2667MT/s bus data rate and 1333MT/s device data rate improves the performance of memory-intensive workloads by 51% on average over a conventional memory system of 1333MT/s data rate. Alternatively, a decoupled DIMM system of 1600MT/s bus data rate and 800MT/s device data rate incurs only 8% performance loss when compared with a conventional system of 1600MT/s data rate, with 16% reduction on the memory power consumption and 9% saving on memory energy.

...read moreread less

Patent•

State of health monitored flash backed dram module

[...]

Mark Moshayedi, Douglas Finke

11 Feb 2009

TL;DR: In this paper, a non-volatile memory is used to move data from a volatile memory to a nonvolatile one upon a loss of power of a primary power source of the volatile memory.

...read moreread less

Abstract: A device includes: non-volatile memory; a controller in communication with the non-volatile memory, wherein the controller is programmed to move data from a volatile memory to the non-volatile memory upon a loss of power of a primary power source of the volatile memory; and a backup power supply providing temporary power to the controller and the volatile memory upon the loss of power of the primary power source, including: a capacitor bank with an output terminal; a connection to a voltage source that charges the capacitor bank to a normal operating voltage; and a state-of-health monitor that is programmed to generate a failure signal based on a voltage at the output terminal of the capacitor bank.

...read moreread less

Proceedings Article•DOI•

An Adaptive-Rate Error Correction Scheme for NAND Flash Memory

[...]

Te-Hsuan Chen¹, Yu-Ying Hsiao¹, Yu-Tsao Hsing¹, Cheng-Wen Wu¹•Institutions (1)

National Tsing Hua University¹

03 May 2009

TL;DR: This work proposes an adaptive-rate ECC scheme with BCH codes that is implemented on the flash memory controller that can trade storage space for higher error correction capability to keep it usable even when there is a high noise level.

...read moreread less

Abstract: ECC has been widely used to enhance flash memory endurance and reliability. In this work, we propose an adaptive-rate ECC scheme with BCH codes that is implemented on the flash memory controller. With this scheme, flash memory can trade storage space for higher error correction capability to keep it usable even when there is a high noise level.

...read moreread less

Patent•

Pcm memories for storage bus interfaces

[...]

Emanuele Confalonieri¹, Manuela Scognamiglio¹, Pederico Tiziani¹•Institutions (1)

Micron Technology¹

14 May 2009

TL;DR: A memory controller for phase change memory (PCM) that can be used on a storage bus interface is described in this paper, where the memory controller includes an external bus interface coupled to a external bus to communicate read and write instructions with an external device.

...read moreread less

Abstract: A memory controller for a phase change memory (PCM) that can be used on a storage bus interface is described. In one example, the memory controller includes an external bus interface coupled to an external bus to communicate read and write instructions with an external device, a memory array interface coupled to a memory array to perform reads and writes on a memory array, and an overwrite module to write a desired value to a desired address of the memory array.

...read moreread less

Patent•

Memory board with self-testing capability

[...]

Hyun Lee, Jayesh R. Bhakta, Soonju Choi

13 Apr 2009

TL;DR: A self-testing memory module as discussed by the authors includes a printed circuit board configured to be operatively coupled to a memory controller of a computer system and includes a plurality of memory devices, each memory device of the plurality comprising data, address, and control ports.

...read moreread less

Abstract: A self-testing memory module includes a printed circuit board configured to be operatively coupled to a memory controller of a computer system and includes a plurality of memory devices on the printed circuit board, each memory device of the plurality of memory devices comprising data, address, and control ports. The memory module also includes a control module configured to generate address and control signals for testing the memory devices. The memory module includes a data module comprising a plurality of data handlers. Each data handler is operable independently from each of the other data handlers of the plurality of data handlers. Each data handler is operatively coupled to a corresponding plurality of the data ports of one or more of the memory devices and is configured to generate data for writing to the corresponding plurality of data ports.

...read moreread less

Proceedings Article•DOI•

A 2ns-read-latency 4Mb embedded floating-body memory macro in 45nm SOI technology

[...]

Anant Singh, Michael K. Ciraula¹, Don R. Weiss¹, John J. Wuu¹, Philippe Bauser, Paul de Champs, Hamid Daghighian, David Fisch, Philippe Graber, Michel Bron - Show less +6 more•Institutions (1)

Advanced Micro Devices¹

29 May 2009

TL;DR: A floating-body Z-RAM® memory cell is presented to fabricate a high-density low-latency and high-bandwidth 4Mb memory macro building block, targeted at the requirements of microprocessor caches.

...read moreread less

Abstract: To meet advancing market demands, microprocessor embedded memory applications require denser and faster memory arrays with each process generation. Recent work presented an 18.5ns 128Mb DRAM with a floating body cell for conventional DRAM products [1] and a 4Mb memory macro using a memory cell built with two floating body transistors [2]. This paper presents a floating-body Z-RAM® memory cell [3] to fabricate a high-density low-latency and high-bandwidth 4Mb memory macro building block, targeted at the requirements of microprocessor caches. It uses a single transistor (1T), unlike traditional 1T1C DRAM [4], or six transistor 6T-SRAM memory cells [5].

...read moreread less

Patent•

Hardware and operating system support for persistent memory on a memory bus

[...]

Jeremy P. Condit¹, Edmund B. Nightingale¹, Benjamin C. Lee¹, Engin Ipek¹, Christopher Frost¹, Douglas C. Burger¹ - Show less +2 more•Institutions (1)

Microsoft¹

16 Jun 2009

TL;DR: In this article, a file system that is supported by a nonvolatile memory that is directly connected to a memory bus, and placed side by side with a dynamic random access memory (DRAM), is described.

...read moreread less

Abstract: Implementations of a file system that is supported by a non-volatile memory that is directly connected to a memory bus, and placed side by side with a dynamic random access memory (DRAM), are described.

...read moreread less

Patent•

Program failure handling in nonvolatile memory

[...]

Chris Nga Yee Avila¹, Jonathan Hsu¹, Alexander Kwok-Tung Mak¹, Jian Chen¹, Grishma Shah¹ - Show less +1 more•Institutions (1)

SanDisk¹

16 Jun 2009

TL;DR: In a nonvolatile memory system as mentioned in this paper, data received from a host by a memory controller is transferred to an on-chip cache, and new data from the host displaces the previous data before it is written to the NVRAM array.

...read moreread less

Abstract: In a nonvolatile memory system, data received from a host by a memory controller is transferred to an on-chip cache, and new data from the host displaces the previous data before it is written to the nonvolatile memory array. A safe copy is maintained in on-chip cache so that if a program failure occurs, the data can be recovered and written to an alternative location in the nonvolatile memory array.

...read moreread less

Patent•

Integrated circuit with secondary-memory controller for providing a sleep state for reduced power consumption and method therefor

[...]

James Lyall Esliger

20 May 2009

TL;DR: In this article, a method comprising determining that a minimum operation level of an integrated circuit (100) has been reached and that a sleep mode is therefore allowable, storing minimum operation context information to a RAM (115) in response to determining that the minimum operating level has not yet been reached, switching to sleep mode code (116) in the RAM, and transferring memory control from a primary memory controller (104) to a secondary memory controller(112) where only the SMC can control the RAM is used.

...read moreread less

Abstract: A method comprising determining that a minimum operation level of an integrated circuit (100) has been reached and that a sleep mode is therefore allowable; storing minimum operation context information to a RAM (115) in response to determining that the minimum operation level has been reached; switching to a sleep mode code (116) in the RAM (115); and transferring memory control from a primary memory controller (104) to a secondary memory controller (112) wherein only the secondary memory controller (112) controls the RAM (115). The method may include storing the sleep mode code (116) and a wakeup code (117) in the RAM (115) in response to determining that sleep mode is allowable, where the wakeup code (117) restores a minimum operation context using the minimum operation context information stored in the RAM (115). The method may also include placing a plurality of integrated circuit power islands into a sleep mode and leaving a secondary memory controller power island (109) in a normal power mode.

...read moreread less

Patent•

Memory-controller-embedded apparatus and procedure for achieving system-directed checkpointing without operating-system kernel support

[...]

Jack Justin Stiffler, Donald D. Burn

16 Oct 2009

TL;DR: In this paper, the authors propose to use bit-map memory to reduce the amount of memory-to-memory copying required to establish a checkpoint in a post-image checkpointing scenario.

...read moreread less

Abstract: System-directed checkpointing is enabled in otherwise standard computers through relatively straightforward augmentations to the computer's memory controller hub. Firmware routines executed by a control and dispatch unit that is normally part of any memory controller hub enable it to implement any of six different checkpointing strategies: post-image checkpointing in which an image of the system state at the time of the last checkpoint is maintained in a local shadow memory; post-image checkpointing in which an image of the system state at the time of the last checkpoint is maintained in a shadow memory located in a second, backup computer; post-image checkpointing using a bit-map memory, having one bit representing each data block in system memory, to reduce the amount of memory-to-memory copying required to establish a checkpoint; post-image checkpointing to a local shadow memory using two bit map memories to enable normal processing to continue while the shadow is being updated, post-image checkpointing to a local shadow memory using a block-state memory that eliminates the need for any memory-to-memory copying; and local pre-image checkpointing that does not require a shadow memory. Since each of these implementations has advantages and disadvantages relative to the others and since similar mechanisms are used in the memory controller hub for all of these options, it can be designed to support all of them with hardwired or settable status bits defining which is to be supported in a given situation.

...read moreread less

Patent•

Graphics Processing System with Enhanced Memory Controller

[...]

Farhad Fouladi, Winyan Yen Winnie, Howard H. Cheng

22 May 2009

Collapse