Project Proposal for CS740: Computer Architecture

Fast Block Operation in DRAM

Group Member :  Ningning Hu (hnn@cs.cmu.edu)
                              Jichuan Chang  (cjc@cs.cmu.edu)
Project Home Page: http://www.cs.cmu.edu/~hnn/cs740-project.html

Project Description: In a traditional DRAM chips, an entire row of bits are read into a latch upon a RAS signal (subsequent CAS signals are used to access individual bits within this row). The latched row values must be written back to the DRAM row after each access, since reading the row is a destructive operation. We plan to modify the DRAM chip so that we can specify that the current contents of the latch be written back to an arbitrary row in the DRAM cell. This would be a variation on a RAS signal, which causes a write rather than a read of a row. If we can make all of the DRAM chips in the system do this simultaneously, we can potentially copy a whole block of bits in just two DRAM cycles, thus improve the system performance greatly. We will also consider other kinds of operation executed directly in DRAM, and try to improve their performance. Fast block operation in DRAM can move large blocks of data quickly from one region of memory to another, as well as quickly clear a large block of memory (when allocating a new page). We will study how to implement these functions in DRAM and evaluate the performance improvement.

Plan of Attack: We will first add a special instruction to implement the fast memory copy operation in SimpleScalar. Our idea is to use this instruction to replace the one in the most popular memory operations in common Glib library, such as memcpy() and memset(), and try to take full advantage of the special block operations in DRAM. When copying consecutive memory blocks, we will not read the data into register or cache, but write directly back to the destination address, using the DRAM read-write mechanism. We hope it could improve the performance of some typical memory operation so as to improve the performance of the whole system (operating system also performs memory copy operation frequently).

Schedule:

The above tasks are supposed to be done by Jichuan Chang and Ningning Hu together.

Milestone: By Nov. 20,  we should have finished the implementation of the new memory instruction on the simulator and should be on the way of evaluation.


Literature Search:

Resources Needed: Getting Started: We have already read the related papers on advanced memory systems, and have finished part of the installation of SimpleScalar (we met some trouble when installing SimpleScalar's Gcc and Glib on Linux).