ABSTRACT

    Carnegie Mellon, School of Computer Science

    Weaving Relations for Cache Performance

    Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, Marios Skounakis

    Carnegie Mellon University
    Pittsburgh, PA 15213

    Relational database systems have traditionally optimzed for I/O performance and organized records sequentially on disk pages using the N-ary Storage Model (NSM) (a.k.a., slotted pages). Recent research, however, indicates that cache utilization and performance is becoming increasingly important on modern platforms. In this paper, we first demonstrate that in-page data placement is the key to high cache performance and that NSM exhibits low cache utilization on modern platforms. Next, we pro­pose a new data organization model called PAX (Partition Attributes Across), that significantly improves cache perfor­mance by grouping together all values of each attribute within each page. Because PAX only affects layout inside the pages, it incurs no storage penalty and does not affect I/O behavior. According to our experimental results, when compared to NSM (a) PAX exhibits superior cache and memory bandwidth utiliza­tion, saving at least 75% of NSM’s stall time due to data cache accesses, (b) range selection queries and updates on memory-resident relations execute 17-25% faster, and (c) TPC-H queries involving I/O execute 11-48% faster.

    FULL PAPER: pdf


    Last updated 10 September, 2004