15-721 Assignment 2:
Heap Files

Out: September 24th -- Due: October 8th

Introduction

This assignment has two major parts. You have to do the following:

  1. Familiarize yourself with the heap file and heap file page interfaces. You should already be familiar with the buffer manager and disk space manager interfaces from the previous assignment.
  2. Implement the HeapFile class--your solution for this part should be contained in heapfile.C (along with any modifications you may want to make in heapfile.h).
  3. Implement the HFPage class--your solution for this part should be contained in hfpage.[Ch].

The main contact person for this assignment is Spiros Papadimitriou. Please direct your questions, complaints and/or flames to spapadim+@cs.cmu.edu!

Prerequisites

From the previous assignment, you should be familiar with how the buffer manager works. Make sure that you know how the heap file is organized and works. You should also be familiar with the slotted page model and how variable length records can be stored in such a page. In addition, HTML documentation is available for Minibase (mirrored locally on the course webpages).

Classes to Familiarize Yourself With First

There are four main classes with which you should be familiar: HeapFile, HFPage, BufferMgr and DB. A heap file is seen as a collection of records. Internally, records are stored on a collection of HFPage objects.

Compiling the Code and Running the Tests

Copy all files from /afs/cs/academic/class/15721-f01/mini_hwk/assign/Heap/src into your own local working directory and study them carefully. The files are:

First of all, create the files for your solution, ie. heapfile.C and hfpage.C. Then do not forget to run `make depend'! You need to run this only once (or whenever you change included files etc.) to produce Makefile.depend.

Run `make heap' (or just `make'). This will compile the .C files in your directory, and create a program called heap. Run this command, and you should see output from running all tests.

Besides the default heap target, the provided Makefile also offers the following options:

Note that if you submit a modified Makefile, your default target must be heap and produce an executable of the same name--our automated testing scripts will expect this, so please follow this!

Finally, as in the previous assignment, you should use the GNU version of `make'!

Part One: Heap File

In this part you will implement the overall heap file manager. You can ignore concurrency control and recovery issues, and concentrate on implementing the interface routines.

A heap file is an unordered set of records that supports the following set of operations:

  1. Heap files can be created and destroyed.
  2. Existing heap files can be opened and closed.
  3. Records can be inserted and deleted.
  4. Each record is uniquely identified by a record ID (RID). A specific record can be retrieved by using its record ID.
The above correspond to the public member functions we expect you to implement--private members are entirely up to you. Note that the destructor should simply call deleteFile(), which in turn permanently wipes out the heap file from the database.

There are two main pieces of code to be written:

  1. The free space manager: You must deal with free space intelligently, using a directory based structure (recommended) or a linked list to identify pages with room for records. When a record is deleted, you must update your information about available free space, and when a record is inserted, you must try to use free space on existing pages before allocating a new page.
  2. Implement the class Scan (instances of which are returned by openScan()), which performs a scan on the heap file. A scan iterator supports a simple getNext() interface. Successive calls to the getNext() function return successive elements in the heap file. You have to maintain state information (which depends on your particular implementation of your heapfile) in the scan class in order to achieve this effect.

Part Two: Heap File Page

In this part, you will implement the page structure for the heap file layer (ie. the member function for the HFPage class). You will be given libraries for the lower layers (Buffer Manager and Disk Space Manager), code for the HeapFile layer, and some driver routines to test that code.

Design Overview and Implementation Details

Have a look at the file hfpage.h. It contains the interfaces for the HFPage class. This class implements a "heap-file page" object. Note that the protected data members of the page are given to you. All you should need to do is implement the public member functions. You should put all your code into the file hfpage.C.

A note on the slot directory: The first entry of the slot directory (i.e., HFPage::slot[0]) is right at the end of the data area of the page. If you want to add more slots to the page (and you will), you will have to put them in the data area of the page, which means two things:

  1. New slots will be at negative offsets in the slot directory array, i.e. the next slot after 0 will be found at HFPage::slot[-1], the one after that at HFPage::slot[-2], etc.
  2. In order to add a record to a page, there has to be room for the record itself in the data area, and also room for a new slot in the data area (unless there happens to be pre-allocated slot that's empty).

The Methods to be Implemented

Here is an overview of the API you have to implement:

What to Turn In, and When

You should turn in copies of your code (ie. at least the files heapfile.[Ch] and hfpage.[Ch]). Please use the same scheme as for the previous assignment, ie. copy your files into:

/afs/cs/academic/class/15721-f01/handin/[AndrewID]/Heap.
If you have not created any more files and/or modified the Makefile, then you should be able to use the `make handin' shorthand.

If anything goes terribly wrong with the AFS handin directories, you can email your files to one of the TAs before the deadline is over. The uuencode target is provided once again, should you need it (although standard MIME attachments are preferred).

The assignment is due at midnight on October 8th. The solution will be made public after that time, and solutions turned in after that time will not receive any credit. So be sure to turn in whatever you have, even if it is not working fully, at that time.

About this document ...

This document was generated using the LaTeX2HTML translator Version 99.1 release (March 30, 1999)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.