Matthew Wachs

Fourth Year (as of Fall, 2007) Ph.D. Student

Computer Science Department
School of Computer Science
Carnegie Mellon University

Parallel Data Laboratory

E-mail: Look at the URL of this page. Take everything between the tilde (~) and the following slash (/), append "+web" to it, then append "@cs.cmu.edu" to it. Alternatively, look for my email address at or near the top of this list.



Current Research

I'm working on performance insulation for shared storage servers. Shared storage servers are an appealing alternative to per-application, dedicated storage systems. However, it is essential that applications sharing a server receive good performance, fairness, and efficiency. Unfortunately, interference between workloads may reduce all three of these. With a combination of three techniques, we've been able to approach the goal of providing each of n clients 1/n of their standalone throughput, while keeping response times reasonable [read more | web site]. The next step is to provide similar guarantees to a workload using multiple servers to store its data (using RAID or erasure coding).

I'm also working on parallel application I/O tracing for benchmarking. The best benchmark for a real application is the real application, or trace replay based on traces from that application. Unfortunately, running the real application against a new or different storage system can be difficult, or even impossible if the application or data set are classified, confidential, or sensitive. Trace replay can be significantly more straightforward and can be done with 'dummy' data. For parallel applications, however, accurate trace replay requires respecting the dependencies between multiple nodes. Thus, it is necessary to discover these dependencies during the trace extraction process. We've proposed and implemented a black-box technique to do this by running a parallel application, slowing down nodes, and observing how other nodes react [read more | web site]. The next step is to design a technique that involves less tracing time and can handle applications whose request patterns change when the timing of nodes is altered.

My advisor is Professor Greg Ganger in the Parallel Data Laboratory at CMU.



Publications
Please visit the PDL web site if the above links do not work.



Support

I appreciate the support of an NDSEG (National Defense Science and Engineering) Graduate Fellowship, thanks to the Air Force Office of Scientific Research (AFOSR).



Education

I double-majored in Computer Science and Math in the College of Arts and Sciences at Cornell University. I graduated in May, 2004.



Teaching

I was a teaching assistant for 15-213 (Fall 2007), Carnegie Mellon's course on computer architecture from a programmer's perspective (such as representation of ints and floats, understanding assembly language, and buffer overflows). It was taught by Professor Todd Mowry and Professor Greg Ganger.

I was a teaching assistant for CS 381 in Fall 2003 with Professor John Hopcroft. CS 381 is Cornell's required CS theory course covering finite automata, context-free languages, and Turing machines.

I was a teaching assistant for CS 482 in Spring 2004 with Professor Jon Kleinberg. CS 482 is Cornell's required CS theory course covering algorithms topics such as greedy algorithms, dynamic programming, network flow, and NP-completeness.



Last Modified: February 15, 2008