RECITATION NOTES 15-451 Algorithms 10/22/03 * hand out and go over midterm * related problems * more on graph matrix algorithms [if time] ========================================================================== Midterm solutions: 1. can ask people why for each one. (a) O(n log n) (b) O(n^3) (c) O(n) (d) O(n) (e) m/2 (f) false (for several reasons: Omega allows multiplicative constant, it has the "n_0" built in, etc) (g) any level-order of the tree will do. (h) all but the first one. (i) store number of elements below it. Why not number less? Because would need to update this even if something is inserted far away. How can you use the number of elements below it to figure out kth smallest: look at numbers in left and right child of root and can use them to make decision about which way to go. related issues: properties of DFS trees: all other edges must go from a node to one of its ancestors. Can use DFS to easily tell if a graph has cycles. 2. (a) there are several ways to do the proof. Proof #1: hash everything using h_1. Since the range is {0,1}, we can view as putting everything into two buckets. Take the larger bucket and hash them using h_2. Again take the larger bucket and so on. Each time this at most cuts down by a factor of 2. At the end, there must be a bucket with at least 2 elements. Call them x and y. They hashed the same way under all k functions. Proof #2: for each x in U can look at its "profile" (h_1(x),...,h_k(x). There are only 2^k possible profiles. Since |U| > 2^k, this means there must exist two elements x and y with the same profile. By defn of "profile", they collide under every h in H. (b) we know there must exist x and y that collide under all of the first n-1 functions in H. We need to make the prob of collision be at most 1/2. This means we need at least n-1 other functions where x and y don't collide. This is at least 2(n-1) total functions. 3. There are a couple ways to do this. For simplicity will just talk about *sizes*. Method #1: for each v, compute (1) C_v = size of best VC for T_v. (2) C'_v = size of best VC for T_v given that it must contain v. We can now combine these as follows: C'_v = 1 + sum_{children w of v} C_w C_v = min(C'_v, sum_{children w of v} C'_w Base case: if v is leaf then C_v = 0, C'_v = 1. Method #2: we could instead define (1) C_v = size of best VC for T_v given that it must NOT contain v. (2) C'_v = size of best VC for T_v given that it must contain v. These combine in this way: C'_v = 1 + sum_{children w of v} min(C_w, C'_w) C_v = sum_{children w of v} C'_w Both of these algorithms are linear time. Why? Method #3: actually this problem can be solved without DP. For each leaf w, we know we must have either w or its parent v in the VC. So, we might as well put in v. Then can strip out v and w from the tree and recurse on the forest left over. 4. (a) If all edges have different lengths, then exists a unique MST. Proof: say M is different from Prim's tree. Consider first edge e that Prim puts in (adding to its current tree T) that's not in M. If we add e to M we get a cycle in M. Walk around the cycle until we get back to T. The edge e' we take to get back into T must be longer than e, by defn of Prim's alg. So replacing e' with e in M gives a better tree, which is a contradiction. Common problems: if you go in the other direction (pick some e' in M but not in Prim's) and then try to find e in Prim's to swap it with, you need to prove that there is an edge e that will work. Not just any arbitrary one will do. Also, some people said that since all edges are distinct, then if Prim's tree and M differ in only one edge-swap, they must have different costs. But that doesn't hold if they differ in > 1 edge swap. (b) false. E.g., maybe all edges in G have length 1, but G itself is a tree so they have to all be in the MST. (c) true. To prove this, we need to argue that each time we delete an edge, there is still an MST consistent with the graph remaining. Say we are about to delete edge (u,v) and M is an MST containing (u,v). By definition of our algorithm, since removing (u,v) doesn't disconnect our current graph, we can remove (u,v) from M, and then reconnect the two pieces of M using some other edge in our current graph. This edge can only be shorter (or equal to) the length of (u,v). ======================= [some recitations didn't get to this] Other topics: questions about graph matrix algorithms? In lecture we did a Prim-like algorithm for finding the maximum bottleneck path. Suppose we wanted to find this for every pair of nodes. Say each edge e has a capacity c_e. What we want is for each pair of vertices (s,t) to find out what is capacity of the highest-capacity path from s to t. (Or, to use a more Pittsburgh-like example, say the edges are tunnels, and c_e is the height of the tallest truck that can use tunnel e. We are trying to compute, for each pair (s,t), is the height of the tallest truck that can get from s to t). We are just trying to compute the values, not the actual paths. How could we do this by changing the defn of matrix multiplication? To start, want A = max capacity path using 1 or fewer edges. How to construct? (infinities on diagonal. For each edge e=(i,j) in E, set A[i,j]=c_e. Rest of entries are 0. Now, say we want to compute the widest path using 2 or fewer edges. How do we want to change the defn of matrix multiplication?? Ans: We change "*" to "min" and change "+" to "max". Then we just do repeated squaring of the matrix for log(n) times.