RECITATION NOTES           15-451 Algorithms                  10/22/03

* hand out and go over midterm
* related problems
* more on graph matrix algorithms [if time]
==========================================================================

Midterm solutions: 
1. can ask people why for each one.
   (a) O(n log n)
   (b) O(n^3)
   (c) O(n)
   (d) O(n)
   (e) m/2

   (f) false (for several reasons: Omega allows multiplicative
	     constant, it has the "n_0" built in, etc)

   (g) any level-order of the tree will do.

   (h) all but the first one.

   (i) store number of elements below it.  Why not number less?
   Because would need to update this even if something is inserted far
   away.  How can you use the number of elements below it to figure
   out kth smallest: look at numbers in left and right child of root
   and can use them to make decision about which way to go.


related issues:  properties of DFS trees: all other edges must go from
		 a node to one of its ancestors.  Can use DFS to
		 easily tell if a graph has cycles.


2. (a) there are several ways to do the proof.

       Proof #1: hash everything using h_1.  Since the range is {0,1},
       we can view as putting everything into two buckets.  Take the
       larger bucket and hash them using h_2.  Again take the larger
       bucket and so on.  Each time this at most cuts down by a factor
       of 2.  At the end, there must be a bucket with at least 2
       elements.  Call them x and y.  They hashed the same way under
       all k functions.

       Proof #2: for each x in U can look at its "profile"
       (h_1(x),...,h_k(x).  There are only 2^k possible profiles.
       Since |U| > 2^k, this means there must exist two elements x and
       y with the same profile.  By defn of "profile", they collide
       under every h in H.

   (b) we know there must exist x and y that collide under all of the
       first n-1 functions in H.  We need to make the prob of collision be
       at most 1/2.  This means we need at least n-1 other functions where
       x and y don't collide.  This is at least 2(n-1) total functions.


3.  There are a couple ways to do this.  For simplicity will just talk
    about *sizes*.

    Method #1: for each v, compute
	   (1) C_v = size of best VC for T_v.
	   (2) C'_v = size of best VC for T_v given that it must contain v.

	   We can now combine these as follows:

	   C'_v = 1 + sum_{children w of v} C_w
	   C_v = min(C'_v, sum_{children w of v} C'_w

	   Base case: if v is leaf then C_v = 0, C'_v = 1.


    Method #2: we could instead define
	   (1) C_v = size of best VC for T_v given that it must NOT contain v.
	   (2) C'_v = size of best VC for T_v given that it must contain v.

	   These combine in this way:
	   C'_v = 1 + sum_{children w of v} min(C_w, C'_w)
	   C_v = sum_{children w of v} C'_w

Both of these algorithms are linear time.  Why?

    Method #3: actually this problem can be solved without DP. For
        each leaf w, we know we must have either w or its parent v in the
	VC.  So, we might as well put in v.  Then can strip out v and w
	from the tree and recurse on the forest left over.


4. 

(a) If all edges have different lengths, then exists a unique MST.

   Proof: say M is different from Prim's tree.  Consider first edge e
   that Prim puts in (adding to its current tree T) that's not in M.
   If we add e to M we get a cycle in M.  Walk around the cycle until
   we get back to T.  The edge e' we take to get back into T must be
   longer than e, by defn of Prim's alg.  So replacing e' with e in M
   gives a better tree, which is a contradiction.

   Common problems: if you go in the other direction (pick some e' in
   M but not in Prim's) and then try to find e in Prim's to swap it
   with, you need to prove that there is an edge e that will work.
   Not just any arbitrary one will do.

   Also, some people said that since all edges are distinct, then if
   Prim's tree and M differ in only one edge-swap, they must have
   different costs.  But that doesn't hold if they differ in > 1 edge
   swap.

(b) false.  E.g., maybe all edges in G have length 1, but G itself is
    a tree so they have to all be in the MST.

(c) true.  To prove this, we need to argue that each time we
    delete an edge, there is still an MST consistent with the graph
    remaining.  Say we are about to delete edge (u,v) and M is an MST
    containing (u,v).  By definition of our algorithm, since removing
    (u,v) doesn't disconnect our current graph, we can remove (u,v)
    from M, and then reconnect the two pieces of M using some other
    edge in our current graph.  This edge can only be shorter (or
    equal to) the length of (u,v).


=======================
[some recitations didn't get to this]

Other topics:

questions about graph matrix algorithms?

In lecture we did a Prim-like algorithm for finding the maximum
bottleneck path.  Suppose we wanted to find this for every pair of
nodes.  Say each edge e has a capacity c_e.  What we want is for each
pair of vertices (s,t) to find out what is capacity of the
highest-capacity path from s to t.  (Or, to use a more Pittsburgh-like
example, say the edges are tunnels, and c_e is the height of the
tallest truck that can use tunnel e.  We are trying to compute, for
each pair (s,t), is the height of the tallest truck that can get from
s to t).  We are just trying to compute the values, not the actual
paths.

How could we do this by changing the defn of matrix multiplication?

To start, want A = max capacity path using 1 or fewer edges.  How to
construct?

	(infinities on diagonal.  For each edge e=(i,j) in E, set
	A[i,j]=c_e.  Rest of entries are 0.

Now, say we want to compute the widest path using 2 or fewer edges.
How do we want to change the defn of matrix multiplication??
    Ans: We change "*" to "min" and change "+" to "max".

Then we just do repeated squaring of the matrix for log(n) times.