----------------------------------------------------------------------------- COMP 731 - Data Struc & Algorithms - Lecture 12 - Thursday, July 20, 2000 ----------------------------------------------------------------------------- Today: o Kruskal's algorithm o Union - Find Kruskal1 (G,w) // G=(V,E), w: E -> R; n=|V|, m=|E| 1 Sort E by edge weights, w, and relabel so that w(e_1) < ... < w(e_m) 2 T = emptyset for i = 1 to m do if |T|=n-1 then break; if T u {e_i} is acyclic, then T=T u {e_i} return T Correctness: we will omit or defer Talked last time about efficient implementation Kruskal2 (G,w) // G=(V,E), w: E -> R; n=|V|, m=|E| 1 Sort E by edge weights, w, and relabel so that w(e_1) < ... < w(e_m) // e_i = {v_i, w_i} 2 T = emptyset Every vertex v comprises its own component for i = 1 to m do if |T|=n-1 then break; if v_i and w_i are in different component then merge those components set T = T u {e_i} return T One more refinement - Kruskal3 (G,w) // G=(V,E), w: E -> R; n=|V|, m=|E| 1 Sort E by edge weights, w, and relabel so that w(e_1) < ... < w(e_m) // e_i = {v_i, w_i} 2 for i = 1 to m do T[i] = NO for i = 1 to n do MAKESET(i) count = 0; for i = 1 to m do if count==n-1 then break; a = FIND(v_i) b = FIND(w_i) if a==b then UNION (a, b) T[i] = YES return T ADT DisjointSets Data: A number k>=0 and a family of k disjoint sets, S_1, ..., S_k, each set i having a canonical "name", name(S_i). Initially, k=0 -- there are no sets Operations: k MAKESET(item x): if x \in Union S_i then undefined else i = 1 k++ S_k = {x} name(k) = something which is not the name of any other set S_i // in implementation, we will let name(k)=k FIND (x): if x \not\in Union_{i=1}^k S_i then undefined else return name(S_i) where x in S_i. // in implementation, we will let the name of the set // that contains x be some particular element of that set UNION(a,b): if a is not the name of some set i or b is not the name of some set j or if then undefined. Else Replace S_i and S_j by S_i union S_j, thereby decreasing k (unless a==b). Let the name of this new (merged) set be something different from the name of any other set. // in implementation, we will let the name // be one of a or b Implementation Assume that the elements are numbers 1..n, like they are in Kruskal's algorithm. Assume that initially we will MAKESET(i) for i = 1, ... n. For the result of the MAKESETs, create an array "name" 1 2 3 4 5 6 7 8 9 10 11 12 13 ----------------------------------------------------- name | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| ----------------------------------------------------- FIND(i): return name[i] UNION (a, b) : for i = 1 to n do if name[i] == b then name[i]=a; Analysis: MAKESET(1..n) - O(n) total FIND each O(1) UNION each O(n) Kruskal's algorithm: O(m lg m) = O(m lg n) to sort, and then O(n) to do the n MAKESETs 2m FIND operations --> O(m) for FINDs n-1 UNION operations --> O(n^2) for UNIONs Total: O(n^2 + m lg n). Would like to improve to O(m lg n) Forest-of-Directed Trees view Making UNION chape and FIND expensive: UNION (a, b) have a point to b -- O(1) time per operation FIND (x) follow the chain of pointers until you get to the root -- O(n) time per operation Nice tree o o o o o o o o o o Bad tree o o o o o o o