-----------------------------------------------------------------------------
COMP 731 - Data Struc & Algorithms - Lecture 12 - Thursday, July 20, 2000
-----------------------------------------------------------------------------

Today: 

o Kruskal's algorithm
o Union - Find


Kruskal1 (G,w)     // G=(V,E), w: E -> R; n=|V|, m=|E|

   1 Sort E by edge weights, w, and relabel so that
       w(e_1) < ... < w(e_m)
   
   2 T = emptyset
     for i = 1 to m do 
        if |T|=n-1 then break;
        if T u {e_i} is acyclic, then T=T u {e_i}
     return T

Correctness: we will omit or defer

Talked last time about efficient implementation

Kruskal2 (G,w)     // G=(V,E), w: E -> R; n=|V|, m=|E|

   1 Sort E by edge weights, w, and relabel so that
       w(e_1) < ... < w(e_m) // e_i = {v_i, w_i}
   
   2 T = emptyset
     Every vertex v comprises its own component
     for i = 1 to m do 
        if |T|=n-1 then break;
        if v_i and w_i are in different component then
             merge those components
             set T = T u {e_i}
     return T


One more refinement - 


Kruskal3 (G,w)     // G=(V,E), w: E -> R; n=|V|, m=|E|

   1 Sort E by edge weights, w, and relabel so that
       w(e_1) < ... < w(e_m) // e_i = {v_i, w_i}
   
   2 for i = 1 to m do T[i] = NO
     for i = 1 to n do MAKESET(i)
     count = 0;
     for i = 1 to m do 
        if count==n-1 then break;
        a = FIND(v_i)
        b = FIND(w_i) 
        if a==b then 
             UNION (a, b)
             T[i] = YES
     return T


ADT DisjointSets
 Data:   A number k>=0 and a family of k disjoint sets, S_1, ..., S_k,
         each set i having a canonical "name", name(S_i).
         Initially, k=0 -- there are no sets
 Operations:
                                   k
       MAKESET(item x): if x \in Union  S_i then undefined  else
                                 i = 1
                        k++
                        S_k = {x}
                        name(k) = something which is not the name 
                                   of any other set S_i
                        // in implementation, we will let name(k)=k

       FIND (x):        if x \not\in Union_{i=1}^k S_i then undefined else
                        return name(S_i) where x in S_i.
                        // in implementation, we will let the name of the set
                        // that contains x be some particular element of that set

       UNION(a,b):      if a is not the name of some set i or 
                           b is not the name of some set j or if then undefined. Else
                        Replace S_i and S_j by S_i union S_j, thereby decreasing k
                        (unless a==b).
                        Let the name of this new (merged) set be something 
                        different from the name of any other set.
                        // in implementation, we will let the name 
                        // be one of a or b


Implementation

Assume that the elements  are numbers 1..n, 
like they are in Kruskal's algorithm.

Assume that initially we will MAKESET(i) for i = 1, ... n.

For the result of the MAKESETs, create an array "name"

        1   2   3   4   5   6   7   8   9   10  11  12  13
      -----------------------------------------------------
name  | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13|   
      -----------------------------------------------------

FIND(i):  return name[i]

UNION (a, b) : 
     for i = 1 to n do if name[i] == b then name[i]=a;

Analysis:

  MAKESET(1..n)  - O(n) total
  FIND  each O(1)
  UNION each O(n)

Kruskal's algorithm:

    O(m lg m) = O(m lg n) to sort, and then
    O(n) to do the n MAKESETs
    2m FIND operations -->  O(m) for FINDs
    n-1 UNION operations --> O(n^2) for UNIONs

  Total: O(n^2 + m lg n).

Would like to improve to O(m lg n)


Forest-of-Directed Trees view

Making UNION chape and FIND expensive:
 
    UNION (a, b)   have a point to b -- O(1) time per operation
    FIND (x)       follow the chain of pointers until you get to 
                   the root -- O(n) time per operation

Nice tree
                o

        o o o o o o o o o 


Bad tree
                o 
                o
                o
                o
                o
                o
                o