---------------------------------------------------------------------------
COMP SCI 731 - Data Struc & Algorithms - Lecture 3 - Tuesday, June 27, 2000 
---------------------------------------------------------------------------

Today: [ ] More on the use of stacks: graph connectivity / DFS
       [ ] Review of asymptotic efficiency measures
       [ ] Implementing stacks 


DFS - connectivity
------------------

Last time we were reviewing graph representations.  
The adjacency list representation of a graph provides, for each vertex v, 
a list, Adj(v)  (in no particular order) of the neigbor set of v, N(v).
We commented last time that the lists in the adjacency list of 
each vertex may be ordinary linked lists, or they may be 
(dyanmically allocated) arrays.  Or it may be one big array into which we
have pointers.  The array representations are often faster.  
Either way, we can list the neighbors of a vertex v, N(v), 
in O(|N(v)|) time.

Remind students what it means for a graph to be connected.  

Algorithm to decide if G is connected::

 ** note: this is done quite differently than in lecture. **
 ** Here it is done in a way that generalizes better.     **  

First, do an an example, with colors to indicate the status of each vertex.
Initially, each vertex is 
  UNSEEN.  Later it will be 
  PENDING, and finally it will be
  VISITED.

   int status[n+1]  // will mark the status of each 
                    // vertex as UNSEEN, PENDING, VISITED

   Boolean Connected(G)    // recursive version.  G=(V,E) has n vertices and m edges.
                           // For concreteness, V={1,2,...,n}
     for i=1 to n do
         status[i] = UNSEEN
     dfs(1)   

     /* alternatively, count the visited vertices above and see if it is n */
     for i = 1 to n 
       if status[i] == UNSEEN then return false // graph is not connected
     return true     // graph is connected

   
   void dfs(v)
    |
    | status[v] = PENDING
    |
    | for i in N(v) do
    |    if status[v] == UNSEEN then
    |       dfs(v)
    | status[v] = VISITED
    |


>> Show example << 

>> Rewrite to increment a counter k (initially 0) when each vertex 
   is visited, and change the fragment which looks for UNSEEN verticies
   to the more simple " return (k==n) ".


This program is using a stack.  Where?  The run-time stack in the execution
environment, of course!  

Let's rewrite it non-recursively, to explicitly manage 
our stack.  This is usually faster.  This won't be a 
DFS, as we are only trying to decide connectivity.

  ** Here is quite different from what I did in class.  What
     was done in class was not good; having a vertex appear
     multiple times on the stack is a bad idea.  The following
     does not run in exactly the same way as the above, but
     it still decides connectivity **


   int status[n+1]  // will mark the status of each 
                    // vertex as UNSEEN, PENDING, VISITED
   stack s   // of vertices we have yet to visit 

   Boolean Connected(G)    // non-recursive version 

     for i=1 to n do
         status[i] = UNSEEN

     s.Push(1)
     k = 0
   
     while !s.Empty() do
        i = s.Pop()         
        status[i] = VISITED
        for each j in N(i) do 
           if status[j] = UNSEEN then
              | status[j] = PENDING
              | s.Push(j)

     return (k==n)
       

Stack Implementations
---------------------


 tos

   |
  \./
------     ------    ------     ------    ------     ------    
|  --|---> |  --|--->|  --|---> |  --|--->|  --|---> |  \ |
------     ------    ------     ------    ------     ------    

                    
           1    2     3    4    5    6    7    8    9   10  
         ---------------------------------------------------
n = 4    |  8 | 3  | 7  |    |    |    |    |    |    |    |  
         ---------------------------------------------------
          initially, n=0 and max = 10, say.

                         Breaks the abstraction
                               /
        Push(item)           /          Pop()
           if n==MAX then Error()          if n==0 then Error()
           n++                             save = stack[n]
           stack[n]=item                   return save 
                                           

Dynamic approach.

Start with an array with some number of elements, max.
If it fills up, allocate a new array with (say) twice as many elements.
Empty the old elements into the new array, and push the new element.
Worst case time:  expensive -- n Push() operations, some operation 
could take O(n) time.
But is it really so bad??

Over a sequence of operations, the time per operation will still be O(1).
Example:  10 items, 1 usec for "ordinary" push, n usec for a push which
           results in an max --> 2 max stack reallocation. 
   
                          // now the array has 10 spaces, none used
   10 x Push  -> cost 10  // now all 10 are used
    1   Push* -> cost 10  // now there are 20 elements in the array, 11 used
    9 x Push  -> cost  9  // now all 20 elements of the array are used
    1   Push* -> cost 20  // now there are 40 elements in the array, 21 used
   19 x Push  -> cost 19  // now there are 40 elements in the array, all used
    1   Push* -> cost 40  // now there are 80 elements in the array, 41 used
------------------------
 41 operatrions, cost 108 
    In general, the cost will remain O(1) per operation over any sequence
   of operations (prove this!).  

Amortized running time: the average time per operation over a worst
   case sequnce of operations.  We will ofen fall back to amortized
   running time analysis
   

O Notation 
----------

We've been using big-O notation. Let me remind you:

O(f(n)) is a set of functions (from N to N, say).  
        A function g is in this set if 
        there is a constant C such that g(n) <= C f(n) for almost all all n.

Give examples  e.g., 

o 7n^2 is O(n^2).   Is Theta(n^2)
o 7n^2 + 100 n is O(n^2).  Is Theta(n^2)
o 7n^2/log n is O(n^2).  It is not Theta(n^2)

T/F:
o if f is Theta(g) then g is Theta(f)   TRUE
o O(n lg n) = O(n log n)  TRUE
o if f is O(g) and f' is O(g) then f+f' is O(g)  TRUE