--------------------------------------------------------------------------- COMP SCI 731 - Data Struc & Algorithms - Lecture 3 - Tuesday, June 27, 2000 --------------------------------------------------------------------------- Today: [ ] More on the use of stacks: graph connectivity / DFS [ ] Review of asymptotic efficiency measures [ ] Implementing stacks DFS - connectivity ------------------ Last time we were reviewing graph representations. The adjacency list representation of a graph provides, for each vertex v, a list, Adj(v) (in no particular order) of the neigbor set of v, N(v). We commented last time that the lists in the adjacency list of each vertex may be ordinary linked lists, or they may be (dyanmically allocated) arrays. Or it may be one big array into which we have pointers. The array representations are often faster. Either way, we can list the neighbors of a vertex v, N(v), in O(|N(v)|) time. Remind students what it means for a graph to be connected. Algorithm to decide if G is connected:: ** note: this is done quite differently than in lecture. ** ** Here it is done in a way that generalizes better. ** First, do an an example, with colors to indicate the status of each vertex. Initially, each vertex is UNSEEN. Later it will be PENDING, and finally it will be VISITED. int status[n+1] // will mark the status of each // vertex as UNSEEN, PENDING, VISITED Boolean Connected(G) // recursive version. G=(V,E) has n vertices and m edges. // For concreteness, V={1,2,...,n} for i=1 to n do status[i] = UNSEEN dfs(1) /* alternatively, count the visited vertices above and see if it is n */ for i = 1 to n if status[i] == UNSEEN then return false // graph is not connected return true // graph is connected void dfs(v) | | status[v] = PENDING | | for i in N(v) do | if status[v] == UNSEEN then | dfs(v) | status[v] = VISITED | >> Show example << >> Rewrite to increment a counter k (initially 0) when each vertex is visited, and change the fragment which looks for UNSEEN verticies to the more simple " return (k==n) ". This program is using a stack. Where? The run-time stack in the execution environment, of course! Let's rewrite it non-recursively, to explicitly manage our stack. This is usually faster. This won't be a DFS, as we are only trying to decide connectivity. ** Here is quite different from what I did in class. What was done in class was not good; having a vertex appear multiple times on the stack is a bad idea. The following does not run in exactly the same way as the above, but it still decides connectivity ** int status[n+1] // will mark the status of each // vertex as UNSEEN, PENDING, VISITED stack s // of vertices we have yet to visit Boolean Connected(G) // non-recursive version for i=1 to n do status[i] = UNSEEN s.Push(1) k = 0 while !s.Empty() do i = s.Pop() status[i] = VISITED for each j in N(i) do if status[j] = UNSEEN then | status[j] = PENDING | s.Push(j) return (k==n) Stack Implementations --------------------- tos | \./ ------ ------ ------ ------ ------ ------ | --|---> | --|--->| --|---> | --|--->| --|---> | \ | ------ ------ ------ ------ ------ ------ 1 2 3 4 5 6 7 8 9 10 --------------------------------------------------- n = 4 | 8 | 3 | 7 | | | | | | | | --------------------------------------------------- initially, n=0 and max = 10, say. Breaks the abstraction / Push(item) / Pop() if n==MAX then Error() if n==0 then Error() n++ save = stack[n] stack[n]=item return save Dynamic approach. Start with an array with some number of elements, max. If it fills up, allocate a new array with (say) twice as many elements. Empty the old elements into the new array, and push the new element. Worst case time: expensive -- n Push() operations, some operation could take O(n) time. But is it really so bad?? Over a sequence of operations, the time per operation will still be O(1). Example: 10 items, 1 usec for "ordinary" push, n usec for a push which results in an max --> 2 max stack reallocation. // now the array has 10 spaces, none used 10 x Push -> cost 10 // now all 10 are used 1 Push* -> cost 10 // now there are 20 elements in the array, 11 used 9 x Push -> cost 9 // now all 20 elements of the array are used 1 Push* -> cost 20 // now there are 40 elements in the array, 21 used 19 x Push -> cost 19 // now there are 40 elements in the array, all used 1 Push* -> cost 40 // now there are 80 elements in the array, 41 used ------------------------ 41 operatrions, cost 108 In general, the cost will remain O(1) per operation over any sequence of operations (prove this!). Amortized running time: the average time per operation over a worst case sequnce of operations. We will ofen fall back to amortized running time analysis O Notation ---------- We've been using big-O notation. Let me remind you: O(f(n)) is a set of functions (from N to N, say). A function g is in this set if there is a constant C such that g(n) <= C f(n) for almost all all n. Give examples e.g., o 7n^2 is O(n^2). Is Theta(n^2) o 7n^2 + 100 n is O(n^2). Is Theta(n^2) o 7n^2/log n is O(n^2). It is not Theta(n^2) T/F: o if f is Theta(g) then g is Theta(f) TRUE o O(n lg n) = O(n log n) TRUE o if f is O(g) and f' is O(g) then f+f' is O(g) TRUE