ECS 110 Lecure Notes for Monday November 13, 1995
Professor Rogaway Conducting

Lecture 19
Scribe of the day: Brian C. Lovrin



Hashing


Readings:
Worst Case Average Time Space
Bubble, Insertion ( n 2 ) ( n 2 ) ( 1 )
Merge ( n log n) ( n log n) ( n )
Quick ( n 2 ) ( n log n) ( log n)
Heap ( n log n) ( n log n) ( 1 )
Radix (all flavors) ( r + nd ) ( r + nd ) ( r + n )
All times are Theta of that function.

Hashing is commonly used to represent a Dictionary ADT.
// DATA - A set of elements S, each element having an associated key,
          drawn from some universe, U, of keys.  No two elements having
          the same key.
// OPERATIONS
    void insert(Elem i);    // adds i to S.
    int in(Elem i);         // returns 1 if i is in S, otherwise 0.
    void delete(Elem i);    // removes i from S.
This ADT may be implemented with a linked-list (slow) or a binary search tree (bit faster).

However, a constant-time implementation would involve the following scheme:

0 null
1 k4
2 k3
27 k1
98 k2
99 null
, where k is the randomly selected key from universe U.

1) Randomly assign some element of S to be the key.
2) This results in a hash table... U->[0 ... m-1] (hash function, H)


"Collision Resolution by Chaining"
Example:
0 null
1 k4 k6
2 k3
27 k1 k5 k8
98 k2 k7
99 null


Load Factor, alpha = n/m, where n=# of entries in dictionary and m=size of hash table.

When alpha is big, searching performance is comparable to a linked-list. Therefore,
rehash the table periodically when alpha is 3 or 4 or so.
Furthermore, dynamically rehash the table "a little bit" when alpha becomes, say, 2.
When alpha becomes 4, the big table will already be ready to accept more data.

Perform a similar fix if the table is too big, i.e., alpha gets small.

All this will achieve O(n) time per operation on a sequence of operations.


Th: In a hash table in which collisions are resolved by chaining, a successful in() takes ~ 1 + alpha/2 comparisons, and an unsuccesful in() takes ~ 1 + alpha comparisons.
This assumes that there is a uniform hashing procedure in which:
distribution, h ° µU, is uniform on [0 .. m-1]

Last Updated: Monday, November 13, 1995
by Brian C. Lovrin