ECS110 Lecture Notes for Monday November 13, 1995

ECS 110 Lecure Notes for Monday November 13, 1995
Professor Rogaway Conducting

Lecture 19
Scribe of the day: Brian C. Lovrin

Hashing

Readings:

8.1
8.2
omit 8.3

	Worst Case	Average Time	Space
Bubble, Insertion	( n ² )	( n ² )	( 1 )
Merge	( n log n)	( n log n)	( n )
Quick	( n ² )	( n log n)	( log n)
Heap	( n log n)	( n log n)	( 1 )
Radix (all flavors)	( r + nd )	( r + nd )	( r + n )

All times are Theta of that function.

Hashing is commonly used to represent a Dictionary ADT.

// DATA - A set of elements S, each element having an associated key,
          drawn from some universe, U, of keys.  No two elements having
          the same key.
// OPERATIONS
    void insert(Elem i);    // adds i to S.
    int in(Elem i);         // returns 1 if i is in S, otherwise 0.
    void delete(Elem i);    // removes i from S.

This ADT may be implemented with a linked-list (slow) or a binary search tree (bit faster).

However, a constant-time implementation would involve the following scheme:

0 null

1 k4

2 k3

27 k1

98 k2

99 null

, where k is the randomly selected key from universe U.

1) Randomly assign some element of S to be the key.
2) This results in a hash table... U->[0 ... m-1] (hash function, H)


insert(Elem i);     // change empty slot from null to contain a random key.
in(Elem i);         // Keep searching non-null slots until i = elem.

"Collision Resolution by Chaining"

When two keys map under the same slot, just "chain" them. Example:

0	null
1	k4	k6
2	k3


27	k1	k5	k8


98	k2	k7
99	null

Load Factor, alpha = n/m, where n=# of entries in dictionary and m=size of hash table.

When alpha is big, searching performance is comparable to a linked-list. Therefore,
rehash the table periodically when alpha is 3 or 4 or so.
Furthermore, dynamically rehash the table "a little bit" when alpha becomes, say, 2.
When alpha becomes 4, the big table will already be ready to accept more data.

Perform a similar fix if the table is too big, i.e., alpha gets small.

All this will achieve O(n) time per operation on a sequence of operations.

Th: In a hash table in which collisions are resolved by chaining, a successful in() takes ~ 1 + alpha/2 comparisons, and an unsuccesful in() takes ~ 1 + alpha comparisons.
This assumes that there is a uniform hashing procedure in which:
distribution, h ° µ_U, is uniform on [0 .. m-1]

Last Updated: Monday, November 13, 1995
by Brian C. Lovrin

ECS 110 Lecure Notes for Monday November 13, 1995 Professor Rogaway Conducting

Lecture 19 Scribe of the day: Brian C. Lovrin

Hashing

ECS 110 Lecure Notes for Monday November 13, 1995
Professor Rogaway Conducting

Lecture 19
Scribe of the day: Brian C. Lovrin