Hash Tables

1. Hashing

  • Save items in a key-indexed table (index is a function of the key).
  • Hash function. Method for computing array index from key.
  • Issues.

    • Computing the hash function.
    • Equality test: Method for checking whether two keys are equal.
    • Collision resolution: Algorithm and data structure to handle two keys that hash to the same array index.
  • Implementing hash code: strings

      public final class String {
          private int hash = 0;
          private final char[] s;
          ...
          public int hashCode() {
              int h = hash;
              if(h != 0) return h;
              int hash = 0;
              for (int i = 0; i < length(); i++)
                  hash = s[i] + (31 * hash);
              hash = h;
              return h;
          }
      }
    
  • Modular hashing

    • An int between -2^31 and 2^31 - 1.
    • Hash function. An int between 0 and M - 1 (for use as array index).

        private int hash(Key key) { return (key.hashCode() & 0x7fffffff) % M;}
      

1.1. Collisions

  • Two distinct keys hashing to same index.
  • Solutions: separate chaining and linear probing.

Collision resolution: Separate chaining symbol table

  • Use an array of M < N linked lists.
    • Hash: map key to integer i between 0 and M - 1.
    • Insert: put at front of i^th chain (if not already there).
    • Search: need to search only i^th chain.
  • Typical choice: M ~ N / 5 ⇒ constant-time ops.

Collision resolution: Open addressing

  • When a new key collides, find next empty slot, and put it there.
    • Hash. Map key to integer i between 0 and M-1.
    • Insert. Put at table index i if free; if not try i+1, i+2, etc.
    • Search. Search table index i; if occupied but no match, try i+1, i+2, etc.
    • Note. Array size M must be greater than number of key-value pairs N.
  • Typical choice: α = N / M ~ ½.

results matching ""

    No results matching ""