Wednesday, November 02, 2005

Hashtable and HashCode

I am a big fan of Richard Monson-Haefel's technical books (EJB, JMS and J2EE Web Services).

Here is an explanation of Hashtable and Hashcodes that few developers grasp in practice:
A Hashtable is designed to provide fast lookups by binding an object to a key. Given any object's key, looking the object up in a hash table is a very quick operation. For the lookup, the key is converted to an integer value using the key's hashCode() method.


Hash codes do not need to be unique, only well-distributed. By "well-distributed," we mean that given any two keys, the chances are very good that the hash codes for the keys will be different. A well-distributed hash code algorithm reduces, but does not eliminate, the possibility that different keys evaluate to the same hash code. When keys evaluate to the same hash code, they are stored together and uniquely identified by their equals() method. If you look up an object using a key that evaluates to a hash code that is shared by several other keys, the Hashtable locates the group of objects that have been stored with the same hash code; then it uses the key's equals() method to determine which key (and hence, which object) you want. (That's why you have to override the equals() method in primary keys, as well as the hashCode() method.) Therefore, the emphasis in designing a good hash code algorithm is on producing codes that are well-distributed rather than unique. This allows you to design an index for associating keys with objects that is easy to compute, and therefore fast.


Source: Enterprise Java Beans, Second Edition



A decent writeup on HashCode and Equals is here:
HashCode and Equals in Java

No comments: