Deep in hashCode()

hashCode() is a method that we override in our custom classes and its mandatory to override if the "equals" method is overridden. Hashcode is also used in hashing based collection classes such as Hashtable, HashMap, HashSet etc. Understanding hashCode is very simple.

Lets first see why and when we use hashCode?

As we know, hashCode is a numeric value. It is used in retrieval of the respective object and it need not be unique which we see later. Now since its a number its always fast to retrieve an object using a number rather than an alphabetic key. How it will do? Assume we created a new object by passing some value which is already available in someother object. Now the new object will return the same hash value as of another object because the value passed is same. Once the same hash value is returned, JVM will go to the same memory address every time and if in case there are more than one objects present for the same hash value it will use equals() method to identify the correct object.


Note: A step of caution that needs to be taken is that while implementing the hashcode() method the fields that are present in the hashcode() should not be the one which could change the state of object. I will explain this later with example.

In what scenarios we override equals and hashCode methods.

Here are few.
In Hashtable, the hashCode method only gets invoked when you use the Object as the key to a Hashtable. It is not used when the Object is a Hashtable value. Most of the time your Hashtable keys are simple Strings, so you rarely need to write custom equals and hashCode methods. But in the case of HashSet, to help you check for duplicate Objects, we need a custom equals and hashCode method because here our Objects act as both key and value.

Lets see how HashMap uses the hash code.

In HashMap, when you try to get the object using get(key) method, the HashMap first gets the hash code of the key (remember that key is an object in the HashMap and it should have hashCode() method overridden) and generates again the hashcode to just defend against poor quality hash functions. Using this hashcode, an index for hash code will be created and this index is used to look up the entry table which is where all the hash entries are available. Now, once you get the Entry object from the Entry table using index, check if the key of the Entry object is same as the key we sent as parameter to get method. If yes, then return the value of the Entry object. If no, then go to next Entry object and continue the same.

Below is the code in the get method of HashMap

int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
	e != null;
	e = e.next) {
	Object k;
	if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
		return e.value;
}
					
					

Now lets go into some more details of hashCode.

In the Java API documentation, the general contract of hashCode is given as:

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.


  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.


  • It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.


Mmmm fine. So what it says is equal objects must produce the same hash code anytime but unequal objects does not need to produce different hash code. They can produce same or different hashcode. Lets see an example.


public class Student {

    private String name;
    private int age;
    private int totalMarks;

    public Student(){}

    public Student(String name, int age, int marks) {
        this.name = name;
        this.age = age;
        this.marks = marks;
    }

    public void setMarks(int marks) {
        this.marks = marks;
    }

    @Override
    public boolean equals(Object obj)
    {
        if(this == obj)
            return true;
        if((obj == null) || (obj.getClass() != this.getClass()))
            return false;
        // object must be Test at this point
        Student stud = (Student)obj;
        return age == stud.age &&
            (name != null && name.equals(stud.name));
    }

    @Override
    public int hashCode() {
        final int seed = 37;
        int result = 1;
        result = seed * result + ((name == null) ? 0 : name.hashCode());
        result = seed * result + age;
        return result;
    }
}

In the above example I am overriding "equals" and "hashCode" methods. If you closely observe, whatever the fields I am comparing in "equals" method, the same fields I am using to generate hash code in "hashCode" method. Ofcourse, this is not mandatory. You can use subset of the variables that participate in the equals method comparison to improve performance of the hashCode method.

Lets check below SCJP question

Q: Given:
11. public class Person {
12. private String name, comment;
13. private int age;
14. public Person(String n, int a, String c) {
15. name = n; age = a; comment = c;
16. }
17. public boolean equals(Object o) {
18. if (! (o instanceof Person)) return false;
19, Person p = (Person)o;
20. return age == p.age && name.equals(p.name);
21. }
22. }

What is the appropriate definition of the hashCode method in class Person?
A. return super.hashCode();
B. return name.hashCode() + age * 7;
C. return name.hashCode() + comment.hashCode() / 2;
D. return name.hashCode() + comment.hashCode() / 2 - age * 3;

Can you guess? Yes, you are right. Its B. Why is it B? Before looking at hashCode method, look at equals method. In equals method the comparision is made between person's age and name to check if the two objects are equal. So we need to consider only person's age and name when generating hashCode.

Got it? Now again go back to previous example of Student class. In the "equals" method or "hashCode" method I didn't use "marks" field to compare or generate the hash code. This I have indicated in the note above that "while implementing the hashcode() method the fields that are present in the hashcode() should not be the one which could change the state of object.". Since "marks" field can change the state of the object, I didn't use marks field in generating hash code. See below example.


@Override
public int hashCode() {
    final int seed = 37;
    int result = 1;
    result = seed * result + ((name == null) ? 0 : name.hashCode());
    result = seed * result + age;
    result = seed * result + marks;
    return result;
}

Student studObj = new Student("Julie",24,80);
map.put(studObj,"First Year");
studObj.setMarks(85);
System.out.println(map.get(studObj));

The output for the above System.out.println is null. 

Finally, it is incorrect to involve a random number directly while computing the hash code of the class object, as it would not consistently return the same hash code for multiple invocations during the same execution





blog comments powered by Disqus