Overriding hashcode() and equals() in Java - Java @ Desk

Thursday, December 18, 2014

Overriding hashcode() and equals() in Java

Overriding hashcode() and equals() in Java

hashCode() and equals() are two methods that are defined in the class Object. As all the classes in java, by default, extend the Object class, we can override these methods in our classes.

Please note that we will be using the Student class defined below for illustration throughout this article.

In our example, we define a class Student, with both equals() and hashcode() methods overridden. Our requirement is that each student should have a unique id. Therefore, two student instances with the same id should be equal.

class Student{
 
 private Integer id;
 private String name;
 
 public Student(int id, String name) {
  this.id = id;
  this.name = name; 
 }
 
 @Override
 public boolean equals(Object obj){
  return ((Student)obj).id.equals(this.id);
 }
 
 @Override
 public int hashCode(){
  return id.hashCode();
  
 }

        @Override
 public String toString(){
  return id + "  " + name;
 }
}


equals()
The equals() method compares two objects to test their equality. Java recommends to override equals()method, if the equality is going to be defined by some other business logic. As per our requirement, two students with same id should be equal. So here, we override the equals()(as well as hashCode()) method to implement this action. If we do not override equals(), then the default implementation of equals() will be used.

Default implementation of equals()
The default implementation of equals() tests whether the object references of the two objects being compared are equal— i.e., if both the objects being compared are the exact same object. Its functionality is the same as the "==" check on two objects, which checks the value (i.e., memory address in the heap) of the two objects.

Let us now see how the following code works for the class Student defined above:
Student std1 = new Student(1, "jerin");
Student std2 = new Student(1, "jerin");
System.out.println("By == Check: " + (std1 == std2));
System.out.println("By overridden equals check: " + std1.equals(std2));


Output:
By == Check: false
By overridden equals check: true

As you can see, the overridden equals() method returns true, unlike its default implementation, where, like the "==" check, it would return false.
Internally, Java uses the equals() method to check the equality of two objects.
Say we place a Student object into a collection of Students. All the Students in this collection will have a unique id. At a later stage, say we need to remove a Student object from this list. We may not have the exact instance of the Student object placed in the list, but we have a unique id associated with each Student instance, based on which we can remove the respective Student object.



Consider the following sample code:
List<Student> studentList = new ArrayList<Student>();
studentList.add(new Student(1 , "Jerin"));
studentList.add(new Student(2 , "Joseph"));
studentList.add(new Student(3 , "Mark"));
studentList.add(new Student(4 , "Antony"));
   
Student studentJerin = new Student(1, "Jerin");
studentList.remove(studentJerin);
 
System.out.println(studentList);


Here, we pass the instance of Student to be removed from the list. The list internally checks whether the passed object is present or not based on the equals() method, and if it is, removes it from the list.

If equals is not overridden, then the output is:
[1  Jerin, 2  Joseph, 3  Mark, 4  Antony]

Explanation: In case of the default equals() implementation, the check is made for the exact instance of Student. In the example above, since we pass a different instance of Student, it will not find the second instance of Student "Jerin" in the list (even though it has the same id as the instance being checked for), and hence will not remove it from the list.

If equals is overridden correctly (as shown in Student class above), then the result is:
[2  Joseph, 3  Mark, 4  Antony]

Explanation: The overridden equals() method in our class equates two instances of Student based on their id. Therefore, it will find both instances of "Jerin" based on the same id and will remove both these objects from the list.

Contracts to be kept in mind while overriding equals:
1. Reflexive: For any non-null reference value x, x.equals(x) must always return true.
2. Symmetric: For any non-null reference values x and y, x.equals(y)must return true if and only if y.equals(x) returns true.
3. Transitive: For any non-null reference values x, y, z, if x.equals(y)returns True and y.equals(z)returns true, then x.equals(z)must also return true.
4. Consistent: For any reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false.
5. For any non-null reference value x, x.equals(null) should return false.




hashCode()

hashCode() is used to return an integer value which is used in hashing based collections, such as a HashMap. In hashing based collections, the data is stored in buckets.

Basic working of hashing based algorithms:
Hashing based algorithms uses a set of buckets to keep the objects. Let us consider a simple example where we have 100 paper strips (analogous to Student Objects), with a student name and grade written on each of it. Say we have a total of ten grades – one through ten -and we have one bucket for each grade; i.e., a total of 10 buckets. We will place these 100 paper strips in the buckets based on the grade, i.e., grade 1 bucket will have all the papers of grade 1 students, and so on. Now, say we get a piece of paper with the student name and grade and we have to find the student. First, we will find the bucket using the grade (analogous to finding the hashbucket using the hashcode), and then look into the bucket to get the student based on the student name (analogous to finding the object from the bucket using equals() method).

Whenever we put an object into a hash based collection , we first find the bucket into which we need to place the object based on the hashcode value, and then place the object in that bucket. Whenever we need to find a particular object, we first need to find the bucket into which it was placed. It’s this hashcode value that is used to find the appropriate bucket, after which we can search for the particular object inside the bucket based on the equals() method. So, to retrieve objects correctly from the hash based collections we need to implement both hashCode() and equals() methods correctly.

Default implementation of hashCode()
By default, hashCode() is generally implemented by converting the internal address of the object into an integer. However, it is jvm specific. For most cases, the hashcode of different objects will return different values.
Just like the equals() method, if we do not override the hashCode() method correctly, we may not be able to retrieve a particular object placed in a collection.

Contracts to override hashCode():
1. Consistency: Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer
2. If two objects are equal according to the equals() method, then the hashcode must also be same for both objects

Conclusion
Throughout the collection framework, java uses equals() to check for the equality of two objects, and hashcodes to find the hashbuckets whenever we use collections that use hashing. Therefore, whenever we need to keep a value in a collection, we must always override the equals() and hashCode() methods correctly.

While overriding equals() and hashCode() methods, always keep the following in mind If two objects:
1. Have true for equals() , then they should also have same the same hashcode.
2. Have the same hashcode, they need not return true for equals()(as different objects can have the same hashcode, ie., multiple objects can be in the same bucket)
3. Have different hashcodes, then, they are always unequal.

Important Note:
If we use unmutable objects as the key in hashing based maps, we need to override the hashCode() and equals() methods of the object correctly. If any change in the state of the object violates the above said contract, we may not be able to retrieve the correct value from the map using the key. As a practice, it is advisable not to use any field for the calculation of hashcode, which is not used in the equals() method. This is because, in a later stage, the field that is used only in hashcode may change, and hence will create a different hashcode for the same object based on equals(). In such a scenario, two objects that are equal based on equals() check will have different hashcodes, and hence, will point to different buckets, finally resulting in us not being able to find this object.

This post is written by Jerin Joseph. He is a freelance writer, loves to explore latest features in Java technology.






No comments:

Post a Comment