Saturday, August 2, 2014

How HashSet determines duplicate object?

When one object is put in the HashSet, it checks whether its already having any object with the hashcode matching the one that it is going to be put. If no matching hashcode found it assumes that this one is not a duplicate. If HashSet finds a matching hashcode for two objects- one that is going to be added and one already in the set-the HashSet will then call one of the object's equals() methods to see if these hashcode-matched objects are really equal. And if they are equal then it treats the object as duplicate. 

For String,Integer etc are having their own overriden hashCode() and equals(Object) methods for which its not possible to add two duplicate String or Integer in the HashSet. 

If instances of your own made class are the elements of set then definitely its important to define the case which will determine whether the two instances are identical or not through overriding the hashCode and equals method.

Lets do some coding to get the idea more clearly: Suppose we want to make a set of Books. Where we want to treat two instances of Book will be duplicate if they have the same title. We will achieve this by overriding the hashCode and equals method. Then we will try to add the two objects having the same title in the Set.
Book.java
public class Book{
 private String title;

  public Book(String title) {
  super();
  this.title = title;
 }

  public String getTitle() {
  return title;
 }

  public void setTitle(String title) {
  this.title = title;
 }

  @Override
 public String toString() {
  return getTitle();
 }

  @Override
 public int hashCode() {
  return getTitle().hashCode();
 }

  @Override
 public boolean equals(Object object) {
  Book anotherBook = (Book) object;
  return this.getTitle().equals(anotherBook.getTitle());
 }

}
In the above book class notice the two hashCode and equals method. We took the help of String class's equals and hashCode method. As our title is String so we can take the help of hashCode and equals method of String class. That means if two titles are same then it will provide the same hashCode and equals method will also return true. 
TestSet.java
public class TestSet {
 public static void main(String[] args) {
  Book one=new Book("A");
  Book two=new Book("B");
  Book three=new Book("C");
  Book four=new Book("D");
  Book five=new Book("D");
  Set<Book> bookSet=new  HashSet<Book>();
  bookSet.add(one);
  bookSet.add(two);
  bookSet.add(three);
  bookSet.add(four);
  bookSet.add(four);
  bookSet.add(five);
  System.out.println(bookSet);
 }
}
If the above code is run then the output will be : [D, A, B, C].

See the output doesn't contain the D twice. But we added D twice.

Now just comment the hashCode and equals methods of Book.java. And run the TestSet.java. 
Now the output is : [D, D, B, C, A]. 

What happened?
As in Book class we have commented the equals and hashCode method so equals and hashCode method from the Object class will be called. Because implicitly every class extends the Object class.

"The default behavior (from Object) is to give each object a unique hashcode value. So you must override hashCode() to be sure that two equivalent objects return the same hashcode. And the default behavior of equals method is to do an == comparison. So if you dont override equals method no two objects can ever be considered equal since reference to two different object will always contain a different bit pattern. So you must also override equals(Object) method so that if you call it on either object. passing in the other object, always returns true."
As we are treating two different object as equal if both of them have same title. So we must override hashCode and equals method.




No comments:

Post a Comment