Why does the Collection interface have equals() and hashCode()?
Why does the Collection interface have equals(Object o) and hashCode() , given that any implementation will have those by default (inherited from Object ) ?
4 Answers 4
While the Collection interface adds no stipulations to the general contract for the Object.equals , programmers who implement the Collection interface «directly» (in other words, create a class that is a Collection but is not a Set or a List ) must exercise care if they choose to override the Object.equals . It is not necessary to do so, and the simplest course of action is to rely on Object’s implementation, but the implementor may wish to implement a «value comparison» in place of the default «reference comparison.» (The List and Set interfaces mandate such value comparisons.)
The general contract for the Object.equals method states that equals must be symmetric (in other words, a.equals(b) if and only if b.equals(a) ). The contracts for List.equals and Set.equals state that lists are only equal to other lists, and sets to other sets. Thus, a custom equals method for a collection class that implements neither the List nor Set interface must return false when this collection is compared to any list or set. (By the same logic, it is not possible to write a class that correctly implements both the Set and List interfaces.)
While the Collection interface adds no stipulations to the general contract for the Object.hashCode method, programmers should take note that any class that overrides the Object.equals method must also override the Object.hashCode method in order to satisfy the general contract for the Object.hashCode method. In particular, c1.equals(c2) implies that c1.hashCode()==c2.hashCode() .
To answer your specific question: why does it have these methods? It’s done simply for convenience to be able to include Java Docs giving hints as to what implementers should do with these methods (e.g. comparing equality of values rather than references).
To add to the other great answers. In the Collections interface, the equals method is defined in that interface to make some decisions in the way equaling two instances of collection should work. From the JAVA 8 documentation:
More generally, implementations of the various Collections Framework interfaces are free to take advantage of the specified behavior of underlying Object methods wherever the implementor deems it appropriate.
So you don’t add methods from the Object class for any other reason that giving more definitiveness to the java doc. This is the reason why you don’t count those methods in the abstract methods in the abstract methods of an interface.
Moreover, in JAVA 8, along the same line of reasoning, default methods from the Object class are not allowed and will generate a compile error. I believe it’s was done to prevent this type of confusion. So if you try to create a default method called hashCode(), for example, it will not compile.
Here is a more in-depth explanation for this behavior in JAVA 8 from the Lambda FAQ:
An interface cannot provide a default implementation for any of the methods of the Object class. This is a consequence of the “class wins” rule for method resolution: a method found on the superclass chain always takes precedence over any default methods that appear in any superinterface. In particular, this means one cannot provide a default implementation for equals, hashCode, or toString from within an interface.
This seems odd at first, given that some interfaces actually define their equals behavior in documentation. The List interface is an example. So, why not allow this?
One reason is that it would become more difficult to reason about when a default method is invoked. The current rules are simple: if a class implements a method, that always wins over a default implementation. Since all instances of interfaces are subclasses of Object, all instances of interfaces have non-default implementations of equals, hashCode, and toString already. Therefore, a default version of these on an interface is always useless, and it may as well not compile.
Another reason is that providing default implementations of these methods in an interface is most likely misguided. These methods perform computations over the object’s state, but the interface, in general, has no access to state; only the implementing class has access to this state. Therefore, the class itself should provide the implementations, and default methods are unlikely to be useful.
Java hashCode() and equals()
The methods hashCode() and equals() play a distinct role in the objects you insert into Java collections. The specific contract rules of these two methods are best described in the JavaDoc. Here I will just tell you what role they play. What they are used for, so you know why their implementations are important.
equals()
equals() is used in most collections to determine if a collection contains a given element. For instance:
List list = new ArrayList(); list.add("123"); boolean contains123 = list.contains("123");
The ArrayList iterates all its elements and execute «123».equals(element) to determine if the element is equal to the parameter object «123». It is the String.equals() implementation that determines if two strings are equal.
The equals() method is also used when removing elements. For instance:
List list = new ArrayList(); list.add("123"); boolean removed = list.remove("123");
The ArrayList again iterates all its elements and execute «123».equals(element) to determine if the element is equal to the parameter object «123». The first element it finds that is equal to the given parameter «123» is removed.
As you can see, a proper implementation of .equals() is essential for your own classes to work well with the Java Collection classes. So how do you implement equals() «properly»?
So, when are two objects equal? That depends on your application, the classes, and what you are trying to do. For instance, let’s say you are loading and processing Employee objects stored in a database. Here is a simple example of such an Employee class:
You could decide that two Employee objects are equal to each other if just their employeeId ‘s are equal. Or, you could decide that all fields must be equal — both employeeId , firstName and lastName . Here are two example implementation of equals() matching these criterias:
Which of these two implementations is «proper» depends on what you need to do. Sometimes you need to lookup an Employee object from a cache. In that case perhaps all you need is for the employeeId to be equal. In other cases you may need more than that — for instance to determine if a copy of an Employee object has changed from the original.
hashCode()
The hashCode() method of objects is used when you insert them into a HashTable , HashMap or HashSet . If you do not know the theory of how a hashtable works internally, you can read about hastables on Wikipedia.org.
When inserting an object into a hastable you use a key. The hash code of this key is calculated, and used to determine where to store the object internally. When you need to lookup an object in a hashtable you also use a key. The hash code of this key is calculated and used to determine where to search for the object.
The hash code only points to a certain «area» (or list, bucket etc) internally. Since different key objects could potentially have the same hash code, the hash code itself is no guarantee that the right key is found. The hashtable then iterates this area (all keys with the same hash code) and uses the key’s equals() method to find the right key. Once the right key is found, the object stored for that key is returned.
So, as you can see, a combination of the hashCode() and equals() methods are used when storing and when looking up objects in a hashtable.
Here are two rules that are good to know about implementing the hashCode() method in your own classes, if the hashtables in the Java Collections API are to work correctly:
- If object1 and object2 are equal according to their equals() method, they must also have the same hash code.
- If object1 and object2 have the same hash code, they do NOT have to be equal too.
Here are two example implementation of the hashCode() method matching the equals() methods shown earlier:
Notice, that if two Employee objects are equal, they will also have the same hash code. But, as is especially easy to see in the first example, two Employee objects can be not equal, and still have the same hash code.
In both examples the hash code is the employeeId is rounded down to an int . That means that many employee id’s could result in the same hash code, but these Employee objects would still not be equal, since they don’t have the same employee id.
More Detail in the JavaDoc
For a 100% precise description of how to implement equals() and hashCode() you should check out the official JavaDoc’s. The purpose of this text was mostly to explain how they are used by the Java Collection classes. Understanding this makes it easier to implement them to suit your purposes.