What is the difference between ArrayList.clear() and ArrayList.removeAll()?
Assuming that arraylist is defined as ArrayList
@Corey: when might one every want to use arraylist.removeAll(arraylist) ? I see absolutely no reason to do that.
@Joachim Sauer That’s exactly what I wanted to verify. Thanks +2. But is the difference between elementData[i] = null and e.remove() significant?
There’s no sane reason to do arrList.removeAll(arrList) instead of arrList.clear() . arrList1.removeAll(arrList2) is a different matter.
If only removeAll()’s implementation started with this line, then this whole discussion could have been so much more entertaining. if (c == this && !isEmpty()) < clear(); return true; >. I’ll have to submit this to OpenJDK as a patch! 😉
9 Answers 9
The source code for clear() :
The source code for removeAll() (As defined in AbstractCollection ):
public boolean removeAll(Collection c) < boolean modified = false; Iteratore = iterator(); while (e.hasNext()) < if (c.contains(e.next())) < e.remove(); modified = true; >> return modified; >
clear() is much faster since it doesn’t have to deal with all those extra method calls.
And as Atrey points out, c.contains(..) increases the time complexity of removeAll to O(n 2 ) as opposed to clear ‘s O(n).
A note that c.contains(. ) squares the time complexity of the operation would make this answer complete.
The source is strong in this one. (To all other answers: Use the source, Luke.) Notice how clear() could be implemented as just the one line, size=0; but garbage-collection wouldn’t know to collect the elements in the unreachable portions of the array.
e.remove() is way more complex! e.remove() also squares the complexity, just as c.contains(. ) does. On an ArrayList, e.remove() calls ArrayList.remove(int index), which has to shift the remainder of the array over by one.
@ateiob e.remove() is two extra method calls, a range check, and an object return (internal to AbstractList.Itr.remove() and ArrayList.remove(int) ), too
@julius If it did this: size = 0; elementData = new Object[10]; all the rest would be garbage collected, since the backing array has no outside references.
The time complexity of ArrayList.clear() is O(n) and of removeAll is O(n^2) .
So yes, ArrayList.clear is much faster.
The clear() method removes all the elements of a single ArrayList . It’s a fast operation, as it just sets the array elements to null .
The removeAll(Collection) method, which is inherited from AbstractCollection , removes all the elements that are in the argument collection from the collection you call the method on. It’s a relatively slow operation, as it has to search through one of the collections involved.
I thought it just sets all, not some elements to null. If it’s not the case, how it decided on which elements should be set to null?
@Farid sorry, my English is just too informal here. I indeed meant that it sets all of the elements to null. I’ll fix it!
Unless there is a specific optimization that checks if the argument passed to removeAll() is the collection itself (and I highly doubt that such an optimization is there) it will be significantly slower than a simple .clear() .
Apart from that (and at least equally important): arraylist.removeAll(arraylist) is just obtuse, confusing code. It is a very backwards way of saying «clear this collection». What advantage would it have over the very understandable arraylist.clear() ?
They serve different purposes. clear() clears an instance of the class, removeAll() removes all the given objects and returns the state of the operation.
clear() will go through the underlying Array and set each entry to null;
removeAll(collection) will go through the ArrayList checking for collection and remove(Object) it if it exists.
I would imagine that clear() is way faster then removeAll because it’s not comparing, etc.
Clear is faster because it does not loop over elements to delete. This method can assume that ALL elements can be deleted.
Remove all does not necessarily mean delete all elements in the list, only those provided as parameters SHOULD be delete. Hence, more effort is required to keep those which should not be deleted.
CLARIFICATION
By ‘loop’, I mean it does not have to check whether the element should be kept or not. It can set the reference to null without searching through the provided lists of elements to delete.
Clear IS faster than deleteall .
Interface List
An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
Unlike sets, lists typically allow duplicate elements. More formally, lists typically allow pairs of elements e1 and e2 such that e1.equals(e2) , and they typically allow multiple null elements if they allow null elements at all. It is not inconceivable that someone might wish to implement a list that prohibits duplicates, by throwing runtime exceptions when the user attempts to insert them, but we expect this usage to be rare.
The List interface places additional stipulations, beyond those specified in the Collection interface, on the contracts of the iterator , add , remove , equals , and hashCode methods. Declarations for other inherited methods are also included here for convenience.
The List interface provides four methods for positional (indexed) access to list elements. Lists (like Java arrays) are zero based. Note that these operations may execute in time proportional to the index value for some implementations (the LinkedList class, for example). Thus, iterating over the elements in a list is typically preferable to indexing through it if the caller does not know the implementation.
The List interface provides a special iterator, called a ListIterator , that allows element insertion and replacement, and bidirectional access in addition to the normal operations that the Iterator interface provides. A method is provided to obtain a list iterator that starts at a specified position in the list.
The List interface provides two methods to search for a specified object. From a performance standpoint, these methods should be used with caution. In many implementations they will perform costly linear searches.
The List interface provides two methods to efficiently insert and remove multiple elements at an arbitrary point in the list.
Note: While it is permissible for lists to contain themselves as elements, extreme caution is advised: the equals and hashCode methods are no longer well defined on such a list.
Some list implementations have restrictions on the elements that they may contain. For example, some implementations prohibit null elements, and some have restrictions on the types of their elements. Attempting to add an ineligible element throws an unchecked exception, typically NullPointerException or ClassCastException . Attempting to query the presence of an ineligible element may throw an exception, or it may simply return false; some implementations will exhibit the former behavior and some will exhibit the latter. More generally, attempting an operation on an ineligible element whose completion would not result in the insertion of an ineligible element into the list may throw an exception or it may succeed, at the option of the implementation. Such exceptions are marked as «optional» in the specification for this interface.
Unmodifiable Lists
- They are unmodifiable. Elements cannot be added, removed, or replaced. Calling any mutator method on the List will always cause UnsupportedOperationException to be thrown. However, if the contained elements are themselves mutable, this may cause the List’s contents to appear to change.
- They disallow null elements. Attempts to create them with null elements result in NullPointerException .
- They are serializable if all elements are serializable.
- The order of elements in the list is the same as the order of the provided arguments, or of the elements in the provided array.
- The lists and their subList views implement the RandomAccess interface.
- They are value-based. Programmers should treat instances that are equal as interchangeable and should not use them for synchronization, or unpredictable behavior may occur. For example, in a future release, synchronization may fail. Callers should make no assumptions about the identity of the returned instances. Factories are free to create new instances or reuse existing ones.
- They are serialized as specified on the Serialized Form page.
This interface is a member of the Java Collections Framework.
Class ArrayList
Type Parameters: E — the type of elements in this list All Implemented Interfaces: Serializable , Cloneable , Iterable , Collection , List , RandomAccess Direct Known Subclasses: AttributeList , RoleList , RoleUnresolvedList
Resizable-array implementation of the List interface. Implements all optional list operations, and permits all elements, including null . In addition to implementing the List interface, this class provides methods to manipulate the size of the array that is used internally to store the list. (This class is roughly equivalent to Vector , except that it is unsynchronized.)
The size , isEmpty , get , set , iterator , and listIterator operations run in constant time. The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. All of the other operations run in linear time (roughly speaking). The constant factor is low compared to that for the LinkedList implementation.
Each ArrayList instance has a capacity. The capacity is the size of the array used to store the elements in the list. It is always at least as large as the list size. As elements are added to an ArrayList, its capacity grows automatically. The details of the growth policy are not specified beyond the fact that adding an element has constant amortized time cost.
An application can increase the capacity of an ArrayList instance before adding a large number of elements using the ensureCapacity operation. This may reduce the amount of incremental reallocation.
Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list. If no such object exists, the list should be «wrapped» using the Collections.synchronizedList method. This is best done at creation time, to prevent accidental unsynchronized access to the list:
List list = Collections.synchronizedList(new ArrayList(. ));
The iterators returned by this class’s iterator and listIterator methods are fail-fast: if the list is structurally modified at any time after the iterator is created, in any way except through the iterator’s own remove or add methods, the iterator will throw a ConcurrentModificationException . Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.
This class is a member of the Java Collections Framework.