Java stream distinct example

Interface Stream

A sequence of elements supporting sequential and parallel aggregate operations. The following example illustrates an aggregate operation using Stream and IntStream :

 int sum = .filter(w -> w.getColor() == RED) .mapToInt(w -> w.getWeight()) .sum(); 

In this example, widgets is a Collection . We create a stream of Widget objects via , filter it to produce a stream containing only the red widgets, and then transform it into a stream of int values representing the weight of each red widget. Then this stream is summed to produce a total weight.

In addition to Stream , which is a stream of object references, there are primitive specializations for IntStream , LongStream , and DoubleStream , all of which are referred to as «streams» and conform to the characteristics and restrictions described here.

To perform a computation, stream operations are composed into a stream pipeline. A stream pipeline consists of a source (which might be an array, a collection, a generator function, an I/O channel, etc), zero or more intermediate operations (which transform a stream into another stream, such as filter(Predicate) ), and a terminal operation (which produces a result or side-effect, such as count() or forEach(Consumer) ). Streams are lazy; computation on the source data is only performed when the terminal operation is initiated, and source elements are consumed only as needed.

A stream implementation is permitted significant latitude in optimizing the computation of the result. For example, a stream implementation is free to elide operations (or entire stages) from a stream pipeline — and therefore elide invocation of behavioral parameters — if it can prove that it would not affect the result of the computation. This means that side-effects of behavioral parameters may not always be executed and should not be relied upon, unless otherwise specified (such as by the terminal operations forEach and forEachOrdered ). (For a specific example of such an optimization, see the API note documented on the count() operation. For more detail, see the side-effects section of the stream package documentation.)

Читайте также:  Javascript проверка если отключен

Collections and streams, while bearing some superficial similarities, have different goals. Collections are primarily concerned with the efficient management of, and access to, their elements. By contrast, streams do not provide a means to directly access or manipulate their elements, and are instead concerned with declaratively describing their source and the computational operations which will be performed in aggregate on that source. However, if the provided stream operations do not offer the desired functionality, the BaseStream.iterator() and BaseStream.spliterator() operations can be used to perform a controlled traversal.

A stream pipeline, like the «widgets» example above, can be viewed as a query on the stream source. Unless the source was explicitly designed for concurrent modification (such as a ConcurrentHashMap ), unpredictable or erroneous behavior may result from modifying the stream source while it is being queried.

  • must be non-interfering (they do not modify the stream source); and
  • in most cases must be stateless (their result should not depend on any state that might change during execution of the stream pipeline).

Such parameters are always instances of a functional interface such as Function , and are often lambda expressions or method references. Unless otherwise specified these parameters must be non-null.

A stream should be operated on (invoking an intermediate or terminal stream operation) only once. This rules out, for example, «forked» streams, where the same source feeds two or more pipelines, or multiple traversals of the same stream. A stream implementation may throw IllegalStateException if it detects that the stream is being reused. However, since some stream operations may return their receiver rather than a new stream object, it may not be possible to detect reuse in all cases.

Streams have a BaseStream.close() method and implement AutoCloseable . Operating on a stream after it has been closed will throw IllegalStateException . Most stream instances do not actually need to be closed after use, as they are backed by collections, arrays, or generating functions, which require no special resource management. Generally, only streams whose source is an IO channel, such as those returned by Files.lines(Path) , will require closing. If a stream does require closing, it must be opened as a resource within a try-with-resources statement or similar control structure to ensure that it is closed promptly after its operations have completed.

Stream pipelines may execute either sequentially or in parallel. This execution mode is a property of the stream. Streams are created with an initial choice of sequential or parallel execution. (For example, creates a sequential stream, and Collection.parallelStream() creates a parallel one.) This choice of execution mode may be modified by the BaseStream.sequential() or BaseStream.parallel() methods, and may be queried with the BaseStream.isParallel() method.


Функция Java Stream distinct() для удаления дубликатов

Метод Java Stream distinct()возвращает новый поток различных элементов. Поэтому его можно использовать для удаления дубликатов элементов из набора.

Особенности использования метода distinct()

  • Элементы сравниваются с использованием equals(). Поэтому необходимо, чтобы элементы потока использовали правильную реализацию этого метода.
  • Если поток упорядочен, порядок нумерации сохраняется.
  • Если поток не упорядочен, то элементы потока могут иметь любой порядок.
  • Stream distinct() — промежуточная операция с состоянием.
  • Использование Different() с упорядоченным параллельным потоком может иметь низкую производительность из-за значительных расходов на буферизацию. В этом случае перейдите к последовательной обработке потока.

Удаление дублирующихся элементов с помощью distinct()

Рассмотрим, как использовать метод distinct()для удаления дубликатов из набора.

jshell> List list = List.of(1, 2, 3, 4, 3, 2, 1); list ==> [1, 2, 3, 4, 3, 2, 1] jshell> List distinctInts =; distinctInts ==> [1, 2, 3, 4]

Пример Java Stream distinct()

Обработка только уникальных элементов с использованием Stream distinct()и forEach()

Поскольку distinct() является промежуточной операцией, то мы можем использовать forEach() для обработки только уникальных элементов.

jshell> List list = List.of(1, 2, 3, 4, 3, 2, 1); list ==> [1, 2, 3, 4, 3, 2, 1] jshell> -> System.out.println("Processing " + x)); Processing 1 Processing 2 Processing 3 Processing 4

Пример Java Stream distinct() forEach()

Применение Stream distinct() к набору пользовательских объектов

Рассмотрим простой пример использования distinct() для удаления повторяющихся элементов из списка.

package; import java.util.ArrayList; import java.util.List; import; public class JavaStreamDistinct < public static void main(String[] args) < ListdataList = new ArrayList<>(); dataList.add(new Data(10)); dataList.add(new Data(20)); dataList.add(new Data(10)); dataList.add(new Data(20)); System.out.println("Data List = "+dataList); List uniqueDataList =; System.out.println("Unique Data List = "+uniqueDataList); > > class Data < private int id; Data(int i) < this.setId(i); >public int getId() < return id; >public void setId(int id) < = id; >@Override public String toString() < return String.format("Data[%d]",; >>
Data List = [Data[10], Data[20], Data[10], Data[20]] Unique Data List = [Data[10], Data[20], Data[10], Data[20]]

distinct() не удалил дублирующиеся элементы, потому что мы не реализовали метод equals() в классе Data. Метод Object equals() суперкласса был использован для идентификации равных элементов. Реализация метода класса equals() приведена ниже:

public boolean equals(Object obj)

Поскольку объекты Date() имели одинаковые идентификаторы, но ссылались на разные объекты, они считались не равными. Вот почему важно реализовать метод equals(), если вы планируете использовать distinct() при работе с пользовательскими объектами.

Обратите внимание на то, что методы equals() и hashCode() используются классами Collection API для сравнения объектов. Так что лучше обеспечить реализацию для них обоих.

@Override public int hashCode() < final int prime = 31; int result = 1; result = prime * result + id; return result; >@Override public boolean equals(Object obj)

Совет : Вы можете сгенерировать метод equals() и hashCode(), используя меню Eclipse: пункт Source> Generate equals() and hashCode().

Результирующий вывод после добавления реализации equals() и hashCode():

Data List = [Data[10], Data[20], Data[10], Data[20]] Data equals method Data equals method Unique Data List = [Data[10], Data[20


Java Stream distinct() Function to Remove Duplicates

Java Stream distinct() Function to Remove Duplicates

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Java Stream distinct() method returns a new stream of distinct elements. It’s useful in removing duplicate elements from the collection before processing them.

Java Stream distinct() Method

  • The elements are compared using the equals() method. So it’s necessary that the stream elements have proper implementation of equals() method.
  • If the stream is ordered, the encounter order is preserved. It means that the element occurring first will be present in the distinct elements stream.
  • If the stream is unordered, then the resulting stream elements can be in any order.
  • Stream distinct() is a stateful intermediate operation.
  • Using distinct() with an ordered parallel stream can have poor performance because of significant buffering overhead. In that case, go with sequential stream processing.

Remove Duplicate Elements using distinct()

Let’s see how to use stream distinct() method to remove duplicate elements from a collection.

jshell> List list = List.of(1, 2, 3, 4, 3, 2, 1); list ==> [1, 2, 3, 4, 3, 2, 1] jshell> List distinctInts =; distinctInts ==> [1, 2, 3, 4] 

Java Stream Distinct Example

Processing only Unique Elements using Stream distinct() and forEach()

Since distinct() is a intermediate operation, we can use forEach() method with it to process only the unique elements.

jshell> List list = List.of(1, 2, 3, 4, 3, 2, 1); list ==> [1, 2, 3, 4, 3, 2, 1] jshell> -> System.out.println("Processing " + x)); Processing 1 Processing 2 Processing 3 Processing 4 

Java Stream Distinct ForEach Example

Stream distinct() with custom objects

Let’s look at a simple example of using distinct() to remove duplicate elements from a list.

package; import java.util.ArrayList; import java.util.List; import; public class JavaStreamDistinct < public static void main(String[] args) < ListdataList = new ArrayList<>(); dataList.add(new Data(10)); dataList.add(new Data(20)); dataList.add(new Data(10)); dataList.add(new Data(20)); System.out.println("Data List = "+dataList); List uniqueDataList =; System.out.println("Unique Data List = "+uniqueDataList); > > class Data < private int id; Data(int i) < this.setId(i); >public int getId() < return id; >public void setId(int id) < = id; >@Override public String toString() < return String.format("Data[%d]",; >> 
Data List = [Data[10], Data[20], Data[10], Data[20]] Unique Data List = [Data[10], Data[20], Data[10], Data[20]] 

The distinct() method didn’t remove the duplicate elements. It’s because we didn’t implement the equals() method in the Data class. So the superclass Object equals() method was used to identify equal elements. The Object class equals() method implementation is:

public boolean equals(Object obj)

Since the Data objects had the same ids’ but they were referring to the different objects, they were considered not equal. That’s why it’s very important to implement equals() method if you are planning to use stream distinct() method with custom objects. Note that both equals() and hashCode() methods are used by Collection classes API to check if two objects are equal or not. So it’s better to provide an implementation for both of them.

@Override public int hashCode() < final int prime = 31; int result = 1; result = prime * result + id; return result; >@Override public boolean equals(Object obj)

Tip: You can easily generate equals() and hashCode() method using “Eclipse > Source > Generate equals() and hashCode()” menu option. The output after adding equals() and hashCode() implementation is:

Data List = [Data[10], Data[20], Data[10], Data[20]] Data equals method Data equals method Unique Data List = [Data[10], Data[20 

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.


Оцените статью