Java stream убрать дубликаты

Java Stream – Find, Count and Remove Duplicates

Few simple examples to find and count the duplicates in a Stream and remove those duplicates since Java 8. We will use ArrayList to provide a Stream of elements including duplicates.

1. Stream.distinct() – To Remove Duplicates

1.1. Remove Duplicate Strings

The distinct() method returns a Stream consisting of the distinct elements of the given stream. The object equality is checked according to the object’s equals() method.

List list = Arrays.asList("A", "B", "C", "D", "A", "B", "C"); // Get list without duplicates List distinctItems = list.stream() .distinct() .collect(Collectors.toList()); // Let's verify distinct elements System.out.println(distinctItems);

1.2. Remove Duplicate Custom Objects

The same syntax can be used to remove the duplicate objects from List. To do so, we need to be very careful about the object’s equals() method, because it will decide if an object is duplicate or unique.

Consider the below example where two Person instances are considered equal if both have the same id value.

Let us see an example of how we can remove duplicate Person objects from a List.

//Add some random persons Collection list = Arrays.asList(p1, p2, p3, p4, p5, p6); // Get distinct people by id List distinctElements = list.stream() .distinct() .collect( Collectors.toList() );

To find all unique objects using a different equality condition, we can take the help of the following distinctByKey() method. For example, we are finding all unique objects by Person’s full name.

//Add some random persons List list = Arrays.asList(p1, p2, p3, p4, p5, p6); // Get distinct people by full name List distinctPeople = list.stream() .filter( distinctByKey(p -> p.getFname() + " " + p.getLname()) ) .collect( Collectors.toList() ); //********The distinctByKey() method need to be created********** public static Predicate distinctByKey(Function keyExtractor) < Mapmap = new ConcurrentHashMap<>(); return t -> map.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null; >

2. Collectors.toSet() – To Remove Duplicates

Another simple and very useful way is to store all the elements in a Set . Sets, by definition, store only distinct elements. Note that a Set stores distinct items by comparing the objects with equals() method.

Here, we cannot compare the objects using a custom equality condition.

ArrayList numbersList = new ArrayList<>(Arrays.asList(1, 1, 2, 3, 3, 3, 4, 5, 6, 6, 6, 7, 8)); Set setWithoutDuplicates = numbersList.stream() .collect(Collectors.toSet()); System.out.println(setWithoutDuplicates);

3. Collectors.toMap() – To Count Duplicates

Sometimes, we are interested in finding out which elements are duplicates and how many times they appeared in the original list. We can use a Map to store this information.

We have to iterate over the list, put the element as the Map key, and all its occurrences in the Map value.

// ArrayList with duplicate elements ArrayList numbersList = new ArrayList<>(Arrays.asList(1, 1, 2, 3, 3, 3, 4, 5, 6, 6, 6, 7, 8)); Map elementCountMap = numbersList.stream() .collect(Collectors.toMap(Function.identity(), v -> 1L, Long::sum)); System.out.println(elementCountMap);

Источник

Removing All Duplicates From a List in Java

announcement - icon

The Kubernetes ecosystem is huge and quite complex, so it’s easy to forget about costs when trying out all of the exciting tools.

To avoid overspending on your Kubernetes cluster, definitely have a look at the free K8s cost monitoring tool from the automation platform CAST AI. You can view your costs in real time, allocate them, calculate burn rates for projects, spot anomalies or spikes, and get insightful reports you can share with your team.

Connect your cluster and start monitoring your K8s costs right away:

We rely on other people’s code in our own work. Every day.

It might be the language you’re writing in, the framework you’re building on, or some esoteric piece of software that does one thing so well you never found the need to implement it yourself.

The problem is, of course, when things fall apart in production — debugging the implementation of a 3rd party library you have no intimate knowledge of is, to say the least, tricky.

Lightrun is a new kind of debugger.

It’s one geared specifically towards real-life production environments. Using Lightrun, you can drill down into running applications, including 3rd party dependencies, with real-time logs, snapshots, and metrics.

Learn more in this quick, 5-minute Lightrun tutorial:

announcement - icon

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

The Jet Profiler was built for MySQL only, so it can do things like real-time query performance, focus on most used tables or most frequent queries, quickly identify performance issues and basically help you optimize your queries.

Critically, it has very minimal impact on your server’s performance, with most of the profiling work done separately — so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you’ll have results within minutes:

announcement - icon

DbSchema is a super-flexible database designer, which can take you from designing the DB with your team all the way to safely deploying the schema.

The way it does all of that is by using a design model, a database-independent image of the schema, which can be shared in a team using GIT and compared or deployed on to any database.

And, of course, it can be heavily visual, allowing you to interact with the database using diagrams, visually compose queries, explore the data, generate random data, import data or build HTML5 database reports.

announcement - icon

The Kubernetes ecosystem is huge and quite complex, so it’s easy to forget about costs when trying out all of the exciting tools.

To avoid overspending on your Kubernetes cluster, definitely have a look at the free K8s cost monitoring tool from the automation platform CAST AI. You can view your costs in real time, allocate them, calculate burn rates for projects, spot anomalies or spikes, and get insightful reports you can share with your team.

Connect your cluster and start monitoring your K8s costs right away:

We’re looking for a new Java technical editor to help review new articles for the site.

1. Introduction

In this quick tutorial, we’re going to learn how to clean up the duplicate elements from a List. First, we’ll use plain Java, then Guava, and finally, a Java 8 Lambda-based solution.

This tutorial is part of the “Java – Back to Basic” series here on Baeldung.

2. Remove Duplicates From a List Using Plain Java

We can easily remove the duplicate elements from a List with the standard Java Collections Framework through a Set:

public void givenListContainsDuplicates_whenRemovingDuplicatesWithPlainJava_thenCorrect() < ListlistWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0); List listWithoutDuplicates = new ArrayList<>( new HashSet<>(listWithDuplicates)); assertThat(listWithoutDuplicates, hasSize(5)); assertThat(listWithoutDuplicates, containsInAnyOrder(5, 0, 3, 1, 2)); >

As we can see, the original list remains unchanged.

In the example above, we used HashSet implementation, which is an unordered collection. As a result, the order of the cleaned-up listWithoutDuplicates might be different than the order of the original listWithDuplicates.

If we need to preserve the order, we can use LinkedHashSet instead:

public void givenListContainsDuplicates_whenRemovingDuplicatesPreservingOrderWithPlainJava_thenCorrect() < ListlistWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0); List listWithoutDuplicates = new ArrayList<>( new LinkedHashSet<>(listWithDuplicates)); assertThat(listWithoutDuplicates, hasSize(5)); assertThat(listWithoutDuplicates, containsInRelativeOrder(5, 0, 3, 1, 2)); >

Further reading:

Java Collections Interview Questions

Java — Combine Multiple Collections

How to Find an Element in a List with Java

3. Remove Duplicates From a List Using Guava

We can do the same thing using Guava as well:

public void givenListContainsDuplicates_whenRemovingDuplicatesWithGuava_thenCorrect() < ListlistWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0); List listWithoutDuplicates = Lists.newArrayList(Sets.newHashSet(listWithDuplicates)); assertThat(listWithoutDuplicates, hasSize(5)); assertThat(listWithoutDuplicates, containsInAnyOrder(5, 0, 3, 1, 2)); >

Here also, the original list remains unchanged.

Again, the order of elements in the cleaned-up list might be random.

If we use the LinkedHashSet implementation, we’ll preserve the initial order:

public void givenListContainsDuplicates_whenRemovingDuplicatesPreservingOrderWithGuava_thenCorrect() < ListlistWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0); List listWithoutDuplicates = Lists.newArrayList(Sets.newLinkedHashSet(listWithDuplicates)); assertThat(listWithoutDuplicates, hasSize(5)); assertThat(listWithoutDuplicates, containsInRelativeOrder(5, 0, 3, 1, 2)); >

4. Remove Duplicates From a List Using Java 8 Lambdas

Finally, let’s look at a new solution, using Lambdas in Java 8. We’ll use the distinct() method from the Stream API, which returns a stream consisting of distinct elements based on the result returned by the equals() method.

Additionally, for ordered streams, the selection of distinct elements is stable. This means that for duplicated elements, the element appearing first in the encounter order is preserved:

public void givenListContainsDuplicates_whenRemovingDuplicatesWithJava8_thenCorrect() < ListlistWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0); List listWithoutDuplicates = listWithDuplicates.stream() .distinct() .collect(Collectors.toList()); assertThat(listWithoutDuplicates, hasSize(5)); assertThat(listWithoutDuplicates, containsInAnyOrder(5, 0, 3, 1, 2)); >

There we have it, three quick ways to clean up all the duplicate items from a List.

5. Conclusion

In this article, we demonstrated how easy it is to remove duplicates from a list using plain Java, Google Guava, and Java 8.

The implementation of all of these examples and snippets can be found in the GitHub project. This is a Maven-based project, so it should be easy to import and run.

announcement - icon

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

The Jet Profiler was built for MySQL only, so it can do things like real-time query performance, focus on most used tables or most frequent queries, quickly identify performance issues and basically help you optimize your queries.

Critically, it has very minimal impact on your server’s performance, with most of the profiling work done separately — so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you’ll have results within minutes:

Источник

Читайте также:  Css border width heights
Оцените статью