Python finding duplicates in list

Содержание

Python: Find duplicates in a list with frequency count & index positions
Step 1: Get duplicate elements in a list with a frequency count
Frequently Asked:
Use collections.Counter() Find duplicates in a list with frequency count
Step 2: Get indices of each duplicate element in a list along with frequency count
Related posts:
Python Find Duplicates in List
Find Duplicates in a List in Python
Using set() Function
Count Duplicates in List in Python
Using the Brute Force approach

Python: Find duplicates in a list with frequency count & index positions

In this article, we will discuss how to find duplicates in a list along with their frequency count and their index positions in the list.

Let’s do this step by step,

Step 1: Get duplicate elements in a list with a frequency count

Suppose we have a list of strings i.e.

# List of strings listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test']

We have created a function that accepts a list and returns a dictionary of duplicate elements in that list along with their frequency count,

Frequently Asked:

def getDuplicatesWithCount(listOfElems): ''' Get frequency count of duplicate elements in the given list ''' dictOfElems = dict() # Iterate over each element in list for elem in listOfElems: # If element exists in dict then increment its value else add it in dict if elem in dictOfElems: dictOfElems[elem] += 1 else: dictOfElems[elem] = 1 # Filter key-value pairs in dictionary. Keep pairs whose value is greater than 1 i.e. only duplicate elements from list. dictOfElems = < key:value for key, value in dictOfElems.items() if value >1> # Returns a dict of duplicate elements and thier frequency count return dictOfElems

Let’s call this function to find out the duplicate elements in list with their frequency,

# List of strings listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test'] # Get a dictionary containing duplicate elements in list and their frequency count dictOfElems = getDuplicatesWithCount(listOfElems) for key, value in dictOfElems.items(): print(key , ' :: ', value)

What this function is doing?

When called, this function creates a new dictionary. Then iterates over all the elements in the given list one by one. For each elements it checks if the element exists in the dictionary keys or not,

If element does not exist in dictionary keys, then it adds the element as key in the dictionary with value as 1.
If the element exists in dictionary keys, then increments the value of that key by 1.

Once the iteration of list elements ends, in this dictionary we have the frequency count of each element in the list. But as we are interested in duplicates only i.e. elements with frequency count more than 1. So, it removes the elements from this dictionary whose value is greater than 1. In the end, it returns a dictionary containing duplicate elements as keys and their frequency count as value.

We can achieve the same using collections.Counter() too,

Use collections.Counter() Find duplicates in a list with frequency count

class collections.Counter( [ iterable-or-mapping ] )

We can create an object of Counter class, using an iterable or any dict like mapping. This Counter object keeps the count of each element in iterable. Let’s use this Counter object to find duplicates in a list and their count,

# List of strings listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test'] # Create a dictionary of elements & their frequency count dictOfElems = dict(Counter(listOfElems)) # Remove elements from dictionary whose value is 1, i.e. non duplicate items dictOfElems = < key:value for key, value in dictOfElems.items() if value >1> for key, value in dictOfElems.items(): print('Element = ' , key , ' :: Repeated Count = ', value)

Element = Ok :: Repeated Count = 2 Element = is :: Repeated Count = 2 Element = test :: Repeated Count = 2

Now we know the frequency count of each duplicate element in the list. But what if we want to know the index position of these duplicate elements in the list? Let’s see how to do that,

Step 2: Get indices of each duplicate element in a list along with frequency count

# List of strings listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test']

Now we want to know indices of each duplicate element in list and also their frequency count. Something like this,

Element = Ok :: Repeated Count = 2 :: Index Positions = [1, 3] Element = is :: Repeated Count = 2 :: Index Positions = [2, 6] Element = test :: Repeated Count = 2 :: Index Positions = [4, 8]

So, to achieve that we have created a function,

def getDuplicatesWithInfo(listOfElems): ''' Get duplicate element in a list along with thier indices in list and frequency count''' dictOfElems = dict() index = 0 # Iterate over each element in list and keep track of index for elem in listOfElems: # If element exists in dict then keep its index in lisr & increment its frequency if elem in dictOfElems: dictOfElems[elem][0] += 1 dictOfElems[elem][1].append(index) else: # Add a new entry in dictionary dictOfElems[elem] = [1, [index]] index += 1 dictOfElems = < key:value for key, value in dictOfElems.items() if value[0] >1> return dictOfElems

This function accepts a list of items and then iterates over the items in the list one by one to build a dictionary. In this dictionary, the key will be the element but value will be a list of,

Let’s call this function to find out the duplicate elements in a list, their index positions, and their frequency,

# List of strings listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test'] dictOfElems = getDuplicatesWithInfo(listOfElems) for key, value in dictOfElems.items(): print('Element = ', key , ' :: Repeated Count = ', value[0] , ' :: Index Positions = ', value[1])

Element = Ok :: Repeated Count = 2 :: Index Positions = [1, 3] Element = is :: Repeated Count = 2 :: Index Positions = [2, 6] Element = test :: Repeated Count = 2 :: Index Positions = [4, 8]

What this function is doing?

When we call this function with a list argument, then this function does following steps,

First of all, it creates a new dictionary.
Then iterates over all the elements in list one by one and keeps the track of index positions.

Then for each element, it checks if the element exists in the dictionary keys or not,

If element does not exist in dictionary keys then it adds a new key-value pair in dictionary, where the key is the element and value is a list object of 2 items i.e.

Frequency count 1
List with current index position

The complete example is as follows,

from collections import Counter def getDuplicatesWithCount(listOfElems): ''' Get frequency count of duplicate elements in the given list ''' dictOfElems = dict() # Iterate over each element in list for elem in listOfElems: # If element exists in dict then increment its value else add it in dict if elem in dictOfElems: dictOfElems[elem] += 1 else: dictOfElems[elem] = 1 # Filter key-value pairs in dictionary. Keep pairs whose value is greater than 1 i.e. only duplicate elements from list. dictOfElems = < key:value for key, value in dictOfElems.items() if value >1> # Returns a dict of duplicate elements and thier frequency count return dictOfElems def getDuplicatesWithInfo(listOfElems): ''' Get duplicate element in a list along with thier indices in list and frequency count''' dictOfElems = dict() index = 0 # Iterate over each element in list and keep track of index for elem in listOfElems: # If element exists in dict then keep its index in lisr & increment its frequency if elem in dictOfElems: dictOfElems[elem][0] += 1 dictOfElems[elem][1].append(index) else: # Add a new entry in dictionary dictOfElems[elem] = [1, [index]] index += 1 dictOfElems = < key:value for key, value in dictOfElems.items() if value[0] >1> return dictOfElems def main(): # List of strings listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test'] print('**** Get duplicate elements with repeated count ****') # get a dictionary containing duplicate elements in list and thier frequency count dictOfElems = getDuplicatesWithCount(listOfElems) for key, value in dictOfElems.items(): print(key , ' :: ', value) print('** Use Counter to get the frequency of duplicate items in list **') # Create a dictionary of elements & their frequency count dictOfElems = dict(Counter(listOfElems)) # Remove elements from dictionary whose value is 1, i.e. non duplicate items dictOfElems = < key:value for key, value in dictOfElems.items() if value >1> for key, value in dictOfElems.items(): print('Element = ' , key , ' :: Repeated Count = ', value) print('Get duplicate elements with repeated count and index position of duplicates') dictOfElems = getDuplicatesWithInfo(listOfElems) for key, value in dictOfElems.items(): print('Element = ', key , ' :: Repeated Count = ', value[0] , ' :: Index Positions = ', value[1]) if __name__ == '__main__': main()

**** Get duplicate elements with repeated count **** Ok :: 2 is :: 2 test :: 2 ** Use Counter to get the frequency of duplicate items in list ** Element = Ok :: Repeated Count = 2 Element = is :: Repeated Count = 2 Element = test :: Repeated Count = 2 Get duplicate elements with repeated count and index position of duplicates Element = Ok :: Repeated Count = 2 :: Index Positions = [1, 3] Element = is :: Repeated Count = 2 :: Index Positions = [2, 6] Element = test :: Repeated Count = 2 :: Index Positions = [4, 8]

Источник

Python Find Duplicates in List

Python find duplicates in list | We will discuss how to find duplicate items or elements in the list. In Python, there are many methods available on the list data type that help you find duplicates elements from a given list. In this post, we are using set(), count(), list comprehension, enumerate(), slicing + in operator, and Brute Force approach.

We will take the list while declaring the variables then, the Python program will find duplicates elements from the list. Finally, the duplicates element will be displayed on the screen.

Find Duplicates in a List in Python

Using set() Function

Python provides a built-in function set(). The set() is the collection of unordered items. Each element in the set must be unique, immutable, and the sets remove the duplicate elements. Sets are mutable which means we can modify them after their creation.

# Python program to find duplicate items in list # take list my_list = [1, 3, 7, 1, 2, 7, 5, 3, 8, 1] # printing original list print('List:', my_list) # find duplicate items using set() seen = set() duplicate_item = [x for x in my_list if x in seen or (seen.add(x) or False)] # printing duplicate elements print('Duplicate Elements:', duplicate_item)

List: [1, 3, 7, 1, 2, 7, 5, 3, 8, 1]Duplicate Elements: [1, 7, 3, 1]

To get each duplicate only once, you can use the set comprehension like this.

# Python program to find duplicate items in list # take list my_list = [1, 3, 7, 1, 2, 7, 5, 3, 8, 1] # printing original list print('List:', my_list) # find duplicate items using set() seen = set() duplicate_item = # printing duplicate elements print('Duplicate Elements:', duplicate_item)

List: [1, 3, 7, 1, 2, 7, 5, 3, 8, 1]Duplicate Elements: [1, 7, 3]

Count Duplicates in List in Python

Count() is an inbuilt function in Python that returns the count of how many times a given object occurs in a list. Syntax: list_name.count(object)

# Python program to find duplicate items in list # take list my_list = [1, 3, 7, 1, 2, 7, 5, 3, 8, 1] # printing original list print('List:', my_list) # find duplicate items using count() duplicate_item = 1> # printing duplicate elements print('Duplicate Elements:', duplicate_item)