Python string find all index

Python: Find an Index (or all) of a Substring in a String

Python Find All Indexes of Substring in String Cover Image

In this post, you’ll learn how to find an index of a substring in a string, whether it’s the first substring or the last substring. You’ll also learn how to find every index of a substring in a string.

Knowing how to work with strings is an important skill in your Python journey. You’ll learn how to create a list of all the index positions where that substring occurs.

The Quick Answer:

Quick Answer - Find All Indices of a Substring in a Python String

How to Use Python to Find the First Index of a Substring in a String

If all you want to do is first index of a substring in a Python string, you can do this easily with the str.index() method. This method comes built into Python, so there’s no need to import any packages.

Читайте также:  Подключение скриптов

Let’s take a look at how you can use Python to find the first index of a substring in a string:

a_string = "the quick brown fox jumps over the lazy dog. the quick brown fox jumps over the lazy dog" # Find the first index of 'the' index = a_string.index('the') print(index) # Returns: 0

We can see here that the .index() method takes the parameter of the sub-string that we’re looking for. When we apply the method to our string a_string , the result of that returns 0 . This means that the substring begins at index position 0, of our string (i.e., it’s the first word).

Let’s take a look at how you can find the last index of a substring in a Python string.

How to Use Python to Find the Last Index of a Substring in a String

There may be many times when you want to find the last index of a substring in a Python string. To accomplish this, we cannot use the .index() string method. However, Python comes built in with a string method that searches right to left, meaning it’ll return the furthest right index. This is the .rindex() method.

Let’s see how we can use the str.rindex() method to find the last index of a substring in Python:

a_string = "the quick brown fox jumps over the lazy dog. the quick brown fox jumps over the lazy dog" # Find the last index of 'the' index = a_string.rindex('the') print(index) # Returns: 76

In the example above, we applied the .rindex() method to the string to return the last index’s position of our substring.

How to Use Regular Expression (Regex) finditer to Find All Indices of a Substring in a Python String

The above examples both returned different indices, but both only returned a single index. There may be other times when you may want to return all indices of a substring in a Python string.

For this, we’ll use the popular regular expression library, re . In particular, we’ll use the finditer method, which helps you find an iteration.

Let’s see how we can use regular expressions to find all indices of a substring in a Python string:

import re a_string = "the quick brown fox jumps over the lazy dog. the quick brown fox jumps over the lazy dog" # Find all indices of 'the' indices_object = re.finditer(pattern='the', string=a_string) indices = [index.start() for index in indices_object] print(indices) # Returns: [0, 31, 45, 76]

This example has a few more moving parts. Let’s break down what we’ve done step by step:

  1. We imported re and set up our variable a_string just as before
  2. We then use re.finditer to create an iterable object containing all the matches
  3. We then created a list comprehension to find the .start() value, meaning the starting index position of each match, within that
  4. Finally, we printed our list of index start positions

In the next section, you’ll learn how to use a list comprehension in Python to find all indices of a substring in a string.

How to Use a Python List Comprehension to Find All Indices of a Substring in a String

Let’s take a look at how you can find all indices of a substring in a string in Python without using the regular expression library. We’ll accomplish this by using a list comprehension.

Want to learn more about Python list comprehensions? Check out my in-depth tutorial about Python list comprehensions here, which will teach you all you need to know!

Let’s see how we can accomplish this using a list comprehension:

a_string = "the quick brown fox jumps over the lazy dog. the quick brown fox jumps over the lazy dog" # Find all indices of 'the' indices = [index for index in range(len(a_string)) if a_string.startswith('the', index)] print(indices) # Returns: [0, 31, 45, 76]

Let’s take a look at how this list comprehension works:

  1. We iterate over the numbers from 0 through the length of the list
  2. We include the index position of that number if the substring that’s created by splitting our string from that index onwards, begins with our letter
  3. We get a list returned of all the instances where that substring occurs in our string

In the final section of this tutorial, you’ll learn how to build a custom function to return the indices of all substrings in our Python string.

How to Build a Custom Function to Find All Indices of a Substring in a String in Python

Now that you’ve learned two different methods to return all indices of a substring in Python string, let’s learn how we can turn this into a custom Python function.

Why would we want to do this? Neither of the methods demonstrated above are really immediately clear a reader what they accomplish. This is where a function would come in handy, since it allows a future reader (who may, very well, be you!) know what your code is doing.

# Create a custom function to return the indices of all substrings in a Python string a_string = "the quick brown fox jumps over the lazy dog. the quick brown fox jumps over the lazy dog" def find_indices_of_substring(full_string, sub_string): return [index for index in range(len(full_string)) if full_string.startswith(sub_string, index)] indices = find_indices_of_substring(a_string, 'the') print(indices) # Returns: [0, 31, 45, 76]

In this sample custom function, we use used our list comprehension method of finding the indices of all substrings. The reason for this is that it does not create any additional dependencies.

Conclusion

In this post, you leaned how to use Python to find the first index, the last index, and all indices of a substring in a string. You learned how to do this with regular string methods, with regular expressions, list comprehensions, as well as a custom built function.

To learn more about the re.finditer() method, check out the official documentation here.

Источник

Python string find all index

Last updated: Feb 22, 2023
Reading time · 5 min

banner

# Table of Contents

# Find all indexes of a substring using startswith()

To find all indexes of a substring in a string:

  1. Use a list comprehension to iterate over a range object of the string’s length.
  2. Check if each character starts with the given substring and return the result.
Copied!
string = 'bobby hadz bobbyhadz.com' indexes = [ index for index in range(len(string)) if string.startswith('bobby', index) ] print(indexes) # 👉️ [0, 11]

We used a range object to iterate over the string.

Copied!
string = 'bobby hadz bobbyhadz.com' # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, # 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23] print(list(range(len(string))))

find all indexes of substring using startswith

The range class is commonly used for looping a specific number of times in for loops and takes the following arguments:

Name Description
start An integer representing the start of the range (defaults to 0 )
stop Go up to, but not including the provided integer
step Range will consist of every N numbers from start to stop (defaults to 1 )

If you only pass a single argument to the range() constructor, it is considered to be the value for the stop parameter.

Copied!
for n in range(5): print(n) # 👉️ 0 1 2 3 4 result = list(range(5)) # 👇️ [0, 1, 2, 3, 4] print(result)

On each iteration, we check if the slice of the string that starts at the current character starts with the given substring.

Copied!
string = 'bobby hadz bobbyhadz.com' indexes = [ index for index in range(len(string)) if string.startswith('bobby', index) ] print(indexes) # 👉️ [0, 11]

If the condition is met, the corresponding index is returned.

The new list contains all of the indexes of the substring in the string.

# Find all indexes of a substring in a String using re.finditer()

This is a three-step process:

  1. Use the re.finditer() to get an iterator object of the matches.
  2. Use a list comprehension to iterate over the iterator.
  3. Use the match.start() method to get the indexes of the substring in the string.
Copied!
import re string = 'bobby hadz bobbyhadz.com' indexes = [ match.start() for match in re.finditer(r'bob', string) ] print(indexes) # 👉️ [0, 11]

find all indexes of substring using re finditer

The re.finditer() method takes a regular expression and a string and returns an iterator object containing the matches for the pattern in the string.

Copied!
import re string = 'bobby hadz bobbyhadz.com' # 👇️ [, # ] print(list( re.finditer(r'bob', string) ))

The match.start() method returns the index of the first character of the match.

Copied!
import re string = 'bobby hadz bobbyhadz.com' print( list(re.finditer(r'bob', string))[0].start() # 👉️ 0 ) print( list(re.finditer(r'bob', string))[1].start() # 👉️ 11 )

The new list contains the index of all occurrences of the substring in the string.

Copied!
import re string = 'bobby hadz bobbyhadz.com' indexes = [ match.start() for match in re.finditer(r'bob', string) ] print(indexes) # 👉️ [0, 11]

Alternatively, you can use a for loop.

# Find all indexes of a substring in a String using a for loop

This is a four-step process:

  1. Declare a new variable that stores an empty list.
  2. Use the re.finditer() to get an iterator object of the matches.
  3. Use a for loop to iterate over the object.
  4. Append the index of each match to the list.
Copied!
import re string = 'bobby hadz bobbyhadz.com' indexes = [] for match in re.finditer(r'bob', string): indexes.append(match.start()) print(indexes) # 👉️ [0, 11]

We used a for loop to iterate over the iterator object.

On each iteration, we use the match.start() method to get the index of the current match and append the result to the indexes list.

The list.append() method adds an item to the end of the list.

# Find all indexes of a substring using a while loop

You can also use a while loop to find all indexes of a substring in a string.

Copied!
def find_indexes(a_string, substring): start = 0 indexes = [] while start len(a_string): start = a_string.find(substring, start) if start == -1: return indexes indexes.append(start) start += 1 return indexes string = 'bobby hadz bobbyhadz.com' print(find_indexes(string, 'bob')) # 👉️ [0, 11] string = 'bobobobob' print(find_indexes(string, 'bob')) # 👉️ [0, 2, 4, 6]

The find_indexes substring takes a string and a substring and returns a list containing all of the indexes of the substring in the string.

We used a while loop to iterate for as long as the start variable is less than the string’s length.

On each iteration, we use the str.find() method to find the next index of the substring in the string.

The str.find method returns the index of the first occurrence of the provided substring in the string.

The method returns -1 if the substring is not found in the string.

If the substring is not found in the string -1 is returned and we return the indexes list.

Otherwise, we add the index of the occurrence to the list and increment the start variable by 1 .

Notice that the function in the example finds indexes of overlapping substrings as well.

Copied!
def find_indexes(a_string, substring): start = 0 indexes = [] while start len(a_string): start = a_string.find(substring, start) if start == -1: return indexes indexes.append(start) start += 1 return indexes string = 'bobobobob' print(find_indexes(string, 'bob')) # 👉️ [0, 2, 4, 6]

# Finding only non-overlapping results

If you need to only find the indexes of the non-overlapping substrings, add the length of the substring to the start variable.

Copied!
def find_indexes(a_string, substring): start = 0 indexes = [] while start len(a_string): start = a_string.find(substring, start) if start == -1: return indexes indexes.append(start) start += len(substring) # 👈️ only non-overlapping return indexes string = 'bobobobob' print(find_indexes(string, 'bob')) # 👉️ [0, 4]

Instead of adding 1 to the start variable when iterating, we added the length of the substring to only get the indexes of the non-overlapping matches.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.

Источник

Оцените статью