Python string contains regex

Python: Check if String Contains a Substring

In this Python Solutions post, you will learn the various ways to check if a string contains a substring. Checking for a substring is a common task in Python that you can use for conditional statements. We will explore with the use of the in Operator, the .index() .__contains__() , and .find() methods. Then we will look at employing regular expressions (regex) with re.search() to search strings.

Table of contents

The in Operator

The most straightforward way to check if a Python string contains a substring is to use the in operator.

The in operator checks data structures for membership and returns either True or False. We invoke the in operator on the superstring.

full_string = "Research" sub_string = "search" if sub_string in full_string: print("Found substring!") else: print("Not found!")

You can also use the operator to check if an item exists in a list.

strings = ['this string has gluons', 'this string has neutrinos', 'this string has muons'] for s in strings: if 'muons' in s: print('Muons found in string') else: print('Muons not found in string')

The in operator is case sensitive, so if the word “muons” is capitalised in the string, the above code would return false

strings = ['this string has gluons', 'this string has neutrinos', 'this string has Muons'] for s in strings: if 'muons' in s: print('Muons found in string') else: print('Muons not found in string')
Muons not found in string Muons not found in string Muons not found in string

Hence it is good practice to use the .lower() method with the in operator:

strings = ['this string has gluons'.lower(), 'this string has neutrinos'.lower(), 'this string has Muons'.lower()] for s in strings: if 'muons' in s: print('Muons found in string') else: print('Muons not found in string')
Muons not found in string Muons not found in string Muons found in string

The in operator is shorthand for calling the __contains__ method of an object.

string = "This string contains photons" target = "photons" if (string.__contains__(target)): print("String contains photons!") else: print("String does not contain photons") 

Bear in mind that the in operator is not null-safe, so if your string is pointing to None, it would throw a TypeError exception.

TypeError: argument of type 'NoneType' is not iterable 

To avoid this you can check if the string points to None or not:

full_string = None sub_string = "search" if full_string != None and sub_string in full_string: print("Found!") else: print("Not found!") 

The String.index() Method

In Python, string type objects have a method called index(), which you can use to find the starting index of the first occurrence of a substring within a string. This method is particularly useful if you need to know the position of the substring as opposed to whether or not the substring exists within the full string. If the substring is not found, it will throw a ValueError exception. To handle this exception you can write your code as a try-except-else block. The syntax of the method contains two optional parameters, start and stop. These take in index values to help you look for the substring within a specific index range.

full_string = "Research" sub_string = "search" try: full_string.index(sub_string) except ValueError: print("Not found!") else: print("Found substring!")

As with the in operator index is case sensitive, so ensure you use the .lower() function to avoid bugs in your code.

try: string.lower().index("python") except ValueError: print("String not found") else: print("Found string at index: ", string.lower().index("python"))

The String.find() Method

The find method takes in the argument of the substring we want to find in a string. The method will return the start location index of the substring. If the substring is not found, the method will return -1. Returning -1 may be preferable compared to ValueError exceptions thrown as in the case of the index() method. We can apply find() in an if-else statement.

The find() method is also case-sensitive.

full_string = "Research" sub_string = "search" if fullstring.find(substring) != -1: print("Found substring!") else: print("Not found!")

We can apply the find() method to the if… in muons example as follows:

strings = ['this string has gluons'.lower(), 'this string has neutrinos'.lower(), 'this string has Muons'.lower()] for s in strings: muons_index = strings.find('muons') if muons_index < 0: print('Muons not found in string') else: print(f'Muons found in string starting at index ')
Muons not found in string Muons not found in string Muons found in string starting at index 16

Regular Expressions (RegEx)

A Regular Expression (RegEx) is a sequence of characters that forms a search pattern. RegEx is useful for extracting information from text. Specific expression can include

You can import RegEx in Python using the re module. Through re.search we can dettermine if a string matches a pattern. The re.search() function generates a Match object if the patten makes a match. Find an example below:

import re string = "This string has photons" re.search("photons", string)

The Match object gives you the span, which is the start and end index for “photons”. Slicing the string between 16 and 23 will return the substring “photons”.

The match field shows us the part of the string that was a match, which is helpful for searching through a range of possible substrings that are match the search conditions. You can access the span and match attributes using the span() and group() methods as shown below:

print(re.search("photons", "This string has photons").span()) print(re.search("photons", "This string has photons".group()) 

Here is another example of using re.search() to find a substring within a string.

from re import search full_string = "Research" sub_string = "search" if search(sub_string, full_string): print("Found substring!") else: print("Not found!")

Regex can also use logic operators like OR to search for multiple substrings. Find an example of this below:

strings = ['this string has gluons'.lower(), 'this string has neutrinos'.lower(), 'this string has Muons'.lower()] for s in strings: if re.search('gluons|muons', s): print('Gluons or muons in string') else: print('Neither particle is in string')
Gluons or muons in string Neither particle is in string Gluons or muons in string

The regex method is best if you need a more complex matching method or require case insensitive matching. Otherwise the simpler substring matching methods are preferable, as regex is slower.

The .count() Method

The count() method searches for a specific substring in the target string. It reterns how many times the substring is present in the full string. The method has start and end as two optional arguments after the substring. In the following example, you will find count() used to retrieve how many times the word research appears in a phrase.

sentence = "How many research scientists who have published research in the world?" sentence.count("research")

We can limit the number of occurrences of research by specifying a window of characters between 0 and 24 of the string:

sentence.count("research", 0, 24)

Remmeber that the starting position is inclusive, but the ending is not.

The .replace() Method

In some cases, you may want to replace a particular substring with a new substring. In this case, you can use the replace() method. The method has the following syntax:

string.replace(old, new, count)

Where count is an optional argument, and specifies the number of times you want the old substring to be replaced by the new substring. In the example below, the substring fox is replaced with panda.

sentence = "the quick brown fox jumps over the lazy dog" print(sentence.replace("fox", "panda"))
the quick brown panda jumps over the lazy dog

We can specify the number of replacements to perform as shown in this example, where we only want two:

string = ("the yellow house is between the blue house and the red house") print(string.replace("house", "boat", 2))
the yellow boat is between the blue boat and the red house

Summary

Searching for a substring is one of the most common operations that programmers use. Python gives you several ways to search and handle strings. The easiest way to see if a string contains a substring is using the if… in statements, which will return True if the string is detected. You can also use the find() method and also retrieve the index that a substring starts at, or -1 if the substring does not exist within the string. RegEx is a more complicated option but is case insensitive and is useful for multiple substring searches using logical operators. Remember to use lower() methods for the case-sensitive methods and ensure the index() method is placed inside a try and except conditional statement. Now you are ready to search for substring like a pro!

Thank you for reading to the end of this article as part of the Python Solution series. You can also find solutions to other common problems on this site, for example, using the Python square root function. If you want practical advice on learning Python for data science and machine learning, you can go to the Online Courses on Python.

If you want to learn about strings in another language, like C++, you can start by going to the article: How to Find the Length of a String in C++.

Have fun and happy researching!

Share this:

Источник

Читайте также:  Css text align justify all
Оцените статью