Python in string wildcard

Last updated: Feb 23, 2023
Reading time · 5 min

banner

# Table of Contents

# Filter a list of strings using a Wildcard in Python

To filter a list of strings using a wildcard:

Pass the list and the pattern with the wildcard to the fnmatch.filter() method.
The fnmatch.filter() method will return a new list containing only the elements that match the pattern.

Copied!
import fnmatch a_list = ['abc_bobby.csv', 'hadz', '!@#', 'abc_employees.csv'] pattern = 'abc_*.csv' filtered_list = fnmatch.filter(a_list, pattern) print(filtered_list) # 👉️ ['abc_bobby.csv', 'abc_employees.csv']

The fnmatch.filter method takes an iterable and a pattern and returns a new list containing only the elements of the iterable that match the provided pattern.

The pattern in the example matches strings that start with abc_ and end with .csv .

Note that the asterisk * matches everything (one or more characters).

If you want to match any single character, replace the asterisk * with a question mark ? .

Pattern	Meaning
*	Matches everything (one or more characters)
?	Matches any single character
[sequence]	Matches any character in sequence
[!sequence]	Matches any character not in sequence

Here is an example of using the question mark to match any single character.

Copied!
import fnmatch a_list = ['abc', 'abz', 'abxyz'] pattern = 'ab?' filtered_list = fnmatch.filter(a_list, pattern) print(filtered_list) # 👉️ ['abc', 'abz']

The pattern matches a string that starts with ab followed by any single character.

The pattern in the example contains only one wildcard character, but you can use as many wildcard characters as necessary.

Here is an example of a pattern that uses two wildcard characters.

Copied!
import fnmatch a_list = ['abc_bobby.csv', 'hadz', '!@#', 'abc_employees.txt'] pattern = 'abc_*.*' filtered_list = fnmatch.filter(a_list, pattern) print(filtered_list) # 👉️ ['abc_bobby.csv', 'abc_employees.txt']

The pattern starts with abc_ , has a dot . and then ends with any character.

You can also use the fnmatch.fnmatch() method instead of the fnmatch.filter() method.

Copied!
import fnmatch import re a_list = ['abc_bobby.csv', 'hadz', '!@#', 'abc_employees.csv'] pattern = 'abc_*.csv' filtered_list = [ item for item in a_list if fnmatch.fnmatch(item, pattern) ] print(filtered_list) # 👉️ ['abc_bobby.csv', 'abc_employees.csv']

The fnmatch.fnmatch method takes a string and a pattern as arguments.

The method returns True if the string matches the pattern and False otherwise.

We used a list comprehension to iterate over the list of strings and called the fnmatch.fnmatch() method on each string in the list.

List comprehensions are used to perform some operation for every element or select a subset of elements that meet a condition.

The new list only contains the strings that match the pattern.

# Check if a string matches a pattern using a wildcard

If you want to check if a string matches a pattern using a wildcard, use the fnmatch.fnmatch() method.

Copied!
import fnmatch a_string = '2023_bobby.txt' pattern = '2023*.txt' matches_pattern = fnmatch.fnmatch(a_string, pattern) print(matches_pattern) # 👉️ True if matches_pattern: # 👇️ this runs print('The string matches the pattern') else: print('The string does NOT match the pattern')

The pattern starts with 2023 followed by any one or more characters and ends with .txt .

Simply replace the asterisk * with a question mark ? if you want to match any single character instead of any one or more characters.

Alternatively, you can use a regular expression.

# Filter a list of strings using a Wildcard with a regex

This is a three-step process:

Use a list comprehension to iterate over the list.
Use the re.match() method to check if each string matches the pattern.
The new list will only contain the strings that match the pattern.

Copied!
import re a_list = ['abc_bobby.csv', 'hadz', '!@#', 'abc_employees.csv'] regex = re.compile(r'abc_.*\.csv') filtered_list = [ item for item in a_list if re.match(regex, item) ] print(filtered_list) # 👉️ ['abc_bobby.csv', 'abc_employees.csv']

The re.compile method compiles a regular expression pattern into an object, which can be used for matching using its match() or search() methods.

This is more efficient than using re.match or re.search directly because it saves and reuses the regular expression object.

The regular expression in the example starts with abc_ .

Copied!
regex = re.compile(r'abc_.*\.csv')

The dot . matches any character except a newline character.

The asterisk * matches the preceding regular expression (the dot . ) zero or more times.

Источник

Wildcard Search in a String in Python

The wildcard name comes from a card game, where a single card can represent any other card. The wildcard metacharacter is similar. It is represented by a dot (.) and matches any character, except for a new line character (\n).

For example, if we have a RegEx:

It matches: son, sun, but not soon, seen.

It will also match characters, such as space or dot: s n, s.n.

This metacharacter represents only a single character inside a string.

This is what the Python implementation looks like:

If you run the code, you will get this result:

Most common mistake

There is a common mistake that people make using the wildcard character.

If you work with decimal fractions, you may want to match the following RegEx:

It will match 5.40, but also 5 40, 5_40, 5-40, 5740, etc.

A good regular expression is when you match the type of text you want to match, and only this type of text, nothing more.

If you want to escape metacharacter, you have to use another metacharacter, called backslash (\).

When you escape metacharacter, you tell the RegEx engine that the character that follows should be treated as a literal character.

This time, the RegEx engine matches only 5.40.

Источник

Читайте также: Рамка вокруг таблицы

Python in string wildcard

Wildcard Matching in Python

Example (Python)

Input