Like function in python

Python string find, like and contains examples

Python offers several different ways to check if a string contains a substring. In this article I’ll post several different ways:

  • test_string in other_string — return True/False
  • test_word.startswith(word) — return True/False
  • word.find(test_word) — return index

Like operator in Python — in

Python contains or like operator in Python can be done by using operator — in :

test_string in other_string 

This will return True or False depending on the result of the execution.

As we can see from the examples below it’s case sensitive. There is a way to make is case insensitive by using: mystr.lower() :

print('lemon' in 'lemon pie') # True print('lemon' in 'Lemon') # False print('LeMoN'.lower() in 'lemon') # True print('lemon' in 'lemon') # True print('lemon' in 'Hot lemon pie') # True print('lemon' in 'orange juice') # False 

Python string — find()

Like operator in Python or string contains — we can check if a string contains a substring by using method .find() :

sentence = "I want a cup of lemon juice"; test_word = "lemon"; test_word_Up = "LEMON"; print (sentence.find(test_word)) print (sentence.find(test_word_Up)) print (sentence.find(test_word, 5)) print (sentence.find(test_word, 20)) 

Check list of strings — exact match

If you have a list of strings or sentences you can check them by:

forbidden_list = ['apple juice', 'banana pie', 'orange juice', 'lemon pie', 'lemon'] test_word = 'lemon' if test_word in forbidden_list: print(test_word) 

Check if any of the words in this list contains word lemon :

Читайте также:  Split string to char python

Testing with the word ‘apple’ the result is empty.

This case is useful when you have a predefined list of values and you want to verify the tested example is not part of the predefined list with values.

Python find string in list

In this section we will test whether a string is substring of any of the list items. To do so we are going to loop over each string and test it by in , find() or .startswith() :

forbidden_list = ['apple juice', 'banana pie', 'orange juice', 'lemon pie', 'lemon'] test_word = 'lemon' for word in forbidden_list: if word.startswith(test_word): print(word) 

This will check if the word is a substring of any of the strings in the list.

Compare two lists of strings

Often there is a need to filter many strings against a list of forbidden values. This can be done by:

  • iterating the list of search words
  • list all forbidden words
  • check if the given word is part of a forbidden word:
forbidden_list = ['apple', 'banana', 'orange', 'lemon', 'kiwi', 'mango'] search_words = ['apple', 'orange', 'lemon'] for test_word in search_words: if any(word.startswith(test_word) for word in forbidden_list): print(test_word) 

using this way you have freedom of choosing what test to be applied — exact match, starting as a string or to be substring:

For exact match you can try also to use:

diff_list = list(set(forbidden_list) & set(search_words)) 

Python string contains or like operator

Below you can find two methods which simulates string contains or like behavior using python:

Testing string against list of string (substring)

If you want to check a given word(s) are they part of a list of a strings this can be done by:

forbidden_list = ['apple', 'banana', 'orange', 'lemon', 'kiwi', 'mango'] forbidden_like_list = ['apple juice', 'banana pie', 'orange juice', 'lemon pie'] search_words = ['apple', 'banana', 'orange', 'lemon'] for test_word in search_words: if any(word.startswith(test_word) for word in forbidden_like_list): print(test_word) print('--------------------------------') test_word = 'lemon' if any(test_word == word for word in forbidden_list): print(word) 

Python like function

This method implements a check of a given list if it is part of another list. This can be used as a filter for messages.

forbidden_list = ['apple', 'banana', 'orange', 'lemon', 'kiwi', 'mango'] search_words = ['apple', 'banana', 'orange', 'lemon'] def string_like(search_words, forbidden_list): for line in forbidden_list: if any(word in line for word in search_words): print(line) string_like(search_words, forbidden_list) 
apple banana orange lemon 

Python like/contains operator

Implementation of the this method is similar to the previous one except that check is verifying that string contains another string:

forbidden_like_list = ['apple juice', 'banana pie', 'orange juice', 'lemon pie'] search_words = ['apple', 'banana', 'orange', 'lemon'] def string_in(search_words, forbidden_like_list): for line in forbidden_like_list: if any(word in line for word in search_words): print(line) string_in(search_words, forbidden_like_list) 
apple juice banana pie orange juice lemon pie 

By using SoftHints — Python, Linux, Pandas , you agree to our Cookie Policy.

Источник

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Working toward a package that will implement a ‘like’ function that compares two strings to determine similarity. See more in the README.

License

Ipgnosis/like

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

This is a work in progress.

A string compare function that analyzes to what extent string A is similar to string B.

This idea came from working on a different project where I kept misspelling the country name ‘Kazakhstan’ as ‘Khazakstan’. I searched for a ‘like’ function in Python and found none. This is strange, because this has been implemented in other languages before: there are even ‘sounds like’ functions (i.e. ‘soundex’). So, just for fun, I thought I would give it a shot. I expect that this will be a lot easier than implementing a spelling checker (or learning how to spell. )

An extension of the Python string object that adds a ‘like’ operator to test equivalence. This will enable ‘Khazakstan’ to match to ‘Kazakhstan’.

Note that this is not a spelling checker.

The initial implementation will evaluate some kind of probability function. As a comparison, I will test against the work of Damerau-Levenshtein.

This will be useful for handling ‘typos’ resulting from keyboarding errors (aka transpositions, e.g. ‘typo’ vs. ‘tyop’) that have found their way into data, thereby making it difficult to search upon.

This is particularly useful when a string contains international characters that aren’t available on all keyboards, for example:

  • ñ: the Spanish letter ‘eñe’
  • ü: the German (etc.) letter u with an umlaut
  • ç: the French c-cedilla
  • ß: the German letter ‘eszett’

(I suspect that the Cyrillic character set is overly ambitious. )

Note that the eszett (and others also) is a complication of the general problem in that two letters (i.e. ‘ss’) can be substituted for the eszett when the keyboard/character set in use doesn’t contain the eszett. For example: the German word for ‘street’ is ‘straße’, which can also be written ‘strasse’.

def like(strA, strB, args1 = float)

  • If args[0] then return True (if similarity >= args[0]) or False
  • If not args[0] then return float 0:1 — 0 = completely dissimilar; 1 = exact match.

The test data (data/test_data.json) is extracted from a Wikipedia article on the most misspelled words: Commonly misspelled English words.

About

Working toward a package that will implement a ‘like’ function that compares two strings to determine similarity. See more in the README.

Источник

How to Use Like Operator in Pandas DataFrame

If so, let’s check several examples of Pandas text matching simulating Like operator.

To start, here is a sample DataFrame which will be used in the next examples:

data = df = pd.DataFrame(data, index=['dog', 'hawk', 'shark', 'cat', 'crow', 'human']) 
num_legs num_wings class
dog 4 0 mammal
hawk 2 2 bird
shark 0 0 fish
cat 4 0 mammal
crow 2 2
human 2 2 mammal

Example 1: Pandas find rows which contain string

The first example is about filtering rows in DataFrame which is based on cell content — if the cell contains a given pattern extract it otherwise skip the row. Let’s get all rows for which column class contains letter i :

df['class'].str.contains('i', na=False) 

this will result in Series of True and False:

dog False
hawk True
shark True
cat False
crow False
human False

If you like to get the the whole row then you can use: df[df[‘class’].str.contains(‘i’, na=False)]

num_legs num_wings class
hawk 2 2 bird
shark 0 0 fish

Note: na=False will skip rows with None values. If you need them — use na=True . In case that parameter na is not specified then error will be raised:

ValueError: Cannot mask with non-boolean array containing NA / NaN values

Example 2: Pandas simulate Like operator and regex

Second example will demonstrate the usage of Pandas contains plus regex. Activating regex matching is done by regex=True . The pipe operator ‘sh|rd’ is used as or:

df[df['class'].str.contains('sh|rd', regex=True, na=True)] 

The code above will search for all rows which contains:

num_legs num_wings class
hawk 2 2 bird
shark 0 0 fish
crow 2 2
  • match rows which digits — df[‘class’].str.contains(‘\d’, regex=True)
  • match rows case insensitive — df[‘class’].str.contains(‘bird’, flags=re.IGNORECASE, regex=True)

Note: Usage of regular expression might slow down the operation in magnitude for bigger DataFrames

Example 3: Pandas match rows starting with text

Let’s find all rows with index starting by letter h by using function str.startswith :

df[df.index.str.startswith('h', na=False)] 
num_legs num_wings class
hawk 2 2 bird
human 2 2 mammal

Example 4: Pandas match rows ending with text

The same logic can be applied with function: .str.endswith in order to rows which values ends with a given string:

df[df.index.str.endswith('k', na=False)] 
num_legs num_wings class
hawk 2 2 bird
shark 0 0 fish

Example 5: Pandas Like operator with Query

Pandas queries can simulate Like operator as well. Let’s find a simple example of it. Here is the moment to point out two points:

  • naming columns with reserved words like class is dangerous and might cause errors
  • the other culprit for errors are None values.

So in order to use query plus str.contains we need to rename column class to classd and fill the None values.

df.query('classd.str.contains("i")', engine='python') 
num_legs num_wings class
hawk 2 2 bird
shark 0 0 fish

or combination with other conditions:

df.query('classd.str.contains("i") and classd.str.endswith("d") ', engine='python') 
num_legs num_wings classd
hawk 2 2 bird

Step 6: Pandas Like operator match numbers only

For this example we are going to use numeric Series like:

s = pd.Series(['20.03', '11', '23.0', '65', '60', 'a', None]) 

Return all rows with numbers:

[True, True, True, True, True, False, None] 
[True, False, True, False, False, False, None] 

How to filter for decimal numbers which have 0 after the point like 20.03, 23.0: Is pattern .0 good enough?

No — because 60 is matched too:

[True, False, True, False, True, False, None] 

The reason is that pattern .0 matches any character followed by a 0. Searching for floating numbers with dot followed by 0 is done by:

[True, False, True, False, False, False, None] 

Step 7: Pandas SQL Like operator

There is a python module: pandasql which allows SQL syntax for Pandas. It can be installed by:

from pandasql import sqldf pysqldf = lambda q: sqldf(q, globals()) 
sqldf("select * from df where classd like 'h%';", locals()) 

Resources

By using SoftHints — Python, Linux, Pandas , you agree to our Cookie Policy.

Источник

Оцените статью