Python replace or re sub

Replace strings in Python (replace, translate, re.sub, re.subn)

In Python, there are several ways to replace strings. You can use the replace() method for simple replacements, translate() for character-to-character replacements, and the regular expression methods re.sub() and re.subn() for more complex pattern-based replacements. Additionally, slicing can be used to replace substrings at specified positions.

  • Replace substrings: replace()
    • Basic usage
    • Specify the maximum count of replacements: count
    • Replace multiple different substrings
    • Swap strings
    • Replace newline character
    • Basic usage
    • Swap characters
    • Basic usage
    • Replace multiple substrings with the same string
    • Replace using the matched part
    • Get the count of replaced parts

    You can also remove a substring by replacing it with an empty string » .

    For extracting substrings or finding their positions, see the following articles.

    There are also methods to convert between uppercase and lowercase letters.

    If you want to replace the contents of a text file, read the file as a string, process it, and save it again.

    Replace substrings: replace()

    Basic usage

    Use the replace() method to replace substrings.

    Specify the old string old for the first argument and the new string new for the second argument.

    s = 'one two one two one' print(s.replace(' ', '-')) # one-two-one-two-one 

    You can remove old by specifying new as the empty string » .

    print(s.replace(' ', '')) # onetwoonetwoone 

    Specify the maximum count of replacements: count

    You can set the maximum number of replacements with the third parameter, count . If count is given, only the first count occurrences are replaced.

    s = 'one two one two one' print(s.replace('one', 'XXX')) # XXX two XXX two XXX print(s.replace('one', 'XXX', 2)) # XXX two XXX two one 

    Replace multiple different substrings

    To replace multiple different strings with the same string, use regular expressions as described below.

    There is no method to replace multiple different strings with different ones directly, but you can apply replace() repeatedly.

    s = 'one two one two one' print(s.replace('one', 'XXX').replace('two', 'YYY')) # XXX YYY XXX YYY XXX 

    It calls replace() sequentially; therefore, if the first new contains the subsequent old , the first new will also be replaced.

    print(s.replace('one', 'XtwoX').replace('two', 'YYY')) # XYYYX YYY XYYYX YYY XYYYX print(s.replace('two', 'YYY').replace('one', 'XtwoX')) # XtwoX YYY XtwoX YYY XtwoX 

    To replace multiple individual characters (strings with a length of 1 ), you can use the translate() method, which is explained later in this article.

    Swap strings

    If you want to swap two strings, replacing them sequentially, as described above, may not work.

    s = 'one two one two one' print(s.replace('one', 'two').replace('two', 'one')) # one one one one one 

    First, you should replace the target string with a temporary string.

    print(s.replace('one', 'X').replace('two', 'one').replace('X', 'two')) # two one two one two 

    You can define a function for this swapping operation as follows:

    def swap_str(s_org, s1, s2, temp='*q@w-e~r^'): return s_org.replace(s1, temp).replace(s2, s1).replace(temp, s2) print(swap_str(s, 'one', 'two')) # two one two one two 

    Note that this function does not work if the temporary string temp is included in the original string. To make it more reliable, check if the original string contains the temporary string temp ; if it does, use a different string for temp . In the above example, temp is simply assigned an arbitrary string.

    To swap multiple individual characters (strings with a length of 1 ), you can use the translate() method, which is explained later in this article.

    Replace newline character

    If the string contains only one type of newline character, you can specify it as the first argument of replace() .

    s_lines = 'one\ntwo\nthree' print(s_lines) # one # two # three print(s_lines.replace('\n', '-')) # one-two-three 

    Be careful if \n (LF, used in Unix OS including Mac) and \r\n (CR + LF, used in Windows) are mixed. Since \n is included in \r\n , you cannot get the desired result depending on the order.

    You can use splitlines() , which returns a list split with various newline characters, and join() , which combines a list of strings. This method is safer and recommended, especially when the types of newline characters included are unknown.

    print(s_lines_multi.splitlines()) # ['one', 'two', 'three'] print('-'.join(s_lines_multi.splitlines())) # one-two-three 

    For more information on handling line breaks in strings, see the following article:

    Replace multiple different characters: translate()

    Basic usage

    Use the translate() method to replace multiple different characters. You can create the translation table specified in translate() by the str.maketrans() .

    Specify a dictionary with the old character as the key and the new string as the value in str.maketrans() .

    The old character must be a single character (a string of length 1 ). The new string is a string or None , where None removes old characters.

    s = 'one two one two one' print(s.translate(str.maketrans('o': 'O', 't': 'T'>))) # One TwO One TwO One print(s.translate(str.maketrans('o': 'XXX', 't': None>))) # XXXne wXXX XXXne wXXX XXXne 

    The first argument is a string concatenating the old characters, the second is a string concatenating the new characters, and the third is a string concatenating the characters to be deleted. The third argument is optional.

    print(s.translate(str.maketrans('ot', 'OT', 'n'))) # Oe TwO Oe TwO Oe 

    In this case, the lengths of the first and second arguments should be the same.

    # print(s.translate(str.maketrans('ow', 'OTX', 'n'))) # ValueError: the first two maketrans arguments must have equal length 

    Swap characters

    You can swap characters with translate() .

    s = 'one two one two one' print(s.replace('o', 't').replace('t', 'o')) # one owo one owo one print(s.translate(str.maketrans('o': 't', 't': 'o'>))) # tne owt tne owt tne print(s.translate(str.maketrans('ot', 'to'))) # tne owt tne owt tne 

    Replace by regex: re.sub() , re.subn()

    If you want to replace a string that matches a regular expression (regex) instead of an exact match, use sub() of the re module.

    Basic usage

    In re.sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third.

    import re s = 'aaa@xxx.com bbb@yyy.net ccc@zzz.org' print(re.sub('[a-z]+@', 'ABC@', s)) # ABC@xxx.com ABC@yyy.net ABC@zzz.org 

    As with replace() , you can specify the maximum count of replacements in the fourth argument, count .

    print(re.sub('[a-z]+@', 'ABC@', s, 2)) # ABC@xxx.com ABC@yyy.net ccc@zzz.org 

    You can also create a regular expression pattern object using re.compile() and call the sub() method. This approach is more efficient when the same regular expression pattern needs to be used repeatedly.

    p = re.compile('[a-z]+@') print(p.sub('ABC@', s)) # ABC@xxx.com ABC@yyy.net ABC@zzz.org 

    For more information on the re module, see the following article.

    Replace multiple substrings with the same string

    The following two are useful to remember even if you are unfamiliar with regex.

    Enclose strings with [] to match any single character in it. You can replace multiple different characters with the same string.

    s = 'aaa@xxx.com bbb@yyy.net ccc@zzz.org' print(re.sub('[xyz]', '1', s)) # aaa@111.com bbb@111.net ccc@111.org 

    If patterns are delimited by | , it matches any pattern. Of course, it is possible to use special characters of regular expression for each pattern, but it is OK even if normal string is specified as it is. You can replace multiple different strings with the same string.

    print(re.sub('com|net|org', 'biz', s)) # aaa@xxx.biz bbb@yyy.biz ccc@zzz.biz 

    Replace using the matched part

    If part of the pattern is enclosed in () , you can use a string that matches the part enclosed in () in the new string.

    s = 'aaa@xxx.com bbb@yyy.net ccc@zzz.org' print(re.sub('([a-z]+)@([a-z]+)', '\\2@\\1', s)) # xxx@aaa.com yyy@bbb.net zzz@ccc.org print(re.sub('([a-z]+)@([a-z]+)', r'\2@\1', s)) # xxx@aaa.com yyy@bbb.net zzz@ccc.org 

    While it is necessary to escape backslashes like \\1 in a regular string ( » or «» ), in a raw string ( r» or r»» ), you can simply write \1 .

    You can specify a function, which takes a match object as its argument, as the second argument of sub() . This allows for more complex operations.

    def func(matchobj): return matchobj.group(2).upper() + '@' + matchobj.group(1) print(re.sub('([a-z]+)@([a-z]+)', func, s)) # XXX@aaa.com YYY@bbb.net ZZZ@ccc.org 

    You can also use a lambda expression.

    print(re.sub('([a-z]+)@([a-z]+)', lambda m: m.group(2).upper() + '@' + m.group(1), s)) # XXX@aaa.com YYY@bbb.net ZZZ@ccc.org 

    For more information on regular expression match objects, see the following article.

    Get the count of replaced parts

    re.subn() returns a tuple of the replaced string and the number of parts replaced.

    s = 'aaa@xxx.com bbb@yyy.net ccc@zzz.org' t = re.subn('[a-z]*@', 'ABC@', s) print(t) # ('ABC@xxx.com ABC@yyy.net ABC@zzz.org', 3) print(type(t)) # print(t[0]) # ABC@xxx.com ABC@yyy.net ABC@zzz.org print(t[1]) # 3 

    The usage of subn() is the same as sub() . You can use the part grouped by () or specify the maximum number of replacements.

    print(re.subn('([a-z]+)@([a-z]+)', r'\2@\1', s, 2)) # ('xxx@aaa.com yyy@bbb.net ccc@zzz.org', 2) 

    Replace by position: slice

    There is no built-in method to replace the string at the specified position. However, you can achieve this by splitting the string using slicing and concatenating the resulting parts with another string.

    s = 'abcdefghij' print(s[:4] + 'XXX' + s[7:]) # abcdXXXhij 

    The length of the string (number of characters) can be obtained with len() , so it can be written as follows:

    s_replace = 'XXX' i = 4 print(s[:i] + s_replace + s[i + len(s_replace):]) # abcdXXXhij 

    The number of characters in the original and replacement strings does not have to be the same, as this method merely concatenates a different string between the sliced parts.

    Additionally, you can create a new string by inserting a different string at any position within the original string.

    print(s[:4] + '+++++' + s[4:]) # abcd+++++efghij 

    See the following article for more details on slicing.

    • String comparison in Python (exact/partial match, etc.)
    • Remove a part of a string (substring) in Python
    • Split strings in Python (delimiter, line break, regex, etc.)
    • Regular expressions with the re module in Python
    • Count characters and strings in Python
    • How to use regex match objects in Python
    • Search for a string in Python (Check if a substring is included/Get a substring position)
    • Extract a substring from a string in Python (position, regex)
    • Extract and replace elements that meet the conditions of a list of strings in Python
    • Sort a list of numeric strings in Python
    • Concatenate strings in Python (+ operator, join, etc.)
    • How to slice a list, string, tuple in Python
    • How to use f-strings in Python
    • Write a long string on multiple lines in Python
    • Right-justify, center, left-justify strings and numbers in Python

    Источник

    Читайте также:  Ошибка создания временного файла css спрей steam
Оцените статью