Match object python методы

Python Regex Match Object

Some regex function or method return a “MatchObject”. The following are its methods and attributes.

Match Object Methods

expand( template_string ) Return a transformed version of template_string , with back references replaced from the captured pattern. This is similar to using re.sub(…, template_string , … ) .

Back refernces include the numeric forms < \1 , \2 , …>, and < \g, \g , …>, and named forms \g < name >. Note, the named forms need to be specified in the regex pattern by ?P < name >pattern [see Regex Syntax]

# -*- coding: utf-8 -*- # python 2 # example of regex match object .expand() import re xx = re.compile(r"(\d\d\d\d)") yy = xx.search("in the year 1999") print yy.expand(r"Year: \1") # Year: 1999

groups(), group(), groupdict()

groups() Return a tuple containing all the captured groups of the match. group( n1 , n2 ,…) Return a string or tuple containing one or more captured group, in the order given. The arguments should be integers. If there’s a single argument, returns a string, else tuple. groupdict() Return a dictionary containing all the named groups of the match. Key is name of the named group.

groups()

Return a tuple containing all the subgroups of the match.

# -*- coding: utf-8 -*- # python 2 # example using match object's .groups() method import re myText = 'some a1 a2 a3 list' patObj = re.compile(r'.+(\w\d+) (\w\d+) (\w\d+).+') matchObj = patObj.search(myText) print matchObj.groups() # ('a1', 'a2', 'a3')

group(…)

return a string or tuple containing one or more captured group, in the order given. The arguments should be integers: 0 is the whole matched pattern, 1 is the 1st captured string, etc. If there’s just 1 argument, returns a string, else tuple. No argument is equivalent to group(0)

# -*- coding: utf-8 -*- # python 2 # example of regex match object's methods groups() and group(…) import re myText = 'some a1 a2 a3 list' patObj = re.compile(r'.+(\w\d+) (\w\d+) (\w\d+).+') matchObj = patObj.search(myText) print matchObj.groups() # ('a1', 'a2', 'a3') print matchObj.group() # 'some a1 a2 a3 list' print matchObj.group(0) # 'some a1 a2 a3 list' print matchObj.group(1) # 'a1' print matchObj.group(2) # 'a2' print matchObj.group(1,2) # ('a1', 'a2') print matchObj.group(2,1,1) # ('a2', 'a1', 'a1') print matchObj.group(0,1) # ('some a1 a2 a3 list', 'a1')

If an argument is negative or larger than the number of groups defined in the pattern, a IndexError exception is raised.

Читайте также:  Как сделать нумерацию css

If a group is contained in a part of the pattern that did not match, the corresponding result is None . (NEED EXAMPLE) If a group is contained in a part of the pattern that matched multiple times, the last match is returned. (NEED EXAMPLE)

If the regex pattern uses the (?P< name >…) syntax, the arguments may also be strings identifying groups by name name .

# -*- coding: utf-8 -*- # python 2 import re myText = 'some a1 a2 a3 list' patObj = re.compile(r'.+(\w\d+) (?P\w\d+) (\w\d+).+') matchObj = patObj.search(myText) print matchObj.group(1, "second", 3) # ('a1', 'a2', 'a3')

If a string argument is not used as a group name in the pattern, IndexError exception is raised.

groupdict(…)

Return a dictionary containing all the named subgroups of the match, keyed by the subgroup name. The default argument is used for groups that did not participate in the match; it defaults to None .

# -*- coding: utf-8 -*- # python 2 import re myText = 'some a1 a2 a3 list' patObj = re.compile(r'.+(?P\w\d+) (?P\w\d+) (?P\w\d+).+') matchObj=patObj.search(myText) print matchObj.groupdict() # prints

start(…), end(…), span(…)

start( n ) Return the index where the n captured group begins. end( n ) Return the index where the n captured group ends. span( n ) Return a tuple, with start and end position of the n th captured group.

start(…)

Return the indices of the start and end of the substring matched by nth captured pattern. start() is equivalent to start(0), similarly for end(). (0 represents to string matched by the whole regex pattern.)

# -*- coding: utf-8 -*- # python 2 import re myText = 'some a1 a2 a3 list' patObj = re.compile(r'.+(?P\w\d+) (?P\w\d+) (?P\w\d+).+') matchObj = patObj.search(myText) print matchObj.start(1) # prints 5 print matchObj.end(1) # prints 7

Return -1 if group exists but did not contribute to the match. For a match object m, and a group g that did contribute to the match, the substring matched by group g (equivalent to m.group(g) ) is m.string[m.start(g):m.end(g)]

Note that m.start(myGroup) will equal m.end(myGroup) if myGroup matched a null string. For example, after m = re.search(‘b(c?)’, ‘cba’) , m.start(0) is 1, m.end(0) is 2, m.start(1) and m.end(1) are both 2, and m.start(2) raises an “IndexError” exception.

span(…)

For MatchObject m, return the 2-tuple (m.start(n), m.end(n)) . Note that if the given captured pattern did not contribute to the match, this is (-1, -1) . (MAY NEED AN EXAMPLE HERE). span() is equivalent to span(0) .

Match Object Attributes

The following are various attributes of the MatchObject.

string

The string passed to match() or search().

# -*- coding: utf-8 -*- # python 2 import re mm = re.compile(r'some.+').search('some text') print mm.string # prints 'some text'

re

The regular expression object whose match() or search() method produced this MatchObject instance.

pos

The value of pos which was passed to the search() or match() method of the RegexObject. This is the index into the string at which the RE engine started looking for a match.

endpos

The value of endpos which was passed to the search() or match() method of the RegexObject. This is the index into the string beyond which the RE engine will not go.

lastindex

The integer index of the last matched capturing group, or None if no group was matched at all. For example, the expressions (a)b , ((a)(b)) , and ((ab)) will have lastindex == 1 if applyied to the string ‘ab’ , while the expression (a)(b) will have lastindex == 2 , if applyied to the same string.

lastgroup

The name of the last matched capturing group, or None if the group didn’t have a name, or if no group was matched at all.

Python Regular Expression

Python

Overview

Источник

Python RegEx Match Object

A Match Object is an object containing information about the search and the result.

Example

Do a search that will return a Match Object:

txt = «The rain in Spain»
x = re.search(«ai», txt)
print(x) #this will print an object

Note: If there is no match, the value None will be returned, instead of the Match Object.

The Match object has properties and methods used to retrieve information about the search, and the result:

.span() returns a tuple containing the start-, and end positions of the match.
.string returns the string passed into the function
.group() returns the part of the string where there was a match

Example

Print the position (start- and end-position) of the first match occurrence.

The regular expression looks for any words that starts with an upper case «S»:

txt = «The rain in Spain»
x = re.search(r»\bS\w+», txt)
print(x.span())

Example

Print the string passed into the function:

txt = «The rain in Spain»
x = re.search(r»\bS\w+», txt)
print(x.string)

Example

Print the part of the string where there was a match.

The regular expression looks for any words that starts with an upper case «S»:

txt = «The rain in Spain»
x = re.search(r»\bS\w+», txt)
print(x.group())

Note: If there is no match, the value None will be returned, instead of the Match Object.

Источник

Оцените статью