-
match and search - Note that match may differ from search using a regular expression beginning with "^": "^" matches only at the start of the string, or in MULTILINE mode also immediately following a newline. The ``match'' operation succeeds only if the pattern matches at the start of the string regardless of mode, or at the starting position given by the optional pos argument regardless of whether a newline precedes it.
re.compile("a").match("ba", 1) # succeeds
re.compile("^a").search("ba", 1) # fails; 'a' not at start
re.compile("^a").search("\na", 1) # fails; 'a' not at start
re.compile("^a", re.M).search("\na", 1) # succeeds, M is multiline flag
re.compile("^a", re.M).search("ba", 1) # fails; no preceding \n
-
split( pattern, string[, maxsplit = 0]) - Split string by the occurrences of pattern. If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list. If maxsplit is nonzero, at most maxsplit splits occur, and the remainder of the string is returned as the final element of the list.
re.split('\W+', 'Words, words, words.') # ['Words', 'words', 'words', '']
re.split('(\W+)', 'Words, words, words.') # ['Words', ', ', 'words', ', ', 'words', '.', '']
re.split('\W+', 'Words, words, words.', 1) # ['Words', 'words, words.']
-
findall( pattern, string[, flags]) - Return a list of all non-overlapping matches of pattern in string. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match. New in version 1.5.2. Changed in version 2.4: Added the optional flags argument.
findall('a', 'abacus') # ['a', 'a']
-
finditer( pattern, string[, flags]) - Return an iterator over all non-overlapping matches for the RE pattern in string. For each match, the iterator returns a match object. Empty matches are included in the result unless they touch the beginning of another match. New in version 2.2. Changed in version 2.4: Added the optional flags argument.
for x in finditer('a', 'abracadabra'):
if x != None: print 'Found match'
Found match
Found match
Found match
Found match
Found match
-
sub( pattern, repl, string[, count]) - Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. If the pattern isn't found, string is returned unchanged. repl can be a string or a function; if it is a string, any backslash escapes in it are processed. That is, "\n" is converted to a single newline character, "\r" is converted to a linefeed, and so forth. Unknown escapes such as "\j" are left alone. subn() performs the same operation as sub(), but return a tuple (new_string, number_of_subs_made).
Backreferences, such as "\6", are replaced with the substring matched by group 6 in the pattern. For example:
re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):',
... r'static PyObject*\npy_\1(void)\n{',
... 'def myfunc():')
# 'static PyObject*\npy_myfunc(void)\n{'
def dashrepl(matchobj):
... if matchobj.group(0) == '-': return ' '
... else: return '-'
re.sub('-{1,2}', dashrepl, 'pro----gram-files')
# 'pro--gram files'
-
escape( string) - Return string with all non-alphanumerics backslashed; this is useful if you want to match an arbitrary literal string that may have regular expression metacharacters in it.
escape('Ana conda\n') # 'Ana\\ conda\\\n'
index
