re模塊常用方法

正則運算式，又稱規則運算式，（英語：Regular Expression，在代碼中常簡寫為regex、regexp或RE），計算機科學的一個概念，正則運算式通常被用來檢索、替換那些符合某個模式(規則)的文本，
給定一個正則運算式和另一個字串，我們可以達到如下的目的：
給定的字串是否符合正則運算式的過濾邏輯（稱作“匹配”）；
可以通過正則運算式，從字串中獲取我們想要的特定部分，
正則運算式的特點是：
靈活性、邏輯性和功能性非常強；
可以迅速地用極簡單的方式達到字串的復雜控制；
對于剛接觸的人來說，比較晦澀難懂，
re模塊操作
在Python中通過re模塊來完成正則運算式操作

match(string[, pos[, endpos]])
string 是待匹配的字串 pos和 endpos 可選引數，指定字串的起始和終點位置，默認值分別是 0和 len(字串長度)，

# match 方法：從起始位置開始查找，一次匹配
re.match(pattern, string, flags=0)


result = re.match("hello", "hellolzt world")
print(result, result.group(), type(result))

在字串開頭匹配pattern，如果匹配成功（可以是空字串）回傳對應的match物件,否則回傳None，

search 方法

查找字串的任何位置，只匹配一次，只要找到了一個匹配的結果就回傳
search(string[, pos[, endpos]]) ,string是待匹配的字串 pos 和 endpos 可選引數，指定字串的起始和終點位置，當匹配成功時，回傳一個 Match 物件，如果沒有匹配上，則回傳 None，掃描整個字串string，找到與正則運算式pattern的第一個匹配（可以是空字串），并回傳一個對應的match物件，如果沒有匹配回傳None.

re.search(pattern, string, flags=0)
result = re.search("hello", "2018hellolzt world")
print(result.group())

fullmatch方法

fullmatch(pattern, string, flags=0)，是match函式的完全匹配（從字串開頭到結尾）

re.fullmatch(pattern, string, flags=0)
result = re.fullmatch("hello", "hello1")
print(result)

string是否整個和pattern匹配，如果是回傳對應的match物件,否則回傳None，

findall方法

以串列形式回傳全部能匹配的子串，如果沒有匹配，則回傳一個空串列， findall(string[, pos[, endpos]]),string待匹配的字串 pos 和 endpos 可選引數，指定字串的起始和終點位置，

findall(pattern, string, flags=0)
result = re.findall("hello", "lzt hello china hello world")
print(result, type(result))
# 回傳串列

split方法

按照能夠匹配的子串將字串分割后回傳串列 split(string[, maxsplit]),maxsplit用于指定最大分割次數，不指定將全部分割，

re.split(pattern, string, maxsplit=0, flags=0)
result = re.split("hello", "hello china hello world", 2)
print(result, type(result))
# 回傳分割串列

sub方法

用于替換,sub(repl, string[, count]),epl可以是字串也可以是一個函式：
(1) 如果repl 是字串，則會使用 repl去替換字串每一個匹配的子串
(2) 如果repl 是函式，方法只接受一個引數（Match物件），并回傳一個字串用于替換，
(3) count 用于指定最多替換次數，不指定時全部替換，

sub(pattern, repl, string, count=0, flags=0)
result = re.sub("hello", "hi", "hello china hello world", 2)
print(result, type(result))

使用repl替換pattern匹配到的內容，最多匹配count次

iterator方法

finditer(pattern, string, flags=0)
result = re.finditer("hello", "hello world hello china")
print(result, type(result))
# 回傳迭代器

compile方法

compile 函式用于編譯正則運算式，生成一個 Pattern 物件

compile(pattern, flags=0)
pat = re.compile("hello")
print(pat, type(pat))
result = pat.search("helloworld")
print(result, type(result))
# 編譯得到匹配模型

flags

re模塊的一些函式中將flags作為可選引數，下面列出了常用的幾個flag, 它們實際對應的是二進制數，可以通過位或將他們組合使用，flags可能改變正則表達時的行為：
re.I re.IGNORECASE: 匹配中大小寫不敏感
re.M re.MULTILINE: “^“匹配字串開始以及”\n"之后；”$“匹配”\n"之前以及字串末尾，通常稱為多行模式
re.S re.DOTALL: "."匹配任意字符，包括換行符，通常稱為單行模式
如果要同時使用單行模式和多行模式，只需要將函式的可選引數flags設定為re.I| re.S即可，

result = re.match("hello", "HeLlo", flags=re.I)
print(result)
result = re.findall("^abc","abcde\nabcd",re.M)
print(result)
result = re.findall("e$","abcde\nabcd",re.M)
print(result)
result = re.findall(".", "hello \n china", flags=re.S)
# "." 可以匹配換行符
print(result)
result = re.findall(".", "hello \n china", flags=re.M)
# "." 不可以匹配換行符
print(result)

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/189675.html

標籤：其他

上一篇：LeetCode每日一題！！（LCP 01.猜數字）

下一篇：群發郵件