我使用一個正則運算式來收集長文本檔案(多行)中的所有名稱:
regex = 'Name:\s*(.*)$'
names = re.findall(regex, file_content)
該檔案包含幾個部分,我只需要收集最多特定子字串的名稱(例如,“computers:”)。使用 Python 可以做到這一點(例如,file_content在子字串之后剪切),但出于某種原因,我必須只使用正則運算式。
如何?
文本檔案示例:
Name: Jon
address: 1st
phone: 01321231231231
Name: Mon
address: 1st
phone: 01321231231231
Name: Gon
address: 1st
phone: 01321231231231
Computers:
Name: Jason
address: 1st
phone: 01321231231231
Name: Bason
address: 1st
phone: 01321231231231
輸出:Jon、Mon、Gon
uj5u.com熱心網友回復:
您可以使用
regex = 'Name:\s*(.*)(?=[\s\S]*computers:)'
這里,
Name:- 一個固定的字串\s*- 零個或多個空格(.*)- 第 1 組:盡可能多的除換行符以外的零個或多個字符(?=[\s\S]*computers:)- 緊靠右側,必須有零個或多個字符后跟computers:字串
uj5u.com熱心網友回復:
import re
file_content = """
Name: Jon
address: 1st
phone: 01321231231231
Name: Mon
address: 1st
phone: 01321231231231
Name: Gon
address: 1st
phone: 01321231231231
Computers:
Name: Jason
address: 1st
phone: 01321231231231
Name: Bason
address: 1st
phone: 01321231231231
"""
#names = re.findall(r'Name:.*\n', file_content)
# To match only till some specific string in that case you
# can do slicing and use your portion of interest.
names = re.findall(r'Name:.*\n', file_content[:file_content.index("Computers")])
final_name_list = []
for name in names:
final_name_list.append(name.replace("Name: ", "").replace("\n", ""))
print(final_name_list)
在 re.findall 中,匹配以“Name:”開頭并以新行結尾的行。
names = re.findall(r'Name:.*\n', file_content) #this matches to all
names = re.findall(r'Name:.*\n', file_content[:file_content.index("Computers")]) #this matches only till the "Compters"
串列名稱將包含帶有您需要的字串的行,您可以通過遍歷每個串列元素將其替換為空字串。
final_name_list.append(name.replace("Name: ", "").replace("\n", ""))
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/385039.html
