我有一個這樣的文本檔案:
## COL
{ "Id": 1, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 1, "key1": "value1", "key2": "valueB", ... "keyN": "valueN"}
{ "Id": 1, "key1": "value1", "key2": "valueC", ... "keyN": "valueN"}
{ "Id": 2, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 3, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 3, "key1": "value1", "key2": "valueB", ... "keyN": "valueN"}
.
.
.
{ "Id": n, "key1": "value1", "key2": "valueZ", ... "keyN": "valueN"}
## USA
{ "Id": 1, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 1, "key1": "value1", "key2": "valueB", ... "keyN": "valueN"}
{ "Id": 1, "key1": "value1", "key2": "valueC", ... "keyN": "valueN"}
{ "Id": 2, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 3, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 3, "key1": "value1", "key2": "valueB", ... "keyN": "valueN"}
.
.
.
{ "Id": n, "key1": "value1", "key2": "valueZ", ... "keyN": "valueN"}
## ESP
{ "Id": 1, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 1, "key1": "value1", "key2": "valueB", ... "keyN": "valueN"}
{ "Id": 1, "key1": "value1", "key2": "valueC", ... "keyN": "valueN"}
{ "Id": 2, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
.
.
.
{ "Id": n, "key1": "value1", "key2": "valueZ", ... "keyN": "valueN"}
我需要使用 regex 和 python 僅提取特定國家/地區的行,例如:
## COL
{ "Id": 1, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 1, "key1": "value1", "key2": "valueB", ... "keyN": "valueN"}
{ "Id": 1, "key1": "value1", "key2": "valueC", ... "keyN": "valueN"}
{ "Id": 2, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 3, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 3, "key1": "value1", "key2": "valueB", ... "keyN": "valueN"}
.
.
.
{ "Id": n, "key1": "value1", "key2": "valueZ", ... "keyN": "valueN"}
注意:沒有標識國家/地區的鍵或值,只有上一個示例中的那些文本標記行
我嘗試這個正則運算式沒有成功:
(?<=## COL).*[\w\s]*(?=##})
提前致謝!
uj5u.com熱心網友回復:
使用正則運算式:
import re
m = re.search(r'^## COL\n(?:(?!##).) ', text, flags=re.S)
if m:
print(m.group())
更有效的替代方案:
m = re.search(r'^## COL\n(?:(?:(?!##).*)\n) ', text).group()
輸出:
## COL
{ "Id": 1, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 1, "key1": "value1", "key2": "valueB", ... "keyN": "valueN"}
{ "Id": 1, "key1": "value1", "key2": "valueC", ... "keyN": "valueN"}
{ "Id": 2, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 3, "key1": "value1", "key2": "valueA", ... "keyN": "valueN"}
{ "Id": 3, "key1": "value1", "key2": "valueB", ... "keyN": "valueN"}
.
.
.
{ "Id": n, "key1": "value1", "key2": "valueZ", ... "keyN": "valueN"}
正則運算式演示選項 1
正則運算式演示替代(帶空行)
uj5u.com熱心網友回復:
怎么樣## COL[^#]*?匹配請求的模式就足夠了?無需向前或向后看。
請參閱https://regex101.com/r/pc0iaV/1以了解它的作業原理。
uj5u.com熱心網友回復:
如果沒有re.S標志,您可以將模式撰寫為:
^## COL(?:\n(?!## ).*)*
解釋
^字串的開始## COL從字面上匹配(?:非捕獲組\n(?!## ).*匹配換行符并匹配整行(如果它不以開頭)##
)*關閉非捕獲組并選擇性地重復它
查看正則運算式演示。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/505049.html
