python規范化修復字串-有解無憂

我有一個包含像這樣的非結構化條目的日志檔案

[roomID=19, description=ZZZZ, requesterCode=20, result=-1, errors=[[code=1, text=XXXXXXrequestID=/1540], flow=10:1] [remote=0.0.0.0, host=xxx]
[roomID=19, description=ZZZZ, requesterCode=20, result=-1, errors=[[code=2, text=XXXXXX., [code=2, text=XXXXXXrequestID=/1551], flow=12:3]

如您所見，“錯誤”值以“[”開頭，但它沒有關閉，這使得決議變得更加困難

我想要做的是只清理“錯誤”部分并像這樣修復它：Replcaing '[]' with '{}' and remove duplicate keys from errors 這樣我就可以將它讀入 python dict

[roomID=19, description=ZZZZ, requesterCode=20, result=-1, errors={code=1, text=XXXXXX, requestID=/1540}, flow=10:1] [remote=0.0.0.0, host=xxx]
[roomID=19, description=ZZZZ, requesterCode=20, result=-1, errors={code=2, text=XXXXXX., requestID=/1551}, flow=12:3]

我不擅長python，但我嘗試使用這個糟糕的代碼。我需要你的幫助才能有效地做到這一點。

def fix(str):
    str = str.replace('errors=[[', 'errors={')
    ..
    return str

非常感謝你

uj5u.com熱心網友回復：

使用re標準庫中的，您可以執行更復雜的文本操作。

重要的是正確識別您的模式。

=[[-->={
[ -->
] -->}

當然，它們可以做得更健壯。

import re

log = """roomID=19, description=ZZZZ, requesterCode=20, result=-1, errors=[[code=1, text=XXXXXXrequestID=/1540], flow=10:1
roomID=19, description=ZZZZ, requesterCode=20, result=-1, errors=[[code=2, text=XXXXXX., [code=2, text=XXXXXXrequestID=/1551], flow=12:3"""

mapping = {1: '={', 2: ' ', 3: '}'}
regex = r'(=\[\[)|(\s\[)|(\])' # r stands for raw string not for regex!

log_new = re.sub(regex, lambda match: mapping[match.lastindex], log)

print(log_new)

回答已編輯的問題

log = """[roomID=19, description=ZZZZ, requesterCode=20, result=-1, errors=[[code=1, text=XXXXXXrequestID=/1540], flow=10:1] [remote=0.0.0.0, host=xxx]
[roomID=19, description=ZZZZ, requesterCode=20, result=-1, errors=[[code=2, text=XXXXXX., [code=2, text=XXXXXXrequestID=/1551], flow=12:3]"""

import re

mapping = {1: '', 2:'{code', 3: '}, '}
regex = r'(\[code=\d ,\s)|(\[ code)|(\],\s)'

log_new = re.sub(regex, lambda match: mapping[match.lastindex], log)

# split text into list
log_new = re.sub(r'(text=. ?)(requestID=)', r'\1, \2' , log_new)

print(log_new)

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/475452.html

標籤：Python 细绳解析

上一篇：檔案管理

下一篇：如何在控制臺中輸出多行字符影像