我正在嘗試將以下 txt 檔案加載到字典中:
['A'] (4)
['B'] (4)
['E'] (4)
['C'] (4)
['A ', 'B'] (3)
['A', 'E'] (3)
['A', 'C'] (3)
['B', 'E'] (4)
['B', 'C'] (3)
['C', 'E'] (3)
['A','B', 'E'] (3)
['B', 'C', 'E'] (3) )
我希望字典看起來像這樣:
itemsets={ A:{"support_count":4},B:{"support_count":4},E:{"support_count":4},C:{"support_count":4},AB:{"support_count":3},AE:{"support_count":3},AC:{"support_count":3},BE:{"support_count":4},BC:{"support_count":3},CE:{"support_count":3},ABE:{"support_count":3},BCE:{support_count:3}}
這是我到目前為止:
keys=[]
values=[]
with open(filename, 'r') as f:
lines = f.readlines()
keys = [line[:line.find(']')] for line in lines]
keys = [k.replace('[', '').replace(']', '').replace(',','').replace("'",'').replace(' ','') for k in keys]
values= [line[line.find('('):] for line in lines]
values = [v.replace('(', '').replace(')', '').replace("'",'').replace("\n",'') for v in values]
itemsets = dict.fromkeys(keys)
for v in values:
for item in itemsets.keys():
d[item]={"support_count": v}
return itemsets
這是我運行時得到的:
{'A': {'support_count': '3'}, 'B': {'support_count': '3'}, 'E': {'support_count': '3'}, 'C': {'support_count': '3'}, 'AB': {'support_count': '3'}, 'AE': {'support_count': '3'}, 'AC': {'support_count': '3'}, 'BE': {'support_count': '3'}, 'BC': {'support_count': '3'}, 'CE': {'support_count': '3'}, 'ABE': {'support_count': '3'}, 'BCE': {'support_count': '3'}}
uj5u.com熱心網友回復:
當您迭代時values,您會繼續用下一個覆寫 dict 值,最后一個是 a 3,您需要同時迭代這兩個值:zip
for k, v in zip(keys, values):
d[k] = {"support_count": int(v)}
要決議資料,我建議使用正則運算式方法
\[(.*)] \((\d )決議每一行:鍵和值[^A-Z]從鍵中洗掉非字母
import re
lines = ["['A'] (4)", "['B'] (4)", "['E'] (4)", "['C'] (4)", "['A', 'B'] (3)",
"['A', 'E'] (3)", "['A', 'C'] (3)", "['B','E'] (4)", "['B', 'C'] (3)",
"['C', 'E'] (3)", "['A','B', 'E'] (3)", "['B', 'C', 'E'] (3)"]
d = {}
ptn_all = re.compile(r"\[(.*)] \((\d )")
ptn_key = re.compile("[^A-Z]")
for line in lines:
keys, value = ptn_all.search(line).groups()
d[ptn_key.sub("", keys)] = {"support_count": int(value)}
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/363161.html
