我有一個 jsonlines 檔案,其中包含以 node 為鍵的專案和作為值的其他節點的串列。要將邊添加到 networkx 圖中,-我認為-需要 (u,v) 形式的元組。我為此撰寫了一個幼稚的解決方案,但我覺得對于足夠大的 jsonl 檔案來說可能有點慢,有沒有人有更好的、更 Pythonic 的解決方案來建議?
dol = [{0: [1,2,3,4,5,6]},{1: [0,2,3,4,5,6]}]
for node in dol:
#print(node)
tpls = []
key = list(node.keys())[0]
tpls = [(key,v) for v in node[key]]
print(tpls)
<iterate through each one in the list to add them to the graph>
[(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6)]
[(1, 0), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)]
uj5u.com熱心網友回復:
dol = [{0: [1,2,3,4,5,6]},{1: [0,2,3,4,5,6]}]
def process(item: dict):
for key, values in item.items():
for i in values:
yield (key, i)
results = map(process, dol)
print([list(r) for r in results])
我認為你應該盡可能使用 yield 。
我不知道您的資料集有多大,但是當您使用 yield 并獲得可以迭代的生成器時,您會發現它的記憶體效率更高。
生成器的記憶體效率更高。
快樂編碼
uj5u.com熱心網友回復:
只有一把鑰匙
如果 dict 從來沒有超過一項,你可以這樣做:
dol = [{0: [1, 2, 3, 4, 5, 6]}, {1: [0, 2, 3, 4, 5, 6]}]
for node in dol:
local_node = node.copy() # only if dict shouldn't be modified in any way
k, values = local_node.popitem()
print([(k, value) for value in values])
# [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6)]
# [(1, 0), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)]
多個鍵
但是如果一個 dict 可能包含多個值,你可以做一個 while 回圈并測驗 dict 是否不為空:
for node in dol:
local_node = node.copy() # only if dict shouldn't be modified in any way
while local_node:
k, values = local_node.popitem()
print([(k, value) for value in values])
# [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6)]
# [(2, 0), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)]
# [(1, 0), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)]
當然,如果您需要存盤生成的串列,請將其附加到串列中,而不僅僅是列印它。
只有一本大詞典
如果你的 dol 物件可以是一個單獨的字典,那就更簡單了,如果,正如 Yves Daoust 所說,你需要一個鄰接表或矩陣,這里有兩個例子:
鄰接表純python
鄰接表:
dol = {0: [1, 2, 3, 4, 5, 6],
1: [0, 2, 3, 4, 5, 6]}
adjacency_list = [(key, value) for key, values in dol.items() for value in values]
print(adjacency_list)
# [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 0), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)]
與熊貓的鄰接矩陣
一個 adjacency_matrix:
import pandas
dol = {0: [1, 2, 3, 4, 5, 6],
1: [0, 2, 3, 4, 5, 6]}
adjacency_list = [(key, value) for key, values in dol.items() for value in values]
adjacency_df = pandas.DataFrame(adjacency_list)
adjacency_matrix = pandas.crosstab(adjacency_df[0], adjacency_df[1],
rownames=['keys'], colnames=['values'])
print(adjacency_matrix)
# values 0 1 2 3 4 5 6
# keys
# 0 0 1 1 1 1 1 1
# 1 1 0 1 1 1 1 1
uj5u.com熱心網友回復:
您可以使用串列理解:
dol = [{0: [1,2,3,4,5,6]},{1: [0,2,3,4,5,6]}]
tuples = [ (n1,n2) for d in dol for n1,ns in d.items() for n2 in ns ]
print(tuples)
[(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 0), (1, 2),
(1, 3), (1, 4), (1, 5), (1, 6)]
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/393715.html
標籤:Python json 字典 网络x jsonlines
上一篇:如何使用每個鍵的多個值反轉字典?
下一篇:帶反斜杠的字串到字典
