我想將 Pandas 資料框轉換為多鍵字典,使用 2 個或更多列作為字典鍵,并且我希望這些鍵與順序無關。
這是將 Pandas 字典轉換為常規多鍵字典的示例,其中順序是相關的。
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,100,size=(5, 3)), columns=list('ABC'))
df_dict = df.set_index(['B', 'C']).to_dict()['A']
print(df_dict)
{(33, 21): 85, (61, 46): 88, (78, 12): 48, (89, 18): 65, (91, 19): 41}
所以df_dict[(33, 21)]會得到85,但df_dict[(21, 33)]會導致關鍵錯誤。
潛在的解決方案
這是一個 SO 問題,它涵蓋了使用 sorted、tuple、Counter 和/或frozenset 使不相關的字典排序的方法。
鍵順序無關緊要的多鍵字典
但是,對于將這些資料型別和函式與 Pandas 轉換方法一起使用,我沒有明顯的解決方案。
下一個想法是在轉換資料幀后轉換字典鍵。
我試過這個
new_d = {frozenset(key): value for key, value in df_dict}
但得到這個錯誤
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-49-6a3244440ac2> in <module>()
----> 1 new_d = {frozenset(key): value for key, value in df_dict}
2 new_d
<ipython-input-49-6a3244440ac2> in <dictcomp>(.0)
----> 1 new_d = {frozenset(key): value for key, value in df_dict}
2 new_d
TypeError: 'int' object is not iterable
uj5u.com熱心網友回復:
為什么不從 df 創建
d = dict(zip(df[['B', 'C']].apply(frozenset,1),df['A']))
d
{frozenset({72, 12}): 34, frozenset({98, 76}): 82, frozenset({67, 7}): 35, frozenset({60, 70}): 18, frozenset({8, 53}): 81}
uj5u.com熱心網友回復:
你忘記了回圈df_dict.items()而不是僅僅df_dict;)
>>> new_d = {frozenset(key): value for key, value in df_dict.items()}
>>> new_d
{frozenset({10, 99}): 92,
frozenset({60, 76}): 54,
frozenset({6, 20}): 31,
frozenset({36, 46}): 31,
frozenset({3, 68}): 59}
>>> new_d[frozenset({99, 10})]
92
獎勵:由于訪問所有使用的東西frozenset({...})是可怕的,我寫了一個小包裝類來使它更容易:
>>> class Test:
... def __init__(self, fs):
... self.fs = fs
... def __getitem__(self, key):
... return self.fs[frozenset(key)]
... def __setitem__(self, key, val):
... self.fs[frozenset(key)] = val
... def __repr__(self):
... import re
... return re.sub(r'frozenset\({(. ?)}\)', r'(\1)', self.fs.__repr__())
... __str__ = __repr__
>>> new_d = Test(new_d)
>>> new_d
{(10, 99): 92, (76, 60): 54, (20, 6): 31, (36, 46): 31, (3, 68): 59}
# Internally still just a dict of frozensets:
>>> new_d.fs
{frozenset({10, 99}): 92,
frozenset({60, 76}): 54,
frozenset({6, 20}): 31,
frozenset({36, 46}): 31,
frozenset({3, 68}): 59}
>>> new_d[10, 99]
92
>>> new_d[99, 10]
92
>>> new_d[99, 10] = 123456789
>>> new_d[10, 99]
123456789
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/392631.html
