我有一個這樣的資料框
data = [['Ma', 1,'too'], ['Ma', 1,'taa'], ['Ma', 1,'tuu',],['Ga', 2,'too'], ['Ga', 2,'taa'], ['Ga', 2,'tuu',]]
df = pd.DataFrame(data, columns = ['NAME', 'AID','SUBTYPE'])
NAME ID SUBTYPE
Ma 1 too
Ma 1 taa
Ma 1 tuu
Ga 2 too
Ga 2 taa
Ga 2 tuu
有重復的 NAME 和 ID 以及不同的 SUBTYPE
我想要這樣的清單
Ma-1-[too,taa,too],Ga-2-[too,taa,tuu]
編輯: NAME 和 ID 應該始終相同。
uj5u.com熱心網友回復:
通常,為了在 Python 中實作這一點,我們會使用字典,因為鍵不能重復。
# We combine the NAME and ID keys, so we can use them together as a key.
df["NAMEID"] = df["NAME"] "-" df["ID"].astype(str)
# Convert the desired fields to lists.
name_id_list = df["NAMEID"].tolist()
subtype_list = df["SUBTYPE"].tolist()
# Loop through the lists by zipping them together.
results_dict = {}
for name_id, subttype in zip(name_id_list, subtype_list):
if results_dict.get(name_id):
# If the key already exists then instead we append them to the end of the list.
results_dict[name_id].append(subttype)
else:
# If key not exists add them as key-value pairs to a dictionary.
results_dict[name_id] = [subtype]
結果字典最終看起來像:
{'Ma-1': ['too', 'taa', 'tuu'], 'Ga-2': ['too', 'taa', 'tuu']}
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/467788.html
上一篇:應用自己的功能
下一篇:將新資料插入資料框
