我有一個看起來像這樣的資料框:
id name last attribute_1_name attribute_1_rating attribute_2_name attribute_2_rating
1 Linda Smith Age 23 Hair Brown
3 Brian Lin Hair Black Job Barista
基本上我想把這個表改成這樣:
id name last attribute_name attribute_rating
1 Linda Smith Age 23
1 Linda Smith Hair Brown
3 Brian Lin Hair Black
3 Brian Lin Job Barista
在 Python 中執行這種轉換的最優雅、最有效的方法是什么?假設有更多的行并且屬性編號達到 13。
uj5u.com熱心網友回復:
假設屬性列的命名一致,您可以這樣做:
result = pd.DataFrame()
for i in [1, 2]:
attribute_name_col = f'attribute_{i}_name'
attribute_rating_col = f'attribute_{i}_rating'
melted = pd.melt(
df,
id_vars=['id', 'name', 'last', attribute_name_col],
value_vars=[attribute_rating_col]
)
melted = melted.rename(
columns={attribute_name_col: 'attribute_name',
'value': 'attribute_rating'}
)
melted = melted.drop('variable', axis=1)
result = pd.concat([result, melted])
df你的原始資料框在哪里。然后列印result給出
id name last attribute_name attribute_rating
1 Linda Smith Age 23
3 Brian Lin Hair Black
1 Linda Smith Hair Brown
3 Brian Lin Job Barista
uj5u.com熱心網友回復:
這將一一隔離屬性/評級,重命名列,最后連接每個名稱的資料框串列:
N_ATTRIBUTES = 2 # change using the desired number of attributes to use
df1 = \
pd.concat(\
[df_name.loc[:,list(compress(list(df_name.columns), [x in ["name", "last",
f'attribute_{i}_name', f'attribute_{i}_rating'] for x in df_name.columns]))]\
.rename(columns=dict([[f'attribute_{i}_name',"attribute_name"],
[f'attribute_{i}_rating', "attribute_rating"]])) for i in range(1,N_ATTRIBUTES 1)]
)
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/325877.html
下一篇:如何減去資料幀中的兩個連續行?
