我想在逗號處拆分兩列,并將它們帶回原始的 Pandas 資料框。我嘗試過,explode()但出現錯誤,ValueError: cannot handle a non-unique multi-index!我想知道如何克服此錯誤。
import pandas as pd
data = {'fruit_tag': {0: 'apple, organge', 1: 'watermelon', 2: 'banana', 3: 'banana', 4: 'apple, banana'}, 'location': {0: 'Hong Kong , London', 1: 'New York, Tokyo', 2: 'Singapore', 3: 'Singapore, Hong Kong', 4: 'Tokyo'}, 'rating': {0: 'bad', 1: 'good', 2: 'good', 3: 'bad', 4: 'good'}, 'measure_score': {0: 0.9529434442520142, 1: 0.952498733997345, 2: 0.9080725312232971, 3: 0.8847543001174927, 4: 0.8679852485656738}}
dt = pd.DataFrame.from_dict(data)
dt.\
set_index(['rating', 'measure_score']).\
apply(lambda x: x.str.split(',').explode())
uj5u.com熱心網友回復:
當您爆炸時,(每個)舊行的索引是相同的。Pandas 不知道(或喜歡)對齊這些索引,因為用戶的意圖可能因情況而異,例如按順序對齊或交叉合并。例如,在您的情況下,您希望從1每列有 2 個條目的行中獲得什么?第2行怎么樣?
如果要交叉合并,則需要手動分解:
def explode(x, col): return x.assign(**{col:x[col].str.split(', ')}).explode(col)
explode(explode(dt, 'fruit_tag'), 'location')
輸出:
fruit_tag location rating measure_score
0 apple Hong Kong bad 0.952943
0 apple London bad 0.952943
0 organge Hong Kong bad 0.952943
0 organge London bad 0.952943
1 watermelon New York good 0.952499
1 watermelon Tokyo good 0.952499
2 banana Singapore good 0.908073
3 banana Singapore bad 0.884754
3 banana Hong Kong bad 0.884754
4 apple Tokyo good 0.867985
4 banana Tokyo good 0.867985
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/350790.html
下一篇:PythonPandas在JupyterNotebook中以默認格式列印Dataframe.describe()
