我有這個資料
df = pd.DataFrame(data={'var1':['A','B','C', 'D'], 'var2':['something','something#else','something#else','something']})
var1 var2
0 A something
1 B something#else
2 C something#else
3 D something
我需要用“#”將單元格分成兩行。我還需要更新 de var1 列,以便資料框以如下方式結束:
var1 var2
0 A something
1 B.1 something
1 B.2 else
1 C.1 something
2 C.2 else
3 D something
到目前為止,我嘗試的是解決 str.split() 方法。我已經正確拆分了字串,但我不知道如何繼續以實作其余部分。
uj5u.com熱心網友回復:
IIUC,你可以這樣做:
(df
.assign(var2=df['var2'].str.split('#'))
.explode('var2')
.assign(var1=lambda d: d['var1'].mask(d['var1'].duplicated(keep=False),
d['var1'] '.' d.groupby('var1').cumcount().add(1).astype(str)))
)
替代語法:
df['var2'] = df['var2'].str.split('#')
df = df.explode('var2')
g = df.groupby('var1')['var1']
suffix = '.' g.cumcount().add(1).astype(str)
df['var1'] = suffix.where(g.transform('size').gt(1), '')
輸出:
var1 var2
0 A something
1 B.1 something
1 B.2 else
2 C.1 something
2 C.2 else
3 D something
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/443615.html
