這是我的資料的虛擬 DataFrame,我有分類行(由 的NaN值的存在表示'Price')和資料行(由 的非NaN
值表示'Price')。
gear = [('Baseball', None), ('Bat', 1), ('Glove', 2), ('Soccer', None), ('Shoes', 3), ('Ball', 4), ('Football', None), ('Helmet', 6)]
dummy_df = pd.DataFrame(gear, columns=['Name', 'Price'])
Name Price
0 Baseball NaN
1 Bat 1.0
2 Glove 2.0
3 Soccer NaN
4 Shoes 3.0
5 Ball 4.0
6 Football NaN
7 Helmet 6.0
我想創建一個新列'Sport',該列應用于課程類別下的每一行,直到您進入下一項運動。洗掉了分類行后,生成的 DataFrame 將如下所示:
Name Price Sport
1 Bat 1.0 Baseball
2 Glove 2.0 Baseball
3 Shoes 3.0 Soccer
4 Ball 4.0 Soccer
5 Helmet 6.0 Football
我正在考慮創建一個新列'Sport',它是Nameif的值Price不是NaNelse NaN。然后使用 affill或其他東西,然后洗掉NaN價格行?
uj5u.com熱心網友回復:
嘗試mask的notna,然后ffill才能得到正確的Sport:
s = dummy_df['Price'].notna()
dummy_df.assign(Sport=dummy_df['Name'].mask(s).ffill()).loc[s]
輸出:
Name Price Sport
1 Bat 1.0 Baseball
2 Glove 2.0 Baseball
4 Shoes 3.0 Soccer
5 Ball 4.0 Soccer
7 Helmet 6.0 Football
uj5u.com熱心網友回復:
dummy_df["Sport"] = dummy_df.groupby(dummy_df.Price.isna().cumsum()).Name.transform("first")
dummy_df[dummy_df.Price.notna()]
# Name Price Sport
# 1 Bat 1.0 Baseball
# 2 Glove 2.0 Baseball
# 4 Shoes 3.0 Soccer
# 5 Ball 4.0 Soccer
# 7 Helmet 6.0 Football
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/313000.html
