我有一個與此類似的資料集:
|Budget|Profit |Ranking| #Ranking has values ranging from 1-10 (1 = worst, 10 = best)
|------|-------|-------|
|10000 | NaN | 8 |
|NaN | 4000 | 3 |
|5000 | 7000 | 9 |
|12000 | NaN | 8 |
|2000 | 4500 | 3 |
|5000 | 10000 | 8 |
我想根據相應排名的中位數填充預算和利潤中的缺失值,但似乎無法正確填寫。
for score in range(1, 11):
median_score = df[df['score'] == score]['budget'].median()
index_nr = df[df['score'] == score]
for i in index_nr:
df.loc[i, :]['budget'].filla(median_score)
如何改進我的代碼,以便用它的排名中位數填充缺失的資料?
uj5u.com熱心網友回復:
首先使用groupby 獲取每個排名組的列中位數transform,然后將其傳遞給fillna
groups_median = df.groupby('Ranking').transform('median')
df = df.fillna(groups_median)
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/350766.html
