我的資料框看起來像這樣
county_name state year rank county_population city_population
31 Fairfax County Virginia 2010.0 0.0 1086730.0 60300
32 Fairfax County Virginia 2011.0 0.0 1099603.0 60300
33 Fairfax County Virginia 2013.0 0.0 1130364.0 60300
34 Fairfax County Virginia 2014.0 0.0 1138123.0 60300
35 Fairfax County Virginia 2015.0 0.0 1142245.0 60300
我想插入 2012 年缺失的行并將其指定為 7。對于縣和市人口的值,我想取前一行和下一行(2011 年和 2013 年)的平均值,并為缺失的行填充這些值.
任何指標將不勝感激
編輯 1:預期的資料框應該是
county_name state year rank county_population city_population
31 Fairfax County Virginia 2010.0 0.0 1086730.0 60300
32 Fairfax County Virginia 2011.0 0.0 1099603.0 60300
33 Fairfax County Virginia 2012.0 7.0 1114984.0 60300
34 Fairfax County Virginia 2013.0 0.0 1130364.0 60300
35 Fairfax County Virginia 2014.0 0.0 1138123.0 60300
36 Fairfax County Virginia 2015.0 0.0 1142245.0 60300
uj5u.com熱心網友回復:
創建一個新的資料框并合并它們,按年份排序并插入缺失值:
data = [['Fairfax County', 'Virginia', 2012, 7, np.NaN, np.NaN]]
out = df.append(pd.DataFrame(data, columns=df.columns)) \
.sort_values('year').interpolate()
print(out)
輸出結果:
>>> out
county_name state year rank county_population city_population
31 Fairfax County Virginia 2010 0.0 1086730.0 60300.0
32 Fairfax County Virginia 2011 0.0 1099603.0 60300.0
0 Fairfax County Virginia 2012 7.0 1114983.5 60300.0
33 Fairfax County Virginia 2013 0.0 1130364.0 60300.0
34 Fairfax County Virginia 2014 0.0 1138123.0 60300.0
35 Fairfax County Virginia 2015 0.0 1142245.0 60300.0
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/325885.html
上一篇:DataFrame到多級列資料框
