我知道檢查熊貓系列中有多少缺失值很容易。如果我想檢查 Pandas 系列是否有 6 個以上的連續缺失值條目怎么辦?
uj5u.com熱心網友回復:
mask = temp_df.loc[:,i].isna()
max_missing_val = temp_df.loc[:,i][mask].groupby((~mask).cumsum()[mask]).agg(['size'])
if len(max_missing_val) == 0:
max_missing_val = 0
else:
max_missing_val = max_missing_val.max()[0]
參考:計算熊貓時間序列中的連續 nan 值
uj5u.com熱心網友回復:
您可以使用cumsum來創建連續NaN值組:
s = pd.Series(
[np.nan, 1, 2, np.nan, np.nan, np.nan, 3, 4, np.nan, np.nan]*2
)
# create groups of continuous na/non na values
group = s.isna().ne(s.shift().isna()).cumsum()
# set threshold for minimum group size, here 3 instead of 6
threshold = 3
group_size = s.groupby(group).transform('size')
# check for rows with 3 continous NaN values
print(s[(group % 2 == 0) & (group_size.ge(threshold))])
# output
3 NaN
4 NaN
5 NaN
8 NaN
9 NaN
10 NaN
13 NaN
14 NaN
15 NaN
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/347967.html
標籤:熊猫
上一篇:向量化最大距離函式
下一篇:測驗復選框的初始選中狀態
