給定一個資料樣本如下:
date value1 value2 value3
0 2021-10-12 1.015 1.115668 1.015000
1 2021-10-13 NaN 1.104622 1.030225
2 2021-10-14 NaN 1.093685 NaN
3 2021-10-15 1.015 1.082857 NaN
4 2021-10-16 1.015 1.072135 1.077284
5 2021-10-29 1.015 1.061520 1.093443
6 2021-10-30 1.015 1.051010 1.109845
7 2021-10-31 1.015 NaN 1.126493
8 2021-11-1 1.015 NaN NaN
9 2021-11-2 1.015 1.020100 NaN
10 2021-11-3 NaN 1.010000 NaN
11 2021-11-30 1.015 1.000000 NaN
假設我想洗掉2021 年 11 月的所有NaN值都為s 的列,這意味著2021-11-01to 的范圍2021-11-30(包括開始和結束日期)。
在這個要求下,vlue3將被丟棄,因為它的所有值2021-11都是NaNs。其他列有NaNs2021-11但不是全部,因此將保留這些列。
我怎么能在 Pandas 中做到這一點?謝謝。
編輯:
df['date'] = pd.to_datetime(df['date'])
mask = (df['date'] >= '2021-11-01') & (df['date'] <= '2021-11-30')
df.loc[mask]
出去:
date value1 value2 value3
8 2021-11-01 1.015 NaN NaN
9 2021-11-02 1.015 1.0201 NaN
10 2021-11-03 NaN 1.0100 NaN
11 2021-11-30 1.015 1.0000 NaN
uj5u.com熱心網友回復:
您可以過濾行November of 2021并測驗是否所有行都有NaNs by 條件:
df['date'] = pd.to_datetime(df['date'])
df = df.loc[:, ~df[df['date'].dt.to_period('m') == pd.Period('2021-11')].isna().all()]
或者:
df['date'] = pd.to_datetime(df['date'])
df = df.loc[:, df[df['date'].dt.to_period('m') == pd.Period('2021-11')].notna().any()]
編輯:如果需要手動設定一些列不處理使用:
mask = (df['date'] >= '2021-11-01') & (df['date'] <= '2021-11-30')
df = df.loc[:, df.loc[mask].notna().any()]
出去:
date value1 value2
0 2021-10-12 1.015 1.115668
1 2021-10-13 NaN 1.104622
2 2021-10-14 NaN 1.093685
3 2021-10-15 1.015 1.082857
4 2021-10-16 1.015 1.072135
5 2021-10-29 1.015 1.061520
6 2021-10-30 1.015 1.051010
7 2021-10-31 1.015 NaN
8 2021-11-01 1.015 NaN
9 2021-11-02 1.015 1.020100
10 2021-11-03 NaN 1.010000
11 2021-11-30 1.015 1.000000
編輯:
df = df.assign(value4 = np.nan)
print (df)
date value1 value2 value3 value4
0 2021-10-12 1.015 1.115668 1.015000 NaN
1 2021-10-13 NaN 1.104622 1.030225 NaN
2 2021-10-14 NaN 1.093685 NaN NaN
3 2021-10-15 1.015 1.082857 NaN NaN
4 2021-10-16 1.015 1.072135 1.077284 NaN
5 2021-10-29 1.015 1.061520 1.093443 NaN
6 2021-10-30 1.015 1.051010 1.109845 NaN
7 2021-10-31 1.015 NaN 1.126493 NaN
8 2021-11-1 1.015 NaN NaN NaN
9 2021-11-2 1.015 1.020100 NaN NaN
10 2021-11-3 NaN 1.010000 NaN NaN
11 2021-11-30 1.015 1.000000 NaN NaN
df['date'] = pd.to_datetime(df['date'])
m = df[df['date'].dt.to_period('m') == pd.Period('2021-11')].isna().all()
m.loc['value4'] = False
print (m)
date False
value1 False
value2 False
value3 True
value4 False
dtype: bool
df = df.loc[:, ~m]
print (df)
date value1 value2 value4
0 2021-10-12 1.015 1.115668 NaN
1 2021-10-13 NaN 1.104622 NaN
2 2021-10-14 NaN 1.093685 NaN
3 2021-10-15 1.015 1.082857 NaN
4 2021-10-16 1.015 1.072135 NaN
5 2021-10-29 1.015 1.061520 NaN
6 2021-10-30 1.015 1.051010 NaN
7 2021-10-31 1.015 NaN NaN
8 2021-11-01 1.015 NaN NaN
9 2021-11-02 1.015 1.020100 NaN
10 2021-11-03 NaN 1.010000 NaN
11 2021-11-30 1.015 1.000000 NaN
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/369422.html
下一篇:如何制作散點圖
