我必須用 NaN 替換前三列中的值,如果它們是>=thanfence_high或<=than fence_low。
我有一個這樣的資料框:
col1 col2 col3 fence_high fence_low
0 1 3 9 9 1.5
1 2 4 6 7 1
2 4 7 -1 6.5 0
這就是我想要實作的目標:
col1 col2 col3 fence_high fence_low
0 NaN 3 NaN 9 1.5
1 2 4 6 7 1
2 4 NaN NaN 6.5 0
到目前為止,我嘗試過df_new = df[(df < df["fence_high"]) & (df > df["fence_low"])],但這給了我所有的 NaN。
uj5u.com熱心網友回復:
我們可以簡單地將值保留在它們之間fence_low,并fence_high使用gt和lt來保持索引對齊:
df.loc[:, 'col1':'col3'] = df.loc[:, 'col1':'col3'].where(
lambda x: x.gt(df['fence_low'], axis=0) & x.lt(df['fence_high'], axis=0)
)
df
col1 col2 col3 fence_high fence_low
0 NaN 3.0 NaN 9.0 1.5
1 2.0 4.0 6.0 7.0 1.0
2 4.0 NaN NaN 6.5 0.0
如果需要一個新的 DataFrame,我們可以在where恢復未考慮的列之后加入:
new_df = df.loc[:, 'col1':'col3'].where(
lambda x: x.gt(df['fence_low'], axis=0) & x.lt(df['fence_high'], axis=0)
).join(df[['fence_high', 'fence_low']])
new_df:
col1 col2 col3 fence_high fence_low
0 NaN 3.0 NaN 9.0 1.5
1 2.0 4.0 6.0 7.0 1.0
2 4.0 NaN NaN 6.5 0.0
uj5u.com熱心網友回復:
其中一種方法是使用apply
看看這是否有幫助:
import pandas as pd
import numpy as np
cols_list = ["col1", "col2", "col3"]
def compare_val(val, high, low):
if val >= high or val <= low:
return np.nan
return val
def compare(row):
result = []
for i in cols_list:
result.append(
compare_val(val=row[i], high=row["fence_high"], low=row["fence_low"])
)
return pd.Series(result)
data = [[1, 3, 9, 9, 1.5], [2, 4, 6, 7, 1], [4, 7, -1, 6.5, 0]]
df = pd.DataFrame(data, columns=[*cols_list, "fence_high", "fence_low"])
print("Original:\n", df.head())
df[cols_list] = df.apply(compare, axis=1)
print("Transformed:\n", df.head())
輸出:
Original:
col1 col2 col3 fence_high fence_low
0 1 3 9 9.0 1.5
1 2 4 6 7.0 1.0
2 4 7 -1 6.5 0.0
Transformed:
col1 col2 col3 fence_high fence_low
0 NaN 3.0 NaN 9.0 1.5
1 2.0 4.0 6.0 7.0 1.0
2 4.0 NaN NaN 6.5 0.0
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/496117.html
