我的資料框看起來像這樣
category value1 value2
A. 20. 30.
B. 40. 50.
A. 60. 70.
B. 80. 90.
C. 10. 10.
D. 20. 20.
我想創建一個新列,其值要么是要么value1基于value2相對于 的條件category。例如,如果category是A那么存盤value1,如果category是B那么存盤value2,如果else那么存盤nan。我期望這樣的輸出:
category value1 value2 new_col
A. 20. 30. 20.
B. 40. 50. 50.
A. 60. 70. 60.
B. 80. 90. 90.
C. 10. 10. nan
D. 20. 20. nan
我怎樣才能做到這一點?
uj5u.com熱心網友回復:
# np.select and define list of condition with corresponding values
df['value3']=(np.select([df['category'].eq('A.'), # condition #1
df['category'].eq('B.')],# condition #2
[df['value1'], # value when #1 is true
df['value2']], # value when #2 is true
np.nan)) # default value
df
category value1 value2 value3
0 A. 20.0 30.0 20.0
1 B. 40.0 50.0 50.0
2 A. 60.0 70.0 60.0
3 B. 80.0 90.0 90.0
4 C. 10.0 10.0 NaN
5 D. 20.0 20.0 NaN
uj5u.com熱心網友回復:
另一種可能的解決方案,基于numpy.where:
df['new_col'] = np.where(df.category.eq('A.'), df.value1, np.where(
df.category.eq('B.'), df.value2, np.nan))
輸出:
category value1 value2 new_col
0 A. 20.0 30.0 20.0
1 B. 40.0 50.0 50.0
2 A. 60.0 70.0 60.0
3 B. 80.0 90.0 90.0
4 C. 10.0 10.0 NaN
5 D. 20.0 20.0 NaN
另一種可能的解決方案,基于pandas.DataFrame.update:
df['new_col'] = df.value1.loc[df.category.eq('A.')]
df['new_col'].update(df.value2.loc[df.category.eq('B.')])
輸出:
# same
uj5u.com熱心網友回復:
一種選擇是使用pyjanitor的case_when:
# pip install pyjanitor
import pandas as pd
import janitor
(df
.case_when(
df.category.eq('A.'), df.value1, # condition, result
df.category.eq('B.'), df.value2,
np.nan, # default
column_name = 'new_col')
)
category value1 value2 new_col
0 A. 20.0 30.0 20.0
1 B. 40.0 50.0 50.0
2 A. 60.0 70.0 60.0
3 B. 80.0 90.0 90.0
4 C. 10.0 10.0 NaN
5 D. 20.0 20.0 NaN
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/519496.html
上一篇:使用字典函式呼叫df列
