我有一個如下所示的熊貓資料框(示例):
ColA MODEL_SCORE
B 300
A 400
L 500
K 600
C 400
...
我正在使用 np.select 來獲得我的預期輸出,正如您所看到的,我必須手動寫下條件,但我的值出現在串列中。請讓我知道,我如何利用此串列來避免手動撰寫條件。謝謝
l = [443.42128478674164,
488.37523204592253,
518.0823073999817,
541.0359169945577,
555.8687207507057,
567.4177820456491,
579.8827874601552,
589.7055254683078,
599.4173064672602,
606.7602443130553,
614.6608818995334,
624.0346335587483,
632.7952850129415,
641.7055745252072,
650.3578400196975,
660.2332325374314,
670.7207392073833,
685.3945990076263,
705.084106536755,
788.1550777011911]
conditions =
[recent['MODEL_SCORE']<= 443.421285,
recent['MODEL_SCORE'] <= 488.375232,
recent['MODEL_SCORE'] <=518.082307,
recent['MODEL_SCORE'] <=541.035917,
recent['MODEL_SCORE'] <=555.868721,
recent['MODEL_SCORE'] <=567.417782,
recent['MODEL_SCORE'] <=579.882787,
recent['MODEL_SCORE'] <=589.705525,
recent['MODEL_SCORE'] <=599.417306,
recent['MODEL_SCORE'] <=606.760244,
recent['MODEL_SCORE'] <=614.660882,
recent['MODEL_SCORE'] <=624.034634,
recent['MODEL_SCORE'] <=632.795285,
recent['MODEL_SCORE'] <=641.705575,
recent['MODEL_SCORE'] <=650.357840,
recent['MODEL_SCORE'] <=660.233233,
recent['MODEL_SCORE'] <=670.720739,
recent['MODEL_SCORE'] <=685.394599,
recent['MODEL_SCORE'] <=705.084107,
recent['MODEL_SCORE'] <=788.155078]
choices = list(range(0,20))
recent['ranks'] = np.select(conditions,choices,default=99)
預期輸出
ColA MODEL_SCORE ranks
B 300 0
A 400 0
L 500 2
K 600 9
C 400 0
...
uj5u.com熱心網友回復:
使用cutwith labels=False,將缺失值替換為99:
#add first group starting by 0
l = [0] l
df['ranks'] = (pd.cut(df['MODEL_SCORE'], bins=l, labels=False, right=False)
.fillna(99)
.astype(int))
print (df)
ColA MODEL_SCORE ranks
0 B 300 0
1 A 400 0
2 L 500 2
3 K 600 9
4 C 40000 99
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/348730.html
