如何在基于行的運算式中根據單元格內容過濾行-有解無憂

我從檔案中讀取了一些資料。由于第一個資料行中的 XXX，第一列被分配了“物件”型別：

tips = pd.read_csv("tips.csv")
print(tips.head())
print(tips.info())

total_bill   tip     sex smoker  day    time  size    
0        xxx  1.01  Female     No  Sun  Dinner     2    
1      10.34  1.66    Male     No  Sun  Dinner     3    
2      21.01  3.50    Male     No  Sun  Dinner     3    
3      23.68  3.31    Male     No  Sun  Dinner     2    
4      24.59  3.61  Female     No  Sun  Dinner     4    
<class 'pandas.core.frame.DataFrame'>    
RangeIndex: 244 entries, 0 to 243    
Data columns (total 7 columns):    
 #   Column      Non-Null Count  Dtype      
---  ------      --------------  -----      
 0   total_bill  244 non-null    object     
 1   tip         244 non-null    float64    
 2   sex         244 non-null    object     
 3   smoker      244 non-null    object     
 4   day         244 non-null    object     
 5   time        244 non-null    object     
 6   size        244 non-null    int64

因此，這將失敗，因為第一行資料中的一個 XXX 數字應該是：

tips['tip_pct'] = tips['tip'] / (tips['total_bill'] - tips['tip'])

如何重寫上面的行以過濾掉壞行，而不實際更改 DataFrame 的內容？

uj5u.com熱心網友回復：

你可以用已在“XXX”列pd.to_numeric使用errors='coerce'。這會將字串型別值轉換為，NaN以便您的操作可以發生并且您的資料框將保持不變

tips['tip_pct'] = tips['tip'] / (pd.to_numeric(tips['total_bill'],errors='coerce') - tips['tip'])

  total_bill   tip     sex smoker  day time  size  Unnamed: 4   tip_pct
0        xxx  1.01  Female     No  Sun     Dinner           2       NaN
1      10.34  1.66    Male     No  Sun     Dinner           3  0.191244
2      21.01  3.50    Male     No  Sun     Dinner           3  0.199886
3      23.68  3.31    Male     No  Sun     Dinner           2  0.162494
4      24.59  3.61  Female     No  Sun     Dinner           4  0.172069

uj5u.com熱心網友回復：

另一種方式，掩碼，強制total_bill浮動計算

m=tips['total_bill']!='xxx'
tips['tip_pct'] =tips.loc[m,'tip'] / (tips.loc[m,'total_bill'].astype(float) - tips.loc[m,'tip'])




   total_bill   tip     sex smoker  day    time  size   tip_pct
0        xxx  1.01  Female     No  Sun  Dinner     2       NaN
1      10.34  1.66    Male     No  Sun  Dinner     3  0.191244
2      21.01  3.50    Male     No  Sun  Dinner     3  0.199886
3      23.68  3.31    Male     No  Sun  Dinner     2  0.162494
4      24.59  3.61  Female     No  Sun  Dinner     4  0.172069

uj5u.com熱心網友回復：

從 read_csv

data = pd.read_csv('tips.csv',
   
    dtype={'total_bil': np.float64})

tips['tip_pct'] = tips['tip'] / (tips['total_bill'] - tips['tip'])

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/397246.html

標籤：Python 蟒蛇-3.x 熊猫

上一篇：如何回傳最后一條評論？

下一篇：查找以特定文本開頭的divaria標簽，然后提取