我正在使用 pandas 函式,我試圖在洗掉一個非數字值后插入一個缺失值。na但是,呼叫該isna().sum()函式時,我仍在讀取一個值。更好的解釋如下。
輸入 .csv 檔案可在此處找到。
這是我所做的:
#Import modules
import pandas as pd
import numpy as np
#Import data
df = pd.read_csv('example.csv')
df.isna().sum() #Shows no NA values, but I know that one of them is not numeric.
pd.to_numeric(df['example'])
產生以下錯誤,表明在第 949 行存在需要洗掉的條目:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
File ~libs\lib.pyx:2315, in pandas._libs.lib.maybe_convert_numeric()
ValueError: Unable to parse string "asdf"
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
Input In [111], in <cell line: 3>()
1 df1 = pd.read_csv('example.csv')
2 df1.isna().sum()
----> 3 pd.to_numeric(df1['example'])
File ~numeric.py:184, in to_numeric(arg, errors, downcast)
182 coerce_numeric = errors not in ("ignore", "raise")
183 try:
--> 184 values, _ = lib.maybe_convert_numeric(
185 values, set(), coerce_numeric=coerce_numeric
186 )
187 except (ValueError, TypeError):
188 if errors == "raise":
File ~libs\lib.pyx:2357, in pandas._libs.lib.maybe_convert_numeric()
ValueError: Unable to parse string "asdf" at position 949
這是我嘗試更正洗掉此值并在其位置插入一個新值的嘗試:
idx_missing = df== 'asdf'
df[idx_missing] = np.nan
df['example'].isnull().sum() #This line confirms that there is one value missing
#Perform interpolation with a linear method
df1.iloc[:, -1] = df.iloc[:, -1].interpolate(method='linear') #Specifying the last column in the dataframe with the 'iloc' command
df1.isna().sum()
顯然,仍然有一個缺失值,并且該值沒有被插值:
example 1
dtype: int64
我怎樣才能正確地插入這個值?
uj5u.com熱心網友回復:
如果您首先找到并替換任何不是數字的值,那應該可以解決您的問題。
#Import modules
import pandas as pd
import numpy as np
#Import data
df = pd.read_csv('example.csv')
df['example'] = df.example.replace(r'[^\d]',np.nan,regex=True)
pd.to_numeric(df.example)
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/453892.html
上一篇:如何根據字串匹配進行拆分?
下一篇:以類似樞軸的樣式重新格式化資料框
