如何將字串資料轉換為整數資料以準備線性回歸？-有解無憂

我正在嘗試準備我的資料進行回歸。所以我試圖用以下代碼將字串列轉換為整數：

train["comment"] = train["comment"].astype(int)

但我收到此錯誤：

runfile('C:/Users/hayyi/.spyder-py3/temp.py', wdir='C:/Users/hayyi/.spyder-py3') 回溯（最后一次呼叫）：

檔案“C:\Users\hayyi.spyder-py3\temp.py”，第 57 行，在 train["comment"] = train["comment"].astype(int)

檔案“D:\SpyderUI\MiniConda\envs\spyder-env\lib\site-packages\pandas\core\generic.py”，第 5815 行，astype new_data = self._mgr.astype(dtype=dtype, copy=copy ，錯誤=錯誤）

檔案 "D:\SpyderUI\MiniConda\envs\spyder-env\lib\site-packages\pandas\core\internals\managers.py", line 418, in astype return self.apply("astype", dtype=dtype,復制=復制，錯誤=錯誤）

檔案“D:\SpyderUI\MiniConda\envs\spyder-env\lib\site-packages\pandas\core\internals\managers.py”，第 327 行，在 apply 中應用 = getattr(b, f)(**kwargs)

檔案“D:\SpyderUI\MiniConda\envs\spyder-env\lib\site-packages\pandas\core\internals\blocks.py”，第591行，astype new_values = astype_array_safe(values, dtype, copy=copy, errors =錯誤）

檔案“D:\SpyderUI\MiniConda\envs\spyder-env\lib\site-packages\pandas\core\dtypes\cast.py”，第 1309 行，在 astype_array_safe new_values = astype_array(values, dtype, copy=copy)

檔案“D:\SpyderUI\MiniConda\envs\spyder-env\lib\site-packages\pandas\core\dtypes\cast.py”，第 1257 行，在 astype_array values = astype_nansafe(values, dtype, copy=copy)

檔案“D:\SpyderUI\MiniConda\envs\spyder-env\lib\site-packages\pandas\core\dtypes\cast.py”，第 1174 行，在 astype_nansafe 中回傳 lib.astype_intsafe(arr, dtype)

檔案“pandas_libs\lib.pyx”，第 679 行，在 pandas._libs.lib.astype_intsafe 中

ValueError: 底數為 10 的 int() 的文字無效：“他得到了他的錢……現在他在等待 2 年后的選舉……骯臟的政客們需要再次害怕焦油和羽毛……但他們不是，所以人們被搞砸了。”

順便說一句，我嘗試這樣做，但我遇到了同樣的錯誤：

train["comment"]=train["comment].str.replace(',','').astype(int)

另一個問題，這種轉換是將字串資料準備好進行回歸的正確方法嗎？

uj5u.com熱心網友回復：

假設字串值是資料型別為字串的數字，請嘗試：

train['comment']= pd.to_numeric(train['comment'], errors='coerce')

如果該列包含任何 NaN 值，請使用以下命令：

train['comment']= pd.to_numeric(train['comment'], errors='coerce').fillna(0).astype(np.int64)

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/390583.html

標籤：Python 熊猫数据框

上一篇：將非標準字串轉換/修改為PythonDataFrame的日期時間

下一篇：在Pandas資料框中用字串交換雙打的優雅方式？