我正在嘗試從兩列中洗掉 nans 和空格,并使用 column.fillna(column.mean) 將它們替換為相應列中的平均值,但是當我實作以下內容時,它告訴我“未定義列”代碼。
如何定義我在資料框中定義為引數的列,以便應用 columns.fillna(column.mean) 方法?
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
points = data = pd.read_csv (r'brain_diseases.csv', index_col='id')
df = pd.DataFrame(data, columns= ['cancer','prions'])
columns.fillna(cancer.mean())
columns.fillna(pryons.mean())
kpoints = KMeans(n_clusters=3, init='random').fit(data)
center = kpoints.cluster_centers_
print(center)
plt.scatter(data['trestbps'], data['chol'], c=kpoints.labels_.astype(float), s=50, alpha=0.5)
plt.scatter(center[:, 0], center[:, 1], c='black', s=50)
plt.show()
非常感謝任何幫助。
uj5u.com熱心網友回復:
columns 未在您的代碼中定義,
可以在資料幀上呼叫 fillna 函式:
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
points = data = pd.read_csv (r'brain_diseases.csv', index_col='id')
df = pd.DataFrame(data, columns= ['cancer','prions'])
df.fillna(cancer.mean())
df.fillna(pryons.mean()) # fill on df instead
...
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/313021.html
