我有以下代碼,它從 csv 檔案中提取 2 個特征(tempo 和 slotID),并根據這 2 個特征繪制 kmeans 聚類。
df = pd.read_csv("prova.csv", encoding = "ISO-8859-1", sep = ';')
dfSlotMean = df.groupby('slotID', as_index=False)['tempo'].mean()
df = dfSlotMean[['tempo','slotID']]
###############################################
Sum_of_squared_distances = []
K = range(1,15)
for k in K:
km = KMeans(n_clusters=k)
km = km.fit(df)
Sum_of_squared_distances.append(km.inertia_)
plt.plot(K, Sum_of_squared_distances, 'bx-')
plt.xlabel('k')
plt.ylabel('Sum_of_squared_distances')
plt.title('Elbow Method For Optimal k')
plt.show()
################################################
kmeans = KMeans(n_clusters=5).fit(df)
labels = kmeans.labels_
centroids = kmeans.cluster_centers_
print(centroids)
# plt.scatter(df['tempo'], df['slotID'], c= kmeans.labels_.astype(float), s=25, alpha=0.5)
# plt.scatter(centroids[:, 0], centroids[:, 1], c='red', s=50)
# plt.show()
plt.scatter(df['slotID'], df['tempo'], c= kmeans.labels_.astype(float), s=25, alpha=0.5)
plt.title('Martedì')
plt.scatter(centroids[:, 1], centroids[:, 0], c='red', s=25)
plt.show()
print(pd.Series(labels).value_counts())
我現在要做的是在每個集群中分配值。我怎樣才能做到這一點?這是代碼的輸出:

簡而言之,我想要,例如,屬于簇號 1 的點是:131,98;135,76 立方...
uj5u.com熱心網友回復:
使用資料框索引來獲取所需的資料。例如,如果您想要來自集群 1 的點,您可以使用
df[labels == 1]
如果您想全部獲得它們:
for i in np.unique(labels):
print(df[labels == i])
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/431727.html
