我的代碼:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
df = pd.read_csv('orderlist.csv', skiprows=1, delimiter=';', encoding="utf8")
df.columns = ["date", "customer_number", "item_code", "quantity"]
df['customer_item'] = df.customer_number ', ' df.item_code
df['date'] = pd.to_datetime(df['date'])
df["quantity"] = df["quantity"].astype(int, errors='ignore')
df["week"]=df.date.dt.week
df_grup = df.groupby(by=['week',"customer_item"]).quantity.sum().reset_index()
df_dum = pd.get_dummies(df_grup)
X, y = df_dum, df_dum["quantity"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
dtree = DecisionTreeClassifier().fit(X_train, y_train)
predict = dtree.fit(X_train, y_train)
y_pred = dtree.predict(X_test)
pred_quantity = dtree.predict(df_dum)
print("predict quantity:")
print(pred_quantity)
結果:
predict quantity:
[100 5 450 ... 295 22 639]
我需要在自己的結果旁邊列印客戶編號。
uj5u.com熱心網友回復:
的第 n 項pred_quantity對應于中的第 n 項 df['customer_number']
因此您可以將pred_quantity其作為列添加到 df
df['pred_quantity'] = pred_quantity
print(df[['customer_number', 'pred_quantity']])
或使用zip( docs ) 并排列印它們
for number, quantity in zip(df['customer_number'], pred_quantity)
print(number, quantity)
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/462317.html
上一篇:如何計算CSV中的值?
