我使用 sklearnLinearRegression()估計器,有 5 個變數
['feat1', 'feat2', 'feat3', 'feat4', 'feat5']
為了預測一個連續的值。
Estimator 回傳系數值和偏差的串列:
linear = LinearRegression()
print(linear.coef_)
print(linear.intercept_)
[ 0.18799409 -0.05406106 -0.01327966 -0.13348129 -0.00614054]
-0.011064865422734674
然后,鑒于我將每個特征作為變數,我可以將系數硬編碼為線性公式并估計我的值,如下所示:
val = ((0.18799409*feat1) - (0.05406106*feat2) - (0.01327966*feat3) - (0.13348129*feat4) - (0.00614054*feat5)) -0.011064865422734674
現在假設我使用 2 次多項式回歸,使用管道,并通過列印:
model = Pipeline(steps=[
('scaler',StandardScaler()),
('polynomial_features', PolynomialFeatures(degree=degree, include_bias=False)),
('linear_regression', LinearRegression())])
#fit model
model.fit(X_train, y_train)
print(model['linear_regression'].coef_)
print(model['linear_regression'].intercept_)
我得到:
[ 7.06524186e-01 -2.98605001e-02 -4.67175212e-02 -4.86890790e-01
-1.06320101e-02 -2.77958604e-03 -3.38253025e-04 -7.80563090e-03
4.51356888e-03 8.32036733e-03 3.57638244e-02 -2.16446849e-02
-7.92169287e-02 3.36809467e-02 -6.60531497e-03 2.16613331e-02
2.10097993e-02 3.49970303e-02 -3.02970698e-02 -7.81462599e-03]
0.011042927069084668
我如何轉換上面的公式以便val從回歸中計算,使用來自.coef_和 的值.intercept_,使用陣列索引而不是硬編碼值,對于任何“n”度?
是否有任何scipy或numpy適合該方法?
uj5u.com熱心網友回復:
重要的是要注意多項式回歸只是線性回歸的擴展情況,因此我們需要做的就是一致地轉換我們的輸入資料。對于任何 N,我們可以使用PolynomialFeaturesfrom sklearn.preprocessing.From 使用虛擬資料,我們可以看到它是如何作業的:
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
#set parameters
X = np.stack([np.arange(i,i 10) for i in range(5)]).T
Y = np.random.randn(10)*10 3
N = 2
poly_reg=PolynomialFeatures(degree=N,include_bias=False)
X_poly=poly_reg.fit_transform(X)
#print(X[0],X_poly[0]) #to check parameters, note that it includes the y intercept as an input of 1
poly = LinearRegression().fit(X_poly, Y)
因此,我們可以像以前一樣獲得 coef_,只需執行矩陣乘法即可獲得回歸值。
new_dat = poly_reg.transform(np.arange(2,2 10,2)[None]) #5 new datapoints
np.testing.assert_array_equal(poly.predict(new_dat),new_dat @ poly.coef_ poly.intercept_)
- - 編輯 - -
如果您不能對 PolynomialFeatures 使用變換,它只是一個迭代組合回圈,用于從您的特征串列中生成資料。
new_feats = np.array([feat1,feat2,feat3,feat4,feat5])
from itertools import combinations_with_replacement
def gen_poly_feats(x,N):
#this function returns all unique groupings (w/ replacement) of the indices into the array x for use in polynomial regression.
return np.concatenate([[np.product(x[np.array(i)]) for i in list(combinations_with_replacement(range(len(x)), n))] for n in range(1,N 1)])[None]
new_feats_poly = gen_poly_feats(new_feats,N)
# just to be sure that this matches...
np.testing.assert_array_equal(new_feats_poly,poly_reg.transform(new_feats[None]))
#then we can use the above linear regression model to predict the new data
val = new_feats_poly @ poly.coef_ poly.intercept_
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/317003.html
