具有特征和目標向量的機器學習-有解無憂

如何訓練具有向量/陣列作為特征的模型？執行此操作時，我似乎一直在出錯...

我的特征矩陣看起來像這樣：

     A    B    C    Profile
0    1    4    4    [1,2,3,4]
1    2    4    5    [2,2,4,1]

而我的目標向量看起來像這樣：

0    [0,4,5,0]
1    [1,5,6,0]

等等，但是我在使用來自 sklearn 的 linear_regression 時遇到了 fit(x, y) 問題。這是 print(x) 和 print(y) 的輸出：

X：

Beams/Beam[0]/Parameters/Energy     Beams/Beam[0]/Parameters/BunchPopulation    Beams/Beam[0]/BunchShape/Parameters/LongitudinalSigmaLabFrame   Simulation/NumberOfParticles    initialXHist
0   25.0    1.300000e 11    1.05    5000    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
1   25.0    1.300000e 11    1.05    5000    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
2   25.0    1.300000e 11    1.05    5000    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
3   25.0    1.300000e 11    1.05    5000    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
4   25.0    1.300000e 11    1.05    5000    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
...     ...     ...     ...     ...     ...
995     26.0    1.300000e 11    1.05    5000    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
996     26.0    1.300000e 11    1.05    5000    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
997     26.0    1.300000e 11    1.05    5000    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
998     26.0    1.300000e 11    1.05    5000    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
999     26.0    1.300000e 11    1.05    5000    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...

1000 rows × 5 columns

是：

0      [8, 4, 6, 13, 5, 5, 10, 11, 15, 9, 19, 18, 16,...
1      [6, 5, 8, 8, 9, 12, 6, 20, 9, 20, 18, 12, 24, ...
2      [6, 6, 7, 8, 13, 10, 12, 7, 14, 14, 18, 24, 16...
3      [2, 5, 10, 3, 6, 8, 13, 12, 7, 18, 12, 20, 22,...
4      [5, 3, 5, 9, 8, 8, 8, 9, 14, 13, 10, 15, 21, 1...
                             ...                        
995    [2, 9, 4, 5, 10, 5, 10, 15, 16, 13, 12, 13, 21...
996    [2, 3, 5, 5, 11, 15, 18, 15, 14, 13, 16, 17, 1...
997    [4, 5, 6, 8, 5, 7, 7, 26, 13, 16, 17, 16, 17, ...
998    [1, 3, 5, 7, 5, 6, 16, 10, 17, 12, 12, 18, 24,...
999    [3, 4, 8, 9, 8, 4, 14, 17, 11, 16, 7, 20, 14, ...
Name: finalXHist, Length: 1000, dtype: object

任何人都可以建議嗎？我得到的錯誤是：

    ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: only size-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
/tmp/ipykernel_826/1502489859.py in <module>
      3 
      4 # Train the model using the training sets
----> 5 regr.fit(X_train, y_train)
      6 
      7 # Make predictions using the testing set

/cvmfs/sft.cern.ch/lcg/views/LCG_101swan/x86_64-centos7-gcc8-opt/lib/python3.9/site-packages/sklearn/linear_model/_base.py in fit(self, X, y, sample_weight)
    516         accept_sparse = False if self.positive else ['csr', 'csc', 'coo']
    517 
--> 518         X, y = self._validate_data(X, y, accept_sparse=accept_sparse,
    519                                    y_numeric=True, multi_output=True)
    520 

/cvmfs/sft.cern.ch/lcg/views/LCG_101swan/x86_64-centos7-gcc8-opt/lib/python3.9/site-packages/sklearn/base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
    431                 y = check_array(y, **check_y_params)
    432             else:
--> 433                 X, y = check_X_y(X, y, **check_params)
    434             out = X, y
    435 

/cvmfs/sft.cern.ch/lcg/views/LCG_101swan/x86_64-centos7-gcc8-opt/lib/python3.9/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     61             extra_args = len(args) - len(all_args)
     62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
     64 
     65             # extra_args > 0

/cvmfs/sft.cern.ch/lcg/views/LCG_101swan/x86_64-centos7-gcc8-opt/lib/python3.9/site-packages/sklearn/utils/validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, estimator)
    869         raise ValueError("y cannot be None")
    870 
--> 871     X = check_array(X, accept_sparse=accept_sparse,
    872                     accept_large_sparse=accept_large_sparse,
    873                     dtype=dtype, order=order, copy=copy,

/cvmfs/sft.cern.ch/lcg/views/LCG_101swan/x86_64-centos7-gcc8-opt/lib/python3.9/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     61             extra_args = len(args) - len(all_args)
     62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
     64 
     65             # extra_args > 0

/cvmfs/sft.cern.ch/lcg/views/LCG_101swan/x86_64-centos7-gcc8-opt/lib/python3.9/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
    671                     array = array.astype(dtype, casting="unsafe", copy=False)
    672                 else:
--> 673                     array = np.asarray(array, order=order, dtype=dtype)
    674             except ComplexWarning as complex_warning:
    675                 raise ValueError("Complex data not supported\n"

ValueError: setting an array element with a sequence.

我試過用谷歌搜索它，但到目前為止沒有運氣，我猜這兩個物件的設定方式有問題。

uj5u.com熱心網友回復：

正在為X（回溯的倒數第三部分）引發錯誤：您不能擁有陣列值功能。你需要做一些特征工程來生成一個平面資料表來訓練；這是將陣列展平為單個特征，還是基于這些陣列提取一些統計資料，或者其他什么取決于這些陣列的含義（對于 datascience.SE 或 stats.SE 來說是一個更好的問題）。

擁有陣列y可能會有類似的問題，但如果將它們視為單獨的輸出是你所追求的，它就會變成“多輸出”回歸或“多標簽”分類，由 sklearn 估計器的子集處理。

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/411793.html

標籤：

上一篇：VS2019Intellisense突然停止在View/Razor檔案中作業

下一篇：名稱“lime_tabular”未定義-ML包