我正在嘗試解決為什么我的資料框在轉換為陣列后會更改其順序。下面是我的代碼:
header_list = ["output", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15" ,"16", "17", "18", "19", "20",
"21", "22", "23", "24", "25", "26", "27", "28", "29", "30"]
df = pd.read_csv(('data.csv'), names = header_list)
#Splitting data 70/30 for training and testing sets
trainingdata = df.sample(frac=0.7)
#assigning Y to be the first column, and X as the rest
X = trainingdata.iloc[:,1:].to_numpy()
Y = trainingdata.iloc[:,0].to_numpy().reshape(-1, 1)
print(trainingdata)
輸出:
output 1 2 ... 28 29 30
12 0 0.267358 0.373690 ... 0.379725 0.130298 0.195592
27 1 0.313739 0.506595 ... 0.456701 0.375517 0.157156
450 0 0.181693 0.490362 ... 0.112165 0.294500 0.139184
440 0 0.033603 0.531958 ... 0.171821 0.241474 0.338187
54 0 0.197312 0.113967 ... 0.189210 0.255076 0.083169
.. ... ... ... ... ... ... ...
20 1 0.519144 0.348326 ... 0.407216 0.653854 0.039814
231 1 0.428274 0.196145 ... 0.680756 0.286615 0.237439
55 0 0.291968 0.190396 ... 0.334089 0.450227 0.205234
159 1 0.410762 0.456206 ... 0.846048 0.337473 0.307359
117 0 0.232335 0.292188 ... 0.391065 0.361128 0.187656
您可以看到我的索引列是按隨機順序排列的,而我的原始資料幀是按數字順序排列的,我是否在這里執行了錯誤的語法導致這種情況?
uj5u.com熱心網友回復:
這是來自samplepandas 的操作。默認情況下,它對資料框中的隨機行/列執行選擇。
在此處閱讀有關它的檔案。
如果您想在每次執行代碼時都有相同的選擇(可重復性),您可以使用該random_state選項。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/433850.html
