將元素附加到2dnumpy陣列-有解無憂

我有一個 numpy 陣列，其形狀為(500, 151296). 下面是陣列格式

大批：

array([[-0.18510018,  0.13180602,  0.32903048, ...,  0.39744213,
        -0.01461623,  0.06420607],
       [-0.14988784,  0.12030973,  0.34801325, ...,  0.36962894,
         0.04133283,  0.04434045],
       [-0.3080041 ,  0.18728344,  0.36068922, ...,  0.09335024,
        -0.11459247,  0.10187756],
       ...,
       [-0.17399777, -0.02492459, -0.07236133, ...,  0.08901921,
        -0.17250113,  0.22222663],
       [-0.17399777, -0.02492459, -0.07236133, ...,  0.08901921,
        -0.17250113,  0.22222663],
       [-0.17399777, -0.02492459, -0.07236133, ...,  0.08901921,
        -0.17250113,  0.22222663]], dtype=float32)

陣列[0]：

array([-0.18510018,  0.13180602,  0.32903048, ...,  0.39744213,
       -0.01461623,  0.06420607], dtype=float32)

我有另一個串列，其中包含與 numpy 陣列形狀大小相同的停用詞

停用詞 = ['no', 'not', 'in' .........]

我想將每個停用詞添加到具有 500 個元素的 numpy 陣列中。下面是我用來添加的代碼

for i in range(len(stopwords)):
  array = np.append(array[i], str(stopwords[i]))

我收到以下錯誤

IndexError                                Traceback (most recent call last)
<ipython-input-45-361e2cf6519b> in <module>
      1 for i in range(len(stopwords)):
----> 2   array = np.append(array[i], str(stopwords[i]))

IndexError: index 2 is out of bounds for axis 0 with size 2

期望的輸出：

陣列[0]：

array([-0.18510018,  0.13180602,  0.32903048, ...,  0.39744213,
       -0.01461623,  0.06420607, 'no'], dtype=float32)

誰能告訴我我在哪里做錯了？

uj5u.com熱心網友回復：

你做錯的是你覆寫array了 for 回圈內的變數：

for i in range(len(stopwords)):
    array = np.append(array[i], str(stopwords[i]))
#   ^^^^^             ^^^^^

但是您也做錯了np.append在 for 回圈中使用，這幾乎總是一個壞主意。

你寧愿做這樣的事情：

from string import ascii_letters
from random import choices

import numpy as np

N, M = 50, 7
arr = np.random.randn(N, M)
stopwords = np.array(["".join(choices(ascii_letters, k=10)) for _ in range(N)])
result = np.concatenate([arr, stopwords[:, None]], axis=-1)

assert result.shape == (N, M 1)
print(result[0])  # ['0.1' '-1.2' '-0.1' '1.6' '-1.4' '-0.2' '1.7' 'ybWyFlqhcS']

但這也是錯誤的，無緣無故地混合資料型別。

恕我直言，你最好只保留兩個陣列。

根據您在做什么，您可以按如下方式迭代它們：

for vector, stopword in zip(arr, stopwords):
    print(f"{stopword = }")
    print(f"{vector   = }")

# stopword = 'RgfTVGzPOl'
# vector   = array([-0.9,  1.1,  0.7 , -0.3 , -0.7 , -0.7, -0.6])
# 
# stopword = 'XlJqKdsvCC'
# vector   = array([-0.5,  0.1, -0.7 , -0.6, -1.1, -0.6, -0.6])
# 
#...

uj5u.com熱心網友回復：

讓我們嘗試一些除錯。

從一個較小的浮點陣列開始：

In [76]: arr = np.arange(12).reshape(3,4).astype(float)    
In [77]: arr
Out[77]: 
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]])

In [78]: words = ['no','not','in']

In [79]: for i in range(3):
    ...:     arr = np.append(arr[i], str(words[i]))
    ...:     
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Input In [79], in <cell line: 1>()
      1 for i in range(3):
----> 2     arr = np.append(arr[i], str(words[i]))

IndexError: index 2 is out of bounds for axis 0 with size 2

看看當你得到錯誤時i：arr

In [80]: arr
Out[80]: array(['1.0', 'not'], dtype='<U3')    
In [81]: i
Out[81]: 2

arr看起來一點也不像原版arr，是嗎？這是一個包含 2 個字串元素的一維陣列。這arr[2]就是引發錯誤的原因。你明白為什么嗎？

重新創建arr，然后只執行一步：

In [82]: arr = np.arange(12).reshape(3,4).astype(float)
In [83]: np.append(arr[0], words[0])
Out[83]: array(['0.0', '1.0', '2.0', '3.0', 'no'], dtype='<U32')

這看起來有點像你想要的第一行，除了它是字串 dtype。但是你不想arr用這個一維陣列替換原來的，對嗎？

執行i=1此結果的步驟會產生

In [84]: np.append(Out[83][1], words[1])
Out[84]: array(['1.0', 'not'], dtype='<U3')

哪個陣列i=2有問題（形狀（2，）陣列）。

不要只是在遇到錯誤時絕望地舉手——通過查看變數進行除錯，并逐步測驗代碼。

您嘗試的迭代型別確實適用于串列：

In [85]: alist = arr.tolist()  
In [86]: alist
Out[86]: [[0.0, 1.0, 2.0, 3.0], [4.0, 5.0, 6.0, 7.0], [8.0, 9.0, 10.0, 11.0]]

In [87]: for i in range(3):
    ...:     alist[i].append(words[i])
    ...:     

In [88]: alist
Out[88]: 
[[0.0, 1.0, 2.0, 3.0, 'no'],
 [4.0, 5.0, 6.0, 7.0, 'not'],
 [8.0, 9.0, 10.0, 11.0, 'in']]

串列的元素長度可以不同；列出就地追加作品；串列可以包含數字和字串。對于 numpy 陣列，這些都不成立。

作為一般規則，嘗試使用 numpy 陣列復制串列方法是行不通的。

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/530656.html

標籤：数组python-3.x麻木的

上一篇：為什么我收到錯誤TypeError:'numpy.ndarray'objectisnotcallable？

下一篇：為什么numpy.dot給出不正確的結果？