在錯誤后面的代碼片段上出現以下錯誤。
關于如何解決這個問題的任何想法?
使用 Numpy 幾乎是全新的——我大部分時間都在使用 Pandas,但試圖擺脫使用 Pandas 來解決許多與性能相關的問題。
最終目標是在兩個結構化陣列上運行 LEFT JOIN。
該錯誤似乎是由ret[i] = tuple(row1[f1]) tuple(row2[f1])運算式提示的,但老實說,我不確定為什么會收到此錯誤。
測驗row1和row2檢查欄位的數量與f1包含 dtype 鍵的欄位的數量,這一切似乎與我所知道的一致。
任何想法將不勝感激!
錯誤
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_43960/3997384146.py in <module>
66 # dtype=[('name', 'U10'), ('age', 'i4')])
67
---> 68 join_by_left(key='name', r1=struct_arr1, r2=struct_arr2, mask=True)
~\AppData\Local\Temp/ipykernel_43960/3997384146.py in join_by_left(key, r1, r2, mask)
43 print(row1[f1])
44 print(row2[f1])
---> 45 ret[i] = tuple(row1[f1]) tuple(row2[f1])
46
47 i = 1
~\AppData\Roaming\Python\Python37\site-packages\numpy\ma\core.py in __setitem__(self, indx, value)
3379 elif not self._hardmask:
3380 # Set the data, then the mask
-> 3381 _data[indx] = dval
3382 _mask[indx] = mval
3383 elif hasattr(indx, 'dtype') and (indx.dtype == MaskType):
ValueError: could not assign tuple of length 6 to structure with 3 fields.
完整代碼
import numpy as np
def join_by_left(key, r1, r2, mask=True):
# figure out the dtype of the result array
descr1 = r1.dtype.descr
descr2 = [d for d in r2.dtype.descr if d[0] not in r1.dtype.names]
descrm = descr1 descr2
# figure out the fields we'll need from each array
f1 = [d[0] for d in descr1]
f2 = [d[0] for d in descr2]
# cache the number of columns in f1
ncol1 = len(f1)
print(f1)
# get a dict of the rows of r2 grouped by key
rows2 = {}
for row2 in r2:
rows2.setdefault(row2[key], []).append(row2)
# figure out how many rows will be in the result
nrowm = 0
for k1 in r1[key]:
if k1 in rows2:
nrowm = len(rows2[k1])
else:
nrowm = 1
# allocate the return array
_ret = np.recarray(nrowm, dtype=descrm)
if mask:
ret = np.ma.array(_ret, mask=True)
else:
ret = _ret
# merge the data into the return array
i = 0
for row1 in r1:
if row1[key] in rows2:
for row2 in rows2[row1[key]]:
print(row1[f1])
print(row2[f1])
ret[i] = tuple(row1[f1]) tuple(row2[f1])
i = 1
else:
for j in range(ncol1):
ret[i][j] = row1[j]
i = 1
return ret
struct_arr1 = np.array([('jason', 28, '[email protected]'), ('jared', 31, '[email protected]')],
dtype=[('name', 'U10'), ('age', 'i4'), ('email', 'U10')])
struct_arr2 = np.array([('jason', 22, '[email protected]'), ('jason', 27, '[email protected]'), ('george', 28, '[email protected]'), ('jared', 22, '[email protected]')],
dtype=[('name', 'U10'), ('age', 'i4'), ('email', 'U10')])
join_by_left(key='name', r1=struct_arr1, r2=struct_arr2, mask=True)
uj5u.com熱心網友回復:
在您收到錯誤的那一行:
ret[i] = tuple(row1[f1]) tuple(row2[f1])
該 運算子將兩個元組連接在一起,因此結果是一個具有 6 個元素的元組,而不是 3 個元素成對添加的元組(如果這是您所期望的)。
簡單的例子:
tuple('abc') tuple('def')
結果是:
('a', 'b', 'c', 'd', 'e', 'f')
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/395407.html
