如何從Pandas的嵌套串列列中獲取最小值？為什么numpy.min()在numpy.mean()有效的情況下不起作用？-有解無憂

我有一小段代碼需要修改，但我不知道為什么 np.mean() 在 pandas 列由嵌套串列組成的特定情況下不起作用的原因。也許這里有人可以澄清一下？

這個片段在這里完美地作業：

import pandas as pd
import numpy as np


def transformation(custom_df):
    dic = dict(zip(custom_df['customers'], custom_df['values']))
    custom_df['values'] = np.where(custom_df['values'].isna() & (custom_df['valid_neighbors'] >= 1),
                                   custom_df['neighbors'].apply(
                                       lambda row: np.mean([dic[v] for v in row if dic.get(v)])),
                                   custom_df['values'])
    return custom_df


customers = [1, 2, 3, 4, 5, 6]
values = [np.nan, np.nan, 10, np.nan, 11, 12]
neighbors = [[6], [3], [], [3, 5], [6], [5]]
vn = [1, 1, 0, 2, 1, 1]
df2 = pd.DataFrame({'customers': customers, 'values': values, 'neighbors': neighbors, 'valid_neighbors': vn})


   customers  values neighbors  valid_neighbors
0          1     NaN       [6]                1
1          2     NaN       [3]                1
2          3    10.0        []                0
3          4     NaN    [3, 5]                2
4          5    11.0       [6]                1
5          6    12.0       [5]                1

df2 = transformation(df2)

結果：

   customers  values neighbors  valid_neighbors
0          1    12.0       [6]                1
1          2    10.0       [3]                1
2          3    10.0        []                0
3          4    10.5    [3, 5]                2
4          5    11.0       [6]                1
5          6    12.0       [5]                1

但是，如果我要在“transformation()”函式上將 np.mean() 更改為 np.min()，它會回傳一個 ValueError，這讓我想知道為什么在我呼叫 np.均值（）函式：

ValueError: zero-size array to reduction operation minimum which has no identity

我想知道我沒有滿足哪些條件，以及我可以做些什么來獲得預期的結果，這將是：

   customers  values neighbors  valid_neighbors
0          1    12.0       [6]                1
1          2    10.0       [3]                1
2          3    10.0        []                0
3          4    10.0    [3, 5]                2
4          5    11.0       [6]                1
5          6    12.0       [5]                1

uj5u.com熱心網友回復：

您的列中有一個空串列neighbors會引發錯誤，np.min但np.mean即使對于空串列也有效。

import numpy as np

print(np.mean([])) 
# Output
# nan

print(np.min([])) 
# Throws error
# ValueError: zero-size array to reduction operation minimum which has no identity

uj5u.com熱心網友回復：

最好transformation通過調整neighbors列中的空陣列來更新函式。這是一個可能有效的解決方法。

def transformation(custom_df):
    dic = dict(zip(custom_df['customers'], custom_df['values']))
    custom_df['values'] = np.where(custom_df['values'].isna() & (custom_df['valid_neighbors'] >= 1),
                                   custom_df['neighbors'].apply(
                                       lambda row: np.min([dic[v] for v in row if dic.get(v)]) if len(row) else 0),
                                   custom_df['values'])
    return custom_df

uj5u.com熱心網友回復：

使用以下代碼并獲得結果：

df3 = df2.set_index('customers')
df2['values'].fillna(df2['neighbors'].apply(lambda x: df3.loc[x, 'values'].mean()))

輸出（平均）：

0   12.00
1   10.00
2   10.00
3   10.50
4   11.00
5   12.00
Name: values, dtype: float64

您可以更改mean為min：

df2['values'].fillna(df2['neighbors'].apply(lambda x: df3.loc[x, 'values'].min()))

輸出（分鐘）：

0   12.00
1   10.00
2   10.00
3   10.00
4   11.00
5   12.00
Name: values, dtype: float64

使所需的結果value列

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/528666.html

標籤：熊猫麻木的numpy-ndarray嵌套列表

上一篇：根據來自另一個陣列numpy的索引分配值

下一篇：列印陣列串列的列