在資料幀的資料幀上使用.apply()-有解無憂

下面我正在創建 3 個資料框。df2并且df3都是的嵌套資料框df1。然后我嘗試.apply()在所有嵌套資料框上使用，并最終向外部資料框添加一個新列，該列本質上是嵌套資料框的修訂版本。

我想將下面的函式應用于可以在.'df_name'列中找到的所有元素（資料框） df1。我還需要將其他列值傳遞df1到.apply()同一行上的函式中 - 即。在函式上'sp'運行時需要知道該值。.apply()df2

.apply()在下面的嘗試中，我將感謝您對以下方面的一些見解：-如何使用該函式訪問嵌套資料框并參考來自df1. - 有沒有辦法使用矢量化來解決這個問題？

import pandas as pd

cols = ['sales', 'sku']
names = [
    [100, 'asdf'],
    [200, 'qwer'],
    [250, 'zxcv'],
    [175, 'yuop']
]
df2 = pd.DataFrame(names, columns = cols)


cols = ['sales', 'sku']
names = [
    [80, 'nyer'],
    [60, 'cawe']
]
df3 = pd.DataFrame(names, columns = cols)


cols = ['name', 'cmpgn_type', 'df_name']
names = [
    ['dustin', 'sp', df2],
    ['jenny', 'sb', df3]
]
df1 = pd.DataFrame(names, columns = cols)


sp_cols_order = ['sales', 'sku', 'Record Type']
sb_cols_order = ['Record_Type', 'sku', 'sales']


def cmpngs(df, type):
    df_shape = df.shape[0]
    for x in range(df_shape):
        df['Record_Type'] = 'hello'
        if type == 'sp':
            df = df[sp_cols_order]
        elif type == 'sb':
            df = df[sb_cols_order]
    return df


df1['ul_cmpgn'] = df1['df_name'].apply(cmpngs, args=(df1['cmpgn_type'],))

print(df1['ul_cmpgn'].iloc[0])
print(df1['ul_cmpgn'].iloc[1])

df1 的預期輸出：

     name cmpgn_type df_name ul_cmpgn
0  dustin         sp     df2     df2a
1   jenny         sb     df3     df3a

df2 的預期輸出：

   sales   sku Record_Type
0    100  asdf       hello
1    200  qwer       hello
2    250  zxcv       hello
3    175  yuop       hello

df3 的預期輸出：

  Record Type  sales   sku
0       hello     80  nyer
1       hello     60  cawe

uj5u.com熱心網友回復：

嘗試將您的cmpngs函式更改為采用單個引數 - row，并呼叫apply整個資料框而不僅僅是df_name列，并使用axis=1：

def cmpngs(row):
    df = row['df_name']
    type = row['cmpgn_type']
    df_shape = df.shape[0]
    for x in range(df_shape):
        df['Record Type'] = 'hello'
        if type == 'sp':
            df = df[sp_cols_order]
        elif type == 'sb':
            df = df[sb_cols_order]
    return df

df1['ul_cmpgn'] = df1.apply(cmpngs, axis=1)

print(df1['ul_cmpgn'].iloc[0])
print(df1['ul_cmpgn'].iloc[1])

輸出：

   sales   sku
0    100  asdf
1    200  qwer
2    250  zxcv
3    175  yuop

    sku  sales
0  nyer     80
1  cawe     60

您不能真正使用嵌套資料框對操作進行矢量化。

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/443590.html

標籤：Python 熊猫麻木的

上一篇：Pandasfillna按時間順序排列的日期

下一篇：資料清洗：正則運算式替換數字