下面我正在創建 3 個資料框。df2并且df3都是 的嵌套資料框df1。然后我嘗試.apply()在所有嵌套資料框上使用,并最終向外部資料框添加一個新列,該列本質上是嵌套資料框的修訂版本。
我想將下面的函式應用于可以在.'df_name'列中找到的所有元素(資料框) df1。我還需要將其他列值傳遞df1到.apply()同一行上的函式中 - 即。在函式上'sp'運行時需要知道該值。.apply()df2
.apply()在下面的嘗試中,我將感謝您對以下方面的一些見解:-如何使用該函式訪問嵌套資料框并參考來自df1. - 有沒有辦法使用矢量化來解決這個問題?
import pandas as pd
cols = ['sales', 'sku']
names = [
[100, 'asdf'],
[200, 'qwer'],
[250, 'zxcv'],
[175, 'yuop']
]
df2 = pd.DataFrame(names, columns = cols)
cols = ['sales', 'sku']
names = [
[80, 'nyer'],
[60, 'cawe']
]
df3 = pd.DataFrame(names, columns = cols)
cols = ['name', 'cmpgn_type', 'df_name']
names = [
['dustin', 'sp', df2],
['jenny', 'sb', df3]
]
df1 = pd.DataFrame(names, columns = cols)
sp_cols_order = ['sales', 'sku', 'Record Type']
sb_cols_order = ['Record_Type', 'sku', 'sales']
def cmpngs(df, type):
df_shape = df.shape[0]
for x in range(df_shape):
df['Record_Type'] = 'hello'
if type == 'sp':
df = df[sp_cols_order]
elif type == 'sb':
df = df[sb_cols_order]
return df
df1['ul_cmpgn'] = df1['df_name'].apply(cmpngs, args=(df1['cmpgn_type'],))
print(df1['ul_cmpgn'].iloc[0])
print(df1['ul_cmpgn'].iloc[1])
df1 的預期輸出:
name cmpgn_type df_name ul_cmpgn
0 dustin sp df2 df2a
1 jenny sb df3 df3a
df2 的預期輸出:
sales sku Record_Type
0 100 asdf hello
1 200 qwer hello
2 250 zxcv hello
3 175 yuop hello
df3 的預期輸出:
Record Type sales sku
0 hello 80 nyer
1 hello 60 cawe
uj5u.com熱心網友回復:
嘗試將您的cmpngs函式更改為采用單個引數 - row,并呼叫apply整個資料框而不僅僅是df_name列,并使用axis=1:
def cmpngs(row):
df = row['df_name']
type = row['cmpgn_type']
df_shape = df.shape[0]
for x in range(df_shape):
df['Record Type'] = 'hello'
if type == 'sp':
df = df[sp_cols_order]
elif type == 'sb':
df = df[sb_cols_order]
return df
df1['ul_cmpgn'] = df1.apply(cmpngs, axis=1)
print(df1['ul_cmpgn'].iloc[0])
print(df1['ul_cmpgn'].iloc[1])
輸出:
sales sku
0 100 asdf
1 200 qwer
2 250 zxcv
3 175 yuop
sku sales
0 nyer 80
1 cawe 60
您不能真正使用嵌套資料框對操作進行矢量化。
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/443590.html
下一篇:資料清洗:正則運算式替換數字
