我有一個包含一些相同型別列的資料框:
['total_tracks', 't_dur0', 't_dur1', 't_dur2', 't_dance0', 't_dance1', 't_dance2',
't_energy0', 't_energy1', 't_energy2', 't_key0', 't_key1', 't_key2', 't_mode0',
't_mode1', 't_mode2', 't_speech0', 't_speech1', 't_speech2', 't_acous0', 't_acous1',
't_acous2', 't_ins0', 't_ins1', 't_ins2', 't_live0', 't_live1', 't_live2', 't_val0',
't_val1', 't_val2', 't_tempo0', 't_tempo1', 't_tempo2', 't_sig0', 't_sig1', 't_sig2',
'popularity', 'release_year', 'release_month']
我正在嘗試將具有相同型別的列組合起來,如下所示:
# Takes in a dataframe with three columns and returns a dataframe with one column of their means
def average_column(dataframe):
dataframe["mean"] = dataframe.mean(axis=1) # Add column to the dataframe (axis=1 means the mean() is applied row-wise)
mean_df = dataframe.iloc[: , -1:] # Isolated column of the mean by selecting all rows (:) for the last column (-1:)
print("Original: {}\tWith mean:\n{}".format(dataframe, mean_df))
return mean_df
受到這個和這個問題的啟發。我試圖運行這段代碼:
t_name_df = df[["t_dur0", "t_dur1", "t_dur2"]]
print(t_name_df.columns.tolist())
average_column(t_name_df)
這給了我這個輸出:
['t_dur0', 't_dur1', 't_dur2']
Original:
t_dur0 t_dur1 t_dur2 mean
0 2315 2310 2293 2306.000000
1 1558 886 1870 1438.000000
2 803 316 504 541.000000
3 498 815 677 663.333333
4 1508 1677 1386 1523.666667
... ... ... ... ...
[2833 rows x 4 columns]
With mean:
mean
0 2306.000000
1 1438.000000
2 541.000000
3 663.333333
4 1523.666667
... ...
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
為了擺脫警告,我嘗試重寫它:
t_name_df = df.loc['t_dur0', 't_dur0']
print(t_name_df.column.tolist())
average_column(t_name_df)
這給了我這個錯誤:
KeyError: 't_dur0'
如何正確擺脫此警告?
uj5u.com熱心網友回復:
將您的功能更改average_column為:
def average_column(dataframe):
# ADD THIS LINE:
dataframe = dataframe.copy()
dataframe["mean"] = dataframe.mean(axis=1) # Add column to the dataframe (axis=1 means the mean() is applied row-wise)
mean_df = dataframe.iloc[: , -1:] # Isolated column of the mean by selecting all rows (:) for the last column (-1:)
print("Original: {}\tWith mean:\n{}".format(dataframe, mean_df))
return mean_df
警告正在發生,因為通過這樣做t_name_df = df[["t_dur0", "t_dur1", "t_dur2"]],您正在創建這些列的副本,并且 pandas 告訴您您對其所做的更改 ( t_name_df) 不會反映在原始資料幀 ( df) 中。通過添加.copy(),您明確地讓 pandas 知道您可以接受這種情況。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/448260.html
標籤:Python 熊猫 数据框 带有复制警告的熊猫设置
