我有兩個 excel 檔案,它們都有 10 個作業表。我想閱讀每個作業表,比較它們并在第三個 excel 檔案中列印資料,即使它會寫在多個作業表中。
以下程式適用于單個作業表
import pandas as pd
df1 = pd.read_excel('zyx_5661.xlsx')
df2 = pd.read_excel('zyx_5662.xlsx')
df1.rename(columns= lambda x : x '_file1', inplace=True)
df2.rename(columns= lambda x : x '_file2', inplace=True)
df_join = df1.merge(right = df2, left_on = df1.columns.to_list(), right_on = df2.columns.to_list(), how = 'outer')
with pd.ExcelWriter('xl_join_diff.xlsx') as writer:
df_join.to_excel(writer, sheet_name='testing', index=False)
如何優化它以使用多個作業表?請指教。
謝謝你。
uj5u.com熱心網友回復:
我認為這應該達到你所需要的。遍歷每個作業表名稱(假設它們在兩個 excel 檔案中的名稱相同。如果沒有,您可以使用數字代替)。將新輸出寫入新作業表,并保存 excel 檔案。
import pandas as pd
writer = pd.ExcelWriter('xl_join_diff.xlsx')
for sheet in ['sheet1', 'sheet2', 'sheet3']: #list of sheet names
#Pull in data for each sheet, and merge together.
df1 = pd.read_excel('zyx_5661.xlsx', sheet_name=sheet)
df2 = pd.read_excel('zyx_5662.xlsx', sheet_name=sheet)
df1.rename(columns= lambda x : x '_file1', inplace=True)
df2.rename(columns= lambda x : x '_file2', inplace=True)
df_join = df1.merge(right=df2, left_on=df1.columns.to_list(),
right_on=df2.columns.to_list(), how='outer')
df_join.to_excel(writer, sheet, index=False) #write to excel as new sheet
writer.save() #save excel document once all sheets have been done
uj5u.com熱心網友回復:
您可以使用回圈來讀取檔案和作業表
writer = pd.ExcelWriter('multiple.xlsx', engine='xlsxwriter')
# create writer for writing all sheets in 1 file
list_files=['zyx_5661.xlsx','zyx_5662.xlsx']
count_sheets=0
for file_name in list_files:
file = pd.ExcelFile(file_name)
for sheet_name in file.sheet_names:
df = pd.read_excel(file, sheet_name)
# ... you can do your process
count_sheets=count_sheets 1
df.to_excel(writer, sheet_name='Sheet-' count_sheets)
writer.save()
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/425826.html
上一篇:根據多個條件過濾基于行的資料
