我的資料如下所示,它位于擴展名為 xlsx 的 excel 檔案中,因此我使用openpyxl庫在 python 中讀取檔案并從第 6 行開始提取 2 列的資料,并將提取的資料附加到帶有鍵的字典中“Column_name”和“Column_Type”。這樣做的原因是,我想在提取的資料之間添加額外的字串。我要添加的字串是 CREATE TABLE TRANSIENT TABLE STG_EMPLOYEE();
到目前為止,我的代碼和輸出如下所示。我在 txt 檔案中的預期輸出也在下面提到。如果您注意到,我想同時從兩個鍵中列舉值,我不確定如何實作這一點,python所以感謝您的幫助?
提前感謝您的時間和精力!
資料
File Name: Employee
Sheet Name: Employee
File Type: csv
Field Name Type
Name String
Salary Numeric
Date Date
Phone Int
到目前為止的代碼
from openpyxl import load_workbook
data_file='\\test.xlsx'
# Load the entire workbook.
wb = load_workbook(data_file)
ws = wb['Employee'] #Access Sheet
outputFile = open('/output.txt', 'w') # Text file Output
outputFile.write("CREATE TABLE TRANSIENT TABLE STG_EMPLOYEE({});".format(theString) "\n")
mylines={"Column_name":[],"Column_Type":[]} #Getting 2 columns data from row 6
for i in range(6, ws.max_row 1):
name = ws.cell(row=i, column=1).value
name1=ws.cell(row=i, column=2).value
mylines["Column_name"].append(name) #Appending dictionary key "Column_name"
mylines["Column_Type"].append(name1) #Appending dictionay key "Column_type"
theString = " "
# This relies on the lists 'Column_name' and 'Column_Type' always being the same length
# I.e., there should always be a value for each key
for i in range(len(mylines['Column_name'])):
theString = mylines['Column_name'][i] " " mylines['Column_Type'][i]
if i < ws.max_row:
theString = ", "
outputFile.close()
使用上面的代碼在 txt 檔案中輸出:
CREATE TABLE TRANSIENT TABLE STG_EMPLOYEE([['Name', 'Salary', 'Date', 'Phone'], ['String', 'Numeric', 'Date', 'Int']]);
Txt 檔案中的預期輸出
CREATE TABLE TRANSIENT TABLE STG_EMPLOYEE( Name String, Salary Numeric, Date Date, Phone Int);
uj5u.com熱心網友回復:
您可以利用"Column_name"和"Column_Type"串列中應始終存在相同數量的專案這一事實,以使用單個變數將問題減少到迭代:
# Simulated Excel Data
ws = {
'max_row': 3,
'cell': [
['Name', 'String'],
['Salary', 'Numeric'],
['Date', 'Date'],
['Phone', 'Int']
]
}
mylines={"Column_name":[],"Column_Type":[]}
for i in range(0, ws['max_row'] 1):
name = ws['cell'][i][0]
name1=ws['cell'][i][1]
mylines["Column_name"].append(name)
mylines["Column_Type"].append(name1)
theString = " "
# This relies on the lists 'Column_name' and 'Column_Type' always being the same length
# I.e., there should always be a value for each key
for i in range(len(mylines['Column_name'])):
theString = mylines['Column_name'][i] " " mylines['Column_Type'][i]
if i < ws['max_row']:
theString = ", "
# OLD
print("CREATE TABLE TRANSIENT TABLE STG_EMPLOYEE({});".format([(mylines[k]) for k,v in mylines.items()]) "\n")
# NEW
print("CREATE TABLE TRANSIENT TABLE STG_EMPLOYEE({});".format(theString) "\n")
當我運行它時,我得到以下輸出(舊版本的第一行和新版本的第二行):
CREATE TABLE TRANSIENT TABLE STG_EMPLOYEE([['Name', 'Salary', 'Date', 'Phone'], ['String', 'Numeric', 'Date', 'Int']]);
CREATE TABLE TRANSIENT TABLE STG_EMPLOYEE( Name String, Salary Numeric, Date Date, Phone Int);
當然,如果您的電子表格中有實際的數字和其他非字串資料,您還需要在將變數值str()附加到theString.
與您的代碼的唯一區別(除了我為使代碼在沒有實際電子表格的情況下作為獨立代碼作業所做的更改)是將format最后一行中的引數替換為通過將兩個串列中的值附加到mylines.
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/465152.html
標籤:Python python-3.x 擅长 字典 文本文件
上一篇:如何獲取dict中鍵的最大值?
