我正在嘗試讀取使用“:”作為分隔符的文本檔案,在第一列中找到特定的搜索詞,然后將第二列輸出到 .csv 檔案。
我從中提取的檔案有多個部分,如下所示(顯示 2 行,多行):
Object : Info1
Type : Info2
LastChange : INFO3
DeviceId : INFO4
EndObject
Object : Info5
Type : Info6
LastChange : INFO7
DeviceId : INFO8
EndObject
這重復相同的第一列(物件,型別..等)但不同的資訊#
我想通過搜索第一列(物件型別 LastChange DeviceId)來搜索并將“Info#”拉入 csv 檔案以讀取為:Info1、Info2、Info3、Info4
到目前為止,我已經讓它輸出物件和型別,但是我只有一次迭代的 for 回圈,我的代碼到目前為止:
import csv
import string
import pandas as pd
filename1 = 'test.txt' #EDIT THIS TO MATCH EXACTLY THE .DMP FILE YOU WISH TO READ!!
infile = open(filename1, 'r', errors = 'ignore') #this names the read file variable, !!DO NOT TOUCH!!
lines = infile.readlines()
filename2 = 'test.csv'
outfile = open(filename2,'w')
headerList ="Type:Device:Name:Change\n".split(':')
headerString = ','.join(headerList)
outfile.write(headerString)
for line in lines[1:]:
sline = line.split(":")
if 'Type' in sline[0]:
dataList = sline[1:]
dataString = ','.join(dataList)
typestring1 = ','.join([x.strip() for x in dataString.split(",")])
if ' Object' in sline[0]:
objectList = sline[1:]
objectstring = ','.join(objectList)
namestring1 = ','.join([x.strip()for x in objectstring.split(",")])
writeString = (typestring1 "," namestring1 "," "\n")
outfile.write(writeString)
outfile.close()
infile.close()
我是 python 新手,任何幫助將不勝感激。
uj5u.com熱心網友回復:
我尋找格式的決議器,但找不到您正在顯示的輸入的格式。我需要一些時間來確保還沒有“用于微控制器記憶體 DMP 檔案的 Python 決議器”。你比我更了解背景關系,所以也許你的搜索會更有成效。
同時,給定您的示例input.txt:
Object : Info1
Type : Info2
LastChange : INFO3
DeviceId : INFO4
EndObject
Object : Info5
Type : Info6
LastChange : INFO7
DeviceId : INFO8
EndObject
這是一個端到端的解決方案,可以讀取該樣本并將物件資料的每個“塊”轉換為 CSV 行。
強調的重點是將這些型別的問題分解為盡可能多的離散步驟,如下所示:
- 過濾 DMP 檔案以確保行中至少有一個冒號 (
:) 以決議為一個值(或者更具體地說,只是Type :) - 決議過濾后的行并證明您已找到所有塊
- 轉換每個塊的行成一排(即你可以傳遞到CSV模塊的作家班)
- 將您的行寫為 CSV
import csv
import pprint
filtered_lines = []
with open('input.txt') as f:
for line in f:
line = line.strip()
if line.startswith('Object') or line == 'EndObject':
filtered_lines.append(line)
continue
# Keep only Type
if line.startswith('Type :'):
filtered_lines.append(line)
continue
# or, keep any line with a color
# if ':' in line:
# filtered_lines.append(line)
# continue
# at this point, no predicate has been satisfied, drop line
pass # redundant, but poignant and satisfying :)
all_blocks = []
this_block = None
in_block = False
for line in filtered_lines:
# Find the start of a "block" of data
if line.startswith('Object'):
in_block = True
this_block = []
# Find the end of block...
if line == 'EndObject':
# save it
all_blocks.append(this_block)
# reset for next block
this_block = None
in_block = False
if in_block:
this_block.append(line)
print('Blocks:')
pprint.pprint(all_blocks)
# Convert a list of blocks to a list of rows
all_rows = []
for block in all_blocks:
row = []
# Convert a list of lines (key : value) to a "row", a list of single-value strings
for line in block:
_, value = line.split(':')
row.append(value.strip())
all_rows.append(row)
print('Rows:')
pprint.pprint(all_rows)
# Finally, save as CSV
with open('output.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(all_rows)
當我針對該輸入運行它時,我得到:
Blocks:
[['Object : Info1', 'Type : Info2'], ['Object : Info5', 'Type : Info6']]
Rows:
[['Info1', 'Info2'], ['Info5', 'Info6']]
最后是output.csv:
Info1,Info2
Info5,Info6
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/408684.html
標籤:
