我有多個 txt 檔案,它們看起來像這樣:
[Level1]
Location = "London"
Type= "GTHY66"
Date = "16-11-2021"
Energy level = "Critical zero"
[Level2]
0.000 26.788
0.027 26.807
0.053 26.860
因此,從我讀取/處理的每??個檔案中,我想創建兩個資料框(最終我將推送到資料庫)。
level1 中的資料框需要是df_level1:
Location Type Date Energy
London GTHY66 16-11-2021 Critical zero
level1 下的資料框需要是df_level2:
Speed Energylevel
0.000 26.788
0.027 26.807
0.053 26.860
這是我嘗試過的,但我被卡住了:
energy_root= r'c:\data\Desktop\Studio\Energyfiles'
#create list of file paths
def read_txt_file(path):
list_file_path = []
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith('.txt'):
file_name = os.path.basename(file)
file_path = os.path.join(root, file_name)
list_file_path.append(file_path)
return list_file_path
def create_df():
for file in read_txt_file(energy_root):
file_name = os.path.basename(file)
file_path = os.path.join(energy_root, file_name)
datetime = re.findall(r'_(\d{8}_\d{6})\.', file_name)[0]
with open(file_path, 'r ') as output:
reader = output.readlines()
for row in reader:
d = row.split('=')
if len(d) > 1:
df_level1 = pd.DataFrame([d[1]], columns=[d[0]])
print(df_level1 )
"then create df_level2 ....."
create_df()
uj5u.com熱心網友回復:
嘗試這個:
def read_txt_file(path):
n = 0
pattern = re.compile(r'(. )\s*=\s*\"(. )\"')
level1 = {}
with open(path) as fp:
for line in fp:
line = line.strip()
n = 1
if line == '[Level2]':
break
m = pattern.match(line)
if m is not None:
key = m.group(1)
value = m.group(2)
level1[key] = value
level1 = pd.DataFrame(level1, index=[0])
level2 = pd.read_csv(path, sep='\s ', skiprows=n, header=None, names=['Speed', 'EnergyLevel'])
return level1, level2
用法:
level1, level2 = read_txt_file('data.txt')
uj5u.com熱心網友回復:
您可以使用pd.read_csv正確的分隔符,但您必須做兩件事:
- 之前:將檔案的部分拆分為 Level1 和 Level2
- After:轉置和設定Level1的列
這是代碼,直接在您的with open [...]行內
reader = output.read() # simply the entire file text, not split into lines
parts = reader.split('[Level2]\n')
lvl1_lines = parts[0].split('[Level1]\n')[1].replace('"','')
lvl2_lines = "Speed Energylevel\n" parts[1]
from io import StringIO # to read strings as files for read_csv
df_level1 = pd.read_csv(StringIO(lvl1_lines), sep='=').transpose()
df_level1.columns = df_level1.iloc[0] # set the correct column names
df_level1 = df_level1[1:] # remove the column row
df_level2 = pd.read_csv(StringIO(lvl2_lines), sep='\\s ')
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/359256.html
