我有書的txt檔案,每章都有“-THE END-”字樣。我想把每一章分成不同的檔案,比如 -
Chapter001.txt
Chapter002.txt
Chapter003.txt
.
.
.
ChapterNNN.txt
我在 python3 中撰寫了以下代碼。
groups = open('input.txt').read()
groups_divided = groups.split('-THE END-\n')
temp = group.split('\n')
我現在希望它把它分成不同的檔案,并給它們命名為“章節”。此外,我不知道如何拆分和創建檔案并確保它涵蓋所有查詢。
另外請告訴我是否有任何簡單的方法可以通過任何軟體來完成。
uj5u.com熱心網友回復:
您可以通過一次處理一行輸入文本檔案來做到這一點,而不是首先通過簡單地迭代并累積其中的每一行直到遇到“章節結尾”行來將整個內容讀入記憶體:
with open('input.txt') as inp:
n = 1
chapter = []
for line in inp:
if line != '-THE END-\n':
chapter.append(line)
else:
filename = f'Chapter{n:03d}.txt'
with open(filename, 'w') as outp:
outp.writelines(chapter)
n = 1
chapter = []
uj5u.com熱心網友回復:
另一種方法是找到每個索引,"-THE END-"然后使用這些索引創建章節。請參閱以下代碼:
with open("my_poem.txt") as f:
lines = f.readlines()
indices = [0]
for idx, line in enumerate(lines):
if "-THE END-" in line:
indices.append(idx) # idx number of line where -THE END- is occurred
chapter_counter = 2
while chapter_counter <= len(indices):
with open(f"Chapter_{chapter_counter-1}.txt", "a") as w:
lines_chapter = lines[indices[chapter_counter-2] 1:indices[chapter_counter-1] 1] # From -THE END- to -THE END-
for line_chapter in lines_chapter:
w.write(f"{line_chapter}")
chapter_counter = 1
uj5u.com熱心網友回復:
n = 0
for chapter in groups_divided:
with open(f'chapter{n}.txt', 'w') as file:
file.write(chapter)
n = 1
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/486481.html
上一篇:裁剪大資料檔案
