Python：for回圈不回圈所有檔案-有解無憂

我正在嘗試遍歷一些壓縮檔案（擴展名“.gz”），但遇到了問題。當遇到以 'aa' 結尾的第一個檔案時，我想執行特定操作 - 它可以是隨機檔案，不一定是串列中的第一個檔案。只有這樣，Python 才必須搜索檔案夾中是否有其他“aa”檔案，如果有，則必須應用第二條規則。（可能有 1 到多個“aa”檔案）。最后，第三條規則必須應用于所有其他不以“aa”結尾的檔案。

但是，當我運行下面的代碼時，并非所有檔案都得到處理。

我究竟做錯了什么？

謝謝！

inputPath = "write your path"
fileExt = r".gz"
    flag = False
    
    for item in os.listdir(inputPath): # loop through items in dir
        if item.endswith(fileExt): # check for ".gz" extension
            full_path = os.path.join(inputPath, item) # get full path of files
            
            
            if item.endswith('aa'   fileExt) and flag == False:
                df = pd.read_csv(full_path, compression='gzip', header=0, sep='|', encoding="ISO-8859-1") #from gzip to pandas df
    #           do something
                flag = True
                print('1 rule:', "The item processed is ", item)
             
            elif item.endswith('aa'   fileExt) and flag == True:
                df = pd.read_csv(full_path, compression='gzip', header=0, sep='|', encoding="ISO-8859-1") #from gzip to pandas df
    #           do something else
                print('2 rule:', "The item processed is ", item)
    
            elif not (item.endswith('aa'   fileExt)) and flag == True:    
                df = pd.read_csv(full_path, compression='gzip', header=0, sep='|', encoding="ISO-8859-1") #from gzip to pandas df
    #           do something else
                print('3 rule:', "The item processed is ", item)

我相信這是因為 Python 遍歷按字母順序排序的檔案串列，然后忽略其他檔案。我該如何解決這個問題？

LIST OF FILES:

File_202112311aa.gz
File_20211231ab.gz
File_20211231.gz
File_20211231aa.gz

OUTPUT
1 rule The item processed is  File_202112311aa.gz
3 rule The item processed is  File_20211231ab.gz
2 rule The item processed is  File_20211231aa.gz

uj5u.com熱心網友回復：

很大程度上未經測驗，但以下內容應該可以作業。

此代碼首先處理以“aa.gz”結尾的檔案（注意：并非所有以“aa.gz”結尾的檔案都首先處理，因為問題中沒有說明），然后處理剩余的檔案。其余檔案沒有特別的順序：這將取決于 Python 在系統上的構建方式，以及（檔案）系統默認執行的操作，并且根本無法保證。

# Obtain an unordered list of compressed files
filenames = glob.glob("*.gz")

# Now find a filename ending with 'aa.gz'
for i, filename in enumerate(filenames):
    if filename.endswith('aa.gz'):
        firstfile = filenames.pop(i)
        # We immediately break out of the loop, 
        # so we're safe to have altered `filenames`
        break
else:  
    # the sometimes useful and sometimes confusing else part 
    # of a for-loop: what happens if `break` was not called:
    raise ValueError("no file ending in 'aa.gz' found!")

# Ignoring the `full_path` part
df = pd.read_csv(firstfile, compression='gzip', header=0, sep='|', encoding="ISO-8859-1")
# do something
print(f"1 rule: The file processed is {firstfile}")
          
# Process the remaining files
for filename in filenames:
    df = pd.read_csv(filename, compression='gzip', header=0, sep='|', encoding="ISO-8859-1")
    if filename.endswith('aa.gz'):
        # do something
        print(f"2 rule: The file processed is {filename}")
    else:
        # do something else
        print(f"3 rule: The file processed is {filename}")

uj5u.com熱心網友回復：

這里的其他人為您提供了更優化的解決方案，但這是為了回答您最初的問題，即為什么不是所有檔案都被處理。

在您的代碼中，您有三個條件來處理檔案：

這是一個*aa.gz檔案，它是第一個找到的
它是一個*aa.gz檔案，是找到的第二個或更多*aa.gz檔案。
它不是一個檔案，并且已經找到*aa.gz了以前的檔案。*aa.gz

因此它將跳過任何非*aa.gz檔案，直到遇到第一個檔案。

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/416893.html

標籤：

上一篇：bash命令'forfileinwords'中的字數有限制嗎？[復制]

下一篇：(Java)使用經典的for回圈從Map中獲取特定的鍵/值