從所有具有特定主題行的Outlook電子郵件中提取可變資料，然后從正文中獲取日期-有解無憂

我每天都會收到一封電子郵件，里面有當天售出的水果數量。盡管我現在已經想出了一些代碼來記錄前進的相關資料，但我一直無法倒退。

資料存盤在電子郵件的正文中，如下所示：

Date of report:,01-Jan-2020
Apples,8
Pears,5
Lemons,7
Oranges,9
Tomatoes,6
Melons,3
Bananas,0
Grapes,4
Grapefruit,8
Cucumber,2
Satsuma,1

我希望代碼做的是首先搜索我的電子郵件并找到與特定主題匹配的電子郵件，逐行迭代并找到我正在搜索的變數，然后將它們記錄在帶有“報告日期”記錄在日期列中并轉換為格式：“%m-%d-%Y”。

我想我可以通過對我撰寫的代碼進行一些修改來實作這一點，以處理跟蹤它的進展：

# change for the fruit you're looking for
Fruit_1 = "Apples"
Fruit_2 = "Pears"

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6) 
messages = inbox.Items
messages.Sort("[ReceivedTime]", True)

# find data email
for message in messages:
    if message.subject == 'FRUIT QUANTITIES':
        if Fruit_1 and Fruit_2 in message.body: 
            data = str(message.body)
            break
        else:
            print('No data for', Fruit_1, 'or', Fruit_2, 'was found')
            break

fruitd = open("fruitd.txt", "w") # copy the contents of the latest email into a .txt file
fruitd.write(data)
fruitd.close()

def get_vals(filename: str, searches: list) -> dict:
    #Searches file for search terms and returns the values
    dct = {}
    with open(filename) as file:
        for line in file:
            term, *value = line.strip().split(',')
            if term in searches:
                dct[term] = float(value[0]) # Unpack value 
    # if terms are not found update the dictionary w remaining and set value to None
    if len(dct.keys()) != len(searches):
        dct.update({x: None for x in search_terms if x not in dct})
    return dct


searchf = [
    Fruit_1, 
    Fruit_2
] # the list of search terms the function searches for

result = get_vals("fruitd.txt", searchf) # search for terms 
print(result)

# create new dataframe with the values from the dictionary
d = {**{'date':today}, **result}
fruit_vals = pd.DataFrame([d]).rename(columns=lambda z: z.upper())
fruit_vals['DATE'] = pd.to_datetime(fruit_vals['DATE'], format='%d-%m-%Y')
print(fruit_vals)

我正在創建一個名為“fruitd”的 .txt，因為我不確定如何以其他方式遍歷電子郵件正文。不幸的是，我不認為為過去的每封電子郵件創建一個 .txt 真的可行，我想知道是否有更好的方法來做到這一點？

任何建議或指示將是最受歡迎的。

**EDIT 理想情況下希望獲取搜索串列中的所有變數；所以 Fruit_1 和 Fruit_2 有空間在必要時將其擴展為 Fruit_3 Fruit_4（等）。

uj5u.com熱心網友回復：

#PREP THE STUFF
Fruit_1 = "Apples"
Fruit_2 = "Pears"
SEARCHF = [
    Fruit_1, 
    Fruit_2
]

#DEF THE STUFF
# modified to take a list of list of strs as `report` arg
# apparently IDK how to type-hint; type-hinting removed
def get_report_vals(report, searches):
    dct = {}
    for line in report:
        term, *value = line
        # `str.casefold` is similar to `str.lower`, arguably better form
        # if there might ever be a possibility of dealing with non-Latin chars
        if term.casefold().startswith('date'):
            #FIXED (now takes `date` str out of list)
            dct['date'] = pd.to_datetime(value[0])
        elif term in searches:
            dct[term] = float(value[0])
    if len(dct.keys()) != len(searches):
        # corrected (?) `search_terms` to `searches`
        dct.update({x: None for x in searches if x not in dct})
    return dct


#DO THE STUFF
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6) 
messages = inbox.Items
messages.Sort("[ReceivedTime]", True)

results = []

for message in messages:
    if message.subject == 'FRUIT QUANTITIES':
        # are you looking for:
        #  Fruit_1 /and/ Fruit_2
        # or:
        #  Fruit_1 /or/  Fruit_2
        if Fruit_1 in message.body and Fruit_2 in message.body:
            # FIXED
            data = [line.strip().split(",") for line in message.body.split('\n')]
            results.append(get_report_vals(data, SEARCHF))
        else:
            pass

fruit_vals = pd.DataFrame(results)
fruit_vals.columns = map(str.upper, fruit_vals.columns)

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/349958.html

標籤：Python 熊猫麻木的外表

上一篇：numpy在一列上應用“where”？

下一篇：python無限回圈和numpy洗掉不能正常作業