更正python代碼以使用函式-有解無憂

先這么大的免責宣告：我對python和編程很陌生，這是我第一次使用函式。我會很高興得到任何幫助，但這里的最終目標不是擁有最漂亮或最有效的代碼。我只是想讓它以某種方式作業。??

我已經撰寫了下面的程式，但我無法讓它正常作業。

因為這是 Uni 的分級練習，所以我有一定的限制，我應該遵循：

在沒有任何字串切片的情況下作業，并且不要為此任務使用字串方法split()、startswith()、endswith() 或 replace() 。相反，使用 Python re 模塊（re.match、re.search、re.findall、re.sub）使用正則運算式來做所有事情。

這是我的代碼：


import re
import sys
# get the filename
filename = sys.argv[1]

# open a file for reading
infile = open(filename, 'r')


########################################################################################
# Normalising function to normalize ü,?,? etc. and append linens to list
def normalize(infile):
    for line in infile:
        line = line.lower()
        line = re.sub('?', 'ae', line)
        line = re.sub('ü', 'ue', line)
        line = re.sub('?', 'oe', line)
        line = re.sub('é|?', 'e', line)
        line_list = []
        line_list.append(line)
    
        return(line_list)


########################################################################################
# function to return just the consonants in a given namecdef get_consonants(y):
def get_consonants(y):
    only_consonants_sep = re.findall('[^\W_aeiou]', y)
    only_consonants = ''.join(only_consonants_sep)
    return only_consonants


########################################################################################
# function to return first and last name according to all rules
def parse_name(input_y):
        
    # Parse the names to first (meaning first middle, if it exists) and last names
    for element in input_y:
    
        # define last name for later use
        last_namegroup = re.search('(\w*)\t', element)
        last_name = last_namegroup.group(1)
        
        # define what the middle name is for later use
        middle_namegroup = re.search('\s(\w)', element)
        middle_name = middle_namegroup.group(1)
        
        # define what first name is for later use
        first_namegroup = re.search('^(\S*)', element)
        first_name1 = first_namegroup.group(1) 
            
        if len(re.findall(' ', element)) > 1:
            
            # remove spaces between first and middle name 
            # put period between first middle  and last name
            first_middle_more = re.sub(" ", "", element, count = 1)
            
            #isolate first middle name to determine length afterwards
            first_middlegroup = re.search('^(\S*)', first_middle_more)
            first_name = first_middlegroup.group(1) 

            # if the length of first middle is larger than 8 
            # give first_middle to get_consonant and store output in consonant_letters
            if len(first_name) > 8:
                consonant_first_middle = get_consonants(first_name)
                
                # if length of consonant_first_middle is still larger than 8
                # take the consonants of only the first name
                # and add the first letter of the middle name defined above 
                if len(consonant_first_middle) > 8:
                    consonant_first_name = get_consonants(first_name1)
                    first_name = (f'{consonant_first_name}{middle_name}')
        
        # if there is no middle name, i.e. only ones white space in the line
        # then just take the word until the space and store it as first_name
        else:
            first_namegroup = re.search('^(\S*)', element)
            first_name = first_namegroup.group(1)   
        
                    
    return(first_name, last_name)
    

########################################################################################
# creating the email addresses
def create_email_address(first_name, last_name):
    
    for lines in infile:
        if re.search('\tstu', lines):
            domain = (f'{first_name}.{last_name}@uzh.ch')
        else:
            domain_part = re.search('\t(.*)$',lines)
            domain = (f'{first_name}.{last_name}@{domain_part.group(1)}.uzh.ch')
    return domain

########################################################################################
# function as the "top level" of the program that calls all other functions as needed.

def main():
  
    ####### Receive a file name from the command line
    # get the filename
    filename = sys.argv[1]
    # open a file for reading
    infile = open(filename, 'r')
    
    ####### Normalise the input
    input_normalized = normalize(infile)
    
    ####### Parse the names as needed
    input_first, input_last = parse_name(input_normalized)
    
    ####### Create email addresses
    email_output = create_email_address(input_first, input_last)
    
    ####### Print the result
    for line in infile:
        print(f'{line} --> {email_output}')

main_function = main()
print(main_function)

這是我必須使用的輸入檔案：

Raphael Fritz Bernasconi    stu
Frédéric Piero  cl
S?ren Thadd?us Favre    stu
Regula Aegerter stu
No?l ?biger cl
Inés Desirée Muff   rom
Sébastien Merian    stu
Liam Cereghetti stu
Bj?rn Michael Crivelli  ds
Jo?lle Fürrer   stu

這是我得到的輸出：

Frédéric Piero  cl
 --> [email protected]
S?ren Thadd?us Favre    stu
 --> [email protected]
Regula Aegerter stu
 --> [email protected]
No?l ?biger cl
 --> [email protected]
Inés Desirée Muff   rom
 --> [email protected]
Sébastien Merian    stu
 --> [email protected]
Liam Cereghetti stu
 --> [email protected]
Bj?rn Michael Crivelli  ds
 --> [email protected]
Jo?lle Fürrer   stu --> [email protected]
None

如您所見，我認為我在檔案如何迭代我的代碼方面做錯了，但我根本無法弄清楚我做錯了什么。

我很感激你能向我指出的一切！

uj5u.com熱心網友回復：

在我的第一個答案中，我嘗試不更改太多代碼，但我認為重新思考邏輯和重寫代碼會更容易。這是我想出的，我認為這可以滿足您的需求嗎？

import sys, re


def normalize(name):
    name = name.lower()
    name = re.sub('?', 'ae', name)
    name = re.sub('ü', 'ue', name)
    name = re.sub('?', 'oe', name)
    name = re.sub('é|?', 'e', name)
    return name


def remove_vowels(name):
    return re.sub('a|e|i|o|u', '', name)
    

def main(filename):
    # a list to store our output
    people = []

    # let's loop through each line, looking at
    # just one person at a time
    for line in open(filename, "r"):
        # remove trailing \n from each line
        line = re.sub("\n", "", line)

        # separate the name and domain
        name, domain = re.split("\t", line)

        # remove the accents and capitals
        normalized_name = normalize(name)

        # separate the name
        # we use a "*" so that the first and last names
        # are selected correctly, and then any remaining
        # characters in between are put into the middle name.
        first_name, *middle_name, last_name  = re.split(" ", normalized_name)
        middle_name = "".join(middle_name)

        full_first_name = first_name   middle_name

        # this is where we check for how
        # long the first and middle names are
        if len(full_first_name) > 8:
            full_first_name = remove_vowels(full_first_name)

            # now we check if this is still too long
            if len(full_first_name) > 8:
                full_first_name = remove_vowels(first_name)

                # check if they actually have a middle name
                if middle_name:
                    full_first_name  = middle_name[0]
        
        # now let's grab the domain
        subdomain = ""
        if domain != "stu":
            subdomain = domain   "."
        
        host = subdomain   "uzh.ch"

        
        # now let's format this into an address
        address = f"{full_first_name}.{last_name}@{host}"

        people.append((name, address))

    return people



filename = sys.argv[1]
people = main(filename)

for name, address in people:
    print(name, "-->", address)

輸出：

Raphael Fritz Bernasconi --> [email protected]
Frédéric Piero --> [email protected]
S?ren Thadd?us Favre --> [email protected]
Regula Aegerter --> [email protected]
No?l ?biger --> [email protected]
Inés Desirée Muff --> [email protected]
Sébastien Merian --> [email protected]
Liam Cereghetti --> [email protected]
Bj?rn Michael Crivelli --> [email protected]
Jo?lle Fürrer --> [email protected]

希望代碼中的注釋能很好地解釋發生了什么。

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/532033.html

標籤：Python正则表达式列表功能正则表达式组

上一篇：區域變數的“未設定”的奇怪行為

下一篇：物件方法參考變數不是物件方法