如何在陣列的多個元素上同時運行（嵌套）回圈？-有解無憂

我是 Python 的初學者，目前正在運行這個嵌套的 for 回圈網路抓取程式，以抓取幾個 Excel 檔案，以獲取我資料集中的數千個觀察結果。但是，我的代碼運行速度如此之慢，以至于我需要加快這個程序，這樣我就可以一次進行 5-20 次觀察。人們建議使用 threading 或 asyncio，但我不知道如何使用它們或撰寫什么代碼，因為在線檔案非常遲鈍，沒有真正解釋 Python 3.9 (Spyder) 在我的試錯程序中在做什么程序。

我的代碼很長，但要點是我需要一次迭代多個元素 i（在第一行代碼中），但我不知道該怎么做。我正在尋找一個簡單的解決方法。我意識到這段代碼非常笨拙，但處理能力/速度不是問題。請只幫我解決并發問題！

這是我的代碼。第一行是我需要一次/同時迭代陣列中的多個 (10-20) 元素的行。

for i in range(0,33000):
        
        #Say what iteration this is
        print('Beginning iteration')
        print(i)
    
    
        #Calling to use Chrome to webscrape
        driver = webdriver.Chrome(ChromeDriverManager().install())
        
        
        #Create WebDriverWait times of 5, 10, 15 and 30 seconds
        wait5 = WebDriverWait(driver, 5)
        wait10 = WebDriverWait(driver, 10)
        wait15 = WebDriverWait(driver, 15)
        wait30 = WebDriverWait(driver, 30)
        
        #Open FEC webpage
        driver.get("https://www.fec.gov/")
        
        
    
        #Find the searchbar and search the PCC ID
        searchbar = driver.find_element_by_xpath('/html/body/header[2]/div/ul/li[3]/form/div/span/input')
        searchbar.send_keys(commid[i])
        #searchbar.send_keys(comm5)
        searchbar.send_keys(Keys.RETURN)
            
        
        #Click on PCC Homepage
        pcc = wait5.until(
          EC.element_to_be_clickable((By.XPATH, '/html/body/main/main/div[2]/div[2]/section/ul/li/h3/a'))
          )
        pcc.click()
            
        
        try:
            
            #Get Two-Year election cycle Period drop down menu in PCC Homepage
            select = driver.find_element_by_xpath( "//select[@id='summary-cycle']")  #get the select element            
            options = select.find_elements_by_tag_name("option") #get all the options into a list
            
        except:
            pass
        
        else:
            
            #Create array that will hold all election cycle options for PCC 'i'
            optionsList = []
            
            for option in options: #iterate over the options, place attribute value in the options array
                optionsList.append(option.get_attribute("value"))
            
            #Now, for each PCC in the dataset, loop over all available election cycles each PCC was registered for
            for oppy in optionsList:
                
              
                #Select the election cycle of interest
                dropdown = Select(driver.find_element_by_id('summary-cycle'))
                dropdown.select_by_value(oppy)
                sleep(randint(5,7))
                
        
            
                try: 
                    
                #Clicks on "Browse receipts" button on PCC i's homepage
                    receipts = wait10.until(EC.presence_of_element_located((By.XPATH, '//*[@id="total-raised"]/div[1]/a')))
                    driver.execute_script("arguments[0].click();",receipts)
                       
                    sleep(randint(10,15))
                
                    
                except:
                    
                    if NoSuchElementException:
                       
                        try:
                            driver.find_element(By.XPATH, '/html/body/main/div[2]/header/div/span[3]')
                            print('For PCC ID {},'.format(''.join(commid[i])))
                            #print('For PCC ID {},'.format(''.join(comm5[i])))
                            print('Receipts do not exist for election year {}.'.format(''.join(oppy)))    
                            pass
                        
                        except:
                            print('For PCC ID {},'.format(''.join(commid[i])))
                            print('Webpage does not exist for election year {}.'.format(''.join(oppy)))
                            driver.back()
              
                        
                else:
                    
                    try:
                        #Clicks on "Export" button for receipts from succeeding webpage of receipt data
                        receiptsexport = wait15.until(EC.element_to_be_clickable((By.XPATH, '//*[@id="main"]/section/div[2]/div[1]/div[1]/div/div[2]/button')))
                        receiptsexport.click()
                        sleep(randint(5,7))
                    
                    except:
                        print('For PCC ID {},'.format(''.join(commid[i])))
                        #print('For PCC ID {},'.format(''.join(comm5[i])))
                        print('There is no Receipt Data to export for election year {}.'.format(''.join(oppy)))    
                        sleep(randint(5,7)) 
                        pass
                         
                    else: 
                        
                        try:
                            
                            #Clicks on "Download" button under "Your downloads" to download receipts as .csv file
                            receiptsdownload = wait10.until(EC.element_to_be_clickable((By.XPATH, '/html/body/div[4]/div/ul/li/div/a')))
                            sleep(randint(5,7))
                            receiptsdownload.click()
                            sleep(randint(5,7))
                            driver.back()
                            sleep(randint(5,7)) 
                        
                        except:
                            print('For PCC ID {},'.format(''.join(commid[i])))
                            print('I cannot download Receipt Data, since there is none to export for election year {}.'.format(''.join(oppy)))
                            driver.back() #Go back to PCC homepage
                            sleep(randint(5,7))
                            pass
                        
                
                try:
                  
                    #Search for "Browse Disbursements" on PCC i's homepage and click link
                    disburse = wait10.until(EC.presence_of_element_located((By.LINK_TEXT, "Browse disbursements")))
                    driver.execute_script("arguments[0].click();",disburse)
                    
                    sleep(randint(10,15))
                
                except:
                    print('For PCC ID {},'.format(''.join(commid[i])))
                    #print('For PCC ID {},'.format(''.join(comm5)))
                    print('Disbursements do not exist for election year {}.'.format(''.join(oppy)))    
                    sleep(randint(5,7)) 
                    pass
                
                else:
                    
                    try:
                        
                        #Clicks on "Export" button for disbursements from succeeding webpage of disbursement data
                        disbursexport = wait15.until(EC.element_to_be_clickable((By.XPATH, '//*[@id="main"]/section/div[2]/div[1]/div[1]/div/div[2]/button')))
                        disbursexport.click()
                        
                        sleep(randint(5,7))
                    
                    except:
                        print('For PCC ID {},'.format(''.join(commid[i])))
                        #print('For PCC ID {},'.format(''.join(comm5)))
                        print('There is no Disbursement Data to export for election year {}.'.format(''.join(oppy)))    
                        sleep(randint(5,7)) 
                        pass
                         
                    else: 
                        
                        try:
                            #Clicks on "Download" button under "Your downloads" to download disbursements as .csv file
                            disbursedownload = wait15.until(EC.element_to_be_clickable((By.XPATH, '/html/body/div[4]/div/ul/li/div/a')))
                            sleep(randint(5,7))
                            disbursedownload.click()
                            driver.back()
                            sleep(randint(5,7)) 
                            
                            
                        except:
                            print('For PCC ID {},'.format(''.join(commid[i])))
                            print('I cannot download Disbursement Data, since there is none to export for election year {}.'.format(''.join(oppy)))
                            driver.back() #Go back to PCC homepage
                            sleep(randint(5,7))
                            pass

uj5u.com熱心網友回復：

你可以試試concurrent.futures。使用一個引數定義要作為函式運行的代碼，并像這樣傳遞它：

import concurrent.futures

def my_func(i):
    do_something

my_list = [i for i in range(0, 33000)]

with concurrent.futures.ThreadPoolExecutor() as executor:
    executor.map(my_func, my_list)

的每個條目my_list都傳遞到my_func. 如果您想將更多引數傳遞給my_func()，請查看How to use multiprocessing pool.map with multiple arguments，但看起來您并不需要。您可以max_workers使用ThreadPoolExecutor.

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/432767.html

標籤：Python 硒循环

上一篇：如何回圈訪問MsAccess中的選定表單串列

下一篇：在不保存在opencv中的情況下撰寫和閱讀視頻