在Python中使用selenium運行回圈時出現代碼問題-有解無憂

我必須在網站https://portalbnmp.cnj.jus.br/#/pesquisa-peca中執行網路報廢。

我的目標是在“Estado”欄位中選擇“Rio de Janeiro”
將密鑰“”發送到“Nome”欄位
搜索
在出現的表格中，我必須單擊每一行。
在下一頁點擊“Emitir”
回傳上一頁并再次進入該表的下一行的程序，依此類推。

當我逐行運行時，下面的代碼運行沒有錯誤，但在回圈中我得到了各種錯誤。陳舊，不可點擊，不可執行等。為什么會發生這種情況的一些想法？

for i in range(1, 11):
   
    element = driver.find_element_by_tag_name('p-dropdown')
    element.find_element_by_xpath("//*[contains(text(), 'Estado')]").click()
    element.find_element_by_xpath("//*[contains(text(), 'Rio de Janeiro')]").click()
        
    search = driver.find_element_by_name("nomePessoa")
    search.send_keys("")
    
    search.send_keys(Keys.RETURN)
         
    # row click 
    table = driver.find_element_by_xpath("//div[@class='ui-datatable-tablewrapper ng-star-inserted']/table/tbody")
    rows = table.find_element_by_tag_name('tr')
    
    rows.find_element_by_xpath("//tr["   str(i)   "]/td[1]").click()
    
    # click 'Emitir'
    buttons = driver.find_element_by_tag_name("button")
    buttons.find_element_by_xpath("//*[contains(text(), 'Emitir')]").click()
    
    # return page
    driver.back()

uj5u.com熱心網友回復：

如果您從瀏覽器中復制 cookie 并將其粘貼到下面的代碼中，您可以避免使用 Selenium 并大大加快此程序，這將搜索 Rio de Janiero (idEstado = 19) 并回傳 100 個結果（您可以編輯它），然后回圈瀏覽結果并保存所需的 PDF 檔案。

請注意，您正在抓取的網站是不穩定的，并且經常回傳 500 個回應，我在等待幾秒鐘后重試了請求：

import requests
import json
import re
import time

#NB get cookie header from Developer Tools - Network - fetch/xhr - Request Headers once you've passed the captcha test
cookie_value = 'portalbnmp=eyJhbGciOiJIUzUxMiJ9.eyJzdWIiOiJndWVzdF9wb3J0YWxibm1wIiwiYXV0aCI6IlJPTEVfQU5PTllNT1VTIiwiZXhwIjoxNjQzMzY1MjgzfQ.niaw12WlnO3okuY33medP7d3u6j1Y-xGPJ6mShgClfZPrs8br7HQm8XZ5k2k5Wz8J59epbUyE5KAGtSFPpEmrA'

headers =   {
    'accept':'application/json, text/plain, */*',
    'accept-encoding':'gzip, deflate, br',
    'accept-language':'en-ZA,en;q=0.9',
    'origin':'https://portalbnmp.cnj.jus.br',
    'referer':'https://portalbnmp.cnj.jus.br/',
    'content-type':'application/json;charset=UTF-8',
    'cookie': cookie_value,
    'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36'
    }

results = 100
url = f'https://portalbnmp.cnj.jus.br/bnmpportal/api/pesquisa-pecas/filter?page=0&size={str(results)}&sort=' #edited to get 100 results, you can edit this size variable

payload = {"buscaOrgaoRecursivo":False,"orgaoExpeditor":{},"idEstado":19} #19 = Rio de Janiero

retries = 1
success = False
while not success:
    try:
        resp = requests.post(url,headers=headers,data=json.dumps(payload))
        print(resp)
        if resp.status_code == 200:
            success = True
        data = resp.json()
    except Exception as e:
        print(url)
        wait = retries
        print(f'Error! Waiting {wait} secs and re-trying...')
        time.sleep(wait)
        retries  = 1

print(len(data['content']))

ids = {str(x['id']):x['nomeMae'] '-' x['nomeOrgao'] for x in data['content']} #get all filenames and IDs

for id_,name in ids.items():
    url = f'https://portalbnmp.cnj.jus.br/bnmpportal/api/certidaos/relatorio/{id_}/10'

    retries = 1
    success = False
    while not success:
        try:
            pdf_data = requests.post(url,headers=headers)
            if pdf_data.status_code == 200:
                success = True
        except Exception as e:
            wait = retries
            print(f'Error! Waiting {wait} secs and re-trying...')
            time.sleep(wait)
            retries  = 1

    filename = re.sub(r'[^\w\-_ ]', '_',name) '.pdf' #remove bad characters for filename
    print(f'Saving {name}')
    with open(filename,'wb') as file:
        file.write(pdf_data.content)

uj5u.com熱心網友回復：

使用 Selenium 時，請嘗試添加檢查以確保您正在與之互動的元素已加載。在某些情況下，您可以添加顯式等待。（盡量不要使用 sleep() 之類的方法，因為根據檔案強烈建議不要使用）。

# import webdriver 
from selenium import webdriver 
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# get element  after explicitly waiting up to 10 seconds
element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.TAG_NAME, "p-dropdown"))
    )  # I would consider looking up by ID or class
element.find_element_by_xpath("//*[contains(text(), 'Estado')]").click()
... etc

這將使您永遠不會在加載元素之前單擊它。Selenium 要記住的另一件事是元素必須是可見的才能與之互動。您可以滾動到一個元素，通過執行以下操作確保它可見：

# example that scrolls to bottom of page
driver.execute_script("window.scrollTo(0,document.body.scrollHeight);")
# example that scrolls to a specific element
from selenium.webdriver.common.action_chains import ActionChains
actions = ActionChains(driver)
element = driver.find_element_by_tag_name('p-dropdown')  # just an example
actions.move_to_element(element)

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/421985.html

標籤：

上一篇：硒/蟒蛇網路

下一篇：錯誤資訊：<selenium.webdriver.firefox.webelement.FirefoxWebElement的系結方法WebElement.click