我用python從網站上獲取上市公司的描述。
我打算讓這段代碼始終如一地獲取資訊,但它只作業一次并發生屬性錯誤。
這是我的代碼
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from openpyxl import load_workbook
import time
from bs4 import BeautifulSoup
import requests
wb = load_workbook("listedcorp.xlsx")
ws = wb.active
col_B = ws["B"]
# print(col_B)
# for cell in col_B:
# print(cell.value)
browser = webdriver.Chrome()
# browser.maximize_window()
for cell in col_B:
url = "https://finance.naver.com/item/main.nhn?code={}".format(cell.value)
browser.get(url)
soup = BeautifulSoup(browser.page_source, "lxml")
ov = soup.find("div", attrs={"class":"summary_info"}).get_text()
print(str.strip(ov) '\n\n')
time.sleep(5)
這是結果
在此處輸入影像描述
請讓我知道導致此問題的原因。
uj5u.com熱心網友回復:
生成內容需要一點時間,因此您應該等到您的元素存在,然后再BeautifulSoup從以下位置創建物件driver.page_source:
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#summary_info')))
例子
from selenium import webdriver
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import time
l = ['102280','002900']
url = 'https://finance.naver.com/item/main.naver?code='
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
wait = WebDriverWait(driver, 5)
for code in l:
driver.get(url code)
wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '#summary_info')))
soup = BeautifulSoup(driver.page_source, "lxml")
if soup.find("div", attrs={"id":"summary_info"}):
ov = soup.find("div", attrs={"id":"summary_info"}).get_text()
else:
ov = 'no text found'
print(str.strip(ov) '\n\n')
time.sleep(5)
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/457815.html
上一篇:回圈瀏覽頁面時硒卡住了
