您好,我正在嘗試學習如何進行網路抓取,所以我首先嘗試通過網路抓取我的學校選單。
我遇到了一個問題,我無法在跨度類下獲取選單項,而是在跨度類“show”的同一行中獲取單詞。

from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome(executable_path=chromedriver.exe')#changed this
driver.get('https://housing.ucdavis.edu/dining/menus/dining-commons/tercero/')
results = []
content = driver.page_source
soups = BeautifulSoup(content, 'html.parser')
element=soups.findAll('span',class_ = 'collapsible-heading-status')
for span in element:
print(span.text)
我試圖把它變成 span.span.text 但這不會給我任何回報,所以有人可以給我一些關于如何在 collapsible-heading-status 類下提取資訊的指示。
uj5u.com熱心網友回復:
美味的華夫餅 - 如前所述,它們已經消失了,但為了實作您的目標,一種方法是通過css selectors使用來選擇名稱adjacent sibling combinator:
for e in soup.select('.collapsible-heading-status span'):
print(e.text)
或與find_next_sibling():
for e in soup.find_all('span',class_ = 'collapsible-heading-status'):
print(e.find_next_sibling('span').text)
例子
要以結構化方式獲取每個資訊的全部資訊,您可以使用:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from bs4 import BeautifulSoup
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get("https://housing.ucdavis.edu/dining/menus/dining-commons/tercero/")
soup = BeautifulSoup(driver.page_source, 'html.parser')
data = []
for e in soup.select('.nutrition'):
d = {
'meal':e.find_previous('h4').text,
'title':e.find_previous('h5').text,
'name':e.find_previous('span').text,
'description': e.p.text
}
d.update({n.text:n.find_next().text.strip(': ') for n in e.select('h6')})
data.append(d)
data
輸出
[{'meal': 'Breakfast',
'title': 'Fresh Inspirations',
'name': 'Vanilla Chia Seed Pudding with Blueberrries',
'description': 'Vanilla chia seed pudding with blueberries, shredded coconut, and toasted almonds',
'Serving Size': '1 serving',
'Calories': '392.93',
'Fat (g)': '36.34',
'Carbohydrates (g)': '17.91',
'Protein (g)': '4.59',
'Allergens': 'Tree Nuts/Coconut',
'Ingredients': 'Coconut milk, chia seeds, beet sugar, imitation vanilla (water, vanillin, caramel color, propylene glycol, ethyl vanillin, potassium sorbate), blueberries, shredded sweetened coconut (desiccated coconut processed with sugar, water, propylene glycol, salt, sodium metabisulfite), blanched slivered almonds'},
{'meal': 'Breakfast',
'title': 'Fresh Inspirations',
'name': 'Housemade Granola',
'description': 'Crunchy and sweet granola made with mixed nuts and old fashioned rolled oats',
'Serving Size': '1/2 cup',
'Calories': '360.18',
'Fat (g)': '17.33',
'Carbohydrates (g)': '47.13',
'Protein (g)': '8.03',
'Allergens': 'Gluten/Wheat/Dairy/Peanuts/Tree Nuts',
'Ingredients': 'Old fashioned rolled oats (per manufacturer, may contain wheat/gluten), sunflower seeds, seedless raisins, unsalted butter, pure clover honey, peanut-free mixed nuts (cashews, almonds, sunflower oil and/or cottonseed oil, pecans, hazelnuts, dried Brazil nuts, salt), light brown beet sugar, molasses'},...]
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/483249.html
