我想提取Myntra網站上的規格和 "完整外觀",這只有在我點擊 "顯示更多 "時才能看到。我為此寫了下面的代碼:
url = 'https://www.myntra.com/kurtas/jompers/jompers-men-yellow-printed-straight-kurta/11226756/buy'
df = pd. DataFrame(columns=['name','title','price','description','size & fit','Material & care', 'Complete the look'] )
metadata = dict. fromkeys(['name','title','price','description','size & fit','Material & care', 'Complete the look'] )
from selenium.common.exceptions import NoSuchElementException
driver = webdriver.Chrome('chromedriver')
specs = dict()
for i in range(1)。#len(links): #len(links)
driver.get(url)
try:
metadata['title'] = driver.find_element_by_class_name('pdp-title') .get_attribute("innerHTML")
metadata['name'] = driver.find_element_by_class_name('pdp-name') .get_attribute("innerHTML")
metadata['price'] = driver.find_element_by_class_name('pdp-price') 。 find_element_by_xpath('./strong'/span>).get_attribute("innerHTML")
metadata['description'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[8]/div/div[1]/p').text
#metadata['規格'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[1]') .text
if driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[7]/div[4]/div[2] '/span>) 。
print('yes'/span>)
element = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[2] ')
element.click()
for i in range(1,20)。
try:
specs[driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[{}/div[1]'/span>.format(i)) 。 text] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[{}]/div[2]'.format(i) ).text
except:
break。
metadata['Complete the look'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[8]/div/div[4]/div/p').text
exceptNoSuchElementException。
通過。
df = df.append(metadata, ignore_index=True)
我在輸出中得到了一個 "是",我猜這表明 "顯示更多 "選項被點擊了,但我在資料框架的 "完成外觀 "列中得到的是 "無"。如何獲得隱藏在 "顯示更多 "中的細節,它有以下標簽:
<div class="index-sizeFitDesc"/span>>
<h4 class="index-sizeFitDescTitle 索引-產品描述-標題" style="padding-bottom: 12px;">規格</h4>。
<div class="index-tableContainer">
<div class="index-row">
<div class="index-rowKey"> Sleeve Length</div>
<div class="index-rowValue"> 長袖</div>
</div><div class="index-row">
<div class="index-rowKey"> Shape</div>
<div class="index-rowValue"> Straight</div>
</div><div class="index-row">
<div class="index-rowKey"> Neck< /div>
<div class="index-rowValue"> Mandarin Collar</div>
</div><div class="index-row">
<div class="index-rowKey">Print or Pattern Type< /div>
<div class="index-rowValue"> Geometric</div>
</div><div class="index-row">
<div class="index-rowKey">設計風格</div>。
<div class="index-rowValue">常規</div></div>
<div class="index-row">
<div class="index-rowKey"> Slit Detail</div>
<div class="index-rowValue"> Side Slits</div>
</div><div class="index-row">
<div class="index-rowKey"> Length</div>
<div class="index-rowValue"> Above Knee</div>
</div><div class="index-row">
<div class="index-rowKey"> Hemline< /div>
<div class="index-rowValue">Curved</div></div>。
<div class="index-showMoreText"> See More</div> </div>
uj5u.com熱心網友回復:
我沒有看完你寫的所有代碼,但是為了點擊顯示更多,我試了一下下面的代碼,可能你可以把下面的代碼注入你現有的代碼。
我們必須
滾動到那個特定的元素,讓Selenium知道這個元素的確切位置。我已經使用JS
.click()來點擊顯示更多
示例代碼 :
driver = webdriver.Chrome(driver_path)
driver.maximum_window()
#driver.implicitly_wait(50)/span>
wait = WebDriverWait(driver, 20)
driver.get("https://www.myntra.com/kurtas/jompers/jompers-men-yellow-printed-straight-kurta/11226756/buy")
ele = WebDriverWait(driver, 20).until(EC.existence_of_element_located((By.CSS_SELECTOR, "div.index-showMoreText")
driver.execute_script("arguments[0].rollIntoView(true);", ele)
ActionChains(driver).move_to_element(ele).perform()
driver.execute_script("arguments[0].click();", ele)
Complete_The_Look = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.index-product-description-content") ).text
print(Complete_The_Look)
匯入 :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
輸出 :
這一季的經典kurta 來自 Jompers。實作一個舒適別致的外觀為你的下一個晚宴或家庭出游,當你把這個黃色的作品與修長的褲子和最小的亮點。
uj5u.com熱心網友回復:
在產品細節中的規格是一個部分的組合。
而且最好在這些部分中一個一個地提取細節。
而且最好嘗試為這些元素找到相關的xpaths。
url = 'https://www.myntra.com/kurtas/jompers/jompers-men-yellow-printed-straight-kurta/11226756/buy'/span>
# df = pd.DataFrame(columns=['name', 'title', 'price', 'description', 'Size & fit', 'Material & care', 'complete the look'])
metadata = dict. fromkeys(['name','title','price','description','size & fit','Material & care','Specifications', 'Complete the look'] )
from selenium.common.exceptions import NoSuchElementException
driver = webdriver.Chrome('chromedriver')
specs = dict()
規格化 = []
for i in range(1): #len(links): #len(links)
driver.get(url)
try:
metadata['title'] = driver.find_element_by_class_name('pdp-title') .get_attribute("innerHTML")
metadata['name'] = driver.find_element_by_class_name('pdp-name') .get_attribute("innerHTML")
metadata['price'] = driver.find_element_by_class_name('pdp-price') 。 find_element_by_xpath('./strong'/span>).get_attribute("innerHTML"/span>)
# 即使不滾動也能提取出細節,但最好是向下滾動。
driver.execute_script("arguments[0].scrollIntoView(true);",driver.find_element_by_xpath("//div[@class='pdp-productDescriptorsContainer'] " )
metadata['description'] = driver.find_element_by_xpath("//p[@class='pdp-product-description-content'] ").text
#metadata['規格'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[1]') text。
if driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[7]/div[4]/div[2] '/span>) 。
print('yes'/span>)
element = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[2] ')
element.click()
metadata['Size & fit'] = driver.find_element_by_xpath("//h4[contains(text(), 'Size')]/following-sibling::p").text
metadata['Material & care']=driver.find_element_by_xpath("//h4[contains(text(),'Material')]/following-sibling::p").text
#從袖長到下擺。
specn1 = driver.find_elements_by_xpath("/div[@class='index-sizeFitDesc']/div[1]/div")
for spec in specn1:
key = spec.find_element_by_xpath("./div[@class='index-rowKey']"/span>).text
value = spec.find_element_by_xpath("./div[@class='index-rowValue']").text
specfication.append([key,value])
#from Colour Family to Occasion。
specn2 = driver.find_elements_by_xpath("//div[@class='index-sizeFitDesc']/div[2]/div[1]/div")
for spec in specn2:
key = spec.find_element_by_xpath("./div[@class='index-rowKey']"/span>).text
value = spec.find_element_by_xpath("./div[@class='index-rowValue']").text
specfication.append([key, value])
metadata['規格'] = specfication
metadata['完成外觀'] = driver.find_element_by_xpath("//h4[包含(text(),'Complete')]/following-sibling::p").text
# metadata['Complete the look'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[8]/div/div[4]/div[2]/p/p') .text
except Exception as e:
print(e)
pass
for key,value in metadata.items()。
print(f"{key} : {value}"/span>)
# df = df.append(metadata, ignore_index=True)
yes
名稱 : 男士黃色印花直筒長袍
標題。Jompers
價格: 892盧比。
說明:黃色印花直筒庫爾塔,有鴛鴦領,長袖,直筒下擺,和側縫。
尺寸 & 適合 : 模型(身高6')穿的是M碼
材質及護理 : 材料。棉質
手洗
規格: [['袖長', '長袖'], ['形狀, '直筒'], ['頸部', '曼陀林r], ['印刷或圖案型別', '固體']。['設計風格', 'Regular'], ['Slit細節', '側縫'], ['長度', '膝蓋長度'], ['裙線', '直筒'], ['色系, '明亮'], ['織紋', 'Regular'], ['織紋Type', '機織'], ['場合, '日常']]
完善的外觀:本季可穿著Jompers的經典kurta。在你的下一個晚宴或家庭出游中,當你用這一黃色單品與修身長褲和最低限度的炫耀搭配時,實作一個舒適別致的外觀。
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/309761.html
標籤:
下一篇:如何使用xpath獲得動態數字?
