我正在嘗試使用 Python Selenium從該組態檔的“關注者”按鈕串列中抓取用戶名。我不能這樣做有兩個原因:
- 我無法通過使用
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")在串列上滾動,因為串列有 2 個滾動條(我不知道為什么它有 2 個)。如果我嘗試滾動它會滾動個人資料頁面而不是實際串列。 - 即使我設法滾動串列,我應該如何存盤用戶名?用戶是動態加載的,出于某種原因類 id 看起來像這樣
class='st--c-PJLV st--c-dhzjXW st--c-edagZx'
我已經嘗試了幾種方法來解決這個問題,但我無法達到我想要的結果,任何幫助表示贊賞。以下是我嘗試使用但出現錯誤的一些代碼片段:
scrollElem = driver.find_elements(By.XPATH, "//div[@class='st--c-PJLV st--c-dhzjXW st--c-
edagZx']/a")
followernumber = 2000
scrollElem[len(scrollElem)-1].location_once_scrolled_into_view
for i in range(0,followernumber):
new = len(scrollElem) i
newname = driver.find_element(By.XPATH, "(//div[@class='st--c-PJLV st--c-dhzjXWstedagZx']/a)[%i]"%new)
print(newname.text, i)
newname.location_once_scrolled_into_view
time.sleep(1)
得到錯誤:selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"(//div[@class='st--c-PJLV st--c-dhzjXW st--c-edagZx']/a)[47]"}
我還嘗試使用此演算法在串列底部滾動并在加載時存盤元素,但這也不起作用:
def scrollDown():
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(SCROLL_PAUSE_TIME)
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
該演算法滾動個人資料頁面而不是關注者串列
由于我是網路抓取的新手,我將不勝感激!
uj5u.com熱心網友回復:
嘗試使用 requests 模塊獲取該組態檔的所有關注者名稱:
import requests
link = 'https://hasura2.foundation.app/v1/graphql'
payload = {"query":"query userFollowersQuery($publicKey: String!, $currentUserPublicKey: String!, $offset: Int!, $limit: Int!) {\n follows: follow(\n where: {followedUser: {_eq: $publicKey}, isFollowing: {_eq: true}}\n offset: $offset\n limit: $limit\n ) {\n id\n user: userByFollowingUser {\n name\n username\n profileImageUrl\n userIndex\n publicKey\n follows(where: {user: {_eq: $currentUserPublicKey}, isFollowing: {_eq: true}}) {\n createdAt\n isFollowing\n }\n }\n }\n}\n","variables":{"currentUserPublicKey":"","publicKey":"0xF74d1224931AFa9cf12D06092c1eb1818D1E255C","offset":0,"limit":48},"operationName":"userFollowersQuery"}
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
while True:
resp = s.post(link,json=payload)
if not resp.json()['data']['follows']:break
for item in resp.json()['data']['follows']:
print(item['user']['username'])
payload['variables']['offset'] =48
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/344498.html
上一篇:當按鈕“加載更多”不更改URL時,PythonSelenium抓取資料
下一篇:從值是下拉選單選擇的頁面中抓取
