如何使用seleniumpython緩慢向下滾動網頁？-有解無憂

我想使用硒向下滾動網頁。發現這個：
如何在 python 中使用 selenium webdriver 滾動網頁？

把這個代碼波紋管：

SCROLL_PAUSE_TIME = 0.5

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

它作業正常。但是由于上面的代碼，我在我的主代碼中發現了一些問題。我想決議推特。如果推特賬號很長，網頁的html代碼中就會有一些twits。并非此帳戶的所有推文。示例：我向下滾動網頁，并且在網頁的 html 代碼中僅包含那些對我可見的推文（我可以看到）。由于這件事，我無法捕捉到所有的推特。上面的代碼快速滾動頁面。我怎樣才能減慢滾動速度？

我試圖解決它并撰寫了愚蠢的代碼：

    last_height = driver.execute_script("return document.body.scrollHeight")
    print(last_height)

    # Scroll down to bottom
    y = 600
    finished = False
    while True:
        for timer in range(0, 100):
            driver.execute_script("window.scrollTo(0, "   str(y)   ")")
            y  = 600
            sleep(1)
            new_height = driver.execute_script("return document.body.scrollHeight")
            print(new_height, last_height)

            if new_height == last_height: #on the first iteration new_height equals last_height
                print('stop')
                finished = True
                break
            last_height = new_height
        if finished:
            break

此代碼不起作用。在第一次迭代時 new_height 等于 last_height 請幫幫我。
如果你能修復我的代碼，修復它。如果您可以撰寫另一個優雅的解決方案，請寫下來。

更新：

這個滾動必須是無限的。例如：我向下滾動 facebook 帳戶，直到我完全滾動它。這就是為什么我有 last_height 和 new_height 變數。在我的代碼中，當 last_height 等于 new_height 時，這意味著頁面已滾動到最后，我們可以停止滾動它（我們可以退出）。但我錯過了一些東西。我的代碼不起作用。

uj5u.com熱心網友回復：

我曾在 Twitter 機器人上作業，當您向下滾動時，它會更新頁面的 HTML 并從上面洗掉一些推文。我使用的演算法是：

為推文 URL 創建一個空串列。
收集可用的推文，然后為每條推文檢查其 URL 是否在串列中，如果沒有，則添加它并對推文的內容執行您想要的處理，否則忽略該推文。
獲取頁面高度 current_height = DriverWrapper.cd.execute_script("return document.body.scrollHeight")
向下滾動頁面，如果new_height == current_height結束，則從第二步開始重復..

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/311609.html

標籤：Python 硒硒网络驱动程序滚动

上一篇：網路抓取亞馬遜評論precentsbs4

下一篇：Pythonseleniumwebdriver：只讀取第一個審稿人資訊