在Python中的for回圈內寫入Pandas列-有解無憂

這肯定是缺乏知識的問題，因為我通常是新手。我試圖用這段代碼完成的是抓取我正在完成的網頁上的所有資料。問題是在回圈繼續之前，我希望 pandas 將當前 position_text 變數寫入 ["Positions"] 列。我通過 print 陳述句確認它正在將我想要寫入的內容準確地寫入新的 ["Position"] 列，但它只是將最后一個實體寫入 ["Position"] ，即 "C"

鏈接：https ://www.fantasypros.com/daily-fantasy/nba/fanduel-defense-vs-position.php

df_results = pd.DataFrame()
?
follow_loop=list(range(1,7))
for i in follow_loop:
    xpath = '//*[@id="main-container"]/div/div/div/div[4]/div[1]/ul/li['
    xpath  = str(i)
    xpath  = "]"
    driver.find_element(By. XPATH,(xpath)).click()
    
    sleep (2)
   
        
    driver.execute_script("window.scrollTo(1,1200)")
  
    sleep(2)
    driver.execute_script("window.scrollTo(1,-1200)")
    
        
    
?
    html=driver.page_source
?
    soup = BeautifulSoup(html,'html.parser')
?
    stats_table=soup.find(id="data-table")
    
    position='//*[@id="main-container"]/div/div/div/div[4]/div[1]/ul/li['
    position  = str(i)
    position  = "]"
    position_text =  driver.find_element(By. XPATH,(position)).text
    
    df_results = df_results.append(pd.read_html(str(stats_table)))
    df_results["Position"] = position_text
    print(position_text)
    sleep (2)
    
    
ALL
PG
SG
SF
PF
C

uj5u.com熱心網友回復：

這是在一個大資料框中從所有表中獲取資料的一種方法：

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time as t
import pandas as pd 

pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
big_df = pd.DataFrame()
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,720")

webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
driver = webdriver.Chrome(service=webdriver_service, options=chrome_options)
wait = WebDriverWait(driver, 20)
url = "https://www.fantasypros.com/daily-fantasy/nba/fanduel-defense-vs-position.php"
driver.get(url)

tables_list = wait.until(EC.presence_of_all_elements_located((By.XPATH, '//ul[@]/li')))

for x in tables_list:
    x.click()
    print('selected', x.text)
    t.sleep(2)
    table = wait.until(EC.element_to_be_clickable((By.XPATH, '//table[@id="data-table"]')))
    df = pd.read_html(table.get_attribute('outerHTML'))[0]
    df['Category'] = x.text.strip()
    big_df = pd.concat([big_df, df], axis=0, ignore_index=True)
    print('done, moving to next table')
print(big_df)
big_df.to_csv('fanduel.csv')

這會將資料保存到 csv 檔案中，并在終端中顯示：

Team    PTS REB AST 3PM STL BLK TO  FD PTS  Category
0   HOUHouston Rockets  23.54   9.10    5.10    2.54    1.88    1.15    2.65    48.55   ALL
1   OKCOklahoma City Thunder    22.22   9.61    5.19    2.70    1.67    1.18    2.52    47.57   ALL
2   PORPortland Trail Blazers   22.96   8.92    5.31    2.74    1.63    0.99    2.65    46.84   ALL
3   SACSacramento Kings 23.00   9.10    5.03    2.58    1.61    0.95    2.50    46.65   ALL
4   ORLOrlando Magic    22.35   9.39    4.94    2.62    1.57    1.04    2.50    46.36   ALL
... ... ... ... ... ... ... ... ... ... ...
175 DENDenver Nuggets   22.96   12.91   3.68    0.96    1.21    1.76    2.62    50.26   C
176 PHIPhiladelphia 76ers   21.95   13.35   3.01    1.15    1.14    1.94    2.07    49.66   C
177 BOSBoston Celtics   19.52   14.46   3.58    0.61    1.40    1.82    2.80    49.10   C
178 NYKNew York Knicks  19.31   14.48   3.02    1.07    1.02    1.98    2.26    47.96   C
179 MIAMiami Heat   19.00   14.44   2.95    0.64    1.24    1.55    2.71    46.41   C

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/514464.html

標籤：Pythonphphtml硒网络驱动程序网页抓取

上一篇：嘗試使用selenium從ul獲取liurl

下一篇：使selenium從輸入框中選擇一個數字python