我正在嘗試撰寫一個 R 代碼來獲取網頁中日期列中的日期,例如:Mar 23 , Sat。我查看了源代碼,但這些日期不存在。
到目前為止,我已經嘗試了以下內容,但沒有任何效果(如果這些代碼看起來很傻,請原諒我,我是網路抓取的新手)
webpage <- read_html("https://www.cricbuzz.com/cricket-series/2810/indian-premier-league-2019/matches")
webpage %>% html_nodes(xpath = "//*[@id='series-matches']/div[4]/div[1]") %>% html_text()
#> [1] ""
webpage %>% html_nodes(xpath = "//html/body/div/div[2]/div[4]/div/div[6]/div[2]/span") %>% html_text()
#> [1] ""
webpage %>% html_nodes(xpath = "//html/body/div/div[2]/div[4]/div/div[6]/div[2]/span/ng-binding") %>% html_text()
#> character(0)
webpage %>% html_nodes(".ng-binding") %>% html_text()
#> character(0)
uj5u.com熱心網友回復:
該資訊存盤在其直接父級具有“schedule-date”類ng-bind的子span元素的屬性中。
<div class="cb-col-25 cb-col pad10 schedule-date ng-isolate-scope" ng-show="!filter_set"><span ng-bind=" 1553351400000| date:'MMM dd, EEE' : ' 05:30'" class="ng-binding">Mar 23, Sat</span></div>
您可以使用 css 選擇器串列.schedule-date > span來定位這些元素,然后提取ng-bind屬性值。然后你有一個紀元 Unix 時間戳,加上 UTC 偏移量(印度標準時間)和日期格式說明。
1553351400000| date:'MMM dd, EEE' : ' 05:30'
您可以提取時間戳部分并應用相關轉換(由管道分隔符后的資訊通知)。
library(rvest)
library(tidyverse)
library(stringi)
strings_with_dates <- read_html("https://www.cricbuzz.com/cricket-series/2810/indian-premier-league-2019/matches") %>%
html_elements(".schedule-date > span") %>%
html_attr("ng-bind")
dates <- str_match(strings_with_dates, "(\\d ).*") %>%
.[, 2] %>%
as.numeric() %>%
map(function(x) x / 1000) %>%
unlist() %>%
as.POSIXct(origin = "1970-01-01", tz = "Asia/Kolkata") %>%
as.Date() %>%
stri_datetime_format(format = "MMM dd, EEE")

uj5u.com熱心網友回復:
一個RSelenium解決方案,以獲得日期,
url = 'https://www.cricbuzz.com/cricket-series/2810/indian-premier-league-2019/matches'
#Launch Browser
library(RSelenium)
library(rvest)
library(dplyr)
driver = rsDriver(browser = c("firefox"))
remDr <- driver[["client"]]
remDr$navigate(url)
remDr$getPageSource()[[1]] %>%
read_html() %>% html_nodes('.schedule-date') %>%
html_nodes('.ng-binding') %>%
html_text()
[1] "Mar 23, Sat" "Mar 24, Sun" "Mar 25, Mon" "Mar 26, Tue" "Mar 27,
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/383370.html
