我正在嘗試從 800 多個鏈接中提取資料并將其放在表格中。我曾嘗試使用 chrome 選擇器小工具,但無法弄清楚如何讓它回圈。我一定花了 40 個小時,不斷收到錯誤代碼。我需要從li:nth-child(8) , li:nth-child(8) strong另一個資訊文本框中提取相同的資訊。我曾嘗試關注 YouTube 視頻,但我只是更改了名稱和鏈接,但在其他方面保持了一致性,這將無法正常作業。
library(tidyverse)
library(rvest)
library(htmltools)
library(xml2)
library(dplyr)
results <- read_html("https://www.artemis.bm/deal-directory/")
issuers <- results %>% html_nodes("#table-deal a") %>% html_text()
url <- results %>% html_nodes("#table-deal a") %>% html_attr("href")
get_modelling = function(url_link) {
issuer_page = read_html(url_link)
modelling = issuer_page %>% html_nodes("#info-box li:nth-child(4)") %>%
html_text()
return(modelling)
}
issuer_modelling = sapply(url, FUN = get_modelling)
我得到這些問題:
Warning message:
In for (i in seq_along(specs)) { :
closing unused connection 4 (https://www.artemis.bm/deal-directory/bellemeade-re-2022-1-ltd/)
Called from: open.connection(x, "rb")
Browse[1]> data.table::data.table(placement = unlist(issue_placement))[,.N, placement]
Error during wrapup: object 'issue_placement' not found
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Browse[1]> c
> data.table::data.table(placement = unlist(issue_placement))[,.N, placement]
Error in unlist(issue_placement) : object 'issue_placement' not found
uj5u.com熱心網友回復:
我們可以使用簡單的for回圈,
#create empty vector
df = c()
for(i in head(url)){
dd = i %>% read_html() %>% html_nodes("#info-box li:nth-child(4)") %>%
html_text()
df = c(dd, df)
}
df
[1] "Risk modelling / calculation agents etc: AIR Worldwide" "Risk modelling / calculation agents etc: AIR Worldwide"
[3] "Risk modelling / calculation agents etc: RMS" "Risk modelling / calculation agents etc: AIR Worldwide"
[5] "Risk modelling / calculation agents etc: AIR Worldwide" "Risk modelling / calculation agents etc: AIR Worldwide"
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/442012.html
