我想將一個 html 頁面保存到一個 tibble 中,以便我以后可以在頁面內容上使用 mutate
我想過將 html 直接讀到 tibble:
library(tidyverse)
library(rvest)
#does not work
tibble(html=read_html("https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm?event=overview.process&ApplNo=040445"))
#> Error: All columns in a tibble must be vectors.
#> x Column `html` is a `xml_document/xml_node` object.
作為list作品閱讀:
#works
works <- tibble(html=list(read_html("https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm?event=overview.process&ApplNo=040445")))
works
#> # A tibble: 1 x 1
#> html
#> <list>
#> 1 <xml_dcmn>
但是,我不能使用mutate:
# does not work
works %>%
mutate(table=html_nodes(unlist(page),"#exampleApplSuppl"))
#> Error: Problem with `mutate()` column `table`.
#> i `table = html_nodes(unlist(page), "#examleApplSuppl")`.
由reprex 包(v2.0.1)于 2021 年 11 月 2 日創建
uj5u.com熱心網友回復:
由于 'html' 列是list,回圈list并回傳輸出list
library(purrr)
library(dplyr)
works %>%
mutate(table = map(html, ~ html_nodes(.x, "#examleApplSuppl")))
-輸出
# A tibble: 1 × 2
html table
<list> <list>
1 <xml_dcmn> <xml_ndst>
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/346513.html
