我是 R 新手,在將字符格式轉換為日期/時間格式時遇到了問題。試圖將started_at和ended_at列從字符突變為日期/時間,但無論我嘗試過什么,我都會收到錯誤nas introduced by coercion或character string is not in a standard unambiguous format. 目的是創建一個新列ride_length作為ended_at和started_at值之間的差異(以分鐘為單位)。
我有我的df名字sep_2021。
str(sep_2021)
spec_tbl_df [804,352 × 14] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ ride_id : chr [1:804352] "9DC7B962304CBFD8" "F930E2C6872D6B32" "6EF72137900BB910" "78D1DE133B3DBF55" ...
$ rideable_type : chr [1:804352] "electric_bike" "electric_bike" "electric_bike" "electric_bike" ...
$ started_at : chr [1:804352] "9/28/21 16:07" "9/28/21 14:24" "9/28/21 00:20" "9/28/21 14:51" ...
$ ended_at : chr [1:804352] "9/28/21 16:09" "9/28/21 14:40" "9/28/21 00:23" "9/28/21 15:00" ...
$ day_of_week : num [1:804352] 3 3 3 3 3 3 3 3 2 3 ...
$ start_station_name: chr [1:804352] NA NA NA NA ...
$ start_station_id : chr [1:804352] NA NA NA NA ...
$ end_station_name : chr [1:804352] NA NA NA NA ...
$ end_station_id : chr [1:804352] NA NA NA NA ...
$ start_lat : num [1:804352] 41.9 41.9 41.8 41.8 41.9 ...
$ start_lng : num [1:804352] -87.7 -87.6 -87.7 -87.7 -87.7 ...
$ end_lat : num [1:804352] 41.9 42 41.8 41.8 41.9 ...
$ end_lng : num [1:804352] -87.7 -87.7 -87.7 -87.7 -87.7 ...
$ member_casual : chr [1:804352] "casual" "casual" "casual" "casual" ...
- attr(*, "spec")=
.. cols(
.. ride_id = col_character(),
.. rideable_type = col_character(),
.. started_at = col_character(),
.. ended_at = col_character(),
.. day_of_week = col_double(),
.. start_station_name = col_character(),
.. start_station_id = col_character(),
.. end_station_name = col_character(),
.. end_station_id = col_character(),
.. start_lat = col_double(),
.. start_lng = col_double(),
.. end_lat = col_double(),
.. end_lng = col_double(),
.. member_casual = col_character()
.. )
- attr(*, "problems")=<externalptr>
我嘗試了以下-
sep_2021 <- mutate(sep_2021, started_at = as.Date(started_at)結果:字串不是標準的明確格式
sep_2021 <- mutate(sep_2021, started_at = as.Date.POSIXct(started_at, tz = "", tryFormats = c("%Y-%m-%d %H:%M:%OS","%Y/%m/%d %H:%M:%OS")))結果:字串不是標準的明確格式
sep_2021 <- mutate(sep_2021, started_at = lubridate::as_datetime(started_at))結果:所有格式都無法決議。未找到格式
sep_2021 <- mutate(sep_2021, started_at = as.Date(started_at, "%m-%d-%y %H:%M:%OS"))結果:強制引入的 NA
非常感謝任何和所有建議或建議!
uj5u.com熱心網友回復:
我們可能會使用parse_date從parsedate
library(dplyr)
library(parsedate)
sep_2021 <- sep_2021 %>%
mutate(across(c(started_at, ended_at), parse_date))
在format使用和列的格式是不同的,即它應該是%m/%d/%y %H:%M
sep_2021 <- sep_2021 %>%
mutate(across(c(started_at, ended_at), as.POSIXct,
format = "%m/%d/%y %H:%M"))
uj5u.com熱心網友回復:
您可以使用mdy_hm將類從字符更改為POSIXct. 計算差異使用difftime并傳遞units給它。
例如,要以秒為單位獲得差異,您可以這樣做 -
library(dplyr)
library(lubridate)
sep_2021 <- sep_2021 %>%
mutate(across(c(started_at, ended_at), mdy_hm),
diff = difftime(ended_at, started_at, units = 'secs'))
sep_2021
# started_at ended_at diff
#1 2021-09-28 16:07:00 2021-09-28 16:09:00 120 secs
#2 2021-09-28 14:24:00 2021-09-28 14:40:00 960 secs
#3 2021-09-28 00:20:00 2021-09-28 00:23:00 180 secs
#4 2021-09-28 14:51:00 2021-09-28 15:00:00 540 secs
資料
如果您以可重現的格式提供資料,則更容易獲得幫助
sep_2021 <- data.frame(started_at = c("9/28/21 16:07", "9/28/21 14:24" ,"9/28/21 00:20" ,"9/28/21 14:51"),
ended_at = c("9/28/21 16:09" ,"9/28/21 14:40", "9/28/21 00:23", "9/28/21 15:00"))
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/338587.html
