我正在制作一個簡單的 R 函式,以相同的方式在磁盤上一致地寫入檔案,但在不同的檔案夾中:
library(magrittr)
main_path <- here::here()
write_to_disk <- function(data, folder, name){
data %>%
vroom::vroom_write(
file.path(main_path, folder, paste0(name, ".tsv"))
)
}
我知道我不一定需要在 R 函式中回傳任何內容,但是如果我要回傳,return()這里的適當陳述句是什么?
非常感謝
uj5u.com熱心網友回復:
老實說,這很容易受到意見和背景的影響,但有一些想法:
回傳原始資料。這種精神在大多數 tidyverse 動詞函式和許多其他包(以及一些在基礎 R 中)中都傳達了。如果您使用的是
%>%或|>管道,這樣做允許在您的函式之后處理資料,這可能非常方便。write_to_disk <- function(data, folder, name){ vroom::vroom_write( data, file.path(folder, paste0(name, ".tsv")) ) data }(您的函式已經隱式執行此操作,因為呼叫
vroom::vroom_write是函式體中的最后一個運算式。)回傳檔案寫入呼叫的輸出。坦率地說,我不太喜歡這個,因為如果你改變了你的包裝器正在使用的函式,那么你的函式的回傳值很可能會改變。我不知道您的包裝函式的生命周期預期,但想象一下,如果您選擇切換
vroom::vroom到另一個函式;vroom回傳基于 的資料子集col_select,也許較新的函式將回傳整個資料,這可能會破壞下游處理的假設。write_to_disk <- function(data, folder, name){ out <- vroom::vroom_write( data, file.path(folder, paste0(name, ".tsv")) ) out }注:我明確地選擇了捕捉到
out,并在情況下,回傳它你曾經之間添加代碼vroom::vroom_write和隨后的out。您原來的功能不變實際上是在做同樣的事情,但是如果您選擇在vroom_write.否則,您的函式已經隱式執行此操作,因為
vroom::vroom_write回傳資料。回傳檔案名。這僅在檔案名不一定知道先驗時才有用。例如,如果您的包裝器注意不覆寫同名檔案,它可能會添加一個計數器(預擴展),以便永遠不會發生覆寫。在這種情況下,呼叫環境不知道所選檔案名是什么,因此它具有價值(有時)。
write_to_disk <- function(data, folder, name){ # file numbering re <- paste0("^", name, "_?([0-9] )?\\.tsv$") existfiles <- list.files(folder, pattern = re, full.names = TRUE) nextnum <- max(0L, suppressWarnings(as.integer(gsub(re, "\\1", basename(existfiles)))), na.rm = TRUE) if (nextnum > 0) { name <- sprintf("%s_i", name, nextnum 1L) } filename <- file.path(folder, paste0(name, ".tsv")) vroom::vroom_write( data, filename ) filename }(“檔案編號”代碼僅作為我認為回傳檔案名可能有意義的一個例子。)
回傳寫入函式成功。這可能需要使用
try或tryCatch(或任何 tidyverse 等價物),捕捉錯誤,并做出相應的反應。write_to_disk <- function(data, folder, name){ res <- tryCatch( vroom::vroom_write( data, file.path(folder, paste0(name, ".tsv")) ), error = function(e) e ) out <- !inherits(res, "error") if (!out) { attr(out, "error") <- conditionMessage(res) } out }回傳什么。這是最簡單的,當然。您需要明確地執行此操作,以免無意中從檔案寫入函式中回傳回傳值。
write_to_disk <- function(data, folder, name){ vroom::vroom_write( data, file.path(folder, paste0(name, ".tsv")) ) NULL }
筆記:
Your use of
main_pathis counter to functional programming, since the function behaves differently given identical inputs based on the presence of something outside of its immediate scope. I argue it's better to passwrite_to_dist(x, file.path(main_path, folder), "somename")sincemain_pathis defined in that environment (not within the function), and your function would be general enough to not require that variable be defined correctly.I've updated all of the code above to reflect this good practice. If you feel strongly enough against this, feel free to add back in
main_pathin your preferred locations.It might be useful for any of the above to be returned invisibly, so that (for instance) saving a large
data.framewithout capturing its return value does not flood the console with the data. This is easy enough to do withinvisible(data)and changes nothing of the return value (other than that it is not printed on the console by default).FYI: Konrad and I have gone back-and-forth in the comments about whether
return(.)is a good idea or not. I don't disagree with most of the claims, and argue regardless that it can be as much about style and opinion than much else. Regardless, since most of my arguments forreturnare moot in all of the above code, I removed it for succinctness.
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/346138.html
上一篇:在python中計算抵押?
下一篇:執行計算時變異不添加正確的列名
