我在資料幀 (df1$text) 中有一個文本向量,我正在嘗試使用文本的最后 10 個單詞 (df1$last.ten) 創建一個新向量。我嘗試了以下但沒有成功:
df1$last.ten = mapply(function(x,y) paste(word(x,y), collapse=" "), df1$text, -1:-10)
但是我只得到一個詞而不是一串十個詞:
> df1$last.ten[1]
[1] "end."
當我給它一個字串時它作業得很好,所以我似乎mapply錯誤地使用了它。
我試圖為此使用gsub,但無法弄清楚語法。將不勝感激word()或gsub()解決方案。謝謝!
uj5u.com熱心網友回復:
這是一個基本的 R 選項 -
#example data
df1 <- data.frame(text = c('This is a long text which consists of words more than 10',
'This is another one which is similar to first one but even longer'))
#split string on space for every word and paste the last 10 words in one string
df1$last.ten <- sapply(strsplit(df1$text, '\\s '), function(x)
paste0(tail(x, 10), collapse = ' '))
df1
uj5u.com熱心網友回復:
我制作了一些示例資料。也許您不需要使用應用函式。
df1 <- data.frame(text = c("one two three four five six seven eight nine ten eleven","one two three four five six seven eight nine ten eleven twelve"))
df1$last.ten <- word(df1[[1]], str_count(df1[[1]], '\\w ') - 9, str_count(df1[[1]], '\\w '))

uj5u.com熱心網友回復:
如果這是您的資料框(玩具資料)
df1
text
1 one two three four five six seven eight nine ten eleven twelve
2 one two three four five six seven eight nine ten eleven twelve
3 one two three four five six seven eight nine ten eleven twelve
然后像這樣提取最后10個單詞
rnge <- 10:1
df1$last.ten <- apply( t(apply( as.data.frame(df1$text), 1, function(x)
rev( unlist( strsplit(x, " ") ) ) )[rnge,]), 1, paste, collapse=" " )
df1
text
1 one two three four five six seven eight nine ten eleven twelve
2 one two three four five six seven eight nine ten eleven twelve
3 one two three four five six seven eight nine ten eleven twelve
last.ten
1 three four five six seven eight nine ten eleven twelve
2 three four five six seven eight nine ten eleven twelve
3 three four five six seven eight nine ten eleven twelve
如果您調整范圍,這將從任何地方提取資料 rnge
rnge <- 5:3
df1$mid <- apply( t(apply( as.data.frame(df1$text), 1, function(x)
rev( unlist( strsplit(x, " ") ) ) )[rnge,]), 1, paste, collapse=" " )
df1
text
1 one two three four five six seven eight nine ten eleven twelve
2 one two three four five six seven eight nine ten eleven twelve
3 one two three four five six seven eight nine ten eleven twelve
last.ten mid
1 three four five six seven eight nine ten eleven twelve eight nine ten
2 three four five six seven eight nine ten eleven twelve eight nine ten
3 three four five six seven eight nine ten eleven twelve eight nine ten
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/387652.html
下一篇:動態添加列并從另一列分配計算值
