我正在使用 R 從一首歌的歌詞中提取包含特定單詞(“一切”、“沿”、“寫”)的句子,這是這首歌:Yellow-Coldplay
看星星 看它們為你發光 你所做的一切 是的,它們都是黃色的 我來了 我為你寫了一首歌 你所做的所有事情 它被稱為黃色 所以,然后輪到我了 什么東西完成 全是黃色 你的皮膚 哦,是的,你的皮膚和骨頭 變成美麗的東西 你知道嗎 你知道我愛你 所以你知道我愛你 所以我游過去 我為你跳過去因為你都是黃色的我畫了一條線
用字母創建一個向量,但它不編譯
uj5u.com熱心網友回復:
這是一個使用tidyverse. 它并不完美,您必須適應您的特定用例:
lyrics <- data.frame(yellow = "Look at the stars Look how they shine for you And everything you do Yeah, they were all yellow I came along I wrote a song for you And all the things you do And it was called Yellow So, then I took my turn What a thing to've done And it was all yellow Your skin Oh yeah, your skin and bones Turn in to something beautiful Do you know You know I love you so You know I love you so I swam across I jumped across for you What a thing to do 'Cause you were all yellow I drew a line")
library(tidyverse)
lyrics %>%
mutate(yellow = gsub('([[:upper:]])', '<>\\1', yellow)) %>%
separate_rows(yellow, sep = "<>") %>%
mutate(flag = str_detect(yellow, "everything|along|wrote")) %>%
filter(flag == T)
這給了我們:
# A tibble: 3 x 2
yellow flag
<chr> <lgl>
1 "And everything you do " TRUE
2 "I came along " TRUE
3 "I wrote a song for you " TRUE
你必須弄清楚:什么是句子?有大寫的時候我數了一個新的句子。
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/328716.html
標籤:r
