我有一個包含兩列的資料框。cnn_handle包含 Twitter 句柄并tweet包含在相應行中提到 Twitter 句柄的推文。但是,大多數推文都提到了至少一個由 表示的其他用戶/句柄@。我想洗掉一條推文包含多個的所有行@。
df
cnn_handle tweet
1 @DanaBashCNN @JohnKingCNN @DanaBashCNN @kaitlancollins @eliehonig @thelauracoates @KristenhCNN CNN you are still FAKE NEWS !!!
2 @DanaBashCNN @DanaBashCNN He could have made the same calls here, from SC.
3 @DanaBashCNN @DanaBashCNN GRAMMER ALERT: THAT'S FORMER PRESIDENT TRUMP Please don't forget this important point. Also please refrain from showing a pic of him till you have one in his casket. thank you
4 @brianstelter @eliehonig @brianstelter My apologies to you sir. Just seems like that story disappeared. Imo the nursing home scandal is just as bad.
5 @brianstelter @DrAndrewBaer1 @JGreenblattADL @brianstelter @CNN @TuckerCarlson @FoxNews Anti-Semite are you, Herr Doktor? How very Mengele of you.
6 @brianstelter @ma_makosh @Shortguy1 @brianstelter @ChrisCuomo Liberals, their feelings before facts and their crucifixion of people before due process. Never a presumption of innocence when it concerns the rival party. So un-American.
7 @andersoncooper @BrendonLeslie And Biden was a staunch opponent of a€?forced businga€. He also said that integrating schools will cause a a€?racial junglea€. But u wona€?t hear this on @ChrisCuomo @jaketapper @Acosta @andersoncooper bc they continue to cover up the truth about Biden & his family.
8 @andersoncooper Anderson Cooper revealed that he "wanted a change" when reflecting on his break from news as #TheMole arrives on Netflix.
9 @andersoncooper @johnnydollar01 @newsbusters @drsanjaygupta @andersoncooper He was terrible as a host
我懷疑需要某種型別的正則運算式。但是,我不確定如何將它與大于號結合使用。
期望的結果,即僅提及相應的推文cnn_handle
cnn_handle tweet
2 @DanaBashCNN @DanaBashCNN He could have made the same calls here, from SC.
3 @DanaBashCNN @DanaBashCNN GRAMMER ALERT: THAT'S FORMER PRESIDENT TRUMP Please don't forget this important point. Also please refrain from showing a pic of him till you have one in his casket. thank you
8 @andersoncooper Anderson Cooper revealed that he "wanted a change" when reflecting on his break from news as #TheMole arrives on Netflix.
uj5u.com熱心網友回復:
一個直截了當的解決方案,使用str_count其中僅在 Twitter 句柄stringr中出現的前提:@
base R:
library(stringr)
df[str_count(df$tweet, "@") > 1,]
dplyr:
library(dplyr)
library(stringr)
df %>%
filter(!str_count(tweet, "@") > 1)
uj5u.com熱心網友回復:
假設您的資料框被呼叫tweets,只需檢查是否有多個匹配項,@然后是文本:
pattern <- "@[a-zA-Z. ]"
multiple_ats <- unlist(lapply(tweets$tweet, function(x) length(gregexpr(pattern, x)[[1]])>1))
tweets[!multiple_ats,]
輸出:
# A tibble: 3 x 2
cnn_handle tweet
<chr> <chr>
1 @DanaBashCNN "@DanaBashCNN He could have made the same calls here, from SC."
2 @DanaBashCNN "@DanaBashCNN GRAMMER ALERT: THAT'S FORMER PRESIDENT TRUMP Please don't forget this important point.,Also please refrain from showing a pic of him till you have one in his casket.,thank you"
3 @andersoncooper "Anderson Cooper revealed that he \"wanted a change\" when reflecting on his break from news as #TheMole arrives on Netflix."
編輯:如果允許 Twitter 用戶名以數字或特殊字符開頭,您將不得不更改模式。我不知道規則是什么。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/455733.html
