我想將“拆分推文”列中的所有單詞轉換為小寫
這是我的代碼;
def word_splitter(df):
df['Split Tweets'] = df['Tweets'].str.split()
df['Split Tweets'] = df['Split Tweets'].str.lower()
df = df[['Tweets', 'Date', 'Split Tweets']]
return df
word_splitter(twitter_df.copy())
這是我得到的輸出;
Tweets Date Split Tweets
0 @BongaDlulane Please send an email to mediades... 2019-11-29 12:50:54 NaN
1 @saucy_mamiie Pls log a call on 0860037566 2019-11-29 12:46:53 NaN
2 @BongaDlulane Query escalated to media desk. 2019-11-29 12:46:10 NaN
3 Before leaving the office this afternoon, head... 2019-11-29 12:33:36 NaN
4 #ESKOMFREESTATE #MEDIASTATEMENT : ESKOM SUSPEN... 2019-11-29 12:17:43 NaN
... ... ... ...
195 Eskom's Visitors Centres’ facilities include i... 2019-11-20 10:29:07 NaN
196 #Eskom connected 400 houses and in the process... 2019-11-20 10:25:20 NaN
197 @ArthurGodbeer Is the power restored as yet? 2019-11-20 10:07:59 NaN
198 @MuthambiPaulina @SABCNewsOnline @IOL @eNCA @e... 2019-11-20 10:07:41 NaN
199 RT @GP_DHS: The @GautengProvince made a commit... 2019-11-20 10:00:09 NaN
這是預期的輸出;
word_splitter(twitter_df.copy())
Tweets Date Split Tweets
0 @BongaDlulane Please send an email to mediades... 2019-11-29 12:50:54 [@bongadlulane, please, send, an, email, to, m...
1 @saucy_mamiie Pls log a call on 0860037566 2019-11-29 12:46:53 [@saucy_mamiie, pls, log, a, call, on, 0860037...
2 @BongaDlulane Query escalated to media desk. 2019-11-29 12:46:10 [@bongadlulane, query, escalated, to, media, d...
3 Before leaving the office this afternoon, head... 2019-11-29 12:33:36 [before, leaving, the, office, this, afternoon...
4 #ESKOMFREESTATE #MEDIASTATEMENT : ESKOM SUSPEN... 2019-11-29 12:17:43 [#eskomfreestate, #mediastatement, :, eskom, s...
... ... ... ...
195 Eskom's Visitors Centres’ facilities include i... 2019-11-20 10:29:07 [eskom's, visitors, centres’, facilities, incl...
196 #Eskom connected 400 houses and in the process... 2019-11-20 10:25:20 [#eskom, connected, 400, houses, and, in, the,...
197 @ArthurGodbeer Is the power restored as yet? 2019-11-20 10:07:59 [@arthurgodbeer, is, the, power, restored, as,...
198 @MuthambiPaulina @SABCNewsOnline @IOL @eNCA @e... 2019-11-20 10:07:41 [@muthambipaulina, @sabcnewsonline, @iol, @enc...
199 RT @GP_DHS: The @GautengProvince made a commit... 2019-11-20 10:00:09 [rt, @gp_dhs:, the, @gautengprovince, made, a,...
請問我該怎么做?
uj5u.com熱心網友回復:
在拆分字串之前,您需要將Tweets字串轉換為小寫。改用這個:
df['Split Tweets'] = df['Tweets'].str.lower().str.split()
uj5u.com熱心網友回復:
請試試這個:
df['Split Tweets'] = df['Tweets'].apply(lambda x:x.lower().split())
uj5u.com熱心網友回復:
完成后str.split(),您的df['Split Tweets']列包含一個串列而不僅僅是一個字串,因此它無法執行該str.lower()方法。
您可以更改順序,就像此處建議的其他答案/評論一樣,或者您可以使用以下str.lower()方法通過 lambda 函式在串列中應用該map方法:
df['Split Tweets'] = df['Split Tweets'].map(lambda x: list(map(str.lower, x)))
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/445123.html
