替換熊貓資料框列中的字串-有解無憂

我有一個熊貓資料框，其中包含一個名為“內容”的列，其中包含文本。我想從該列中的每個文本中洗掉一些單詞。我想用空字串替換每個字串，但是當我列印我的函式的結果時，我發現這些單詞沒有被洗掉。我的代碼如下：

def replace_words(t):
  words = ['Livre', 'Chapitre', 'Titre', 'Chapter', 'Article' ]
  for i in t:
    if i in words:
      t.replace (i, '')
    else:
      continue
  print(t)


st = 'this is Livre and Chapitre and Titre and Chapter and Article'

replace_words(st)

期望結果的一個例子是：'這是和和和和'

使用下面的代碼，我想將上面的函式應用于“內容”列中的每個文本：

df['content'].apply(lambda x: replace_words(x))

有人可以幫我創建一個函式來洗掉我需要的所有單詞，然后將此函式應用于我的 df 列中的所有文本嗎？

uj5u.com熱心網友回復：

您可以使用str.replace.
輸入：

df = pd.DataFrame({
    'ID' : np.arange(4),
    'words' : ['this is Livre and Chapitre and Titre and Chapter and Article', 
               'this is car and Chapitre and bus and Chapter and Article',
              'this is Livre and Chapitre',
              'nothing to replace']
})

words = ['Livre', 'Chapitre', 'Titre', 'Chapter', 'Article']
pat = '|'.join(map(re.escape, words))
print(pat)
'Livre|Chapitre|Titre|Chapter|Article'

df['words'] = df['words'].str.replace(pat, '', regex=True)
print(df)

   ID                               words
0   0        this is  and  and  and  and 
1   1  this is car and  and bus and  and 
2   2                       this is  and 
3   3                  nothing to replace

uj5u.com熱心網友回復：

兩個問題：

如果你拆分使用for i in t:eachi是一個字母，而不是一個單詞。
t.replace 不能就地作業

用這個：

def replace_words(t):
    words = ['Livre', 'Chapitre', 'Titre', 'Chapter', 'Article' ]
    for i in t.split(' '):
        # print(i) # remove to see problem 1
        if i in words:
            t= t.replace (i, '')
        else:
            continue
    # print(t)
    return t

編輯：您可以直接呼叫df['col'].apply(replace_words).

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/519557.html

標籤：Python熊猫功能代替

上一篇：單擊按鈕后如何使表單出現？

下一篇：為什么我的自定義誤差條函式在R中不起作用？