我正在嘗試將大量字串（只有三個字串示例，但實際上我有數千個字串）替換為“replaceWord”上定義的其他字串。

“replaceWord”沒有規律。

然而，我寫的代碼并沒有像我預期的那樣作業。

運行腳本后，輸出如下：

     before     after
0  test1234  test1234
1  test1234  test1234
2  test1234      1349
3  test1234  test1234
4  test1234  test1234

我需要如下輸出；

  before    after
1 test1234  1349
2 test9012  te1210st
3 test5678  8579
4 april     I was born August
5 mcdonalds i like checkin

腳本

import os.path, time, re
import pandas as pd
import csv


body01_before="test1234"
body02_before="test9012"
body03_before="test5678"
body04_before="i like mcdonalds"
body05_before="I was born april"

replaceWord = [
                ["test9012","te1210st"],
                ["test5678","8579"],
                ["test1234","1349"],
                ["april","August"],
                ["mcdonalds","chicken"],

]

cols = ['before','after']
df = pd.DataFrame(index=[], columns=cols)

for word in replaceWord:
    
    body01_after = re.sub(word[0], word[1], body01_before)
    body02_after = re.sub(word[0], word[1], body02_before)
    body03_after = re.sub(word[0], word[1], body03_before)
    body04_after = re.sub(word[0], word[1], body04_before)
    body05_after = re.sub(word[0], word[1], body05_before)

    df=df.append({'before':body01_before,'after':body01_after}, ignore_index=True)
    
#df.head()
print(df)

df.to_csv('test_replace.csv')

uj5u.com熱心網友回復：

使用正則運算式將非數字捕獲(\D )為第一組，將數字捕獲(\d )為第二組。從第二組開始替換文本，\2然后是第一組\1

df['after'] = df['before'].str.replace(r'(\D )(\d )', r'\2\1', regex = True)

df
     before     after
1  test1234  1234test
2  test9012  9012test
3  test5678  5678test

編輯

似乎您沒有資料集。你有變數：

body01_before="test1234"
body02_before="test9012"
body03_before="test5678"
body04_before="i like mcdonalds"
body05_before="I was born april"

replaceWord = [
                ["test9012","te1210st"],
                ["test5678","8579"],
                ["test1234","1349"],
                ["april","August"],
                ["mcdonalds","chicken"],

]

# Gather the variables in a list
vars = re.findall('body0\\d[^,] ', ','.join(globals().keys()))
df = pd.DataFrame(vars, columns = ['before_1'])
# Obtain the values of the variable
df['before'] = df['before_1'].apply(lambda x:eval(x))

# replacement function
repl = lambda x: x[0] if (rp:=dict(replaceWord).get(x[0])) is None else rp

# Do the replacement
df['after'] = df['before'].str.replace('(\\w )',repl, regex= True)

df
        before_1            before              after
0  body01_before          test1234               1349
1  body02_before          test9012           te1210st
2  body03_before          test5678               8579
3  body04_before  i like mcdonalds     i like chicken
4  body05_before  I was born april  I was born August

uj5u.com熱心網友回復：

這符合你的目的嗎？

words = ["test9012", "test5678", "test1234"]
updated = []

for word in words:
    for i, char in enumerate(word):
        if 47 < ord(char) < 58: # the character codes for digits 1-9
            updated.append(f"{word[i:]}{word[:i]}")
            break

print(updated)

代碼列印：['9012test', '5678test', '1234test']

uj5u.com熱心網友回復：

據我了解，您有一個字串串列和一個映射字典，格式為：{oldString1: newString1, oldString2: newString2, ...}您想用來替換原始字串串列。我能想到的最快（也許是最 Pythonic）的方法是將映射字典簡單地保存為 Python dict。例如：

mapping = {
   "test9012":"9012test",
   "test5678","5678test",
   "test1234","1234test",
}

如果您的字串串列存盤為 Python 串列，則可以使用以下代碼獲取替換串列：

new_list = [mapping.get(key=old_string, default=old_string) for old_string in old_list]

注意：我們使用mapping.get()withdefault=old_string以便函式old_string在它不在映射字典中的情況下回傳。

如果您的字串串列存盤在 Pandas Series（或 Pandas DataFrame 的列）中，您可以快速將字串替換為：

new_list = old_list.map(mapping, na_action='ignore')

注意：我們設定na_action='ignore'以便函式old_string在它不在映射字典中的情況下回傳。

uj5u.com熱心網友回復：

您可以使用正則運算式來匹配模式。

import os.path, time, re
import pandas as pd
import csv

words = ["test9012", "test5678", "test1234"]

for word in words:
  textOnlyMatch = re.match("(([a-z]|[A-Z])*)", word)
  textOnly = textOnlyMatch.group(0) // take the entire match group
  numberPart = word.split(textOnly)[1] // take string of number only
  result = numberPart   textOnly
  df = df.append({'before':word,'after':result}, ignore_index=True)

#df.head()
print(df)

df.to_csv('test_replace.csv')

因此，通過使用正則運算式匹配，您可以僅分隔字母和僅數字部分。

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/464519.html

標籤：Python python-3.x 代替导出到 csv

上一篇：程式在4個科目中輸入5個學生分數并輸出學生和科目的最高平均分

下一篇：在Python中為多個字典創建字典

Python：如何替換大量字串

編輯