python替換沒有空格的正則運算式匹配-有解無憂

我基本上想“加入”應該清楚地在一起的數字。我想用它自己替換正則運算式匹配，但沒有任何空格。

我有：

df
               a
'Fraxiparine 9 500 IU (anti-Xa)/1 ml'
'Colobreathe 1 662 500 IU inhala?ny prá?ok v tvrdej kapsule'

我希望有：

df
               a
'Fraxiparine 9500 IU (anti-Xa)/1 ml'
'Colobreathe 1662500 IU inhala?ny prá?ok v tvrdej kapsule'

我r'\d \s \d \s*\d '用來匹配數字，我創建了以下函式來洗掉字串中的空格：

def spaces(x):
    match = re.findall(r'\d \s \d \s*\d ', x)
    return match.replace(" ","")

現在我無法將該函式應用于完整的資料幀，但我也不知道如何用沒有任何空格的字串替換原始匹配。

uj5u.com熱心網友回復：

嘗試使用以下代碼：

def spaces(s):
    return re.sub('(?<=\d) (?=\d)', '', s)

df['a'] = df['a'].apply(spaces)

正則運算式將匹配：

任何空間
前面有一個數字(?<=\d)
然后是一個數字(?=\d)。

然后，pandas.Series.apply函式會將您的函式應用于資料框的所有行。

輸出：

0   Fraxiparine 9500 IU (anti-Xa)/1 ml
1   Colobreathe 1662500 IU inhala?ny prá?ok v tvrd...

uj5u.com熱心網友回復：

我相信您的問題可以通過稍微調整您的函式來解決，以便應用于整個字串“匹配”，如下所示：

import pandas as pd
import re

df = pd.DataFrame({'a' : ['Fraxiparine 9 500 IU (anti-Xa)/1 ml','Colobreathe 1 662 500 IU inhala?ny prá?ok v tvrdej kapsule']})

# your function
def spaces(x):
    match = re.findall(r'\d \s \d \s*\d ', x)
    replace_with = match[0].replace(" ","")
    return x.replace(match[0], replace_with)

# now apply it on the whole dataframe, row per row
df['a'] = df['a'].apply(lambda x: spaces(x))

uj5u.com熱心網友回復：

利用

df['a'] = df['a'].str.replace(r'(?<=\d)\s (?=\d)', '', regex=True)

解釋

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  (?<=                     look behind to see if there is:
--------------------------------------------------------------------------------
    \d                       digits (0-9)
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  \s                       whitespace (\n, \r, \t, \f, and &quot; &quot;) (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    \d                       digits (0-9)
--------------------------------------------------------------------------------
  )                        end of look-ahead

如果您的計劃是僅在以下位置洗掉空格\d \s \d \s*\d ：

df['a'] = df['a'].str.replace(r'\d \s \d \s*\d ', lambda m: re.sub(r'\s ', '', m.group()), regex=True)

見str.replace：

repl : str 或可呼叫
替換字串或可呼叫。可呼叫物件傳遞正則運算式匹配物件，并且必須回傳要使用的替換字串。參見 re.sub()。

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/485695.html

標籤：Python 正则表达式细绳

上一篇：正則運算式拆分文本。括號中帶空格的特殊情況

下一篇：設定alertmanager以按作業名稱將警報分發到不同的通道