用字串模式映射熊貓系列-有解無憂

嗨，我想使用字串模式映射熊貓系列

s=pd.DataFrame([['AMcU8', 10], ['AM8v', 15], ['ASw9', 14],['ASw7', 14]], columns = ['Code', 'Quantity'])

s["newcode"]=s["Code"].map({"AM.*8.*" : "AM8", "AS.*9.*" : "AS9"})

但我明白了：

   Code  Quantity newcode
0  AMcU8       10     NaN
1  AM8v        15     NaN
2  ASw9        14     NaN
3  ASw7        14     NaN

代替：

   Code  Quantity newcode
0  AMcU8       10     AM8
1  AM8v        15     AM8
2  ASw9        14     AS9
3  ASw7        14     NaN

任何的想法？當它沒有找到匹配時，得到一個 NaN 很好

uj5u.com熱心網友回復：

您可以Series.replace將引數regex設定為映射字典（檔案）：

s["newcode"] = s["Code"].replace(regex={"AM.*8.*":"AM8", "AS.*9.*": "AS9"})

它產生：

    Code    Quantity    newcode
0   AMcU8   10          AM8
1   AM8v    15          AM8
2   ASw9    14          AS9
3   ASw7    14          ASw7

請注意，不匹配的模式保持不變。

uj5u.com熱心網友回復：

據我所知，沒有執行此操作的直接功能。

您可以使用apply()and執行此操作，re并按如下方式遍歷您的映射字典：

mapping = {"AM.*8" : "AM8", "AS.*9" : "AS9"}
import re

def regex_mapping(x):
    for k, v in mapping.items():
        if re.match(k, x):
            return re.sub(k, v, x)
    return x

s['Code'].apply(regex_mapping)

輸出：

0     AM8
1     AM8
2     AS9
3    ASw7
Name: Code, dtype: object

uj5u.com熱心網友回復：

據我所知，您不能向Series.map().

但是，這可以滿足您的需求：

import re
import pandas as pd

s = pd.DataFrame([['AMcU8', 10], ['AM8', 15], ['ASw9', 14], ['ASw7', 14]], columns=['Code', 'Quantity'])


def regex_replace(x, map: dict = None):
    for regex, replacement in map.items():
        if re.match(regex, x):
            return replacement
    else:
        return x


s["newcode"] = s["Code"].apply(regex_replace, map={"AM.*8": "AM8", "AS.*9": "AS9"})

或者，如果您經常將其應用于大型 DataFrame 并希望它在這種情況下更快更高效：

import re
import pandas as pd
from functools import partial

s = pd.DataFrame([['AMcU8', 10], ['AM8', 15], ['ASw9', 14], ['ASw7', 14]], columns=['Code', 'Quantity'])


def regex_replace(map: dict = None, x=None):
    for regex, replacement in map.items():
        if regex.match(x):
            return replacement
    else:
        return x

mapping = partial(regex_replace, {re.compile("AM.*8"): "AM8", re.compile("AS.*9"): "AS9"})
s["newcode"] = s["Code"].apply(mapping)

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/331597.html

標籤：Python 数据框映射

上一篇：使用pandas在python中建立索引后更改資料框的列名

下一篇：如何使用df.column.str.contains()獲得與下面代碼相同的結果？