如何使用條件將字串轉換為字典？-有解無憂

我有一個資料框（非常大，數百萬行）。這是它的外觀：

id     value
a1     0:0,1:10,2:0,3:0,4:7
b4     0:5,1:0,2:0,3:0,4:1
c5     0:0,1:3,2:2,3:0,4:0
k2     0:0,1:2,2:0,3:4,4:0

我想把這些字串變成字典，但只有那些沒有 0 的鍵值對。所以想要的結果是：

id          value
a1       {1:10, 4:7}
b4       {4:1}
c5       {1:3, 2:2}
k2       {1:2}

怎么做？當我嘗試使用 dict() 函式但它帶來了 KeyError: 0:

df["value"] = dict(df["value"])

所以我首先把它變成字典有問題

我也試過這個：

df["value"] = json.loads(df["value"])

但它帶來了同樣的錯誤

uj5u.com熱心網友回復：

這可以解決問題，只需使用串列推導式：

import pandas as pd

dt = pd.DataFrame({"id":["a1", "b4", "c5", "k2"], 
                   "value":["0:0,1:10,2:0,3:0,4:7","0:5,1:0,2:0,3:0,4:1","0:0,1:3,2:2,3:0,4:0","0:0,1:2,2:0,3:4,4:0"]})


def to_dict1(s):
    return [dict([map(int, y.split(":")) for y in x.split(",") if "0" not in y.split(":")]) for x in s]

            
dt["dict"] = to_dict1(dt["value"])

獲得相同結果的另一種方法是使用正則運算式（模式(?!0{1})(\d)匹配任何數字，但單個 0）：

import re

def to_dict2(s):
    return [dict([map(int, y) for y in re.findall("(?!0{1})(\d):(?!0{1})(\d )", x)]) for x in s]

to_dict1根據我的測驗，在性能方面，快了近 20%。

uj5u.com熱心網友回復：

此代碼將產生您想要的結果。我按照您提供的示例輸入，并在最后列印了預期的結果。

import pandas as pd

df = pd.DataFrame(
    {
        'id': ['a1', 'b4', 'c5', 'k2'],
        'value': ['0:0,1:10,2:0,3:0,4:7', '0:5,1:0,2:0,3:0,4:1', '0:0,1:3,2:2,3:0,4:0', '0:0,1:2,2:0,3:4,4:0']
    }
)

value = []  # temporal value to save only key, value pairs without 0
for i, row in df.iterrows():
    pairs = row['value'].split(',')
    d = dict()
    for pair in pairs:
        k, v = pair.split(':')
        k = int(k)
        v = int(v)
        if (k != 0) and (v != 0):
            d[k] = v
    value.append(d)

df['value'] = pd.Series(value)

print(df)

#   id          value
#0  a1  {1: 10, 4: 7}
#1  b4         {4: 1}
#2  c5   {1: 3, 2: 2}
#3  k2   {1: 2, 3: 4}

uj5u.com熱心網友回復：

def make_dict(row):
   """ Requires string list of shape 
       ["0":"0", "1":"10", ...]"""
   return {key: val for key, val 
           in map(lambda x: map(int, x.split(":")), row) 
           if key != 0 and val != 0}

df["value"] = df.value.str.split(",").apply(make_dict)

uj5u.com熱心網友回復：

這就是我將如何做到的：

def string_to_dict(s):
    d = {}
    pairs = s.split(',') # get each key pair
    for pair in pairs:
        key, value = pair.split(':') # split key from value
        if int(value): # skip the pairs with zero value
            d[key] = value
    return d

df['value'] = df['value'].apply(string_to_dict)

uj5u.com熱心網友回復：

使用字典理解來排除等于零的鍵或值項

txt="""id     value
a1     0:0,1:10,2:0,3:0,4:7
b4     0:5,1:0,2:0,3:0,4:1
c5     0:0,1:3,2:2,3:0,4:0
k2     0:0,1:2,2:0,3:4,4:0 """

df = pd.DataFrame({"id":["a1", "b4", "c5", "k2"], 
               "value":["0:0,1:10,2:0,3:0,4:7","0:5,1:0,2:0,3:0,4:1","0:0,1:3,2:2,3:0,4:0","0:0,1:2,2:0,3:4,4:0"]})

for key,row in df.iterrows():
    results=[]
    {results.append({int(k),int(v)}) if int(k)!=0 and int(v)!=0 else None for k,v in (x.split(':') for x in row['value'].split(','))}
    df.loc[key,'value']=results

 print(df)

輸出：

   id              value
0  a1  [{1, 10}, {4, 7}]
1  b4           [{1, 4}]
2  c5      [{1, 3}, {2}]
3  k2   [{1, 2}, {3, 4}]

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/358008.html

標籤：Python 蟒蛇-3.x 数据框功能

上一篇：從陣列中獲取所有ID的引數

下一篇：這個（max）函式如何使用while回圈作業？