從選擇特定值的字典串列創建字典-有解無憂

我有一個字典串列如下，我想創建一個字典來存盤串列中的特定資料。

test_list = [
    {
        'id':1,
        'colour':'Red',
        'name':'Apple',
        'edible': True,
        'price':100
    },
    {
        'id':2,
        'colour':'Blue',
        'name':'Blueberry',
        'edible': True,
        'price':200
    },
    {
        'id':3,
        'colour':'Yellow',
        'name':'Crayon',
        'edible': False,
        'price':300
    }
]

例如，一個新字典只存盤各種專案的 {id, name, price}。

我創建了幾個串列：

id_list = []
name_list = []
price_list = []

然后我將我想要的資料添加到每個串列中：

for n in test_list:
   id_list.append(n['id']
   name_list.append(n['name']
   price_list.append(n['price']

但我不知道如何創建字典（或更合適的結構？）以我想要的 {id, name, price} 格式存盤資料。感謝幫助！

uj5u.com熱心網友回復：

如果你沒有太多資料，你可以使用這個嵌套串列/字典理解：

keys = ['id', 'name', 'price']
result = {k: [x[k] for x in test_list] for k in keys}

這會給你：


{
  'id': [1, 2, 3],
  'name': ['Apple', 'Blueberry', 'Crayon'],
  'price': [100, 200, 300]
}

uj5u.com熱心網友回復：

我認為字典串列仍然是正確的資料格式，所以：

test_list = [
    {
        'id':1,
        'colour':'Red',
        'name':'Apple',
        'edible': True,
        'price':100
    },
    {
        'id':2,
        'colour':'Blue',
        'name':'Blueberry',
        'edible': True,
        'price':200
    },
    {
        'id':3,
        'colour':'Yellow',
        'name':'Crayon',
        'edible': False,
        'price':300
    }
]

keys = ['id', 'name', 'price']
limited = [{k: v for k, v in d.items() if k in keys} for d in test_list]

print(limited)

結果：

[{'id': 1, 'name': 'Apple', 'price': 100}, {'id': 2, 'name': 'Blueberry', 'price': 200}, {'id': 3, 'name': 'Crayon', 'price': 300}]

這很好，因為您可以訪問它的部分，例如limited[1]['price'].

但是，pandas如果您不介意使用第三方庫，您的用例非常適合：

import pandas as pd

test_list = [
    {
        'id':1,
        'colour':'Red',
        'name':'Apple',
        'edible': True,
        'price':100
    },
    {
        'id':2,
        'colour':'Blue',
        'name':'Blueberry',
        'edible': True,
        'price':200
    },
    {
        'id':3,
        'colour':'Yellow',
        'name':'Crayon',
        'edible': False,
        'price':300
    }
]

df = pd.DataFrame(test_list)

print(df['price'][1])
print(df)

DataFrame 非常適合這些東西，并且只選擇您需要的列：

keys = ['id', 'name', 'price']
df_limited = df[keys]
print(df_limited)

我更喜歡串列字典的原因是，操作串列字典會變得復雜且容易出錯，并且訪問單個記錄意味著訪問三個單獨的串列 - 除了某些操作之外，這種方法沒有很多優點如果您更頻繁地訪問單個屬性，on list 會更快。但在這種情況下，pandas輕而易舉地獲勝。

在你問的評論中“假設我有item_names = ['Apple', 'Teddy', 'Crayon']并且我想檢查這些專案名稱中的一個是否在df_limited變數中，或者我猜df_limited['name']- 有沒有辦法做到這一點，如果是那么列印說價格，或操縱價格？”

There's many ways of course, I recommend looking into some online pandas tutorials, because it's a very popular library and there's excellent documentation and teaching materials online.

However, just to show how easy it would be in both cases, retrieving the matching objects or just the prices for them:

item_names = ['Apple', 'Teddy', 'Crayon']

items = [d for d in test_list if d['name'] in item_names]
print(items)
item_prices = [d['price'] for d in test_list if d['name'] in item_names]
print(item_prices)

items = df[df['name'].isin(item_names)]
print(items)
item_prices = df[df['name'].isin(item_names)]['price']
print(item_prices)

Results:

[{'id': 1, 'colour': 'Red', 'name': 'Apple', 'edible': True, 'price': 100}, {'id': 3, 'colour': 'Yellow', 'name': 'Crayon', 'edible': False, 'price': 300}]
[100, 300]

   id    name  price
0   1   Apple    100
2   3  Crayon    300
0    100
2    300

In the example with the dataframe there's a few things to note. They are using .isin() since using in won't work in the fancy way dataframes allow you to select data df[<some condition on df using df>], but there's fast and easy to use alternatives for all standard operations in pandas. More importantly, you can just do the work on the original df - it already has everything you need in there.

And let's say you wanted to double the prices for these products:

df.loc[df['name'].isin(item_names), 'price'] *= 2

這.loc用于技術原因（您不能只修改資料框的任何視圖），但在此答案中涉及的內容太多了 - 您將學習查看pandas. 不過，它非常干凈和簡單，我相信你同意。（您也可以.loc用于前面的示例）

在這個簡單的示例中，兩者都立即運行，但您會發現pandas對于非常大的資料集執行得更好。此外，嘗試使用您要求的方法（如接受的答案中提供的）撰寫相同的示例，您會發現它并不那么優雅，除非您再次將所有內容壓縮在一起：

item_prices = [p for i, n, p in zip(result.values()) if n in item_names]

獲得具有相同結構的結果會result更加棘手，涉及更多的壓縮和解包，或者需要您檢查串列兩次。

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/407261.html

標籤：

上一篇：如何洗掉第一列中包含單詞“class”的csv的所有行，但第一行除外

下一篇：通過拆分列中的一組值來重新格式化R中的表