PythonPandas：更改行和標題-有解無憂

我有以下陣列： [['fic10', {'bulle_naif': '55'}, {'bulle_bool': '52'}, {'bulle_opt': '39'}, {'selection': '45' }，{'插入'：'20'}，{'rapide'：'60'}]，['fic100'，{'bulle_naif'：'5050'}，{'bulle_bool'：'5050'}，{' Bulle_opt'：'4816'}，{'selection'：'4950'}，{'insertion'：'2221'}，{'rapide'：'6697'}]，['fic1000'，{'bulle_naif'：' 2623195'}，{'bulle_bool'：'1789209'}，{'bulle_opt'：'2618499'}，{'selection'：'2620905'}，{'insertion'：'1535788'}，{'rapide'：' 1323294'}]，['fic10000'，{'bulle_naif'：'4764881010'}，{'bulle_bool'：'926117379'}，{'bulle_opt'：'4764749559'}，{'selection'：'4764783390'}，{'insertion'：'900955079'}，'rapide'6'7' }]]]

我將其轉換為資料幀：

Python Pandas：更改行和標題

uj5u.com熱心網友回復：

讓我們以更易于理解的方式呈現您的資料pandas.DataFrame。

第一種方法：`key:list`每列一個條目的字典

此方法的目標是重新排列您的資料，使其看起來像一個單獨的 dict，每列一個條目，以及每列的值串列：

# {'bulle_naif': ['55', '5050', '2623195', '4764881010'],
#  'bulle_bool': ['52', '5050', '1789209', '926117379'],
#  'bulle_opt': ['39', '4816', '2618499', '4764749559'],
#  'selection': ['45', '4950', '2620905', '4764783390'],
#  'insertion': ['20', '2221', '1535788', '900955079'],
#  'rapide': ['60', '6697', '1323294', '506697139'],
#  'name': ['fic10', 'fic100', 'fic1000', 'fic10000']}

這是進行該轉換的代碼：

import pandas as pd

raw_data = [['fic10', {'bulle_naif': '55'}, {'bulle_bool': '52'}, {'bulle_opt': '39'}, {'selection': '45'}, {'insertion': '20'}, {'rapide': '60'}], ['fic100', {'bulle_naif': '5050'}, {'bulle_bool': '5050'}, {'bulle_opt': '4816'}, {'selection': '4950'}, {'insertion': '2221'}, {'rapide': '6697'}], ['fic1000', {'bulle_naif': '2623195'}, {'bulle_bool': '1789209'}, {'bulle_opt': '2618499'}, {'selection': '2620905'}, {'insertion': '1535788'}, {'rapide': '1323294'}], ['fic10000', {'bulle_naif': '4764881010'}, {'bulle_bool': '926117379'}, {'bulle_opt': '4764749559'}, {'selection': '4764783390'}, {'insertion': '900955079'}, {'rapide': '506697139'}]]

cleaned_data = { k: [] for d in raw_data[0][1:] for k in d.keys() }
cleaned_data['name'] = []
for row in raw_data:
    cleaned_data['name'].append(row[0])
    for d in row[1:]:
        for k,v in d.items():
            cleaned_data[k].append(v)

print(cleaned_data)
# {'bulle_naif': ['55', '5050', '2623195', '4764881010'],
#  'bulle_bool': ['52', '5050', '1789209', '926117379'],
#  'bulle_opt': ['39', '4816', '2618499', '4764749559'],
#  'selection': ['45', '4950', '2620905', '4764783390'],
#  'insertion': ['20', '2221', '1535788', '900955079'],
#  'rapide': ['60', '6697', '1323294', '506697139'],
#  'name': ['fic10', 'fic100', 'fic1000', 'fic10000']}


# IMPORTANT NOTE
# This cleaning-up is a bit careless
# If a key is missing in one of the lists, the resulting data will be misaligned

# Making sure data is not misaligned:
assert(all(len(l) == len(cleaned_data['name']) for l in cleaned_data.values()))

good_dataframe = pd.DataFrame(cleaned_data).set_index('name')
print(good_dataframe)

#           bulle_naif bulle_bool   bulle_opt   selection  insertion     rapide
# name                                                                         
# fic10             55         52          39          45         20         60
# fic100          5050       5050        4816        4950       2221       6697
# fic1000      2623195    1789209     2618499     2620905    1535788    1323294
# fic10000  4764881010  926117379  4764749559  4764783390  900955079  506697139

第二種方法：沒有鍵但按順序排列的二維陣列

如果您的資料已經排序，這樣bulle_naif，bulle_opt等，已經在每一行的相同的順序，那么你就可以擺脫所有型別的字典，并提供pandas.DataFrame直接二維陣列：

# assumes the rows of raw_data are all in the same order already
array_data = [[row[0]]   [v for d in row[1:] for v in d.values()] for row in raw_data]

print(array_data)
# [['fic10', '55', '52', '39', '45', '20', '60'],
#  ['fic100', '5050', '5050', '4816', '4950', '2221', '6697'],
#  ['fic1000', '2623195', '1789209', '2618499', '2620905', '1535788', '1323294'],
#  ['fic10000', '4764881010', '926117379', '4764749559', '4764783390', '900955079', '506697139']]

keys = ['name']   [k for d in raw_data[0][1:] for k in d.keys()]
dataframe = pd.DataFrame(array_data, columns = keys).set_index('name')

print(dataframe)
#           bulle_naif bulle_bool   bulle_opt   selection  insertion     rapide
# name                                                                         
# fic10             55         52          39          45         20         60
# fic100          5050       5050        4816        4950       2221       6697
# fic1000      2623195    1789209     2618499     2620905    1535788    1323294
# fic10000  4764881010  926117379  4764749559  4764783390  900955079  506697139

如果您不知道所有行都已按相同順序顯示，則必須對它們進行顯式排序以確保：

# I shuffled the entries in raw_data
raw_data = [
 ['fic10', {'bulle_bool': '52'}, {'bulle_naif': '55'}, {'selection': '45'}, {'insertion': '20'}, {'rapide': '60'}, {'bulle_opt': '39'}],
 ['fic100', {'bulle_opt': '4816'}, {'bulle_naif': '5050'}, {'insertion': '2221'}, {'selection': '4950'}, {'rapide': '6697'}, {'bulle_bool': '5050'}],
 ['fic1000', {'bulle_opt': '2618499'}, {'selection': '2620905'}, {'insertion': '1535788'}, {'bulle_bool': '1789209'}, {'rapide': '1323294'}, {'bulle_naif': '2623195'}],
 ['fic10000', {'selection': '4764783390'}, {'bulle_opt': '4764749559'}, {'bulle_bool': '926117379'}, {'bulle_naif': '4764881010'}, {'insertion': '900955079'}, {'rapide': '506697139'}]]

array_data = [[row[0]]   [v for d in sorted(row[1:], key=lambda d: next(iter(d.keys()))) for v in d.values()] for row in raw_data]

print(array_data)
# [['fic10', '52', '55', '39', '20', '60', '45'],
#  ['fic100', '5050', '5050', '4816', '2221', '6697', '4950'],
#  ['fic1000', '1789209', '2623195', '2618499', '1535788', '1323294', '2620905'],
#  ['fic10000', '926117379', '4764881010', '4764749559', '900955079', '506697139', '4764783390']]

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/370123.html

標籤：Python 数组熊猫数据框排序

上一篇：如何像陣列一樣選擇物件？

下一篇：將位元組陣列轉換為BigInteger不正確

PythonPandas：更改行和標題

第一種方法：key:list每列一個條目的字典

第二種方法：沒有鍵但按順序排列的二維陣列

第一種方法：`key:list`每列一個條目的字典