假設我有一個陣列字典,例如:
favourite_icecreams = {
'Josh': ['vanilla', 'banana'],
'Greg': ['chocolate'],
'Sarah': ['mint', 'vanilla', 'mango']
}
我想將其轉換為 pandas 資料框,列為“Flavour”和“Person”。它應該如下所示:
| 味道 | 人 |
|---|---|
| 香草 | 喬什 |
| 香蕉 | 喬什 |
| 巧克力 | 格雷格 |
| 薄荷 | 莎拉 |
| 香草 | 莎拉 |
| 芒果 | 莎拉 |
最有效的方法是什么?
uj5u.com熱心網友回復:
您可以使用(生成器)理解,然后將其提供給pd.DataFrame:
import pandas as pd
favourite_icecreams = {
'Josh': ['vanilla', 'banana'],
'Greg': ['chocolate'],
'Sarah': ['mint', 'vanilla', 'mango']
}
data = ((flavour, person)
for person, flavours in favourite_icecreams.items()
for flavour in flavours)
df = pd.DataFrame(data, columns=('Flavour', 'Person'))
print(df)
# Flavour Person
# 0 vanilla Josh
# 1 banana Josh
# 2 chocolate Greg
# 3 mint Sarah
# 4 vanilla Sarah
# 5 mango Sarah
uj5u.com熱心網友回復:
另一種解決方案,使用.explode():
df = pd.DataFrame(
{
"Person": favourite_icecreams.keys(),
"Flavour": favourite_icecreams.values(),
}
).explode("Flavour")
print(df)
印刷:
Person Flavour
0 Josh vanilla
0 Josh banana
1 Greg chocolate
2 Sarah mint
2 Sarah vanilla
2 Sarah mango
uj5u.com熱心網友回復:
您可以完全在熊貓中執行此操作,如下所示,使用DataFrame.from_dictand df.stack:
In [453]: df = pd.DataFrame.from_dict(favourite_icecreams, orient='index').stack().reset_index().drop('level_1', 1)
In [455]: df.columns = ['Person', 'Flavour']
In [456]: df
Out[456]:
Person Flavour
0 Josh vanilla
1 Josh banana
2 Greg chocolate
3 Sarah mint
4 Sarah vanilla
5 Sarah mango
uj5u.com熱心網友回復:
一種選擇是將人員和風味提取到單獨的串列中,在person串列上使用 numpy 重復,最后創建 DataFrame:
from itertools import chain
person, flavour = zip(*favourite_icecreams.items())
lengths = list(map(len, flavour))
person = np.array(person).repeat(lengths)
flavour = chain.from_iterable(flavour)
pd.DataFrame({'person':person, 'flavour':flavour})
person flavour
0 Josh vanilla
1 Josh banana
2 Greg chocolate
3 Sarah mint
4 Sarah vanilla
5 Sarah mango
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/474761.html
