我需要將有大量記錄(約3000條)的csv格式的資料轉換為物件/dict的串列。我開始使用Pandas,但現在,我不確定這是一個好的選擇。檔案包含5列。csv檔案的結構是這樣的:
readTimestamp school_subject graduate full_name term
1611658200000 mathematics 3 Edd Ston 2
1611658200000 物理學 5 Edd Ston 2
1611658200000外語 5 Edd Ston 2
1611658200000 地理4 Edd Ston2
1611658200000 歷史 3 Edd Ston 2
1611658200000 資訊學 4 Kate Slow 1
1611658200000 化學 5 Kate Slow 1
1611658200000 數學 5 凱特慢 1
1611658200000外語 5 Kate Slow 1
我需要接收結構為:
[
{
"readTimestamp"/span>。123123123,
"full_name": "Edd Ston",
"術語": 2,
"school_subject": [
{
"mathematics": 3,
"phisics": 5,
"外語": 5,
"geography": 4,
"歷史": 3.
}
]
},
{
"readTimestamp": 345345345,
"full_name": "Kate Slow",
"術語": 1,
"school_subject": [
{
"Informatics": 4,
"化學": 3,
"數學": 5,
"外語": 5.
}
]
}
直到現在我收到:
df = df.groupby(['readTimestamp','full_name','term']) . apply(lambda x: x[['school_subject', 'graduate']].to_dict( orient='records')) .to_dict()
{(1611658200000, 'Edd Ston', 2) 。[{'school_subject': 'mathematics', 'graduate': 3}, {'school_subject': '物理學', '畢業生': 5}, {'school_subject': '外語', '畢業生': 5}, {'school_subject': 'geography', 'graduate': 4}, {'school_subject': '歷史', '畢業生': 3}], (1611658200000, 'Kate Slow', 1) 。[{'school_subject': 'Informatics', 'graduate': 4}, {'school_subject': '化學', '畢業生': 5}, {'school_subject': 'mathematics', 'graduate': 5}, {'school_subject': '外語', '畢業生': 5}]}。
我將感謝您的幫助,并解釋我在哪里犯了錯誤
。uj5u.com熱心網友回復:
我認為你的解決方案可以做一點改變--每組創建字典,然后用orient='records'轉換為字典:
d = (df.groupby(['readTimestamp','ful_name','term'])
.apply(lambda x: x.set_index('school_subject')['graduate'].to_dict()
.reset_index(name='school_subject')
.to_dict(orient='records'))
print (d)
[{
'readTimestamp'。1611658200000,
'full_name': 'Edd Ston',
'term': 2,
'school_subject': {
'mathematics': 3,
'物理學': 5,
'外語': 5,
'geography': 4,
'歷史': 3, 'history'.
}
}, {
'readTimestamp': 1611658200000,
'full_name': 'Kate Slow',
'term': 1,
'school_subject': {
'Informatics': 4,
'化學': 5,
'數學': 5,
'外語': 5.
}
}]
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/327669.html
標籤:
下一篇:合并具有相同列命名的幾個csv
