我有一個有趣的情況,我的資料框看起來像這樣。
marks_in_test_1 marks_in_test_2
rank1 {'english': 25, 'maths': 30} {'english': 15, 'maths': 30, 'science': 45}
rank4 {'english': 34, 'maths': 39} {'english': 35, 'maths': 31}
我想將它轉換為如下所示的內容,它看起來像一個資料透視表,其中舊的列值也充當索引。
english maths Science
rank1 marks_in_test_1 25 30 NaN
rank1 marks_in_test_2 15 30 45
rank2 marks_in_test_1 34 35 NaN
rank2 marks_in_test_2 39 31 NaN
我曾嘗試查看 Pandas 資料透視檔案,但沒有任何幫助。
uj5u.com熱心網友回復:
DataFrame.stack與DataFrame建構式一起使用:
s = df.stack()
df = pd.DataFrame(s.tolist(), index=s.index)
print (df)
english maths
rank1 marks_in_test_1 25 30
marks_in_test_2 15 30
rank4 marks_in_test_1 34 39
marks_in_test_2 35 31
s = df.stack()
df = pd.DataFrame(s.tolist(), index=s.index).rename_axis(['a','b']).reset_index()
print (df)
a b english maths
0 rank1 marks_in_test_1 25 30
1 rank1 marks_in_test_2 15 30
2 rank4 marks_in_test_1 34 39
3 rank4 marks_in_test_2 35 31
uj5u.com熱心網友回復:
您可以通過應用pd.Series每個字典來轉換您的字典(即使它沒有真正優化)
>>> df.stack().apply(pd.Series).rename_axis(index=['rank', 'marks']).reset_index()
rank marks english maths science
0 rank1 marks_in_test_1 25.0 30.0 NaN
1 rank1 marks_in_test_2 15.0 30.0 45.0
2 rank4 marks_in_test_1 34.0 39.0 NaN
3 rank4 marks_in_test_2 35.0 31.0 NaN
uj5u.com熱心網友回復:
#transposing column
df = df.melt(id_vars=["rank"])
df.head()
rank variable value
0 rank1 marks_in_test_1 {'english': 25, 'maths': 30}
1 rank4 marks_in_test_1 {'english': 34, 'maths': 39}
2 rank1 marks_in_test_2 {'english': 15, 'maths': 30, 'science': 45}
3 rank4 marks_in_test_2 {'english': 35, 'maths': 31}
df = pd.concat([df[['rank','variable']], df['value'].astype(str).str.replace("{","").replace("}","").str.split(', ', expand=True)], axis=1)
df.head()
rank variable 0 1 2
0 rank1 marks_in_test_1 'english': 25 'maths': 30} None
1 rank4 marks_in_test_1 'english': 34 'maths': 39} None
2 rank1 marks_in_test_2 'english': 15 'maths': 30 'science': 45}
3 rank4 marks_in_test_2 'english': 35 'maths': 31} None
#renaming columns
df.rename(columns = {0: "english", 1: "maths", 2:"science"},
inplace = True)
#removing string from mark columns
df['english'] = df['english'].str.replace(r'[^0-9] ', '')
df['science'] = df['science'].str.replace(r'[^0-9] ', '')
df['maths'] = df['maths'].str.replace(r'[^0-9] ', '')
df
rank variable english maths science
0 rank1 marks_in_test_1 25 30 None
1 rank4 marks_in_test_1 34 39 None
2 rank1 marks_in_test_2 15 30 45
3 rank4 marks_in_test_2 35 31 None
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/383801.html
上一篇:每個時間步的平均值
下一篇:在python中的一列中合并多列
