我有一個資料框,我想在其中使用pd.CategoricalDtype()進行過濾并使用px.bar在條形圖中顯示結果。
在 pandas 的最后一次更新之前,它運行良好,但在最新的更新中,它會使圖表崩潰并顯示以下錯誤:
Traceback(最近一次呼叫最后):檔案“”,第 1 行,在檔案“/home/marco/python-wsl/project_folder/venv/lib/python3.8/site-packages/plotly/express/_chart_types.py”中,第 373 行,在 bar return make_figure(檔案“/home/marco/python-wsl/project_folder/venv/lib/python3.8/site-packages/plotly/express/_core.py”,第 2003 行,在 make_figure 組中,命令= get_groups_and_orders(args, grouper) 檔案“/home/marco/python-wsl/project_folder/venv/lib/python3.8/site-packages/plotly/express/_core.py”,第 1978 行,在 get_groups_and_orders 組中 = { 檔案“/home/marco/python-wsl/project_folder/venv/lib/python3.8/site-packages/plotly/express/_core.py”,第 1979 行,在 sf 中:grouped.get_group(s if len(s) > 1 else s[0]) 檔案“/home/marco/python-wsl/project_folder/venv/lib/python3.8/site-packages/pandas/core/groupby/groupby.py",第 811 行,在 get_group 中引發 KeyError(name) KeyError: 'C'
代碼:
# Code outside px.bar
old_df2 = pd.DataFrame({'name': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'C'],
'id1': [18, 22, 19, 14, 14, 11, 20, 28],
'id2': [5, 7, 7, 9, 12, 9, 9, 4],
'id3': [11, 8, 10, 6, 6, 7, 9, 12]})
new_df = old_df2.groupby([pd.CategoricalDtype(old_df2.name),'id2'])['id3'].count().fillna(0)
# Transforms count from series to data frame
new_df = new_df.to_frame()
# rowname to index
new_df.reset_index(inplace=True)
new_df = new_df[new_df["level_0"].isin(["A","B"])]
new_df .rename(columns={'level_0': 'name'}, inplace=True)
# Not working here the error
fig_bar = px.bar(new_df.loc[::-1], x="id2", y="id3", color = "name", barmode="group")
# Working version identical data
new_df_list = new_df.to_dict("records")
unlinked_df = pd.DataFrame(new_df_list )
如何修復錯誤?
uj5u.com熱心網友回復:
我認為Categorical如果需要默認行為,您可以將列轉換為 - 從資料中推斷類別并且類別是無序的:
new_df = old_df2.groupby([pd.Categorical(old_df2.name),'id2'])['id3'].count().fillna(0)
如果需要CategoricalDtype傳遞categories的唯一值old_df2.name:
from pandas.api.types import CategoricalDtype
cat_type = CategoricalDtype(categories=old_df2.name.unique())
new_df = old_df2.groupby([old_df2.name.astype(cat_type),'id2'])['id3'].count().fillna(0)
也iloc從改變loc:
fig_bar = px.bar(new_df.iloc[::-1], x="id2", y="id3", color = "name", barmode="group")
編輯:我做了一些研究,問題是如果按類別列過濾缺少類別沒有被洗掉。您可以cat.remove_unused_categories嘗試isin:
old_df2 = pd.DataFrame({'name': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'C'],
'id1': [18, 22, 19, 14, 14, 11, 20, 28],
'id2': [5, 7, 7, 9, 12, 9, 9, 4],
'id3': [11, 8, 10, 6, 6, 7, 9, 12]})
from pandas.api.types import CategoricalDtype
cat_type = CategoricalDtype(categories=old_df2.name.unique())
new_df = old_df2.groupby([old_df2.name.astype(cat_type),'id2'])['id3'].count().fillna(0)
# rowname to index
new_df = new_df.reset_index()
new_df = new_df[new_df["name"].isin(["A","B"])]
print (new_df['name'])
# 0 A
# 1 A
# 2 A
# 3 A
# 4 A
# 5 B
# 6 B
# 7 B
# 8 B
# 9 B
# Name: name, dtype: category
# Categories (3, object): ['A', 'B', 'C']
new_df['name'] = new_df['name'].cat.remove_unused_categories()
print (new_df['name'])
# 0 A
# 1 A
# 2 A
# 3 A
# 4 A
# 5 B
# 6 B
# 7 B
# 8 B
# 9 B
# Name: name, dtype: category
# Categories (2, object): ['A', 'B']
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/530242.html
上一篇:將列中的特定值分配給特定的行數
