我有一個資料框，我想在其中使用pd.CategoricalDtype()進行過濾并使用px.bar在條形圖中顯示結果。

在 pandas 的最后一次更新之前，它運行良好，但在最新的更新中，它會使圖表崩潰并顯示以下錯誤：

Traceback（最近一次呼叫最后）：檔案“”，第 1 行，在檔案“/home/marco/python-wsl/project_folder/venv/lib/python3.8/site-packages/plotly/express/_chart_types.py”中，第 373 行，在 bar return make_figure（檔案“/home/marco/python-wsl/project_folder/venv/lib/python3.8/site-packages/plotly/express/_core.py”，第 2003 行，在 make_figure 組中，命令= get_groups_and_orders(args, grouper) 檔案“/home/marco/python-wsl/project_folder/venv/lib/python3.8/site-packages/plotly/express/_core.py”，第 1978 行，在 get_groups_and_orders 組中 = { 檔案“/home/marco/python-wsl/project_folder/venv/lib/python3.8/site-packages/plotly/express/_core.py”，第 1979 行，在 sf 中：grouped.get_group(s if len(s) > 1 else s[0]) 檔案“/home/marco/python-wsl/project_folder/venv/lib/python3.8/site-packages/pandas/core/groupby/groupby.py"，第 811 行，在 get_group 中引發 KeyError(name) KeyError: 'C'

代碼：

# Code outside px.bar
old_df2 = pd.DataFrame({'name': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'C'],
                       'id1': [18, 22, 19, 14, 14, 11, 20, 28],
                       'id2': [5, 7, 7, 9, 12, 9, 9, 4],
                       'id3': [11, 8, 10, 6, 6, 7, 9, 12]})


new_df = old_df2.groupby([pd.CategoricalDtype(old_df2.name),'id2'])['id3'].count().fillna(0)
    
# Transforms count from series to data frame
new_df = new_df.to_frame()

# rowname to index 
new_df.reset_index(inplace=True)

new_df = new_df[new_df["level_0"].isin(["A","B"])]

new_df .rename(columns={'level_0': 'name'}, inplace=True)

# Not working  here the error 
fig_bar = px.bar(new_df.loc[::-1], x="id2", y="id3", color = "name", barmode="group")

# Working version identical data
new_df_list = new_df.to_dict("records")

unlinked_df = pd.DataFrame(new_df_list )

如何修復錯誤？

uj5u.com熱心網友回復：

我認為Categorical如果需要默認行為，您可以將列轉換為 - 從資料中推斷類別并且類別是無序的：

new_df = old_df2.groupby([pd.Categorical(old_df2.name),'id2'])['id3'].count().fillna(0)

如果需要CategoricalDtype傳遞categories的唯一值old_df2.name：

from pandas.api.types import CategoricalDtype

cat_type = CategoricalDtype(categories=old_df2.name.unique())
new_df = old_df2.groupby([old_df2.name.astype(cat_type),'id2'])['id3'].count().fillna(0)

也iloc從改變loc：

fig_bar = px.bar(new_df.iloc[::-1], x="id2", y="id3", color = "name", barmode="group")

編輯：我做了一些研究，問題是如果按類別列過濾缺少類別沒有被洗掉。您可以cat.remove_unused_categories嘗試isin：

old_df2 = pd.DataFrame({'name': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'C'],
                       'id1': [18, 22, 19, 14, 14, 11, 20, 28],
                       'id2': [5, 7, 7, 9, 12, 9, 9, 4],
                       'id3': [11, 8, 10, 6, 6, 7, 9, 12]})


from pandas.api.types import CategoricalDtype

cat_type = CategoricalDtype(categories=old_df2.name.unique())
new_df = old_df2.groupby([old_df2.name.astype(cat_type),'id2'])['id3'].count().fillna(0)
    
# rowname to index 
new_df = new_df.reset_index()

new_df = new_df[new_df["name"].isin(["A","B"])]

print (new_df['name'])
# 0    A
# 1    A
# 2    A
# 3    A
# 4    A
# 5    B
# 6    B
# 7    B
# 8    B
# 9    B
# Name: name, dtype: category
# Categories (3, object): ['A', 'B', 'C']

new_df['name'] = new_df['name'].cat.remove_unused_categories()

print (new_df['name'])
# 0    A
# 1    A
# 2    A
# 3    A
# 4    A
# 5    B
# 6    B
# 7    B
# 8    B
# 9    B
# Name: name, dtype: category
# Categories (2, object): ['A', 'B']

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/530242.html

標籤：Python熊猫情节地条形图

上一篇：將列中的特定值分配給特定的行數

下一篇：ScrapyResponse沒有顯示任何表格資料

Pandas最新更新過濾按物件分組打破px.bar

代碼：