如何在python中按因子級別重新排序pandas資料幀中的行？-有解無憂

我創建了一個比較每個杯子大小的咖啡飲料價格的小資料集。

當我旋轉我的資料集時，輸出會自動按字母順序重新排序索引（“大小”列）。

有沒有辦法為不同的大小分配一個數字級別（例如小 = 0，中 = 1，大 = 2）并以這種方式重新排序行？

我知道這可以在 R 中使用 forcats 庫（例如使用 fct_relevel）來完成，但我不知道如何在 python 中做到這一點。我更愿意保留使用 numpy 和 Pandas 的解決方案。

data = {'Item': np.repeat(['Latte', 'Americano', 'Cappuccino'], 3),
        'Size': ['Small', 'Medium', 'Large']*3,
        'Price': [2.25, 2.60, 2.85, 1.95, 2.25, 2.45, 2.65, 2.95, 3.25]
       }

df = pd.DataFrame(data, columns = ['Item', 'Size', 'Price'])
df = pd.pivot_table(df, index = ['Size'], columns = 'Item')
df

#         Price
# Item    Americano Cappuccino  Latte
#   Size            
#  Large       2.45       3.25   2.85
# Medium       2.25       2.95   2.60
#  Small       1.95       2.65   2.25

uj5u.com熱心網友回復：

你可以使用一個Categorical型別ordered=True：

df.index = pd.Categorical(df.index,
                          categories=['Small', 'Medium', 'Large'],
                          ordered=True)
df = df.sort_index()

輸出：

           Price                 
Item   Americano Cappuccino Latte
Small       1.95       2.65  2.25
Medium      2.25       2.95  2.60
Large       2.45       3.25  2.85

您可以通過以下方式訪問代碼：

>>> df.index.codes
array([0, 1, 2], dtype=int8)

如果這是一個系列：

>>> series.cat.codes

uj5u.com熱心網友回復：

一種選擇是在旋轉之前創建分類；對于這種情況，我使用encode_categoricalfrom pyjanitor，主要是為了方便：

# pip install pyjanitor
import pandas as pd
import janitor
(df
 .encode_categorical(Size = (None, 'appearance'))
 .pivot_table(index='Size', columns='Item')
)

           Price                 
Item   Americano Cappuccino Latte
Size                             
Small       1.95       2.65  2.25
Medium      2.25       2.95  2.60
Large       2.45       3.25  2.85

這樣，您不必費心進行排序，因為旋轉隱式地做到了這一點。您可以跳過 pyjanitor，只使用 Pandas：

(df
 .astype({'Size': pd.CategoricalDtype(categories = ['Small', 'Medium', 'Large'], 
                                      ordered = True)})
 .pivot_table(index='Size', columns='Item')
)

           Price                 
Item   Americano Cappuccino Latte
Size                             
Small       1.95       2.65  2.25
Medium      2.25       2.95  2.60
Large       2.45       3.25  2.85

uj5u.com熱心網友回復：

第一種方式：

pivot_table函式根據索引對行進行排序。因此，在pivot_table 函式中應用索引時最好使用lambda 函式。這樣，您不需要任何進一步的排序步驟（更耗時）或任何第三方庫。

df = pd.pivot_table(df, index = (lambda row: 0 if df.loc[row,'Size']=="Small" else 1 if df.loc[row,'Size']=="Medium" else 2), 
                    columns = 'Item')

         Price                 
Item Americano Cappuccino Latte
0         1.95       2.65  2.25
1         2.25       2.95  2.60
2         2.45       3.25  2.85

第二種方式：

您也可以使用自己的代碼，然后對新創建的表進行重命名和排序：

df = pd.DataFrame(data, columns = ['Item', 'Size', 'Price'])
df = pd.pivot_table(df, index = ['Size'], columns = 'Item')

# rename:
df = df.rename(index= lambda x: 0 if x=="Small" else 1 if x=="Medium" else 2)

#sort:
df = df.sort_index(ascending = True)


         Price                 
Item Americano Cappuccino Latte
0         1.95       2.65  2.25
1         2.25       2.95  2.60
2         2.45       3.25  2.85

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/370740.html

標籤：Python 熊猫数据框排序排行

上一篇：C#-如何防止嵌套回圈中的重復結果？

下一篇：什么導致java.lang.NoClassDefFoundError:org/openqa/selenium/internal/Require當使用WebDriverManager5.0.3時