pandas-按自定義順序對列進行排序-有解無憂

我有一個每天生成的 Pandas Dataframe，并且 Dataframe 中存在的列串列在每次生成時可能會有所不同。

我正在嘗試查看是否可以對列存盤為特定格式的 Dataframe 的最終輸出的順序進行排序。如果存在新列，則將它們放置在末尾。

下面給出的是我如何嘗試構建這個最終輸出

expected_columns = ['cust_id','cost_id','sale_id','prod_id']

示例資料框列：

['customer_name','cust_id','sale_id','sale_time']

我希望上述 Dataframe 結構如下：

['cust_id','sale_id','customer_name','sale_time']

基本上列中expected_columns的優先級最高，然后將資料幀中的其他列作為新資料幀的連續列。

uj5u.com熱心網友回復：

第一個想法是使用串列理解并通過以下方式連接串列：

expected_columns = ['cust_id','cost_id','sale_id','prod_id']

df = pd.DataFrame(columns=['customer_name','cust_id','sale_id','sale_time'])

expected_columns = ['cust_id','cost_id','sale_id','prod_id']
new1 = [c for c in df.columns if c in expected_columns]
new2 = [c for c in df.columns if c not in expected_columns]

new = new1   new2
print (new)
['cust_id','sale_id','customer_name','sale_time']

或者使用Index.intersection具有Index.difference：

expected_columns = ['cust_id','cost_id','sale_id','prod_id']

new = (df.columns.intersection(expected_columns, sort=False).tolist()   
       df.columns.difference(expected_columns, sort=False).tolist())

如果在輸出中排序也expected_columns很重要，請使用：

new = (pd.Index(expected_columns).intersection(df.columns, sort=False).tolist()  
       df.columns.difference(expected_columns, sort=False).tolist())

差異是改變樣本資料：

expected_columns = ['sale_id','cost_id','cust_id','prod_id']


df = pd.DataFrame(columns=['customer_name','cust_id','sale_id','sale_time'])


new = (pd.Index(expected_columns).intersection(df.columns, sort=False).tolist()  
       df.columns.difference(expected_columns, sort=False).tolist())
print (new)
['sale_id', 'cust_id', 'customer_name', 'sale_time']


new = (df.columns.intersection(expected_columns, sort=False).tolist()   
       df.columns.difference(expected_columns, sort=False).tolist())
print (new)
['cust_id', 'sale_id', 'customer_name', 'sale_time']

最后用于更改列的順序使用：

df = df[new]

轉載請註明出處，本文鏈接：https://www.uj5u.com/qianduan/312301.html

標籤：Python 熊猫排序

上一篇：對'Apr-01'、'Feb-02'的字串進行排序......在1月到12月排序順序：Pandas

下一篇：將陣列合并為一個陣列并使用相同的值填充鍵