在Pandas中按順序重命名重復的列名-有解無憂

我有一個資料框 df，我想在其中按連續順序重命名兩個重復的列：

資料

DD  Nice Nice Hello
0   1    1    2

想要的

DD  Nice1 Nice2 Hello
0   1     1     2

正在做

df.rename(columns={"Name": "Name1", "Name": "Name2"})

rename但是，我正在運行該函式，因為兩個列名稱相同，結果并不理想。

uj5u.com熱心網友回復：

這是一種方法groupby：

s = df.columns.to_series().groupby(df.columns)


df.columns = np.where(s.transform('size')>1, 
                      df.columns   s.cumcount().add(1).astype(str), 
                      df.columns)

輸出：

   DD  Nice1  Nice2  Hello
0   0      1      1      2

uj5u.com熱心網友回復：

這就是你如何做到的。例如：

df.rename(columns={ df.columns[1]: "Name1" }, inplace = True)

uj5u.com熱心網友回復：

您可以使用：

cols = pd.Series(df.columns)
dup_count = cols.value_counts()
for dup in cols[cols.duplicated()].unique():
    cols[cols[cols == dup].index.values.tolist()] = [dup   str(i) for i in range(1, dup_count[dup] 1)]

df.columns = cols

Input:

col_1  Nice  Nice  Nice  Hello  Hello  Hello
col_2     1     2     3      4      5      6

Output:

col_1  Nice1  Nice2  Nice3  Hello1  Hello2  Hello3
col_2      1      2      3       4       5       6

Setup to generate duplicate cols:

df = pd.DataFrame(data={'col_1':['Nice', 'Nice', 'Nice', 'Hello', 'Hello', 'Hello'], 'col_2':[1,2,3,4, 5, 6]})
df = df.set_index('col_1').T

uj5u.com熱心網友回復：

您可以使用itertools.count()計數器和串列運算式來創建新的列標題，然后將它們分配給資料框。

例如：

>>> import itertools
>>> df = pd.DataFrame([[1, 2, 3]], columns=["Nice", "Nice", "Hello"])
>>> df
   Nice  Nice  Hello
0     1     2      3
>>> count = itertools.count(1)
>>> new_cols = [f"Nice{next(count)}" if col == "Nice" else col for col in df.columns]
>>> df.columns = new_cols
>>> df
   Nice1  Nice2  Hello
0      1      2      3

（f 字串需要 Python 3.6 ）

編輯：或者，根據下面的注釋，串列運算式可以替換可能包含的任何標簽，"Nice"以防出現意外空格或其他字符：

new_cols = [f"Nice{next(count)}" if "Nice" in col else col for col in df.columns]

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/333454.html

標籤：Python 熊猫麻木的

上一篇：如何計算pandas資料幀的2個索引之間的行數

下一篇：基于另一個Pandas資料框中的重疊范圍映射2列的范圍，并對相同范圍的值求和