如果列名以特定子字串開頭，如何合并列？-有解無憂

我有一個這樣的資料框：

import pandas as pd
import numpy as np

df = pd.DataFrame({"a":[10, 13, 15, 30],
                  "b:1":[np.nan, np.nan, 13, 14],
                  "b:2":[6, 7, np.nan, np.nan]})

當它們以“b：”開頭時，我想將列組合成一列“b”。在這種情況下我可以簡單地使用df["b"] = df["b:1"].combine_first(df["b:2"])，但這是一個更大資料框的示例，有時它也可以有類似“b：3”和轉發的東西，甚至還有其他帶有“c：1，c：2”的列，這些是最后一個那些我不想合并的。

任何人都可以告訴我如何做到這一點，所以我的最終資料框將是：

df
Out[23]: 
    a   b:1  b:2     b
0  10   NaN  6.0   6.0
1  13   NaN  7.0   7.0
2  15  13.0  NaN  13.0
3  30  14.0  NaN  14.0

uj5u.com熱心網友回復：

您可以使用str.containsfordf.columns然后 sum on axis=1：

col_b = df.columns[df.columns.str.contains('b')]
df['b'] = df[col_b].sum(axis=1)

uj5u.com熱心網友回復：

這可能會幫助你：

from functools import reduce

import pandas as pd
import numpy as np

df = ...  # define DataFrame

exclude_cols = ['c', 'd']  # List the columns that should be excluded from merging

included_cols = []
for col in df.columns:
    if ':' in col:
        base_col = col.split(':')[0]
        if base_col in included_cols:
            continue
        associated_cols = [c for c in df.columns if f"{base_col}:" in col]
        df[base_col] = reduce(lambda x, y: x.combine_first(y), [df[c] for c in associated_cols])
        included_cols.append(base_col)

uj5u.com熱心網友回復：

您可以遍歷所有首字母并回填：

df = pd.DataFrame({"a":[10, 13, 15, 30, 11],
                  "b:1":[np.nan, np.nan, 13, 14, np.nan],
                  "b:2":[6, 7, np.nan, np.nan, np.nan],
                  "b:3":[np.nan, np.nan, np.nan, np.nan, 11]})

df_combined = pd.DataFrame()
for first_letter in set([c[0] for c in df.columns]):
    df_combined[first_letter] = \
    df[[c for c in df.columns if c[0]==first_letter]].fillna(method='bfill', axis=1).iloc[:,0]

	b	一個
0	6	10
1	7	13
2	13	15
3	14	30
4	11	11

uj5u.com熱心網友回復：

另一種可能的解決方案：

df['b'] = df.T[lambda x: x.index.str.startswith('b:')].ffill().bfill().iloc[0]

輸出：

    a   b:1  b:2     b
0  10   NaN  6.0   6.0
1  13   NaN  7.0   7.0
2  15  13.0  NaN  13.0
3  30  14.0  NaN  14.0

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/513796.html

標籤：Python熊猫合并

上一篇：如何在Pandas列中使用REGEX獲取字串中間的子字串

下一篇：在python中將嵌套字典轉換為熊貓資料框