如果在其中任何一個中找到值,我正在嘗試使用來自 Col2、Col3 的值更新 Col1。一行只有一個值,但它可以有“-”,但這應該被視為 NaN
df = pd.DataFrame(
[
['A',np.nan,np.nan,np.nan,np.nan,np.nan],
[np.nan,np.nan,np.nan,'C',np.nan,np.nan],
[np.nan,np.nan,"-",np.nan,'B',np.nan],
[np.nan,np.nan,"-",np.nan,np.nan,np.nan]
],
columns = ['Col1','Col2','Col3','Col4','Col5','Col6']
)
print(df)
Col1 Col2 Col3 Col4 Col5 Col6
0 A NaN NaN NaN NaN NaN
1 NaN NaN NaN C NaN NaN
2 NaN NaN NaN NaN B NaN
3 NaN NaN NaN NaN NaN NaN
我希望輸出為:
Col1
0 A
1 C
2 B
3 NaN
我嘗試使用更新功能:
for col in df.columns[1:]:
df[Col1].update(col)
它適用于這個小DataFrame但當我在更大的上運行它時DataFrame,rows我columns在兩者之間失去了很多價值。有沒有更好的功能可以做到這一點最好沒有回圈。請幫助我嘗試了許多其他方法,包括使用.loc但不快樂。
uj5u.com熱心網友回復:
這是一種解決方法
# convert the values in the row to series, and sort, NaN moves to the end
df2=df.apply(lambda x: pd.Series(x).sort_values(ignore_index=True), axis=1)
# rename df2 column as df columns
df2.columns=df.columns
# drop where all values in the column as null
df2.dropna(axis=1, how='all', inplace=True)
print(df2)
Col1
0 A
1 C
2 B
3 NaN
uj5u.com熱心網友回復:
您可以使用combine_first:
from functools import reduce
reduce(
lambda x, y: x.combine_first(df[y]),
df.columns[1:],
df[df.columns[0]]
).to_frame()
以下DataFrame是之前代碼的結果:
Col1
0 A
1 C
2 B
3 NaN
uj5u.com熱心網友回復:
對于這種型別的用例,Python 有一個單行生成器:
# next((x for x in list if condition), None)
df["Col1"] = df.apply(lambda row: next((x for x in row if not pd.isnull(x) and x != "-"), None), axis=1)
[Out]:
0 A
1 C
2 B
3 None
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/522467.html
