資料框是這樣的:
RS AS IS
F1 [F1, F2, F3, F4, F5] [F1] [F1]
F2 [F2, F3, F5] [F1, F2, F3, F5] [F5, F3, F2]
F3 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
F4 [F4] [F1, F3, F4, F5] [F4]
F5 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
我需要的輸出:
RS AS IS Level
F1 [F1, F2, F3, F4, F5] [F1] [F1]
F2 [F2, F3, F5] [F1, F2, F3, F5] [F5, F3, F2] I
F3 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
F4 [F4] [F1, F3, F4, F5] [F4] I
F5 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
邏輯很簡單。如果 RS 和 IS 具有相似的值,則I在 Level 列中寫入。我正在使用以下代碼,但看起來它不起作用。
if df['RS'].any() == df['IS'].any():
df['Level'] = 'I'
實施上述方法后,還需要從原始資料幀中洗掉具有“I”級的整行。像這樣
RS AS IS
F1 [F1, F2, F3, F4, F5] [F1] [F1]
F3 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
F5 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
uj5u.com熱心網友回復:
將您的串列轉換為set,然后比較是否相等以獲取哪些行具有相同的元素,然后分配值。下面的示例忽略了您的中間列。
import pandas as pd
df = pd.DataFrame({'RS':
[[1,2,3,4,5],
[2,3,5],
[2,3,4,5],
[4],
[2,3,4,5]],
'IS':
[[1],
[5,3,2],
[5,3,2],
[4],
[5,3,2]]})
ix = df.RS.apply(set) == df.IS.apply(set)
df['Level'] = ''
df.loc[ix, 'Level'] = 'I'
df:
# returns:
RS IS Level
[1, 2, 3, 4, 5] [1]
[2, 3, 5] [5, 3, 2] I
[2, 3, 4, 5] [5, 3, 2]
[4] [4] I
[2, 3, 4, 5] [5, 3, 2]
如果您需要洗掉I將被分配的行;您實際上不需要分配I,只需使用:
ix = df.RS.apply(set) == df.IS.apply(set)
df.loc[~ix]
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/442505.html
