示例:我有一個像

我想用否定過濾多個條件 firstname == "James" & lastname == "Smith" or firstname == "Robert" & lastname == "Williams"
我需要的輸出應該是
我正在使用這樣的東西,但它不起作用
df = df.filter(~(df.firstname == "James") & (df.lastname == "Smith")|~(df.firstname == "Robert") & (df.lastname == "Williams"))
uj5u.com熱心網友回復:
您必須對整個條件應用否定。
data = [("James","","Smith","36636","M",3000),
("Michael","Rose","jim","40288","M",4000),
("Robert","","Williams","42114","M",4000),
("Maria","Anne","Jones","39192","F",4000),
("Jen","Mary","Brown","60563","F",-1)]
df = spark.createDataFrame(data, ("firstname", "middlename", "lastname", "id", "gender", "salary", ))
(df.filter(~(((df.firstname == "James") & (df.lastname == "Smith")) |
((df.firstname == "Robert") & (df.lastname == "Williams"))
)
)
.show())
輸出
--------- ---------- -------- ----- ------ ------
|firstname|middlename|lastname| id|gender|salary|
--------- ---------- -------- ----- ------ ------
| Michael| Rose| jim|40288| M| 4000|
| Maria| Anne| Jones|39192| F| 4000|
| Jen| Mary| Brown|60563| F| -1|
--------- ---------- -------- ----- ------ ------
uj5u.com熱心網友回復:
這里 OR 條件沒有給出正確的輸出。我們必須將其更改為 AND
df_new = (df
.filter(~((F.col("firstname") == "James") & (F.col("lastname") == "Smith"))
& ~((F.col("firstname") == "Robert") & (F.col("lastname") == "Williams"))
)
)
結果如下 -

uj5u.com熱心網友回復:
A|Bis 的否定(~A)&(~B)。
所以試試這個:
df = df.filter((~(firstname == "James" & lastname == "Smith")) & (~(firstname == "Robert" & lastname == "Williams")))
由于C&Dis的否定(~C)|(~D),您可以進一步簡化過濾條件為
df = df.filter((firstname != "James" | lastname != "Smith") & (firstname != "Robert" | lastname != "Williams"))
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/392032.html
上一篇:在目錄中查找xlsx并將作業表分配給不同的資料幀-Python
下一篇:在資料框列中的字串周圍添加雙引號
