考慮這個 df
---- ------
|cond|chaine|
---- ------
| 0| TF1|
| 1| TF1|
| 1| TNT|
---- ------
我想應用此 withColumn 指令,但僅適用于具有cond == 1以下內容的行:
df.withColumn("New", when($"chaine" === "TF1", "YES!"))
.withColumn("New2", when($"chaine" === "TF1", "YES2!"))
.withColumn("New3", when($"chaine" === "TF1", "YES3!"))
.withColumn("New4", when($"chaine" === "TF1", "YES4!"))
我無法使用,.filter因為我仍然希望cond =!= 1在輸出中有行。
我可以通過在代碼的每個地方添加我的條件來做到這一點:
df.withColumn("New", when($"chaine" === "TF1" AND $"cond" === 1, "YES!"))
.withColumn("New2", when($"chaine" === "TF1" AND $"cond" === 1, "YES2!"))
.withColumn("New3", when($"chaine" === "TF1" AND $"cond" === 1, "YES3!"))
.withColumn("New4", when($"chaine" === "TF1" AND $"cond" === 1, "YES4!"))
但問題是我有很多新列,我想要一個更好的解決方案(比如全域配置?)
謝謝你。
uj5u.com熱心網友回復:
一些簡單的句法思想:
def whenCondIs(n: Int)(condition: Column, value: Any): Column =
when(condition && $"cond" === n, value)
def whenOne(condition: Column, value: Any): Column =
whenCondIs(1)(condition, value)
進而:
df.withColumn("New", whenOne($"chaine" === "TF1", "YES2!"))
.withColumn("New2", whenOne($"chaine" === "TF1", "YES2!"))
uj5u.com熱心網友回復:
您可以在串列中創建條件和要創建的新列之間的映射,并用于foldLeft將它們添加到您的資料框中。像這樣的東西:
val newCols = Seq(
("New", "chaine='TF1'", "YES!"),
("New2", "chaine='TF1'", "YES2!"),
("New3", "chaine='TF1'", "YES3!"),
("New4", "chaine='TF1'", "YES4!")
)
val df1 = newCols.foldLeft(df)((acc, x) =>
acc.withColumn(x._1, when(expr(x._2) && col("cond")===1, lit(x._3)))
)
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/372692.html
上一篇:如何在Quarkus中通過REST從資料庫流式傳輸大資料
下一篇:Scala中集合的索引/切片
