我有一個非常簡單的代碼
val win = Window.partitionBy("app").orderBy("date")
val appSpendChange = appSpend
.withColumn("prevSpend", lag(col("Spend")).over(win))
.withColumn("spendChange", when(isnull($"Spend" - "prevSpend"), 0)
.otherwise($"spend" - "prevSpend"))
display(appSpendChange)
這應該可以作業,因為我指的是 PySpark 示例并將其更改為 scala :Pyspark Column Transformation: Calculate Percentage Change for Each Group in a Column
但是,我收到此錯誤:
error: overloaded method value lag with alternatives:
(e: org.apache.spark.sql.Column,offset: Int,defaultValue: Any,ignoreNulls: Boolean)org.apache.spark.sql.Column <and>
(e: org.apache.spark.sql.Column,offset: Int,defaultValue: Any)org.apache.spark.sql.Column <and>
(columnName: String,offset: Int,defaultValue: Any)org.apache.spark.sql.Column <and>
(columnName: String,offset: Int)org.apache.spark.sql.Column <and>
(e: org.apache.spark.sql.Column,offset: Int)org.apache.spark.sql.Column
cannot be applied to (org.apache.spark.sql.Column)
.withColumn("prevPctSpend", lag(col("pctCtvSpend")).over(win))
^
我應該怎么理解?特別是e:注釋?感謝并感謝任何反饋。
uj5u.com熱心網友回復:
您應該將此錯誤理解為:
- 有 5 個方法
lag定義了以下引數和回傳型別((<parameters>)<return>:(e: org.apache.spark.sql.Column,offset: Int,defaultValue: Any,ignoreNulls: Boolean)org.apache.spark.sql.Column(e: org.apache.spark.sql.Column,offset: Int,defaultValue: Any)org.apache.spark.sql.Column(columnName: String,offset: Int,defaultValue: Any)org.apache.spark.sql.Column(columnName: String,offset: Int)org.apache.spark.sql.Column(e: org.apache.spark.sql.Column,offset: Int)org.apache.spark.sql.Column
- 這些可能性都不能應用于型別引數
(org.apache.spark.sql.Column)(您撰寫的代碼)
最后,這意味著您呼叫了一個缺少引數或引數無效的方法。
正如@Dima 所說,您可能希望在offset對lag.
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/515726.html
標籤:斯卡拉阿帕奇火花数据块
