Spark 版本- 3.0.1 亞馬遜 Deequ 版本- deequ-2.0.0-spark-3.1.jar
我在本地的 spark shell 中運行以下代碼:
import com.amazon.deequ.analyzers.runners.{AnalysisRunner, AnalyzerContext}
import com.amazon.deequ.analyzers.runners.AnalyzerContext.successMetricsAsDataFrame
import com.amazon.deequ.analyzers.{Compliance, Correlation, Size, Completeness, Mean,
ApproxCountDistinct, Maximum, Minimum, Entropy}
import com.amazon.deequ.analyzers.{Compliance, Correlation, Size, Completeness, Mean,
ApproxCountDistinct, Maximum, Minimum, Entropy}
val analysisResult: AnalyzerContext = {AnalysisRunner.onData(datasourcedf).addAnalyzer(Size()).addAnalyzer(Completeness("customerNumber")).addAnalyzer(ApproxCountDistinct("customerNumber")).addAnalyzer(Minimum("creditLimit")).addAnalyzer(Mean("creditLimit")).addAnalyzer(Maximum("creditLimit")).addAnalyzer(Entropy("creditLimit")).**run()**}
錯誤:
java.lang.NoSuchMethodError: 'scala.Option
org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction.toAggregateExpression$default$2()'
at org.apache.spark.sql.DeequFunctions$.withAggregateFunction(DeequFunctions.scala:31)
at org.apache.spark.sql.DeequFunctions$.stateful_approx_count_distinct(DeequFunctions.scala:60)
at com.amazon.deequ.analyzers.ApproxCountDistinct.aggregationFunctions(ApproxCountDistinct.scala:52)
at com.amazon.deequ.analyzers.runners.AnalysisRunner$.$anonfun$runScanningAnalyzers$3(AnalysisRunner.scala:319)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
at scala.collection.immutable.List.flatMap(List.scala:355)
at com.amazon.deequ.analyzers.runners.AnalysisRunner$.liftedTree1$1(AnalysisRunner.scala:319)
at com.amazon.deequ.analyzers.runners.AnalysisRunner$.runScanningAnalyzers(AnalysisRunner.scala:318)
at com.amazon.deequ.analyzers.runners.AnalysisRunner$.doAnalysisRun(AnalysisRunner.scala:167)
at com.amazon.deequ.analyzers.runners.AnalysisRunBuilder.run(AnalysisRunBuilder.scala:110)
... 63 elided
有人可以讓我知道如何解決這個問題
uj5u.com熱心網友回復:
您不能將 Deeque 2.0.0 版與 Spark 3.0 一起使用,因為由于 Spark 內部結構的變化,它與二進制檔案不兼容。使用 Spark 3.0,您需要使用版本1.2.2-spark-3.0
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/345653.html
標籤:亚马逊网络服务 斯卡拉 阿帕奇火花 apache-spark-sql 亚马逊-迪曲
下一篇:迭代獲取子串
