我是新手。現用spark運行logistic Regression時遇到以下問題。我個人的問題:我的訓練集是5列資料,我設定的user.log也是5個引數,為什么會說不匹配。求大神指導,具體資訊如下。
scala> val data=https://bbs.csdn.net/topics/spark.sparkContext.
| textFile("file:////media/sf_shared_file/train.csv").
| map(_.split(",")).
| map(p=> user_log(Vectors.dense(p(0).toDouble,p(1).toDouble,p(2).toDouble,p(3).toDouble,p(4).toString))).toDF()
<console>:35: error: not enough arguments for method apply: (features: org.apache.spark.ml.linalg.Vector, lable: Double)user_log in object user_log.
Unspecified value parameter lable.
map(p=> user_log(Vectors.dense(p(0).toDouble,p(1).toDouble,p(2).toDouble,p(3).toDouble,p(4).toString))).toDF()
^
類user_log
case class user_log(features:org.apache.spark.ml.linalg.Vector, lable: Double)
train.csv:
user_id age_range gender merchant_id label
34176 6 0 944 -1
34176 6 0 412 -1
34176 6 0 1945 -1
34176 6 0 4752 -1
uj5u.com熱心網友回復:
map(p=> user_log(Vectors.dense(p(0).toDouble,p(1).toDouble,p(2).toDouble,p(3).toDouble),p(4).toString)))轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/9465.html
標籤:Spark
上一篇:CDN和IDC的關系
下一篇:redis集群
