Spark SQL在Spark-shell里面進行如下操作時,最后一條陳述句報錯,請問最后一句哪里錯誤了?
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext._
case class Person(name: String, age: Int)
val people = sc.textFile("file:///home/liuyang/lytest/testPerson.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
val peopleDataFrame = sqlContext.createDataFrame(people)
peopleDataFrame.registerTempTable("people")
val teenagers= sqlContext.sql("SELECT name FROM people WHERE age >= 13 AND age <= 19")
teenagers.count()
teenagers.map(t => "Name: " + t(0)).collect().foreach(println)
teenagers.map(a=>(if(a(1).toString.toInt>13) org.apache.spark.sql.Row(a(0),a(1).toString.toInt+1000) else org.apache.spark.sql.Row(a(0),a(1).toString.toInt+9000))).collect()
另外,val data = Array(1,2,3,4,5)
val distData = sc.parallelize(data)
這樣新建RDD是正確的,但是
val data = Array(“asd”,”www”,”qqq”,”weq”,”qwewq”)
val distData = sc.parallelize(data)
這樣就會報錯,創建String型別的RDD應該怎么修改
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/92965.html
標籤:Spark
上一篇:hive與hbase整合
下一篇:JSP課程學習(第一周)
