如題,原來的時間戳是STRING型別,(%d-%b-%Y %H.%M.%S.%f %p) 03-DEC-15 12.00.00.000000 AM
一般的型別轉換我可以用withColumn里面用CAST(‘double')
但這種想轉換成DATETIME型別的,或者INT,要怎么處理呢
uj5u.com熱心網友回復:
scala> spark.version
res11: String = 2.0.2
scala> val df = sc.parallelize(Seq("03-DEC-15 12.00.00.000000 AM")).toDF
df: org.apache.spark.sql.DataFrame = [value: string]
scala> df.show(false)
+----------------------------+
|value |
+----------------------------+
|03-DEC-15 12.00.00.000000 AM|
+----------------------------+
scala> val df2 = df.withColumn("dateType", unix_timestamp($"value", "dd-MMM-yy hh.mm.ss.SSSSSS a"))
df2: org.apache.spark.sql.DataFrame = [value: string, dateType: bigint]
scala> df2.show(false)
+----------------------------+----------+
|value |dateType |
+----------------------------+----------+
|03-DEC-15 12.00.00.000000 AM|1449118800|
+----------------------------+----------+
scala> df2.withColumn("newFormat", from_unixtime($"dateType")).show(false)
+----------------------------+----------+-------------------+
|value |dateType |newFormat |
+----------------------------+----------+-------------------+
|03-DEC-15 12.00.00.000000 AM|1449118800|2015-12-03 00:00:00|
+----------------------------+----------+-------------------+
see the Java simpleDateFormat
And Spark unix_timestamp UDF
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/67593.html
標籤:Spark
