es總資料量大約有10億,去最近一個月的資料(大概5000萬),使用sparksql去加載,然后處理相關業務。加載例外緩慢,感謝有做過類似優化的朋友共享一下。另附加載代碼:
val vehpassDataFrame = sparkSession.sqlContext.read.format("org.elasticsearch.spark.sql").options(options).load("alias_veh_pass/doc")
vehpassDataFrame.select("hphm","hpzl","jgsj","gctp1","gcbh","lhy_syxz").createTempView("alias_veh_pass")
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/47510.html
標籤:Spark
