spark寫資料到elasticsearch中報錯
EsSpark.saveToEs(result, "userprofile/users", Map("es.mapping.id" -> "uid"))報錯資訊為
org.elasticsearch.hadoop.EsHadoopException: Could not write all entries [3/1024] (maybe ES was overloaded?). Bailing out...
at org.elasticsearch.hadoop.rest.RestRepository.flush(RestRepository.java:250)
at org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:201)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:163)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:49)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
spark寫入的資料是5000w行資料的RDD,es集群有兩個節點
uj5u.com熱心網友回復:
val conf = new SparkConf();conf.set("es.nodes", elasticsearch_nodes);
conf.set("es.batch.write.retry.count", "10"); # 默認是重試3次,為-1的話為無限重試(慎用)
conf.set("es.batch.write.retry.wait", "60"); # 默認重試等待時間是10s.可適當加大
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/42606.html
標籤:Spark
上一篇:分布式系統
下一篇:如何抓取視頻
