當我嘗試通過 spark 從 elasticsearch 獲取資料時出現以下錯誤。該錯誤未指定錯誤在哪里。
body2 在 elasticsearch 的開發工具中作業
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark import SparkConf
from pyspark.sql import SparkSession
body2={
"query": {
"bool": {
"must": [
{
"range": {
"@timestamp": {
"lte": "2022-05-03T09:25:15.000-03:00",
"gte": "2022-05-04T09:25:15.000-03:00"
}
}
},
{
"match": {
"type.keyword": "TABLA"
}
}
]
}
},
"size":10
}
es_read_conf = {
"es.nodes": "10.45.15.93",
"es.port": "9200",
"es.query": body2,
"es.nodes.wan.only": "true",
"es.resource" : "indice1/TABLA",
"es.net.http.auth.user": "usuario1",
"es.net.http.auth.pass": "rsl242442j"
}
es_rdd = sc.newAPIHadoopRDD(
inputFormatClass="org.elasticsearch.hadoop.mr.EsInputFormat",
keyClass="org.apache.hadoop.io.NullWritable",
valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable",
conf=es_read_conf)
這是錯誤,我不知道代碼中的錯誤在哪里:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/spark/python/pyspark/context.py", line 859, in newAPIHadoopRDD
jrdd = self._jvm.PythonRDD.newAPIHadoopRDD(self._jsc, inputFormatClass, keyClass,
File "/opt/spark/python/lib/py4j-0.10.9.3-src.zip/py4j/java_gateway.py", line 1321, in __call__
File "/opt/spark/python/pyspark/sql/utils.py", line 111, in deco
return f(*a, **kw)
File "/opt/spark/python/lib/py4j-0.10.9.3-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: java.lang.ClassCastException: class java.util.HashMap cannot be cast to class java.lang.String (java.util.HashMap and java.lang.String are in module java.base of loader 'bootstrap')
at org.apache.spark.api.python.PythonHadoopUtil$.$anonfun$mapToConf$1(PythonHadoopUtil.scala:160)
at org.apache.spark.api.python.PythonHadoopUtil$.$anonfun$mapToConf$1$adapted(PythonHadoopUtil.scala:160)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
...
我查看了代碼,但沒有得到解決方案
謝謝大家
uj5u.com熱心網友回復:
如錯誤訊息所述,問題在于查詢應該是字串而不是字典:
body2="""{
"query": {
"bool": {
"must": [
{
"range": {
"@timestamp": {
"lte": "2022-05-03T09:25:15.000-03:00",
"gte": "2022-05-04T09:25:15.000-03:00"
}
}
},
{
"match": {
"type.keyword": "TABLA"
}
}
]
}
},
"size":10
}"""
你可以在這里看到一個參考
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/472160.html
