我創建了一個包含一個主服務器和兩個從服務器的 Spark 集群,每個集群都在一個 Docker 容器上。我用命令啟動它start-all.sh。
我可以從我的本地機器訪問 UI localhost:8080,它顯示集群啟動良好:
Spark UI 的螢屏截圖
然后我嘗試使用以下命令 spark-submit 從我的主機(而不是 Docker 容器)提交一個簡單的 Python 腳本: spark-submit --master spark://spark-master:7077 test.py
測驗.py:
import pyspark
conf = pyspark.SparkConf().setAppName('MyApp').setMaster('spark://spark-master:7077')
sc = pyspark.SparkContext(conf=conf)
但是控制臺向我回傳了這個錯誤:
22/01/26 09:20:39 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://spark-master:7077...
22/01/26 09:20:40 WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master spark-master:7077
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109)
at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: Failed to connect to spark-master:7077
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
... 4 more
Caused by: java.net.UnknownHostException: spark-master
at java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:797)
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1509)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1368)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1302)
at java.base/java.net.InetAddress.getByName(InetAddress.java:1252)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143)
at io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43)
at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63)
at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55)
at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57)
at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32)
at io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:108)
at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:202)
at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:48)
at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:182)
at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:168)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:551)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:604)
at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:985)
at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:505)
at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:416)
at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:475)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518)
at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 more
我也嘗試使用簡單的 scala 腳本,只是為了嘗試訪問集群,但我遇到了同樣的錯誤。
您知道如何使用 python 腳本訪問我的集群嗎?
(編輯)
我忘了指定我已經在我的主人和我的奴隸之間創建了一個 docker 網路。因此,在 MrTshoot 和 Gaarv 的幫助下,我用我的主容器的 ip 替換了 spark-master(在 spark://spark-master:7077 中)(您可以使用命令獲取它docker network inspect my-network)。
這是作業!謝謝!
uj5u.com熱心網友回復:
當您指定.setMaster('spark://spark-master:7077')時,它意味著“在 DNS 地址“spark-master”和埠 7077 上到達本地機器無法決議的 spark 集群。
因此,為了讓您的主機訪問集群,您必須指定 Spark 集群的 Docker DNS / IP 地址,檢查本地計算機上的“docker0”介面并用它替換“spark-master”。
uj5u.com熱心網友回復:
您無法使用其服務名稱連接主機上的 Docker 服務。您應該設定 DNS 或 IP 服務或使用以下技巧:
公開您的 Spark 集群埠。
打開您的 /etc/hosts 并在其上添加以下內容
127.0.0.1 本地主機火花大師
::1 本地主機火花大師
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/421887.html
標籤:
上一篇:TypeError:_()接受2個位置引數,但給定了4個Databricks
下一篇:更改pyspark中的日期格式
