我已經創建了一個包含該findspark.init()功能的程式的 docker 鏡像。該程式在本地機器上運行良好。當我嘗試使用 運行影像時docker run -p 5000:5000 imgname:latest,出現以下錯誤:
Traceback (most recent call last):
File "app.py", line 37, in <module>
findspark.init()
File "/usr/local/lib/python3.8/site-packages/findspark.py", line 129, in init
spark_home = find()
File "/usr/local/lib/python3.8/site-packages/findspark.py", line 35, in find
raise ValueError(
ValueError: Couldn't find Spark, make sure SPARK_HOME env is set or Spark is in an expected location (e.g. from homebrew installation).
任何人都可以提出解決這個問題的方法嗎?當我嘗試在沒有 findspark 函式的情況下制作程式時,我遇到了與 Spark 相關的其他錯誤。這是我的 dockerfile:
#Use python as base image
FROM python:3.8
#Use working dir app
WORKDIR /app
#Copy contents of current dir to /app
ADD . /app
#Install required packages
RUN pip3 install --upgrade pip
RUN pip3 install -r requirements.txt
#Open port 5000
EXPOSE 5000
#Set environment variable
ENV NAME analytic
#Run python program
CMD python app.py
這是影像停滯的代碼部分:
### multiple lines of importing libraries and then
# Spark imports
import findspark
findspark.init()
import pyspark
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql import functions as F
requirements.txt 檔案可以在這個鏈接上看到。
uj5u.com熱心網友回復:
即使您正在運行 pyspark,Spark 也需要 Java,因此您需要在映像中安裝 Java。此外,如果您仍在使用,findspark您也可以指定SPARK_HOME目錄:
RUN apt-get update && apt-get install -y default-jre
ENV SPARK_HOME /usr/local/lib/python3.8/site-packages/pyspark
總而言之,你Dockerfile應該看起來像:
#Use python as base image
FROM python:3.8
RUN apt-get update && apt-get install -y default-jre
#Use working dir app
WORKDIR /app
#Copy contents of current dir to /app
ADD . /app
#Install required packages
RUN pip3 install --upgrade pip
RUN pip3 install -r requirements.txt
#Open port 5000
EXPOSE 5000
#Set environment variable
ENV NAME analytic
ENV SPARK_HOME /usr/local/lib/python3.8/site-packages/pyspark
#Run python program
CMD python app.py
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/387864.html
