我想在 Spark Scala 中讀取一個具有名稱的檔案:monthlyPurchaseFile{202205}-May.TXT
我正在使用以下代碼:
val df = spark.read.text("handel_special_ch/monthlyPurchaseFile{202205}-May.TXT"
但我得到以下例外:
org.apache.spark.sql.AnalysisException: Path does not exist: file:/home/hdp_batch_datalake_dev/handel_special_ch/monthlyPurchaseFile{202205}-May.TXT
at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$3(DataSource.scala:792)
at org.apache.spark.util.ThreadUtils$.$anonfun$parmap$2(ThreadUtils.scala:372)
at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
at scala.util.Success.$anonfun$map$1(Try.scala:255)
at scala.util.Success.map(Try.scala:213)
at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
請建議,我如何讀取名稱中包含字符{的檔案} 。
uj5u.com熱心網友回復:
path您傳遞給該方法spark.read.text的 被視為正則運算式。由于{和}是特殊字符,Spark 會嘗試根據該運算式匹配路徑。您可以使用該?字符來匹配任何字符,因此以下應該可以作業:
val df = spark.read.text("handel_special_ch/monthlyPurchaseFile?202205?-May.TXT"
uj5u.com熱心網友回復:
字符\\用作轉義序列。因此,使用以下代碼可以按預期作業并解決問題:
val df = spark.read.text("handel_special_ch/monthlyPurchaseFile\\{202205\\}-May.TXT"
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/472152.html
