ocrmypdf-找不到源pdf？-有解無憂

我想使用 ocrmypdf 將一些 pdf 檔案從圖片轉換為可讀的 pdf -

使用以下簡單代碼進行了嘗試：（invoice.pdf 當然可以在與 python-script 相同的路徑中使用，并且應該生成 output.pdf）

import ocrmypdf
if __name__ == '__main__':
  fn = r"C:\Users\Polzi\Documents\DEV\Python-Diverses\PDFOCR\invoice.pdf"
  ocrmypdf.ocr(fn, 'output.pdf', deskew=True)

但不幸的是，我收到此錯誤訊息：

$ python exPDFOCR.py
[WinError 2] Das System kann die angegebene Datei nicht finden
Traceback (most recent call last):
  File "C:\Users\Polzi\Documents\DEV\Python-Diverses\PDFOCR\exPDFOCR.py", line 25, in <module>
    ocrmypdf.ocr('invoice.pdf', 'output.pdf', deskew=True)
  File "C:\Users\Polzi\Documents\DEV\.venv\testing\lib\site-packages\ocrmypdf\api.py", line 336, in ocr
    check_options(options, plugin_manager)
  File "C:\Users\Polzi\Documents\DEV\.venv\testing\lib\site-packages\ocrmypdf\_validation.py", line 271, in check_options
    ocr_engine_languages = plugin_manager.hook.get_ocr_engine().languages(options)
  File "C:\Users\Polzi\Documents\DEV\.venv\testing\lib\site-packages\ocrmypdf\builtin_plugins\tesseract_ocr.py", line 155, in languages
    return tesseract.get_languages()
  File "C:\Users\Polzi\Documents\DEV\.venv\testing\lib\site-packages\ocrmypdf\_exec\tesseract.py", line 143, in get_languages
    proc = run(
  File "C:\Users\Polzi\Documents\DEV\.venv\testing\lib\site-packages\ocrmypdf\subprocess\__init__.py", line 53, in run
    proc = subprocess_run(args, env=env, **kwargs)
  File "c:\users\polzi\appdata\local\programs\python\python39\lib\subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "c:\users\polzi\appdata\local\programs\python\python39\lib\subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "c:\users\polzi\appdata\local\programs\python\python39\lib\subprocess.py", line 1420, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] Das System kann die angegebene Datei nicht finden

為什么他不能在執行 py 檔案的同一檔案夾中找到該檔案？

uj5u.com熱心網友回復：

有時，第一條錯誤訊息可能會在沒有明確原因的情況下產生誤導

在這種情況下，主要訊息 "The system cannot find the specified file"

將引導用戶專注于檔案名不正確的原因，如本例所示。

錯誤應該報告的是未找到依賴項中的必需檔案。這可能是由于一個或多個 Tesseract 或相關的 Leptonica / Language 資料檔案由于未安裝或安裝不當而不在正確位置引起的。

據了解，從https://github.com/UB-Mannheim/tesseract/wiki “腳本現在可以正常作業”在 Windows 上安裝 tesseract

請注意，缺少依賴項是此處出現類似訊息的原因Import ocrmypdf in Visual Stdio Code in Python

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/415054.html

標籤：

上一篇：如何替換.properties檔案中定義的引數中的環境變數

下一篇：部署到Heroku(Ubuntu)的Puppeteer不使用TimesNewRoman字體下載PDF