Utility_Apps/Shell/AddOCR
2025-11-17 14:20:52 -06:00
..
AddOCR.bat Add OCR layer to PDFs 2025-11-17 14:20:52 -06:00
readme.md Add OCR layer to PDFs 2025-11-17 14:20:52 -06:00

📘 Add OCR Layer Windows Context Menu Script

This script adds an OCR text layer to any PDF using OCRmyPDF, with smart handling for PDFs that already contain text.

It integrates directly into the Windows right-click menu, so you can right-click any PDF → Add OCR Layer.


Features

  • ✔ Right-click any PDF to run OCR

  • ✔ Detects:

    • Tagged PDFs

    • PDFs with pre-existing OCR

  • ✔ Prompts user when text already exists:

    • R--redo-ocr (best for mixed raster/vector)

    • F--force-ocr (overwrite all text)

    • S → Skip OCR

  • ✔ Produces a new file with _ocr.pdf appended

  • ✔ Works even when OCRmyPDF returns ambiguous exit codes


📦 Installation Guide

1. Install Python

Install Python 3.11 or later:
https://www.python.org/downloads/
Be sure to check:

Add python.exe to PATH


2. Install OCRmyPDF

Open Command Prompt (Win+R → cmd) and install:

nginx

Copy code

pip install ocrmypdf

OCRmyPDF requires several external tools.


3. Install Ghostscript

Required for rasterizing pages:

nginx

Copy code

choco install ghostscript

Or download manually:
https://ghostscript.com/releases/index.html


4. Optional: Install Tesseract

OCRmyPDF bundles a basic engine, but Tesseract yields better results:

nginx

Copy code

choco install tesseract

Or install manually from UB Mannheim builds.


5. Copy the Script

Save the provided batch script as:

makefile

Copy code

C:\Tools\add_ocr_layer.bat

(You may place it anywhere, but avoid locations that sync to the cloud.)


6. Add “Add OCR Layer” to Right-Click Menu

Create a .reg file:

swift

Copy code

Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\Add OCR Layer] @="Add OCR Layer" [HKEY_CLASSES_ROOT\*\shell\Add OCR Layer\command] @="\"C:\\Tools\\add_ocr_layer.bat\" \"%1\""

Double-click to install.

Manual (if needed)

Navigate to:

Copy code

Computer\HKEY_CLASSES_ROOT\*\shell\

Create key: Add OCR Layer
Inside it, create key: command
Set default value to:

perl

Copy code

"C:\Tools\add_ocr_layer.bat" "%1"


▶️ Usage

Right-click any PDF → Add OCR Layer

The script will:

  1. Show the file path

  2. Run OCRmyPDF

  3. Detect if pages contain text

  4. If text is found, it will prompt:

sql

Copy code

Choose how to proceed: R = Use --redo-ocr (raster areas only) F = Use --force-ocr (overwrite all text) S = Skip OCR

  1. Your OCRd file will be saved as:

Copy code

original_filename_ocr.pdf


⚠ Troubleshooting

Ghostscript not found (gs missing)

Install via Chocolatey:

nginx

Copy code

choco install ghostscript

Or add Ghostscripts bin folder to PATH manually.


OCRmyPDF not found

Ensure Python Scripts folder is in PATH:

makefile

Copy code

C:\Users\<you>\AppData\Local\Programs\Python\Python312\Scripts\


TaggedPDFError appears and OCR stops

This script handles it automatically and will offer choices.


🧪 Tested On

  • Windows 10

  • Windows 11

  • Python 3.12

  • OCRmyPDF 15.x

  • Ghostscript 10.x

  • Tesseract 5.x