| .. | ||
| AddOCR.bat | ||
| readme.md | ||
📘 Add OCR Layer – Windows Context Menu Script
This script adds an OCR text layer to any PDF using OCRmyPDF, with smart handling for PDFs that already contain text.
It integrates directly into the Windows right-click menu, so you can right-click any PDF → Add OCR Layer.
✅ Features
-
✔ Right-click any PDF to run OCR
-
✔ Detects:
-
Tagged PDFs
-
PDFs with pre-existing OCR
-
-
✔ Prompts user when text already exists:
-
R →
--redo-ocr(best for mixed raster/vector) -
F →
--force-ocr(overwrite all text) -
S → Skip OCR
-
-
✔ Produces a new file with
_ocr.pdfappended -
✔ Works even when OCRmyPDF returns ambiguous exit codes
📦 Installation Guide
1. Install Python
Install Python 3.11 or later:
https://www.python.org/downloads/
Be sure to check:
☑ Add python.exe to PATH
2. Install OCRmyPDF
Open Command Prompt (Win+R → cmd) and install:
nginx
Copy code
pip install ocrmypdf
OCRmyPDF requires several external tools.
3. Install Ghostscript
Required for rasterizing pages:
nginx
Copy code
choco install ghostscript
Or download manually:
https://ghostscript.com/releases/index.html
4. Optional: Install Tesseract
OCRmyPDF bundles a basic engine, but Tesseract yields better results:
nginx
Copy code
choco install tesseract
Or install manually from UB Mannheim builds.
5. Copy the Script
Save the provided batch script as:
makefile
Copy code
C:\Tools\add_ocr_layer.bat
(You may place it anywhere, but avoid locations that sync to the cloud.)
6. Add “Add OCR Layer” to Right-Click Menu
Automated (recommended)
Create a .reg file:
swift
Copy code
Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\Add OCR Layer] @="Add OCR Layer" [HKEY_CLASSES_ROOT\*\shell\Add OCR Layer\command] @="\"C:\\Tools\\add_ocr_layer.bat\" \"%1\""
Double-click to install.
Manual (if needed)
Navigate to:
Copy code
Computer\HKEY_CLASSES_ROOT\*\shell\
Create key: Add OCR Layer
Inside it, create key: command
Set default value to:
perl
Copy code
"C:\Tools\add_ocr_layer.bat" "%1"
▶️ Usage
Right-click any PDF → Add OCR Layer
The script will:
-
Show the file path
-
Run OCRmyPDF
-
Detect if pages contain text
-
If text is found, it will prompt:
sql
Copy code
Choose how to proceed: R = Use --redo-ocr (raster areas only) F = Use --force-ocr (overwrite all text) S = Skip OCR
- Your OCR’d file will be saved as:
Copy code
original_filename_ocr.pdf
⚠ Troubleshooting
Ghostscript not found (‘gs’ missing)
Install via Chocolatey:
nginx
Copy code
choco install ghostscript
Or add Ghostscript’s bin folder to PATH manually.
OCRmyPDF not found
Ensure Python Scripts folder is in PATH:
makefile
Copy code
C:\Users\<you>\AppData\Local\Programs\Python\Python312\Scripts\
TaggedPDFError appears and OCR stops
This script handles it automatically and will offer choices.
🧪 Tested On
-
Windows 10
-
Windows 11
-
Python 3.12
-
OCRmyPDF 15.x
-
Ghostscript 10.x
-
Tesseract 5.x