History

Ryan Schultz 60dfbf82b2 Add OCR layer to PDFs		2025-11-17 14:20:52 -06:00
..
AddOCR.bat	Add OCR layer to PDFs	2025-11-17 14:20:52 -06:00
readme.md	Add OCR layer to PDFs	2025-11-17 14:20:52 -06:00

readme.md

This script adds an OCR text layer to any PDF using OCRmyPDF, with smart handling for PDFs that already contain text.

It integrates directly into the Windows right-click menu, so you can right-click any PDF → Add OCR Layer.

✅ Features

✔ Right-click any PDF to run OCR
✔ Detects:
- Tagged PDFs
- PDFs with pre-existing OCR
✔ Prompts user when text already exists:
- R → --redo-ocr (best for mixed raster/vector)
- F → --force-ocr (overwrite all text)
- S → Skip OCR
✔ Produces a new file with _ocr.pdf appended
✔ Works even when OCRmyPDF returns ambiguous exit codes

📦 Installation Guide

1. Install Python

Install Python 3.11 or later:
https://www.python.org/downloads/
Be sure to check:

☑ Add python.exe to PATH

2. Install OCRmyPDF

Open Command Prompt (Win+R → cmd) and install:

nginx

Copy code

pip install ocrmypdf

OCRmyPDF requires several external tools.

3. Install Ghostscript

Required for rasterizing pages:

nginx

Copy code

choco install ghostscript

Or download manually:
https://ghostscript.com/releases/index.html

4. Optional: Install Tesseract

OCRmyPDF bundles a basic engine, but Tesseract yields better results:

nginx

Copy code

choco install tesseract

Or install manually from UB Mannheim builds.

5. Copy the Script

Save the provided batch script as:

makefile

Copy code

C:\Tools\add_ocr_layer.bat

(You may place it anywhere, but avoid locations that sync to the cloud.)

Automated (recommended)

Create a .reg file:

swift

Copy code

Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\Add OCR Layer] @="Add OCR Layer" [HKEY_CLASSES_ROOT\*\shell\Add OCR Layer\command] @="\"C:\\Tools\\add_ocr_layer.bat\" \"%1\""

Double-click to install.

Manual (if needed)

Navigate to:

Copy code

Computer\HKEY_CLASSES_ROOT\*\shell\

Create key: Add OCR Layer
Inside it, create key: command
Set default value to:

perl

Copy code

"C:\Tools\add_ocr_layer.bat" "%1"

▶️ Usage

Right-click any PDF → Add OCR Layer

The script will:

Show the file path
Run OCRmyPDF
Detect if pages contain text
If text is found, it will prompt:

sql

Copy code

Choose how to proceed: R = Use --redo-ocr (raster areas only) F = Use --force-ocr (overwrite all text) S = Skip OCR

Your OCR’d file will be saved as:

Copy code

original_filename_ocr.pdf

⚠ Troubleshooting

Ghostscript not found (‘gs’ missing)

Install via Chocolatey:

nginx

Copy code

choco install ghostscript

Or add Ghostscript’s bin folder to PATH manually.

OCRmyPDF not found

Ensure Python Scripts folder is in PATH:

makefile

Copy code

C:\Users\<you>\AppData\Local\Programs\Python\Python312\Scripts\

TaggedPDFError appears and OCR stops

This script handles it automatically and will offer choices.

🧪 Tested On

Windows 10
Windows 11
Python 3.12
OCRmyPDF 15.x
Ghostscript 10.x
Tesseract 5.x

readme.md

📘 Add OCR Layer – Windows Context Menu Script

✅ Features

📦 Installation Guide

1. Install Python

2. Install OCRmyPDF

3. Install Ghostscript

4. Optional: Install Tesseract

5. Copy the Script

6. Add “Add OCR Layer” to Right-Click Menu

Automated (recommended)

Manual (if needed)

▶️ Usage

Right-click any PDF → Add OCR Layer

⚠ Troubleshooting

Ghostscript not found (‘gs’ missing)

OCRmyPDF not found

TaggedPDFError appears and OCR stops

🧪 Tested On

readme.md Unescape Escape

📘 Add OCR Layer – Windows Context Menu Script

✅ Features

📦 Installation Guide

1. Install Python

2. Install OCRmyPDF

3. Install Ghostscript

4. Optional: Install Tesseract

5. Copy the Script

6. Add “Add OCR Layer” to Right-Click Menu

Automated (recommended)

Manual (if needed)

▶️ Usage

Right-click any PDF → Add OCR Layer

⚠ Troubleshooting

Ghostscript not found (‘gs’ missing)

OCRmyPDF not found

TaggedPDFError appears and OCR stops

🧪 Tested On

readme.md