OCR: Optical Character Recognition

From Parallel Library Services
Jump to navigation Jump to search
Ocr conversion

OCR (Optical Character Recognition) is a technology which recognises typographic characters in bitmap images, and renders them as machine encoded text. This is particularly useful when digitising printed matter, as running OCR on a file produces a selectable text layer that can searched with a computer.

Tesseract and OCRmyPDF are two free software tools that can be used to do OCR on a document.