OCR: Optical Character Recognition

From Parallel Library Services
Revision as of 11:30, 25 November 2021 by Simon (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Ocr conversion

OCR (Optical Character Recognition) is a technology which recognises typographic characters in bitmap images, and renders them as machine encoded text. This is particularly useful when digitising printed matter, as running OCR on a file produces a selectable text layer that can searched with a computer.

Tesseract and OCRmyPDF are two free software tools that can be used to do OCR on a document.