Tesseract

From Parallel Library Services
Revision as of 10:42, 6 October 2021 by Simon (talk | contribs)
Jump to navigation Jump to search

https://github.com/tesseract-ocr/tesseract

Documentation: https://tesseract-ocr.github.io/

see also Flatbed scanning a book and processing it into a PDF

Tesseract is an open source text recognition engine, that performs OCR (Optical Character Recognition) on images to create a copy/pasteable text layer.