What is Optical Character Recognition (OCR)?

Navigation & FunctionalityMoving about the databaseWhat is Optical Character Recognition (OCR)?

Optical Character Recognition, or OCR, is a process by which software reads a page image and translates it into a text file by recognising the shapes of the letters (The NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials).

OCR enables searching of large quantities of full-text data, but it is never 100% accurate. The level of accuracy depends on the print quality of the original issue, its condition at the time of microfilming, the level of detail captured by the microfilm scanner, and the quality of the OCR software. Issues with poor quality paper, small print, mixed fonts, multiple column layouts, or damaged pages may have poor OCR accuracy.

The searchable text and titles in this collection have been automatically generated using OCR software. They have not been manually reviewed or corrected.

Skip to toolbar