- This forum has 6 topics, and was last updated 1 year, 9 months ago by .
- You must be logged in to create new topics.
Optical Character Recognition, or OCR, is a process by which software reads a page image and translates it into a text file by recognizing the shapes of the letters. OCR enables searching of large quantities of full-text data, but it is never 100% accurate. The level of accuracy depends on the print quality of the original newspaper issue, its condition at the time of microfilming, the level of detail captured by the microfilm scanner, and the quality of the OCR software. Issues with poor quality paper, small print, mixed fonts, multiple column layouts, or damaged pages may contribute to poor OCR accuracy. The effectiveness of OCR software has improved dramatically over the years, however, there are many pages within the CHNC that were added more than 10 years ago, and the quality of the OCR created text for those pages can only be corrected manually.