Optical Character Recognition (OCR)...

Optical Character Recognition - OCR
Optical Character Recognition (OCR) in progress

Optical Character Recognition (OCR)

Max is able to convert all typewritten text using optical character recognition (“OCR”). We convert paper-based records, microfilm or existing digital images into a searchable .pdf format. Our specialised methods give exceptional accuracy, turning your off-line material into a searchable on-line resource.

We are able to handle jobs of all sizes and can work with all kinds of original materials, including bound volumes and broadsheet newspapers. We can output to a variety of formats, including PDF/A, text, MS-Word, XML and HTML.

Our sophisticated OCR system uses pattern recognition algorithms, which identify individual characters. A dictionary-based analysis then enables the system to deduce the content on a word-by-word basis, even where individual characters have not been picked up correctly. The OCR process recognises and retains content layout such as columns, tables and illustrations. This means that the document can be displayed in its original layout on the PDF whilst still being a fully searchable archive.

Some of the clients for whom we have undertaken large-scale OCR projects include:

  • London School of Economics
  • British Universities Film & Video Council
  • Anti-Slavery International Library
  • Greenwich University

If you'd like to get in touch to find out more about our services please use the buttons below or call us on 020 8309 5445

Testimonials

Max have been a trusted digitisation and solutions partner with King’s College London Archives for more than a decade. They have always undertaken work to a high standard, and on time, and are a friendly team who are ready to help at short notice.

--Dr Geoff Browell | Head of Archives and Research Collections | King’s College London
Optical Character Recognition - OCR
Optical Character Recognition (OCR) in progress