OCR Service

Max is able to convert all typewritten text using optical character recognition (“OCR”). We convert paper-based records, microfilm or existing digital images into a searchable pdf format. Our specialised methods give exceptional accuracy, turning your off-line material into a searchable on-line resource.

We are able to handle jobs of all sizes and can work with all kinds of original materials, including bound volumes and broadsheet newspapers. We can output to a variety of formats, including PDF/A, text, MS-Word, XML and HTML.

Our sophisticated OCR system uses pattern recognition algorithms, which identify individual characters. A dictionary-based analysis then enables the system to deduce the content on a word-by-word basis, even where individual characters have not been picked up correctly. The OCR process recognises and retains content layout such as columns, tables and illustrations. This means that the document can be displayed in its original layout on the PDF whilst still being a fully searchable archive.

Some of the clients for whom we have undertaken large-scale OCR projects include:

London School of Economics
British Universities Film & Video Council
Anti-Slavery International Library
Greenwich University

Talk to Max about your OCR needs now! Talk to Max 020 8309 5445 or fill out our enquiry form.

LSE Student 'Beaver' newspaper founded 1949, over 15,000 documents captured and OCR'd
Image courtesy of LSE