You might think that optical character recognition (OCR) software can’t do much more. As long as it recognizes documents accurately and reasonably quickly, where’s the room for improvement? In fact, Nuance claims several notable improvements for OmniPage Professional 16, probably the most well-known OCR application on the market.
First, Nuance says the new version is between 16 percent and 27 percent more accurate than before, while also being up to 46 percent faster. In addition, it is said to be able to compensate for lens distortions in images of pages captured with a camera, to automatically redact words in sensitive documents and to process electronic and paper forms. It can create documents in Office 2007’s XPS format and includes copies of PaperPort 11 (Nuance’s document management application) and PDF Converter 4 which, as you can imagine, converts documents to PDF format.
The program is also claimed to do a better job of creating accurate representations of pages without packing everything into separate text and graphics frames. This has long been a nuisance as it’s one thing to get the page looking right and another to easily edit the text within this layout. Most OCR programs struggle with the “Basic Editing in Layout” part.
After installing and activating OmniPage Professional 16, you need to set up a scanner to work with it. The scanner setup wizard should run automatically, but it didn’t in our case. The wizard downloaded the latest scanner database from Nuance, which didn’t include our HP OfficeJet 7210, a recent and popular all-in-one. We had to run the program’s diagnostics to recognize it, which involved scanning text, grayscale, and color documents — about five minutes of work.
The main processing screen offers four main task tabs at the top with three windows below; one for thumbnails, one for a graphical image of the page, and one for the OCRed text. At the bottom is a full-width area for document statistics, most of which OmniPage calculates for itself.
The tabs are for workflow, load or scan type, page layout, and export. Despite what Nuance seems to be thinking, they’re not that intuitive to use. As if that were an admission, a series of how-to guides walks you through many of the tasks that should be obvious, but aren’t. Unexpectedly, the standard 1-2-3 workflow, designed to automatically process the most common OCR tasks, is set by default to load images from a file – do most customers really want to get their input documents? You must change this behavior before the program looks for a scanner instead.
After scanning a document and recognizing its characters, the program checked it and claimed it was 100 percent correct, even though there were two instances of the same typo in the text. Just because reading “tor” as “for” still makes a legitimate word doesn’t mean it’s correct.
OmniPage completed recognition in just over two seconds, which is fast, and even a more complex page with graphics and boxed text took less than 10 seconds. This page needed more preparation before we could get an editable document with a reasonable resemblance to the original. We needed to outline the areas of the page that we wanted treated as text instead of leaving OmniPage on Automatic.
Here, too, there are clear deviations. Some are understandable, such as B. the misreading of colored text compared to the original, while others, such. B. Differences in font and text style are less acceptable. Some of the text has been placed in boxes on the Word 2003 page we created, while the rest has been converted to the main text. There is also a variety of indents and line spacing, even though all text in the original has the same left margin.
It’s easy enough to save OCRed documents to any of the supported file types, including Word 2007’s docx, Adobe’s PDF, WordPerfect X3, and WAV for audio playback. The text-to-speech conversion is particularly good and sounds comparatively natural and expressive despite the US accent.
If you don’t need PaperPort or PDF Converter and can do without some of OmniPage Professional 16’s more business-related features like form OCR, word redacting (also redacting) and the batch processing manager, the standard OmniPage 16 costs around £60 – a big saving compared to the professional version.
“‘Verdict”‘
The improvements highlighted for OmniPage Professional 16 would all be useful, but according to our testing, the software still has some way to go to meet them. When batch processing long, standard text documents, the software can undoubtedly save a lot of time, but for more complex pages with significant graphic content, it can still be difficult to get close to what you’ve scanned.
points in detail
functions 7
value 7
Ease of use 6