Skip to Local Navigation
Skip to Content
California State University, Long BeachCalifornia State University, Long Beach

Omnipage Quick OCR (Optical Character Recognition) Guide

These instructions are for those who wish to scan documents and make changes to those documents. To do this you need to utilize an OCR (optical character recognition) software such as Adobe Acrobat Professional or Omnipage.

For the majority of customers Adobe Acrobat is the preferable software as it offers more office type benefits and is usually more readily available. The exception to this is for departments scanning mostly graphics. Omnipage software works better with graphics and text combinations. There are separate instructions available for using Omnipage and Adobe Acrobat Professional.

If you need assistance determining which is the best OCR program for you to work with you can either consult with your department’s computer technician or contact the Copier Program at x.55329.

Open Omnipage.

Click the “Load Files” button. It's the second button in the second row near the top of your screen.

screenshot

Browse to the document on which you would like to perform OCR.

screenshot

Select the document and click OK.

Omnipage will open your document and display it using three panels from left to right.

  • All of the pages in your document
  • A large view of the current page
  • An area for all of the text contained on the current page.

When you first open your document, the far right panel will be empty, because that is where the text will show up when you run OCR.

screenshot

Back to top

Within the Image Panel on the far left, select all pages of your document on which you would like to perform OCR.

Hint: You can select one page, then click CTRL A to select all pages at once.

Click the "Perform OCR" button. It's the third button on the second row of buttons near the top of your screen. When highlighting the button, it tells you it will “Perform OCR”. This begins the OCR process.

screenshot

There is a progress bar in the row below the "Perform OCR button". It will advance as your file is processed indicating the percent of completion.

The screendshot below shows the bar is at 4%.

screenshot

When that process is completed, your document will have boxes around different areas of content and the right panel will be populated with information.

screenshot

Another example:

screenshot

Omnipage organizes your document into groups of text, tables, graphics, or forms. The red “t” is symbolic that the data included therein is text. The small graphic with the green hillside, sky and sun is symbolic that it is recognized as a graphic. There is also a calculator for tables, and a purple “f” for forms. See example below:

screenshot

To change a field to something other than what it was automatically detected to be, simply right click the field and change it.

By default, the text auto-proofreader window pops up after the completion of an OCR process.

screenshot

Back to top

This is your chance to change or modify the text as you see fit.

Recommended: Close the proofreader and letting the save code do its changes automatically.

In the top box, the word is displayed as it would appear on paper. The second box under “suspect word” displays how Omnipage sees the word. The low resolution this scan was done at causes more errors to be created. Increasing the scan resolution can fix a lot of problems. Omnipage does not recognize the word correctly and the results need to be corrected by a human being, so in the second box under “suspect word” you must change the word to how it should be, or press the ignore button, or if there is a suggestion as to what the word might be it will show up in the third box under “suggestions”. You can also pick one of those words and press the “change” button. Also, pressing the close button will stop the editor from asking you to make changes to the document. If you stop the editor you are in effect accepting all the errors that Omnipage may have made from that point on. This is highly not recommended with Omnipage as it prefers to change words to something unreadable instead of ignoring it and leaving it as it is (which is usually readable to the human eye). As you proofread more and more, Omnipage will learn from your changes and make fewer errors in the future.

It is not necessary to fix all of the errors you will see in a document that the proofreader informs you about. When you are saving the document, it runs the document through some extra code and puts some things back in the form of a picture if it couldn’t figure out what the word was. This is beneficial because instead of it changing the word on you t something illegible, it ignores it and leaves it as it was. The only way to know for sure what it will or will not fix, is to press close as the proofreader pops up, then save the document and re-open it with acrobat to see how it looks.

Now that you have completed the process, select the button “Save to File”.

screenshot

A dialog box pops up asking you what to name the file, where to save it, and what format you would like it saved in.

screenshot

Press OK to save the document and you are finished.

You can now re-open that document you just saved with another program and actually select, search, and even edit the document's text that was OCR’d.

It is VERY important when using Omnipage to change the “Files of Type” box when saving your document. If your document has text only, you can save it as a regular pdf. If your document has the tiniest amount of graphics, which includes bullets, lines, small images, or anything that could be construed as an image, you must save the document as a “PDF with image on text” or you will get a horrendous amount of errors in your document that you will constantly be fixing.

Back to top