Nuance OMNIPAGE PRO 8 User Manual

OmniPage® Pro
Users Manual
CAERE CORPORATION
100 Cooper Court Los Gatos, California 95032
European Offices: Caere GmbH Innere Wiener Strasse 5 81667 Munich Germany
Please Note
In order to use this program, you should know how to work in the Microsoft Windows environment. Please refer to Windows documentation if you have questions about how to use menu commands, dialog boxes, scroll bars, edit boxes, and so on.
OmniPage Pro for Windows Version 8
Copyright© 1997 Caere Corporation. All rights reserved. CAERE®, OmniPage®, OmniPage Pro®, Language Analyst®, 3D OCR®, AutoOCR™, and True Page® are trademarks of Caere Corporation
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Such designations appearing in this manual have been printed in initial caps.
PD 802-0584-030A
2

Welcome

Welcome to OmniPage Pro, and thank you for buying our software! The following documentation has been provided to help you learn about OmniPage Pro.
This Users Manual
This manual introduces you to the basics of using OmniPage Pro. It includes an introduction to OmniPage Pro, insta llation and setup instructions, task-oriented instructions, ways to customize processing, settings guidelines, and technical information.
Online Help
OmniPage Pro’s online help contains detailed information on features, settings, and procedures. The online help conforms to Windows online help conventions and has been designe d for quick and easy information retrieval. Please see “Getting Online Help” on page 12 for more information.
Release Notes
The
Release Notes
OmniPage Pro. Please read this before installing th e application.
Scanner Setup Notes
The
Scanner Setup No tes
supported scanners and scanner setup.
booklet contains last-minute in formation about
booklet contains the latest information about
Welcome – 3

Using This Manual

This manual is written with the assumption that you know how to work in the Microsoft Windows environment. Please refer to your Windows user’s manual or online help if you have questions about how to use dialog boxes, menus, and so on.
The following conventions are used in this manual.
Convention Purpose
Using This Manual
Italicized text
Note symbol Introduces a tip or an item of
Warning symbol Introduces cautionary text
• Emphasizes menu commands, dialog box options, labeled buttons, and file names
For example: “Choose
menu.”
• Emphasizes new terms the first time they are used
• Emphasizes important words in a sentence
note
Open...
in the File
Welcome – 4
Chapter 1
Introduction to
OmniPage Pro
You probably use your computer for most business correspond ence and other written projects. The problem is that certain sources of information cannot be immediately used on a computer.
For example, if you want to incorporate information from a magazine article into a document in your word processor, you somehow have to get the text from the article into your computer. Painstakingly retyping the article is not an appealing solution.
OmniPage Pro offers a smart solution to in crease your work productivity. OmniPage Pro’s technology accurately and easily converts scanned paper documents and image files into editable text for use in your favorite computer applications. OmniPage Pro does the retyping for you.
optical character recognition (OCR)
Please continue reading this chapter for information on these topics:
• What Is Optical Character Recognition (OCR)?
• The OmniPage Pro Desktop
• Getting Online Help
• Product Support
Introduction to OmniPage Pro - 5

What Is Optical Character Recognition (OCR)?

What Is Optical Character Recognition (OCR)?
Optical character recognition (OCR
computer-editable text. An image is an electronic picture of text such as a scanned paper document or an electronic fax file. Images do not have editable text characters; they have many tiny dots ( form a picture of text.
During OCR, OmniPage Pro analyzes an image and defines characters to produce editable text. After OCR, you can export the resulting text to a variety of word-processing, page layout, and spreadsheet applications.

OmniPage Pro OCR

In addition to text recognition, OmniPage Pro can retain the following elements of a document during OCR.
Graphics
Photos, logos, and drawings are examples of graphics.
Text formatting
Font types, font sizes, and font styles (such as bold or of text formatting.
Page formatting
Column structure, paragraph spacing, and placement of graphics are examples of page formatting.
) is the process of turning an
) that together
pixels
) are examples
italic
image
into
The graphics, text formatting, and page formatting elements that OmniPage Pro retains are determined by the settings you select. See “Settings Guidelines” on page 54 for more information.
OmniPage Pro only recognizes machine- printed characters such as laser-printed or typewritten text. However, it can retain handwritten text, such as a signature, as a graphic.
Introduction to OmniPage Pro - 6
What Is Optical Character Recognition (OCR)?

Basic Steps of OmniPage Pro OCR

These are the basic steps of OmniPage Pro’s OCR process.
1 Bring a document image into OmniPage Pro.
You can scan a paper document, load an image file, or load a fax from Microsoft. The resulting image appears in OmniPage Pro’s image viewer. See “Bringing Document Images into OmniPage Pro” on page 23 for more information.
2 Create zones to identify areas you want to recognize as
text or retain as graphics.
Zones are borders that enclose the parts of a document image that will get processed. You can create zones manually, automatically, or with a template. Any areas not enclosed by zones are ignored during OCR. See “Creating Zones for OCR” on page 26 for more information.
3 Perform OCR to convert text information into editable
text characters.
During OCR, OmniPage Pro defines text characters in an image. After OCR, you can check and correct errors in the text. See “Performing OCR on a Document” on page 27 for more information.
4 Export the document to the desired location.
You can save your document to a specified file format, place it on the Clipboard, or send it as a mail attachment. See “Exporting Documents” on page 39 for more information.
There are different ways to start the OCR process in OmniPage Pro. See “Ways to Process Documents” on page 21 for more information.
Introduction to OmniPage Pro - 7

The OmniPage Pro Desktop

OmniPage Pro’s desktop displays the pages of a docum ent in its thumbnail viewer, image viewer, and text viewer. You can use buttons in the Standard, AutoOCR, and Zone toolbars to perform various tasks on the document.
Standard toolbar
AutoOCR toolbar
The thumbnail viewer displays a picture of each page in the document.
The OmniPage Pro Desktop
Zone
toolbar
The current page has a box around it.
The image viewer
displays the current
pages original image.
Drag this splitter to
the left or right to
resize a viewer.
The text viewer displays the
current pages recognized
text and retained graphics.
Introduction to OmniPage Pro - 8

AutoOCR Toolbar

The AutoOCR toolbar contains buttons that can activate each step of the OCR process.
AUTO
button
Set commands in the AutoOCR toolbar buttons for the operations you want to perform. You can choose commands in a buttons’s drop-down list.
• The AUTO button allows you to activate automatic processing or use the OCR Wizard.
• The Image button allows you to bring in ima ges by scanning or loading pages.
• The Zone button allows you to a utomatically create zones on images based on their original page layouts or predefined templates.
• The OCR button allows you to perform OCR or train characters for OCR.
• The Export button allows you to save, copy, or send your recognized document as a mail attachment.
Image button
The OmniPage Pro Desktop
Zone
button
Click the down arrow to
display the commands in a
buttons drop-down list.
OCR
button
Export button
Please see “Setting AutoOCR Toolbar Commands” on page 43 for more information on each toolbar button.
Introduction to OmniPage Pro - 9

Standard Toolbar

The Standard toolbar contains buttons and drop-down lists for performing various tasks.
The OmniPage Pro Desktop
New
Open
Save
Print

Zone Toolbar

Check
Recognition
Cut
Copy
Paste
Undo
View
Image
Editor
Options
Rotate Image
Straighten
Image
Zoom
The Zone toolbar contains buttons that allow you to draw and define zones on a page image.
Draw
Rectangular
Zones
Add to
Zone
Reorder
Zones
Help
Draw
Irregular
Zones
Subtract
from
Zone
Zone
Properties
See “Customizing Zones” on page 65 for more information.
Introduction to OmniPage Pro - 10

Options Dialog Box

Click the tabs in the Options dialog box to view and select different settings.
The OmniPage Pro Desktop
You can select settings fo r OmniPage Pro in the Options di alog box. To open it, click the Options button or choose
Options...
in the Tools menu.
See Chapter 4, OmniPage Pro Settings, for more information on settings.
Introduction to OmniPage Pro - 11

Getting Online Help

After installing OmniPage Pro, you can use its online help system to get information on features and procedures.
Please refer to your Windows documentation to learn more about using Windows online help systems.

Help Menu

Use commands in the Help menu to open topics that provide information on features and procedures.
Getting Online Help
• Choose listings for OmniPage Pro help topics.
• Choose Pro, including tutorial exercises.
• Choose services for OmniPage Pro.
• Choose
OmniPage Pro Help Topics
Getting Started
Product Support
Tip of the Day

Context-Sensitive Help

You can get on-the-spot informatio n about a particular OmniPage Pro command, toolbar button, or dialog box option in the following ways.
• Click the Help button in the Standard toolbar and then click any toolbar button, menu command, or area of the OmniPage Pro desktop to display informa tion about that item.
• Click the question-mark button in the upper-right corner of a dialog box and then click an item in the dialog box to get an explanation of that item.
• Some dialog boxes have a information about that dialog box.
to get contents and index
to get introductory topics to OmniPage
to find out how to get product support
to get hints for using OmniPage Pro.
button. Click
Help
Help
to get
Introduction to OmniPage Pro - 12

Product Support

For the fastest and easiest way to get help, please look for solutions in this manual or in the online help. For troubleshooting tips, see “Genera l Troubleshooting Solutions” on page 86.
If you need additional help, product support and information are available to registered users through the services listed in this table.
Product Support
Service How to Contact
World Wide Web home page (common Q&A, patches, updates, troubleshooting procedures, and product information)
Download Service (BBS) (patches, updates)
Automated Fax Response Service (common Q&A, updates)
Telephone Support in North America (fee-based troubleshooti ng)
For international telephone numbers, please refer to the insert in your OmniPage Pro package.
http://www.caere.com
+1 408 395-1631 (8 bits, no parity, 1 stop bit)
+1 408 354-8471 (US fax numbers only)
+1 408 395-8319
Caere Product Support
Please have the following information ready for the best service if you call Caere Product Support:
• OmniPage Pro version and serial number
• The make and model of your computer system, scanner, and other peripheral devices (printer, monitor, and so on)
• The names and version numbers of any other scanning software you use
• The amount of memory (RAM) on your system To get memory information, choose
in the Windows taskbar. Double-click the System icon in
Panel
SettingsControl
Start
the Control Panel to open the System Properties dialog box. On Windows 95, click the
Performance
information. On Windows NT, click the
tab to see memory
General
tab to see
memory information.
• The amount of free hard disk space on your system To get disk space information, open Windows Explorer and
select the drive letter for your hard disk. The status bar will report how much free hard disk space is available.
Introduction to OmniPage Pro - 13
Chapter 2

Installation and Setup

This chapter provides installation and setup information for OmniPage Pro and the Scan Manager.
For technical and troubleshooting information, please read Chapter 6, Technical Information. For specific scanner information, please read the
Scanner Setup Notes
This chapter contains the following topics:
• Minimum System Requirements
• Install i n g OmniPage Pro
• Setting Up Your Scanner with OmniPage Pro
• Starting OmniPage Pro
• Registering OmniPage Pro
included in your OmniPage Pro package.
Installation and Setup - 14

Minimum System Requirements

You need the following setup, at minimum, to install and run OmniPage Pro:
• Computer with a 486 or higher processor
• Microsoft Windows 95 or Windows NT 4.0
• 8MB of memory (RAM) for Windows 95 16MB of memory for Windows NT
• 30MB of free hard disk space to install application files, the Scan Manager, and one OCR language
40MB to install above files and all O C R languages 10MB of free hard disk space for temporary files during
installation
• SVGA or VGA monitor
• Windows-compatible mouse
• A compatible scanner if you plan to sc an documents Please see the
Scanner Setup Notes
Minimum System Requirements
for a list of tested scanners.
Performance and speed will be enhanced if your system exceeds these minimum requirements.

Installing OmniPage Pro

OmniPage Pro’s Setup program takes you through installation with onscreen instructions at every step. For best results, do not run any other programs — especially anti-virus programs — during installation.
Be sure your scanner is connected, turned on, compatible with your system, and runs with the software provided by the scanner manufacturer
To install OmniPage Pro:
1 Insert OmniPage Pro’s CD-ROM in the CD-ROM drive.
before
The Setup program should start automatically. If it does not start, locate your CD-ROM drive in Windows Explorer and double-click the CD-ROM.
you install OmniPage Pro.
6HWXSH[H
program at the top-level of the
Installation and Setup - 15

Setting Up Your Scanner with OmniPage Pro

2 Click 3 Follow the onscreen instructions to finish installation.
During installation, you are prompted to enter a serial number. You can find the serial number on the label of the CD-ROM.
to continue with installation.
Next
Setting Up Your Scanner with OmniPage Pro
To use your scanner with OmniPage Pro, you must install the Scan Manager and select your scanner. You are prompted to do this during OmniPage Pro’s regular installation. However, you can also instal l the Scan Manager at a separate time.
The
Scanner Setup No tes
scanner support and setup. You can also find more information in “Scanner Setup Issues” on page 91.
Use the following procedure to install the Scan Manager if you did not install it during OmniPage Pro installation.
To install the Scan Manager:
1 Make sure your scanner is turned on when you start your
computer.
contain the most detailed information about
2 Close OmniPage Pro if it is open. 3 Insert OmniPage Pro’s CD-ROM in the CD-ROM drive. 4 Cancel the regular Setup program if it sta r ts automatically. 5 Double-click the
Scanmgr\Disk 1
6 Select your scanner when you are prompted.
The Scan Manager finishes installing after you make your scanner selection.
Once your scanner is set up with OmniPage Pro, you can select scanner settings in OmniPage Pro’s Options dialog box. See “Scanner Settings” on page 49 for more information.
VHWXSH[H
folder.
program located in the
Installation and Setup - 16

Starting OmniPage Pro

To change your scanner selection in the Scan Manager:
1 Make sure your scanner is turned on when you start your
computer. 2 Close OmniPage Pro if it is open. 3 Click
SettingsControl Panel
4 Double-click the
Manager. 5 Click the 6 Select the name of the scanner you want to use in the
Scanners
7 Click
You are prompted to select the directory containing the files
that need to be installed. 8 Insert OmniPage Pro’s CD-ROM in the CD-ROM drive.
Cancel the regular Setup program if it starts automatically. 9 Select 10 Click
Starting OmniPage Pro
If you plan to scan, make sure your scanner is attached to your computer and turned on before you start OmniPage Pro.
To start OmniPage Pro, click
ProgramsCaere ApplicationsOmniPage Pro 8.0
group you selected during installation if it is different than
Applications
.)
in the Windows taskbar and choose
Start
.
Caere Scan Manager 3. 0
Select Scanner
list box.
Set as Current Scanner
Scanmgr\Disk 1
in the Scan Manager after processing is complete.
Close
tab.
and then click
as the installation directory and click OK.
in the Windows taskbar and choose
Start
icon to open the Scan
.
Apply
. (Use the program
Supported
Caere
Or, double-click the OmniPage Pro icon located in the folder where you installed OmniPage Pro.
See “The OmniPage Pro Desktop” on page 8 for an intro d uction to OmniPage Pro’s user interface.
Installation and Setup - 17

Registering OmniPage Pro

Registering your copy of OmniPage Pro entitles you to product support, notification of special offers, and the lowest price offered on the next OmniPage Pro upgrade.
You can use OmniPage Pro for 25 sessions without registering it. The Register dialog box appears the 26th time you launch OmniPage Pro, and the program exits if you do not register at that time.
If you purchased your product directly from Caere or if you are already a registered user, you should
To register OmniPage Pro by telephone:
Registering OmniPage Pro
be prompted to register again.
not
You will be asked to provide your serial and key numbers.
When you get your registration number, enter it here.
1 Click the
Register
menu to open the Register dialog box.
This dialog box appears automatically the first time you start
OmniPage Pr o .
Closes the Register dialog box without registering.
Prints out your registration information.
2 Click the
drop-down list and locate the phone number for
Call
your country. 3 Call the phone number and ask for a registration number.
You will be asked to provide your serial and key numbers that
are listed in the Register dialog box. 4 Enter the registration number in the
box and click OK.
Registration Number
Installation and Setup - 18
text
Registering OmniPage Pro
The Registration menu disappears from the menu bar after you
register.
To register OmniPage Pro at Caeres Web site:
You will need to enter your serial and key numbers.
When you get your registration number, enter it here.
1 Click the
Register
menu to open the Register dialog box.
Opens a help topic that provides instructions and a link to Caeres Web site.
2 Open your Web browser and go to the following address:
http://www.caere.com/registration
3 Enter the requested information in the fields provided.
You will need to enter your serial number and key numbers
that are listed in the Register dialog box. 4 Click
Submit Information
when you are finished entering
information.
You will be given a registration number. 5 Enter the registration number in the
Registration Number
box and click OK.
The Register menu disappears from the menu bar af ter you
register.
Installation and Setup - 19
text
Chapter 3

Processing Documents

This chapter describes how to work with documents in OmniPage Pro , including each step of the OCR process.
There are different ways to accomplish the same tasks in OmniPage Pro. You can use toolbar buttons or menu commands to start procedures. OmniPage Pro can perform all OCR steps automatically, or you can start each step individually. You can even do different tasks at the same time.
Please continue reading this chapter for information on these topics:
• Ways to Process Documents
• Bringing Document Images into O mniPage Pro
• Creating Zones for OCR
• Performing OCR on a Document
• Checking OCR Results
• Using OCR in Other Applications
• Working with Documen t s
• Exporting Documents
For complete information on all OmniPage Pro commands, settings, and procedures, please use OmniPage Pro’s online help. See “Getting Online Help” on page 12 for more information.
Processing Documents - 20

Ways to Process Documents

Optical character recognition (OCR) is the process of turning an image into computer-editable text so you do not have to retype the text manually. Chapter 1 explains the basic steps of Omn iPage Pro’s OCR process. The following is a summary of those steps.
1 Bring a document image into OmniPage Pro.
See page 23 for more information. 2 Create zones to identify areas you want to recognize as text or
retain as graphics.
See page 26 for more information. 3 Perform OCR to convert text information into editable text
characters.
See page 27 for more information. 4 Export the document to the desired location.
See page 39 for more information.

Using the OCR Wizard

The OCR Wizard guides you through the entire OCR process by asking you questions about your document and selecting the appropriate settings for you.
Ways to Process Documents
To process your document using the OCR Wizard:
1 Set
2 Click AUTO or choose
3 Answer the question in the first screen and click 4 Continue answering questions in the screens that follow.
OCR Wizard
down list.
The first wizard screen appears.
as the command in the AUTO button’s drop-
OCR Wizard
in the Process menu.
Next
Processing Documents - 21
.

Automatic Processing

Use the AUTO button to process a new document from start to finish or finish processing an open document.
To process your document automatically:
Ways to Process Documents
1Set
2 Set the desired Image, Zone, OCR, and Export commands.
3 Choose
4 Place your document in your scanner if you are scanning. 5 Click AUTO or choose
AutoOCR
down list.
See “Setting AutoOCR Toolbar Commands” on page 43 for
more information.
appropriate for your document.
See “Settings Guidelines” on page 54 for more information.
Each page of the document is processed and finished in order
according to the selected commands. If page images in an open
document already have zones, OmniPa ge Pro will skip zoning
for those pages and continue with the selected OCR and export
operations.
as the command in the AUTO button’s drop-
Options...
in the Tools menu and check that settings are
AutoOCR

Performing Multiple Tasks at Once

OmniPage Pro takes advantage of your computer’s ability to handle more than one process at a time. You can simultaneously scan, create zones, recognize, and edit documents. You do not have to wait for any process to complete before moving on to the next task.
For example, if you scan a multiple-page document, you can draw zones on an image as soon as the first page is scanned and you can edit recognized text as soon as it appears in the text viewer. These tasks can be done at the same time other pages are being scanne d and recognized.
in the Process menu.

Starting the OCR Process Outside OmniPage Pro

You can start the OCR process outside OmniPage Pro in a variety of ways. For example, you can use the from another application and paste recognized text into an open document. See “Using OCR in Other Applications” on page 33 for more information.
OCR Aware
feature to initiate OCR
Processing Documents - 22

Bringing Document Images into OmniPage Pro

Bringing Document Images into OmniPage Pro
You can bring document images into OmniPage Pro by:
• Scanning Pages
• Loading Ima g e Files
• Loading Exchange Faxes

Scanning Pages

You can scan paper documents to convert them to electronic images in OmniPage Pro. If a document is already open, scanned pages are inserted as new pages.
To scan in OmniPage Pro, you must install the Scan Manager and select your default scanner. See “Setting Up Your Scanner with OmniPage Pro” on page 16 for more information.
If you use a Visioneer scanner or if your scanner is set up to work with Visioneer’s PaperPort software, see “Using Visioneer Scanners with OmniPage Pro” on page 89.
To scan pages into OmniPage Pro:
1 Place your page in your scanner.
You can scan a stack of pages if you ha ve an automatic document feeder (ADF).
2Set
3 Choose
4 Click the Image button or choose
Scan Image
down list.
make sure the appropriate settings are selected.
Select
Scan Until Empty
at once. Otherwise, you must click the Image button to scan
each subsequent page.
menu.
Pages are scanned in order and combined into one working
document.
as the command in the Image button’s drop-
Options...
in the Tools menu and click the
if you want to scan all pages in an ADF
Scan Image
Scanner
in the Process
tab to
Processing Documents - 23

Loading Image Files

An image file is an electronic picture of text, such as a scanned paper document or an electronic fax, that is saved in an image file format such as PCX or TIFF. You can load image files into OmniPage Pro. If a document is already open, loaded image files are inserted as new pages.
The following procedure is for loading image files only. To open a n OmniPage Document ( menu.
To load image files into OmniPage Pro:
PHW
Bringing Document Images into OmniPage Pro
), use the
command in the File
Open...
1Set
Load Image
as the command in the Image button’s drop-
down list. 2 Click the Image button or choose
Load Image
in the Process
menu.
The Load Image dialog box appears.
Click
Advanced
you want to select files from more than one folder.
3 Select the folder location and file type of the file you want to
load.
See “Supported File Formats” on page 89 for a complete list of
supported file formats. 4 Select the files you want to load.
You can Shift-click or Ctrl-click to select multiple files in the
same folder.
if
5 Click
folder
Advanced
.
• Select a file and click
• Click
Add All
if you want to select files from more th an one
to put it in the
Add
Selected Files
list.
to add all files from the current folder.
Processing Documents - 24
Bringing Document Images into OmniPage Pro
6 Click
load.
Image files are loaded in the order selected and combined into
one working document.
when you have selected all the files you want to
Open

Loading Exchange Faxes

You can load fax images into OmniPage Pro from Microsoft Exchange or Outlook if you have the Microsoft Fax comp onent installed with those applications. Please see Microsoft documentation for information on configuring these applications.
If a document is already open, loaded faxes are inserted as new pages.
For best results, ask senders to use you faxes.
To load Exchange faxes into OmniPage Pro:
1Set
Load Exchange Fax
drop-down list.
This command only appears in the drop-down list if you have
the Microsoft Fax component installed with Microsoft Exchange or Outlook.
or
Fine
as the command in the Image button’s
mode when they send
Best
2 Click the Image button or choose
Process menu.
The Exchange dialog box appears.
3 Select the folder that contains the faxes you want to load. 4 Select the faxes you want to load.
You can Shift-click or Ctrl-click to select multiple faxes. 5 Click
load.
Exchange faxes are loaded in the order selected and combined
into one working document.
when you have selected all the faxes you want to
Open
Load Exchange Fax
Processing Documents - 25
in the

Creating Zones for OCR

Creating Zones for OCR
This is a text zone. It will be converted to text during OCR.
All unzoned areas of the page will be ignored during OCR.
Page images are displayed in OmniPage Pro’s image viewer where
zones
are created before OCR. Zones are borders that identify areas of an image that will be recognized as text or retained as graphics. Any part of an image not enclosed by a zone is ignored during OCR.
This is a graphic zone. It will be kept as a graphic image during OCR.
For information on drawing zones manually, modifying zones, deleting unwanted zones, and using zone templates, please see “Customizing Zones” on page 65.

Creating Zones Automatically

OmniPage Pro can analyze a page and create zones automatically for you. It uses the selected setting in the Zone button to determine the text flow on a page and breaks it into ordered zones.
To create zones automatically:
1 Choose a setting in the Zone button’s drop-down list that most
closely matches the format of your document.
You can choose
Tables, Mixed Pages
Button Commands” on page 45 for more information o n these settings.
Single-Column Pages, Multiple-Column Pages
, or a template of your own. See “Zone
,
Processing Documents - 26

Performing OCR on a Document

You can also choose
HP AccuPage
scanning and zoning technology — as the zone setting if your scanner supports it and
HP AccuPage
Manager.
2 Click the Zone button or choose
menu.
OmniPage Pro automatically draws zones on the current page
in the image viewer. Each zone has a number indicating its
order and a letter indicating its zone properties.
Zone #1: alphanumeric text
Zone #2: alphanumeric text
Make sure zones are identified correctly before performing
OCR. For example, if you want to retain an area as a graphic,
that area should be identified as a
“Changing Zone Properties” on page 71 for more information.
Performing OCR on a Document
— an advanced Hewlett Packard
is selected in the Scan
Auto Zones
Zone #3: graphic
Graphic
in the Proces s
zone type. See
Performing OCR converts an image to editable text. This is also referred to as
recognizing text
.
OmniPage Pro only recognizes printed characters such as laser-printed or typewritten text. However, it can retain handwritten text, such as a signature, as a graphic.
To perform OCR:
1 Choose
Options...
in the Tools menu and click the
Page For mat
tab. 2Select an
Output Format
setting for your document.
OmniPage Pro uses this setting to determine the output
formatting of a document during OCR.
Processing Documents - 27

Checking OCR Results

3 Set
down list.
Or, set
checking to begin automatically after OCR. 4 Click the OCR button.
The page is recognized according to the current zones and
settings. If there are no zones on the page, zones are created
according to the current command in the Zone button.
To schedule a group of documents for OCR at a particular time, see “Scheduling OCR” on page 79.
Checking OCR Results
After performing OCR, recognized text appears in the text viewer where you can check for errors. Error checking starts automatically if you chose
OCR and Check
OmniPage Pro marks suspected errors in green and inserts a red “reject” character for any character it cannot recognize. To turn off these color markers, choose
OCR and Check
Perform OCR
as the command in the OCR button’s drop-
as the command if you do not want error
as the OCR process command.
Show Markers
in the View menu.
To check and correct errors:
1 Click the Check Recognition button or choose
Recognition...
in the Tools menu.
The Check Recognition dialog box displays the first suspected
error and a picture of how it originally looked in the image.
Check
Click in this window to enlarge or reduce the picture.
Processing Documents - 28
Checking OCR Results
2 Select one of these options for the word:
• Click
• Click
to allow the word to remain as is.
Ignore Ignore All
to ignore all instances of the word in the
current document.
• Click edit box.
to
• Click
word in the
• Click
to replace the word with the word in the
Change
Change All
to add the word to the current user dictionary.
Add
to replace all instances of the word with the
Change to
edit box.
After you choose an option for the word, OmniPage Pro automatically continues to find the next possible error.
Change
3 Click

Verifying Text

After performing OCR, you can compare recognized text against the original image to verify that the text was recognized correctly.
To verify text against its original image:
1 Double-click any word in the text viewer or select a word and
2 Click inside the window to enlarge or reduce the picture.
to stop checking recognition.
Done
Color markers are removed from words that have been checked.
choose
Verify Text
in the Tools menu.
The Verify Text window opens and shows a picture of the
original word and its surrounding area.
Close button
3 Continue double-clicking words that you want to verify.
The window display changes as you select new words.
4 Click the standard Close button to close the window.
Processing Documents - 29

Checking OCR Results in Microsoft Word

You can check for OCR errors directly in Microsoft Word 7 or Microsoft Word 97 if you have those versions installed on your computer.
Checking OCR Results
To enable this feature, you must select settings in the
Microsoft Word
section of OmniPage Pro’s Options dialog box. See “Microsoft Word Settings” on page 53 for more information.
Make sure the
*.doc
file extension is associated with the version of Word you plan to use. Please refer to your Windows documentation for more information on associ ating file extensions with application s.
To check and correct errors in Microsoft Word:
1 Perform OCR on your document and then save it as the
appropriate file type:
• Save as
• Save as
Word for Windows 7.0 Word 97
if you are using that version.
if you are using that version.
2 Open the document in Microsoft Word.
The document must be opened on a system that has OmniPage Pro installed.
An OmniPage menu appears in Microsoft Word’s m e nu bar along with a corresponding toolbar:
Check Recognition
Verify Text
3 Choose
Close Image Viewer
Check Re cognition ...
Remove Check Recognition Support
in the OmniPage menu.
Processing Documents - 30
Use these buttons to zoom in or out on the image.
original image
Checking OCR Results
When the first suspected error is located, the Verify Text window appears displaying the original image of the text.
The Check Recognition dialog box also appears.
4 Select one of these options for the word:
• Click
• Click
• Click edit box.
to
• Click
word in the
• Click
to allow the word to remain as is.
Ignore Ignore All Change
Change All
Add
to ignore all instances of the word.
to replace the word with the word in the
to replace all instances of the word with the
Change to
edit box.
to add the word to the current user dictionary.
Change
After you choose an option for the word, the next possible error is located.
5 Click
to stop checking recognition.
Done
Color markers are removed from words that have been checked.
To verify recognized text against its original image in Microsoft Word, you must process the document in OmniPa ge Pro and save it to the appropriate Word format. You cannot verify text against original images using the OCR Aware feature.
Processing Documents - 31
Checking OCR Results
To verify text against its original image in Microsoft Word:
1 Follow steps 1 and 2 in the preceding instructions if your
document is not already open in Microsoft Word.
2 Select a suspect word.
Suspect words are marked in the color that was selected in the
Microsoft Word
section of OmniPage Pro’s Options dialog box.
You can only verify words that are marked as suspected errors. However, once the Verify Text window is open, you can use its scroll bars and zoom buttons to see any pa rt of the original image.
Use these buttons to zoom in or out on the image.
Removing OmniPage Pro Data from the Word Document
After checking for OCR errors, you should remove OmniPage Pro data from your document to reduce its file size. You are automatically prompted to remove OmniPage data after all suspect words have been checked. You can also choose OmniPage menu. The OmniPage menu, toolbar, color markers, and image data will all be removed from the document.
3 Choose
Verify Text...
in the OmniPage menu.
The Verify Text window opens and shows a picture of the
original word and its surrounding area.
4 Repeat steps 2 and 3 to continue checking other suspect words.
The window display changes as you select new words.
5 Choose
Close Image Viewer
in the OmniPage menu to close the
window when you are done.
Remove Check R ecognition Support
in the
Processing Documents - 32

Using OCR in Other Applications

Using OCR in Other Applications
You can use OmniPage Pro's applications. For example, you can scan, recognize, and paste text directly into a word-processing document without ever leaving the application.
You can use OCR Aw are with 32-b it (and som e 1 6-bit ) ap plic at ions t hat have been registered with OmniPage Pro. An application must be installed on your computer in order to use it with OCR Aware. See page 51 for more information on registeri ng applications with OCR Aware.
For information on other way s to start OCR outside OmniPage Pro, please see the “Starting OCR Outside OmniPage Pro” online help topic.
To use OCR Aware in an application:
1 Align your document in your scanner if you plan to scan. 2 Open the application in which you want to insert recognized
text. The application must be registered to work with OCR Aware.
You do not need to open OmniPage Pro itself.
3 Place the cursor at the location in your document where you
want to insert recognized text. If no document is open, recognized text will be pasted to the
Clipboard.
OCR Aware
feature to use OCR in other
4 Choose
you want to check the current settings.
5 Choose
are ready to start the OCR process. OCR processing occurs according to the selected settings.
Recognized text appears at the cursor location in your application. If no document is open, text is pasted to the Clipboard.
Text formatting, such as bold and italics, is retained if the application supports RTF information. Otherwise, only plain text will be pasted. Graphics are retained if the application supports bitmap images.
Acquire Text Settings...
Acquire Text...
in the application's File menu when you
in the applica t io n' s File menu if
Processing Documents - 33

Working with Documents

OmniPage Pro’s thumbnail, image, and text viewers allow you to look at and work with pages in the current document.
Thumbnail viewer
Working with Documents
Image viewer Text viewer
This section describes the following procedures:
• Saving a Document as You Work
• Resizing a Page View
• Changing Pages
• Reordering Pages
• Deleting Pages
• Printing a Document
• Closing a Document
• Closing Omn i Page Pro
Drag this splitter to the left or right to resize a view.
Processing Documents - 34

Saving a Document as You Work

Click the menu to save changes to the current document as you work. The first time a document is saved, the Save As dialog box appears. See “Saving a Document” on page 39 for more information.
button in the Standard toolbar or choose
Save
Working with Documents
in the File
Save
If a document has been saved as an OmniPage Document ( the changes you make in the open document are saved. If a document has been saved as a text-based file type, only the text changes are saved out to that file.
For example, suppose you save the current document as a text file called
Memo.txt
Pro. Whenever you click the Save button, changes in the recognized text will overwrite the
but continue to work with the recognized text in OmniPage

Resizing a Page View

You can resize a page displayed in the image viewer or text viewer to enlarge or reduce the view.
To resize a page view:
1 Click in the viewer you want to enlarge or reduce to make it
active.
2 Choose a size option in the Zoom drop-down list in the
Standard toolbar.
Or, choose the drop-down list.
The page resizes as specified.
Memo.txt
Zoom
file.
in the View menu and select a size option in
PHW
), all
You can also click your right mouse butt on in the viewer you want to resize and select a size option in the shortcut menu.
Processing Documents - 35

Changing Pages

The thumbnail viewer, image viewer, and text viewer all display the same page in a document.
You can change pages in a document in the following ways:
• Click the thumbnail of the page you want to display.
Working with Documents
The thumbnail of the currently displayed page has a box around it.
• Click the Next Page or Previous Page buttons at the lower-right corner of the OmniPage Pro desktop.
• Choose
Next Page, Previous Page
, or
Go to Page...
in the Edit menu.
Processing Documents - 36

Reordering Pages

You can reorder pages in a document by dragging their thumbnails to different positions in the thumbnail viewer.
Working with Documents
Click the thumbnail of the page you want to move and drag it above the desired page number.
Hold down the Ctrl key while you click thumbnails if you want to select multiple thumbnails to move as a group.

Deleting Pages

If you delete a page from a document in OmniPage Pro, the thumbnail, original image, and recognized text for that page are all deleted.
To permanently delete pages:
• Choose currently displayed page.
• Select one or more thumbnails of pages you want to delete and press the Delete key.
Delete Current Page
in the Edit menu to delete the
Processing Documents - 37
Undoing Changes
Working with Documents
You can click the Undo button or choose the very last change you made in the text viewer. You can also choose
to cancel zone deletions in the image viewer. However, page
Undo
deletions cannot be undone.

Printing a Document

You can print the current document's original page images or recognized text.
To print a document:
1 Choose
following in the submenu:
• Choose
• Choose 2 Select the desired print settings in the Print dialog box. 3 Click OK to start the print job.
As a shortcut, you can click either the text or image viewer to make it active and then click the Print button to print from that viewer.
in the Edit menu to cancel
Undo
in the File menu and choose one of the
Print...
Image... Text...
to print original page images.
to print recognized text.

Closing a Document

Choose You are prompted to save your document if you have not saved it or
have modified it since the last save. Save a document as an OmniPage Document (
in the File menu to close a document.
Close
PHW

Closing OmniPage Pro

Choose to save the current document if you have not saved it or have modified it since the last save.
in the file menu to close OmniPage Pro. You are prompted
Exit
) if you want to reopen it in OmniPage Pro again.
Processing Documents - 38

Exporting Documents

You can export a document to other applications by:
• Saving a Document
• Copying a Docume nt to the Clipboard
• Sending a Document as a Mail Attachment
After you export a document, a copy of the document remains open in OmniPage Pro. Save the document as an OmniPage Document ( if you want to reopen it in OmniPage Pro again. OmniPage Documents retain all original images, zones, and recognized text.

Saving a Document

You can save recognized text and original images to disk in a va riety of file types.
To save recognized text:
Exporting Document s
PHW
)
1 Choose
You can also click the Export button with the drop-down list.
The Save As dialog box appears.
2 Select a folder location and file type for your document.
See “Supported File Formats” on page 89 for a complete list of supported file types.
3 Type in a file name and select save options.
Save As...
in the File menu.
Save As
selected in
Processing Documents - 39
4 Click OK.
The document is saved to disk as specified. Graphics and
formatting are saved in the document only if the selected file type supports them.
To save original images:
Exporting Document s
1 Choose
The Save Image dialog box appears.
2 Select a folder location and file type for your document.
See “Supported File Formats” on page 89 for a complete list of
supported file types. 3 Type in a file name and select 4 Click OK.
The image is saved to disk as specified (zones and recognized
text are not saved with the file).
Save Image...
in the File menu.
and
Save
Image
options.

Copying a Document to the Clipboard

You can copy every page of a recognized document to the Clipboard and then paste the text directly into another application.
To copy a document to the Clipboard:
1Set
2 Click the Export button or choose
Copy to Clipboard
drop-down list.
Process menu.
The document is copied to the Clipboard.
as the command in the Export button’s
Copy to Clipbo ard
Processing Documents - 40
in the
Text formatting, such as bold and italics, is retained when you paste into an application that supports RTF information. Otherwise, only plain text will be pasted. Graphics are retained if the application supports bitmap images.

Sending a Document as a Mail Attachment

You can send a recognized document as a file attached to a mail message if you have a MAPI-compliant mail application, such as Microsoft Exchange or Outlook, installed.
To send a document as a mail attachment:
Exporting Document s
1 Choose
You can also click the Export button with
the drop-down list.
The Send Mail dialog box appears.
2 Specify a file type and attachment options for your document. 3Click OK. 4 Log into your mail application if you are prompted to do so.
A new message appears ready for addressing. 5 Address your mail message as desired and click the Send
button.
The document is sent as an attachment to the mail message.
Send Mail...
in the File menu.
Send Mail
selected in
Processing Documents - 41
Chapter 4

OmniPage Pro Settings

This chapter describes the settings in the AutoOCR toolbar and Options dialog box. Please look in OmniPage Pro’s online help for more detailed information on settings.
The settings you select for processing documents can greatly affect OCR results. You may have to experiment with different settings to get the results you want. Settings guidelines are provided at the end of the chapter to get you started.
Please continue reading this chapter for information on these topics:
• Setting AutoOCR Toolbar Commands
• Selecting OmniPage Pro Settings
• Accuracy Settings
• Scanner Settings
• Page Format Settings
• Language Settings
• OCR Aware Settings
• Process Settings
• Microsoft Word Settings
• Settings Guidelines
OmniPage Pro Settings - 42

Setting AutoOCR Toolbar Commands

The AutoOCR toolbar buttons allow you to take a document through each step of the OCR process. Every toolbar button has different process commands that can be set for the operations you want to perform. OmniPage Pro can go through al l steps automatically, or you can start each step individually.
Setting AutoOCR Toolbar Commands
AUTO button
Image button
You can set AutoOCR Toolbar commands in two locations:
• Click the down arrow next to each AutoOCR toolbar button and
select a process command in the drop-down list.
• Choose
Process Settings...
button and select process commands in the Options dialog box.
The pictures in the AutoOCR toolbar buttons change as you set dif ferent process commands. The commands can be activated by clicking the AutoOCR toolbar buttons or choosing commands in the Process men u.

AUTO Button Commands

Use the AUTO button to process a document from start to finish. The AUTO button’s drop-down list contains the commands.
AutoOCR
Select
AutoOCR
to the selected process commands. See “Automatic Processing” on page 22 for more information.
to finish processing a new or open document according
Zone
button
OCR
button
Export button
in the Process menu or click the Options
AutoOCR
and
OCR Wizard
OCR Wizard
Select
OCR Wizard
to have the OCR Wizard guide you through the entire OCR process. See “Using the OCR Wizard” on page 21 for more information.
OmniPage Pro Settings - 43

Image Button Commands

Use the Image button to bring a document image into OmniPage Pro’s image viewer. The Image button’s drop-down list contains the
Image, Load Exchange Fax,
Load Image
and
Scan Image
Setting AutoOCR Toolbar Commands
Load
commands.
Select
Load Image
files.
Load Exchange Fax
Select
Load Exchange Fax
Outlook. This command only appears in the drop-down list if you have the full Microsoft Fax application installed.
Scan Image
Select
Scan Image
command only appears in the drop-down list if you have installed the Caere Scan Manager and have selected your default scanner.
Please see “Bringing Document Images into OmniPage Pro” on page 23 for more information.
to load existing image files such as TIFF or PCX
to load faxes from Microsoft Exchange or
to scan paper documents in your scanner. This
OmniPage Pro Settings - 44

Zone Button Commands

Use the Zone button to automatically create zones on document images. Zones are boxes that specify what will be recognized as text or retained as graphics on an image. The Zo ne butt on’s dro p-down li st cont ains th e
Single-Column Pages, Multiple-Column Pages, Tables, Mixed Pages AccuPage
created. See “Creating Zones fo r OCR” on page 26 f or more information.
Single-Column Pages
Select and order zones on single-column document images such as letters or memos.
Multiple-Column Pages
commands and the names of any zone templates you have
Single-Column Pages
Setting AutoOCR Toolbar Commands
and
HP
to have OmniPage Pro automatically draw
Select
Multiple-Column Pages
to have OmniPage Pro automatically draw and order zones on multiple-column document images such as magazine or newspaper articles.
Tables
Select
to have OmniPage Pro automatically draw and order zones
Tables
on table format document images such as spreadsheets, or any pa ge that contains a table.
Mixed Pages
Select
Mixed Pages
if your document contains multiple pages w ith a variety of page layouts. OmniPage Pro will automatically draw and order zones on each page.
HP AccuPage
®
,
If you use a scanner that supports HP AccuPage
AccuPage
as the auto zoning option for scanned pages.
you can select
Zone Templates
Select a zone template to create zones on document images using that template. See “Creating Zone Templates” on page 72 for more information.
HP
OmniPage Pro Settings - 45

OCR Button Commands

Use the OCR button to perform the selected OCR operation on document images. The OCR button’s drop-down list contains the
Perform OCR, OCR and Check, Train OCR,
Perform OCR
Setting AutoOCR Toolbar Commands
and
Defer OCR
commands.
Select
Perform OCR
OmniPage Pro analyzes the image and identifies characters to produce editable text. See “Performing OCR on a Document” on page 27 for more information.
OCR and Check
Select
OCR and Check
automatically start checking for errors after OCR. Se e “Checking OCR Results” on page 28 for more information .
Train OCR
Select
Train OCR
characters. These pre-recognized characters are saved in a training file, which OmniPage Pro can use to compare with the characters in document images during OCR. See “Training OCR for Special Characters” on page 74 for more information.
Defer OCR
Select
Defer OCR
OmniPage Pro will process your document up to the point of OCR and then ask if you want to schedule the document to be finished later. See “Scheduling OCR” on page 79 for more information.
to recognize text on document images. During OCR,
to recognize text on document images and
to teach OmniPage Pro how to recognize special
to delay text recognition during automatic processing.

Export Button Commands

Use the Export button to export recognized text and retained graphics to other applications. The Export button’s drop-down list contains the
As, Send Mail, Copy to Clipboard,
and
Defer Export
Save
commands.
Save As
Select
Save As
format. See “Saving a Document” on page 39 for more information.
to save a recognized document to disk in a specified file
OmniPage Pro Settings - 46
Send Mail
Setting AutoOCR Toolbar Commands
Select
Send Mail
to send a recognized document as a file attached to a mail message if you have a MAPI- c ompliant mail application, such as Microsoft Exchange or Outlook, installed. See “Sending a Document as a Mail Attachment” on page 41 for more information.
Copy to Clipboard
Select
Copy to Clipboard
to place a copy of a recognized document on the Clipboard. See “Copying a Document to the Clipboard” on page 40 for more information.
Defer Export
Select
Defer Export
if you do not want to export your document right after automatic processing. OmniPage Pro will process your document up to the point of export and then stop.
OmniPage Pro Settings - 47

Selecting OmniPage Pro Settings

Selecting OmniPage Pro Settings
Click each tab to view and select different settings.
Click the Op ti ons b utt o n or cho os e
Options...
in the Tools menu to open the Options dialog box. This is the central location for Omn iPage Pro settings.
Click for a description of each setting.
Documents require different settings depending on their input attributes and your output goals. To get the best results, learn how to identify document attributes and make selections for them. You may have to experiment with different settings to get the results you want. Refer to the Settings Guidelines beginning on page 54 for more information.
OmniPage Pro Settings - 48

Accuracy Settings

Accuracy Settings
Click the most.
Language Analyst
evaluates and replaces unknown words with words most likely to be correct during OCR.
Training files help recognize special
characters during OCR.
Select a brightness setting to account for variations in paper and print quality when you scan.

Scanner Settings

Click the
Accuracy
Scanner
tab to select settings that affect OCR accuracy the
Select
Small
text
if you are processing a page containing text that is < 6 pt.
tab to select settings for scanning pages.
This is recommended
for black and white pages.
This is recommended for pages with colored
backgrounds, colored text, or pages containing grayscale graphics.
This is recommended for highest accuracy with HP scanners that
support HP AccuPage.
Use these settings if your scanner
has an automatic
document feeder.
OmniPage Pro Settings - 49

Page Format Settings

Page Format Settings
Click the formatting of a page is handled during OCR.
Select a setting that describes how your original page looks.
Select a setting to determine what your page will look like after OCR.

Language Settings

Click the
Page Format
Language
tab to select language settings for your document.
tab to select settings that determine how the
Click to select font options for recognized text.
Select the documents main language.
Select additional languages for a
multi-language document.
This is the language that will
be used in dialog boxes, windows, and menu commands.
This is the character used in place of unknown characters.
OmniPage Pro Settings - 50

OCR Aware Settings

OCR Aware Settings
OCR Aware allows you to
initiate OCR from another application.
If your application is not listed, click locate the application file
(
*.exe
Registered
Browse...
) and add it to the
list box.
to
Click the
OCR Aware
tab to select settings for the OCR Aware feature. OCR Aware allows you to initiate OCR fr om another application. See “Using OCR in Other Applications” on page 33 for more information.
An application must be registered to work with OCR Aware.
Click
Register
Office 97...
register Office 97 applications.
to
Some applications may be pre-registered with OCR Aware during OmniPage Pro installation. These applications will display in the
Registered
list box.
To register an application with OCR Aware:
1 Launch the application you want to register and open a
document in it. This will ensure that the applica tion name appears in the list
box in step 5. 2 Choose 3 Click the
Options…
OCR Aware
4 Make sure that
in OmniPage Pro’s Tools menu.
tab in the Options dialog box.
Enable OCR Aware
is selected.
5 Select the name of the application you want to register in the
Add >>
list box.
to add the selected application to the
Registered
Unregistered
6 Click
list box and then click OK.
OmniPage adds the
Acquire Text...
and
Acquire Text Settings...
commands to the File menus of registered applications.
OmniPage Pro Settings - 51

Process Settings

Process Settings
The OCR Wizard will guide you through
the OCR process when you click AUTO.
Specifies where newly loaded or scanned images
are added to an open document.
Click the
tab to set commands and settings for each step of OCR.
Process
OmniPage Pro Settings - 52

Microsoft Word Settings

Microsoft Word Settings
Select this if you want to check for OCR errors in Microsoft Word.
Click the
Microsoft Word
tab to select settings for performing check recognition directly in Microsoft Word. See “Checking OCR Results in Microsoft Word” on page 30 for more information.
Select the color in which you want suspected
errors to appear
in Microsoft
Word.
Checking recognition in Microsoft Word is only supported in Microsoft Word versions 7 and 97. Make sure you associate the
*.doc
extension with the version you plan to use. Please refer to your Windows documentation for more information.
OmniPage Pro Settings - 53

Settings Guidelines

The settings you select in OmniPage Pro can greatly affect OCR results. Make sure that settings are appropriate for your document begin processing. You may have to experiment with different settings to get the results you want.
Answer the following questions to get settings recommendations for your documents.
What type of document are you processing?
Magazine and newspaper pages Memos and letters Spreadsheets and tab l es Legal documents Mixed formats or not sure
What is the quality of the original document?
Poor or not sure Good
How much original formatting do you want to keep?
Minimal Some As much as possible
, page 57
, page 58
, page 58
, page 55
, page 55
, page 56
, page 57
, page 59
, page 55
, page 56
Settings Guidelines
you
before
Do you want to retain graphics in your document?
, page 60
Yes
, page 60
No
How many languages are in your document?
One language More than one language
Are you processing a multi-page document?
, page 62
Yes
, page 62
No
, page 61
, page 61
OmniPage Pro Settings - 54
What type of document are you processing?
Settings Guidelines
Magazine and newspaper pages
Recommendations
Select
Select the appropriate page size and
Draw zones manually or modify
Multiple columns
settings.
orientation in the scanning.
automatically created zones if auto zoning does not successfully create zones around all page areas you want to process. See Customizing Zones on page 65, for more information. Keep associated sections of text, such as paragraphs, together in one zone. Omit unnecessary parts of the page such as separator lines between columns.
Scanner
Memos and letters Recommendations
Select
Select the appropriate page size and
Identify graphics that you want to retain as
Single column
settings.
orientation in the scanning.
Graphic
zone types.
Scanner
in the
Page Format
settings if you are
in the
Page Format
settings if you are
Spreadsheets and tables Recommendations
Select Select the appropriate page size and
Select
Identify the zone type as
Identify the zone content as
Table
in the
orientation in the scanning.
Retain flowing columns
Format
settings.
that contain graphics you want to retain.
zones that only contain numbers.
Scanner
Page Format
OmniPage Pro Settings - 55
settings.
settings if you are
in the
Page
Graphic
for zones
Numeric
for
What type of document are you processing?
Legal documents Recommendations
Select
Select
Select the appropriate page size and
Draw zones manually or modify
Select
Multiple columns
settings if text appears in two or more columns.
Single column
settings if the document has one, page- wide text column.
orientation in the scanning.
automatically created zones to omit unnecessary parts of the page. For example, do not include line numbers in a zone if you plan to renumber lines in your word processor.
Table
select
Hard carriage return after every line
in the Save As dialog box if you want to preserve line numbering.
Scanner
in the
Settings Guidelines
in the
in the
Page Format
settings if you are
Page Format
Page Format
settings and
Mixed formats or not sure Recommendations
Select
Select the appropriate page size and
Draw zones manually or modify
Mixed pages
settings.
orientation in the scanning.
automatically created zones if auto zoning does not successfully create zones around all page areas you want to process. See Customizing Zones on page 65, for more information. Keep associated sections of text, such as paragraphs, together in one zone. Omit unnecessary parts of the page such as unwanted graphics.
Scanner
in the
Page Format
settings if you are
OmniPage Pro Settings - 56
What is the quality of the original document?
Settings Guidelines
Poor or not sure
Degraded copies, colored or shaded backgrounds or text, run-together or broken text characters
thick, run-together text
characters
thin, broken text
characters
Recommendations for scanning
Select
Select
For best accuracy, use the
Try to scan original documents rather than
Grayscale with 3D OCR
in the
Accuracy
settings if you have a grayscale scanner and your page contains grayscale graphics, colored background, or colored text.
Grayscale with HP AccuPage
in the settings if you have an HP scanner that supports HP AccuPage, and you selected HP AccuPage in the Scan Manager.
Black and white
your pages are black and white. Lighten the setting for thick, run-together text characters or dark backgrounds. Darken the setting for thin, broken text characters.
photocopies.
Other recommendations
Select
Draw zones manually to omit any smudges or
Choose
Ask senders to select
Use Language Analyst
in the
Accuracy
OmniPage Pro will evaluate words and make logical replacements for hard-to-recognize characters.
scribbles on the page.
Check Recognition...
in the Tools menu to
locate possible errors after OCR.
Fine
or
Best
mode when they
send faxes that you plan to recognize.
Accuracy
setting if
settings.
Good
Clear, well-formed, black text characters on a clean, white background
well-formed text
characters
Recommendations
Select
Deselect
Black and white
in the
Accuracy
the fastest processing if you are scanning. Use a setting near the middle of the slider box.
Use Language Analyst
in the
settings for faster processing.
OmniPage Pro Settings - 57
settings for
Accuracy
Settings Guidelines
How much original formatting do you want to keep?
Minimal
Keep one font and one font size only
Some
Keep font characteristics and paragraph formatting
Recommendations
Select
Click
Select
Remove formatting
settings.
Font Mapping
and select one font and one font size to be used for all text.
ANSI
in the Save As dialog box if you want to be able to open the document in any application.
in the
...
in the
Page Format
Page Format
Recommendations
Select
Click
Save to a file format, such as Rich Text Format
Retain font and paragraph formatting
the
Page Format
Font Mapping
and select the fonts you want mapped to various font types.
(RTF), that supports the formatting. Text formatting, such as bold and italics, is
retained if the application supports RTF information. Otherwise, only plain text will be retained. Graphics are retained if the application supports bitmap images.
settings.
in the
...
Page Format
settings
in
settings
OmniPage Pro Settings - 58
Settings Guidelines
How much original formatting do you want to keep?
As much as possible
Keep font characteristics, paragraph formatting, column formatting and graphic positioning
Recommendations
Select
Select
Please note:
frames when necessary to maintain column formatting and graphic positioni ng. Although frames will appear in the text viewer, only required frames, such as frames around graphics, will be exported.
Click
Make sure all parts of the page are included
Save to a file format, such as Rich Text Format
True Page
retain the original appearance of a page using frames. The formatting will be more precise but will be more difficult to edit.
Retain flowing columns
settings if your page contains multiple columns and you want text to flow between paragraphs and columns in your target application. The formatting may be less precise than will be easier to edit.
The
Font Mapping
and select the fonts you want mapped to various font types.
within zones. Any part not enclosed within a zone is ignored during OCR and will not appear in the recognized document.
(RTF), that supports the formatting. Text formatting, such as bold and italics, is
retained if the application supports RTF information. Otherwise, only plain text will be retained. Graphics are retained if the application supports bitmap images.
in the
Page Format
in the
Retain flowing columns
in the
...
Page Format
settings to
Page Format
True Page
setting uses
settings
but
OmniPage Pro Settings - 59
Settings Guidelines
Do you want to retain graphics in your document?
Yes
Keep graphics such as logos and photos during OCR processing
Recommendations for scanning
Select
Select
Please note:
does not support grayscale graphics.
Grayscale with 3D OCR
settings if you are scanning with a grayscale scanner or loading a grayscale image file and you want to retain grayscale graphics.
Black and white
you are scanning line-art drawings.
The
Grayscale with HP AccuPage
in the
in the
Scanner
Scanner
settings if
setting
Other recommendations
Select
Manually draw zones around graphic areas if
Make sure separate zones are drawn around
Make sure graphic zones are identified as
Select
To save graphics separately from text after OCR,
Multiple columns or Mixed pages
Page Format
will not automatically detect graphics.
necessary.
graphic areas and text areas.
Graphic
in the upper-right corner.
when you save a document to another file format.
choose
Save each graphic zone to a file
settings. The
zone types. These are marked with a G
Retain graphics
Save Image...
Single column
in the Save As dialog box
in the File menu and select
.
in the
setting
No
Ignore graphics such as logos and photos during OCR processing
Recommendations
For best accuracy, select
Accuracy
on a white background.
Deselect
box when you save a document to another file format.
settings if your page contains black text
Retain graphics
Black and white
in the Save As dialog
OmniPage Pro Settings - 60
in the
How many languages are in your document?
One language Recommendations
If your document contains a language that is not
installed in OmniPage Pro, you can add languages to OmniPage Pro by uninstalling and then reinstalling it.
Select the document language in the
settings.
For faster processing and more accurate results,
select only the language that appears in your document in the
More than one language R ecomm endations
If your document contains languages that are not
installed in OmniPage Pro, you can add languages to OmniPage Pro by uninstalling and then reinstalling it. You will be prompted during installation to select which languages you want installed. Select all languages that your document contains, as well as any other languages you commonly use.
Select the main document language and any
additional languages in the
For faster processing and more accurate results,
select only the languages that appear in your document in the
Language
Language
Settings Guidelines
settings.
Language
settings.
Language
settings.
OmniPage Pro Settings - 61
Settings Guidelines
Are you processing a multi-page document?
Yes Recommendations if you have an
automatic document feeder (ADF)
Select
Select
Insert blank pages to separate more than one job
Other recommendations
Set the desired process commands and click
Create and use a zone template if all pages have
Choose
After OCR, choose
Scan until empty
scan a stack of pages at once. Otherwise, you must click the Image button to scan each subsequent page.
Double-sided pages
print on both sides. You will be prompted to turn the stack over.
within a stack of pages. You can save pages between blank pages as separate files after OCR.
AUTO
to automatically process each page of your
document in order.
similar zoning requirements. See Creating Zone Templates on page 72 for more information.
Schedule OCR...
schedule processing for a specific time. Pick a time that you plan to be away from your computer.
You can select an option to save the recognized document as a single file, one file per page, or a new file after each blank page.
in the
Scanner
to scan pages with
in the Process menu to
Save As...
in the File menu.
settings to
No Recommendations
Set the desired process commands and click
AUTO
to automatically process the page.
Click the Image button to add more pages to the
document by scanning or loading images.
OmniPage Pro Settings - 62
Chapter 5

Customizing OCR

OmniPage Pro has many features that allow you to customize the way your documents are handled during OCR. This chapter describes how to use these features.
Please continue reading this chapter for information on these topics:
• Adjusting Page Images Before OCR
• Customizing Zones
• Specifying Fonts
• Training OCR for Special Characters
• Creating User Dictionaries
• Saving Settings Files
• Scheduling OCR
Customizing OCR - 63

Adjusting Page Images Before OCR

You can rotate and straighten page images in OmniPage Pro’s image viewer before zoning and OCR take place. This is recommended to improve OCR accuracy on pages that are not oriented correctly.
Adjusting Page Images Before OCR
If you need to rotate or straighten a page, be sure to do so create zones because all zones are deleted during these operations.
To rotate a page image:
1 Click on the page image to make the image viewer active. 2 Click the Rotate Image button to rotate the imag e 90-deg rees
(clockwise) at a time. Or, choose
degrees.
To straighten a page image:
1 Click on the page image to make the image viewer active. 2 Click the Straighten Image button.
Or, choose OmniPage Pro straightens the page image up to a maximum of
10 degrees. OmniPage Pro will not straighten a page if it determines that it is unnecessary.
You can also have OmniPage Pro automatically rotate or straighten pages as necessary during OCR by selecting those options in the
Page Format
Rotate
Straighten Ima ge
section of the Options dialog box.
in the View menu and sel ect 90, 18 0, or 270
in the View menu.
before
you
Customizing OCR - 64

Customizing Zones

Zones are borders created around areas o f a page image to identify what will be recognized as text or retained as a graphic during OCR. Zones play a big part in determining OCR results.
You can create zones automatically, manua lly, or with a template. Topics in this section describe how you can customize zones including:
• Drawing Zones Manually
• Modifying Zones
• Deleting Zones
• Changing Zone Properties
• Creating Zone Templates
For information on creating zones automatically, please see “Creating Zones for OCR” on page 26.

Zone toolbar

The Zone toolbar contains buttons for drawing and m odifying zones.
Customizing Zones
Draw
Rectangular
Zones
Draw
Irregular
Zones
Add to
Zone
Subtract
Reorder
Zones
from
Zone
Zone
Properties
Customizing OCR - 65
Customizing Zones

Drawing Zones Manually

You can draw zones manually on a page image using buttons in the Zone toolbar. Rectangular zones are the most common, but you can a lso draw irregular-shaped zones.
To draw rectangular zones:
1 Click the Zone Properties button and select the zone type and
content for the zone you are about to draw. See “Changing Zone Properties” on page 71 for more
information.
2 Click the Draw Rectangular Zones button.
The mouse pointer in the image viewer becomes a drawing tool.
3 Enclose an area of the image you want as a zone by holding
down the mouse button and dragging the drawing tool to form a rectangular box.
Try to keep areas of text, such as paragraphs or single columns, together in the same zone.
4 Release the mouse button when you are done.
A number appears within the zone indicating its processing order.
5 Repeat steps 3 and 4 until you have finished drawing zones
around the desired areas of the page.
You cannot draw overlapping zones. If you attempt to draw a zone over an existing zone, the borders of the new zone will wrap the boundaries of the existing zone.
To draw irregular-shaped zones:
1 Click the Zone Properties button and select the zone type and
content for the zone you are about to draw. See “Changing Zone Properties” on page 71 for more
information.
2 Click the Draw Irregular Zones button.
The mouse pointer in the image viewer becomes a drawing tool.
3 Position the drawing tool where you want to start drawing the
first side of the zone.
4 Click the mouse button once.
Customizing OCR - 66
around
Customizing Zones
5 Drag the drawing tool to form the first side of your zone. 6 Click the mouse button when you have drawn the desired line
length.
7 Draw a perpendicular line in either direction to form the next
side of the zone.
8 Repeat steps 6 and 7 to finish drawing each side of your zone.
You will not be allowed to draw a line if it constitutes a restricted shape. The following zone shapes are restricted:

Modifying Zones

You can modify zones by moving, resizing, reordering, extending, subtracting, connecting, or dividing them.
To move zones:
1 Deselect the buttons in the Zone toolbar.
(If one of the first two drawing buttons is selected, you do not
have to deselect it.) 2 Place the mouse pointer inside a zone. 3 Hold down the mouse button and drag the zone to the desired
location.
To resize zones:
1 Deselect the buttons in the Zone toolbar.
(If one of the first two drawing buttons is selected, you do not
have to deselect it.)
Indented along
the bottom
Indented along
the top
2 Select the zone you want to resize by clicking inside it.
The selected zone is shaded and handles appear on its border. 3 Place the mouse pointer over a handle so that it changes to a
two-way arrow. 4 Hold down the mouse button and drag the handle in the
direction that you want to enlarge or reduce the zone. 5 Release the mouse button when you are done.
The zone border changes to display the modified zone area.
Customizing OCR - 67
Customizing Zones
To reorder zones:
1 Click the Reorder Zones button.
The numbers in the zones disappear. 2 Click within the zone you want recognized first.
The number 1 appears in the zone. 3 Click within the zone you want recognized next.
The number 2 appears in the zone. 4 Repeat step 3 until all the zones are appropriately ordered.
If you do not number all the zones, they are automatically
numbered for you when you start OCR.
The numbered order of zones determines the order in which text will be placed on a recognized page. However, if you select
Page
or
Retain flowing columns
as the Output Option for a page, the
order of the text will be based on the order of the original page.
To extend an area of a zone:
1 Click the Add to Zone button.
The mouse pointer in the image viewer becomes a drawing tool
with a plus sign.
True
drawing tool
2 Position the drawing tool at the poin t where you want to start
extending the zone.
3 Hold down the mouse button and drag the drawing tool in the
direction that you want to extend the zone.
Customizing OCR - 68
The left area of this zone has been extended downward.
To subtract an area of a zone:
Customizing Zones
4 Release the mouse button when you are finished extending the
zone.
The zone border changes to display the modified zone area.
1 Click the Subtract from Zone button.
The mouse pointer in the image viewer becomes a drawing tool
with a minus sign. 2 Position the drawing tool at the poin t where you want to start
subtracting from the zone.
drawing tool
3 Hold down the mouse button and drag the drawing tool in the
direction that you want to subtract from the zone. 4 Release the mouse button when you are finished subtracting
from the zone.
The zone border changes to display the modified zone area.
Customizing OCR - 69
To connect two or more zones:
1 Click the Add to Zone button.
2 Hold the mouse button down and drag the drawing tool over
3 Release the mouse button when you are done.
To divide a zone:
1 Click the Subtract from Zone button.
2 Hold the mouse button down and drag the drawing tool over
3 Release the mouse button when you are done.

Deleting Zones

You can delete the current zones if you want to create new zones. You can also delete individual zones that you do not want to process during OCR. Any part of a page image not enclosed by a zone is ignored during OCR.
Customizing Zones
The mouse pointer in the image viewer becomes a drawing tool
with a plus sign.
the area where you want the zones to be connected.
The zone border changes to display the modified zone area.
The mouse pointer in the image viewer becomes a drawing tool
with a minus sign.
the area where you want to divide the zone.
The zone border changes to display the modified zone area.
To delete and replace the current zones automatically, click the Zone button. You will be prompted to replace the current zones.
To delete zones:
1 Select the zone you want to delete by clicking inside the zone.
• Shift-click to select additional zones.
• Choose current page.
Selected zones are shaded.
2 Press the Delete key or choose
The selected zones disappear.
Select All
in the Edit menu to select all zones on the
in the Edit menu.
Clear
Customizing OCR - 70

Changing Zone Properties

You can set certain properties for zones to customize how each zone will be treated during OCR. The Zone Properties dialog box contains settings for
zone type
Zone Type
Every zone on a page has a zone type setting. You can select the following zone types:
and
zone content
Single-column zone Multiple-column zone Table zone Mixed zone
for text zones that contain text in tabbed columns
for text zones that contain a mixture of column
layouts
Graphic zone
for photos, drawings, and areas of text that you want to retain as a graphic. The letter G appears within graphic zones. OCR is not performed on graphic zones.
Customizing Zones
.
Close button
for text zones that contain a single column
for text zones that contain multiple columns
Zone Content
All text zones on a page also have a zone content setting. This specifies the characters OmniPage Pro looks for within a zone during OCR. You can select
appears within an alphanumeric zone and the letter N appears within
A
Alphanumeric
or
Numeric
as the zone content setting. The letter
a numeric zone. For example, if a particular zone only contains numbers and
mathematical signs, you can specify the contents of that zone to be
Numeric
. OmniPage Pro will only look for numeric characters in that
zone during recognition.
OmniPage Pro assigns zone properti es to each zone when it creates zones automatically. You do not need to change the zone properties unless you want to modify the way zones will be treated during OCR.
Customizing OCR - 71
The settings in this dialog box will be blank if multiple zones with different setting s are selected at once.
Customizing Zones
To change the properties of a zone:
1 Select the zone you want to modify by clicking it.
You can Shift-click to select multip le zones. Selected zones are
shaded.
2 Click the Zone Properties button to open the Zone Properties
dialog box.
Close button
3 Select a zone type for the selected zones. 4 Select a zone content for the selected zones.
You can only select a zone content setting for text zones.
5 Click the standard Close button when you are done.
You can also change a zone’s type and content settings individually by clicking your right mouse button over the zone and choosing a setting in the shortcut menu that appears.

Creating Zone Templates

You can use zone templates to create zones on a page image. A zone template contains zone attributes including size, shape, position, order, type, and content. Zone templates are useful if you frequently process documents that have the same layouts and similar content.
To create a zone template:
1 Load a page image and create the desired zones. 2 Choose
The New Template dialog box appears. 3 Type a name for your file in the 4 Click OK.
The zone template file is saved in the
installation folder. It can be selected in the Zone button drop-
down list.
Save Zone Templa te
... in the Tools menu.
File name
text box.
GDWD
folder in your
Customizing OCR - 72
To create zones with a template:
1 Select the zone template that you want to use in the Zo ne
button drop-down list.

Specifying Fonts

Specifying Fonts
You can retain the font characteristics in your document during OCR if you select an Output Format option other than
Page Format
OmniPage Pro automatically To map fonts, OmniPage Pro analyzes text and categorizes it as one of these font types:
2 Click the Zone button or choose
OmniPage Pro creates zones o n the page image using the zone
template.
section of the Options dialog box.
detected font types to specified fonts.
maps
• Proportional Serif
Character spacing varies depending on the character; short lines finish off the letter strokes. The body text in this manual is an example of this font type.
• Proportional Sans-Serif
Character spacing varies depending on the character; letter strokes do not have finishing lines. The he adings in this manual are an example of this font type.
• Monospaced Serif
Character spacing is the same for each character; short lines finish off the letter strokes.
• Monospaced Sans-Serif
Character spacing is the same for each character; letter strokes do not have finishing lines. font type.
&RXULHU
Template
is an example of this font type.
in the Process menu.
Remove formatting
is an example of this
in the
To customize the font mapping for font types:
1 Choose
box. 2 Click the
Options...
Page Format
in the Tools menu to open the Options dialog
tab.
Customizing OCR - 73

Training OCR for Special Characters

The selected fonts are applied to text when their corresponding font types are detected during OCR.
3 Click
4 Select the font you want mapped to each font type.
5 Click OK when you are done.
Font Mapping...
The fonts available in the drop-down lists depend on the True
Type fonts installed on your system.
to open the Font Mapping dialog box.
Training OCR for Special Characters
A
training file
Pro compares with characters on a page image during OCR. You can create a training file for special characters that might normally be difficult to recognize such as the copyright symbol © or the registered trademark symbol ®.
To create a training file:
is a set of pre-recognized text characters that OmniPage
1 Open the image file or scan the page that includes characters
you want to train. 2 Create zones around the text that you want to train. 3Set
4 Click the OCR button or choose
Train OCR
list.
menu.
OmniPage Pro analyzes the document and then opens the
Train Characters dialog box.
as the command in the OCR button’s drop-down
Train OCR
in the Process
Customizing OCR - 74
Original character images
OmniPage Pros interpretation of the images
Training OCR for Special Characters
5 Double-click a character you want to train. Or select it and click
.
Specify
Most characters do not need to be trained. Look for uncommon characters such as the copyright symbol ©.
The Specify Character dialog box shows how the selected
character appeared in the original page image.
Click the character you want to associate with the selected character
The associated character appears here
The original image of the selected character
6 Specify how you want OmniPage Pro to interpret th e character
during OCR by entering a character in the
7 Click
return to the Train Characters dialog box.
OK to
Character
edit box.
8 Repeat steps 5–7 to continue specifying characters. 9 Click
Or, click
to save the specified characters to a training file.
Save
Append
to add the specified characters to another
training file.
After saving or appending to a file, you are asked if you want
to make this the current training file. Click
current page using the training file you just created. Click
to recognize the
Yes
No
to
return to the image without recognizing it.
Customizing OCR - 75
Training OCR for Special Characters
Training files are saved in the You can select them in the
To edit a training file:
Original image Associated characters
GDWD
folder in your installation folder.
section of the Options dialog box.
in the Tools menu.
1 Choose
Accuracy
Edit Training File...
A dialog box appears listing all your training files.
2 Double-click the training file you want to edit. Or, select it and
click
Edit.
The Train Character dialog box displays characters in the
selected file.
3 Edit the characters as desired.
• Double-click a character that you want to edit.
• Click a character that you want to remove and click
4 Do one of the following after editing the training file:
• Click
• Click
to save changes in the training file.
Save Append
to add all train ed ch ara c ters to another training
file.
• Click
to exit without saving the edits to the training file.
Cancel
Customizing OCR - 76
Delete
.

Creating User Dictionaries

Words in the user dictionary appear in this list box.
A user dictionary is used when you perform OCR and check for errors afterward. You can select a user dictionary in the Options dialog box.
To customize a user dictionary:
Creating User Dictionaries
Language
section of the
This is Microsoft Words user dictionary. You can use it with OmniPage Pro.
This is OmniPage Pros default user dictionary.
1 Choose
Edit User Dictionary...
in the Tools menu.
A dialog box lists all user dictionary files.
2 Do one of the following:
• Select a file and click
• Click
to create a new user dictionary. Enter a name in the
New
to edit an existing user dictionary.
Edit
dialog box that appears and click OK. The User Dictionary dialog box appears.
3 Add or delete words as desired:
• Type a word in the
User word
• Select a word in the list box and click
Delete All
• Click
4 Click
to remove all words from the dictionary.
Import...
Close
to add words from a text file.
when you are finished editing the user dictionary.
edit box and click
Delete
OmniPage Pro’s user dictionaries are saved in the
in your installation folder.
to add it.
Add
to delete it. Click
GDWD
folder
Customizing OCR - 77

Saving Settings Files

You can save OmniPage Pro settings to a file. A settings file is useful for quickly loading particular settings that you need for certain documents.
The settings you select in OmniPage Pro can greatly affect OCR results. For help in selecting settings for different kinds of documents, see “Settings Guidelines” on page 54 .
To save settings to a file:
Saving Settings Files
1 Choose 2 Select the desired settings in the Options dialog box. 3 Click
4 Select a folder location for the settings file. 5 Type in a file name for the settings file and click OK.
All the current settings in the Options dialog box are saved into a settings file with an
6 Click OK to close the Options dialog box.
Options...
Save Settings...
in the Tools menu.
to open the Save Settings dialog box.
LQL
extension.
Customizing OCR - 78
To load a settings file:

Scheduling OCR

Scheduling OCR
1 Choose
box. 2 Click
3 Select the folder location of the settings file you want to load. 4 Select the name of the settings file you want to load and click
OK
The settings change according to the selected file.
5 Click OK to close the Options dialog box.
You can schedule OCR to take place on one or more OmniPage Documents, supported image files, and pages in your scanner. This processing can take place while you are away from your computer as long as OmniPage Pro is still running. Scheduled documents are opened at the specified time, unfinished pages are recognized, and the documents are saved in a preselected format and location.
Options...
Load Settings...
.
in the Tools menu to open the Options dialog
to open the Load Settings dialog box.
Scheduled documents are deleted from the processing queue if you close OmniPage Pro. Therefore, you should keep OmniPage Pro running until the documents are processed.
Topics in this section include:
• Scheduling Individual Documents
• Scheduling Documents from an Input Folder
• Modifying Output Options for Documents
Customizing OCR - 79

Scheduling Individual Documents

You can schedule individual documents from different folders. Scheduled documents are recognized at the specified time and then saved in the designated output folder.
To schedule individual documents:
Scheduling OCR
All scheduled documents are displayed in this processing queue.
Click this to modify default output options.
OmniPage Pro starts processing scheduled documents, in order, at the specified time.
1 Choose
Schedule OCR...
in the Process menu.
The Schedule OCR dialog box appears.
2 Click
to open the Add Jobs dialog box.
Add...
Click
Add...
to add documents to the processing queue.
Click
Remove
to remove a selected document from the processing queue.
Click
Advanced
to select documents from more than one folder.
3 Locate and select the files you want to add to the schedule.
You can select OmniPage Documents and supp orted image
files. 4 Click
after selecting the desired files.
Open
The Schedule OCR dialog box displays the newly added files.
Customizing OCR - 80
5 Select the time that you want OmniPage Pro to process the
scheduled documents.
Select
Finish now
if you want OmniPage Pro to process all
scheduled documents as soon as you close the dialog box. 6 Click OK in the Schedule OCR dialog box to save your settings
as specified.
All scheduled files are processed, in order, at the scheduled
time.

Scheduling Documents from an Input Folder

You can set up OmniPage Pro to automatically schedule documents from a specified input folder. Scheduled documents are recognized at the specified time and then saved in the designated output folder.
To schedule documents from an input folder:
Scheduling OCR
All scheduled documents are displayed in this processing queue.
Click this to modify default output options.
OmniPage Pro starts processing documents in the queue at the specified time.
1 Choose
Schedule OCR...
in the Process menu.
The Schedule OCR dialog box appears.
Customizing OCR - 81
Scheduling OCR
Select this to schedule documents in your scanners ADF.
Select this to automatically schedule documents in the specified folder.
2 Click the
Options...
button to open the Schedule OCR Options
dialog box.
The selected output options are used for all newly scheduled documents.
3Select
Auto add new jobs from folder
and select the desired input
folder.
If you use the auto-add feature to schedule documents and you do not select
Delete original file after OCR
, original files will be moved
from the input folder to the output folder after processing.
4 Click OK in the Schedule OCR Options dialog box to accept the
selected settings.
The Schedule OCR dialog box reappears and adds documents
from the input folder to the processing queue. 5 Select the time that you want OmniPage Pro to process
scheduled documents. 6 Click OK in the Schedule OCR dialog box to save the settings
and close the dialog box.
Processing begins at the specified time. Right before processing
begins, OmniPage Pro checks the input folder again and adds
any new documents to the processing queue.
After scheduled jobs are processed, the
Auto add new jobs from folder
option will be deselected.
Customizing OCR - 82

Modifying Output Options for Documents

All newly scheduled documents have the same default output folder and file format assigned to them. The default output file name uses the original file name and the extension of the output file format. You can modify all of these output options for any scheduled document.
Scheduling OCR
Click the default options used for all newly scheduled documents.
To modify the output options for an individual document:
Select the document for which you want to modify output options.
Click this to modify default output options.
Options...
1 Choose
button in the Schedule OCR dialog box to change the
Schedule OCR...
in the Process menu.
The Schedule OCR dialog box appears.
2 Select a scheduled file and click
Scheduled Job dialog box.
Modify…
Click this to modify the output options for the selected document.
to open the Modify
Select output options for this particular document.
Select this if you want the original document deleted after processing.
Customizing OCR - 83
3 Select the desired options for the document. 4 Click OK to accept the selected options.
The Schedule OCR dialog box reappears. 5 Click OK to close the Schedule OCR dialog box.
Scheduling OCR
Customizing OCR - 84
Chapter 6

Technical Information

This chapter provides troubleshooting and other technical information about using OmniPage Pro.
Please also read the your OmniPage Pro package. These contain the latest information on OmniPage Pro and its supported scanners.
Please continue reading this chapter for information on these topics:
• General Troubleshooting Solutions
• Using Visioneer Scanners with Omn iPage Pro
• Supported File Formats
• Scanner Setup Issues
•OCR Problems
• Uninstalling the Software
Release Notes
and
Scanner Setup Notes
that came in
Technical Information - 85

General Troubleshooting Solutions

Although OmniPage Pro is designed to be easy to use, problems sometimes occur. Many of the onscreen error messages contain self­explanatory descriptions of what to do — check connections, close other applications to free up memory, and so on. Sometimes that is all the troubleshooting help you need.
Please see your Windows documentation for information on optimizing your system and application performance.
Topics in this section include:
• Solutions to Try First
• Testing OmniPage Pro
• Low Memory Problems
• Low Disk Space Problems

Solutions to Try First

Try these possible solutions if you experience problems using OmniPage Pro:
• Make sure that your system meets all requirements listed under
“Minimum System Requirements” on page 15.
• Restart your computer and make sure other applications are
functioning properly.
• Make sure that your scanner is plugged in and that all cable
connections are secure.
• Turn off your computer and your scanner, turn your scanner
back on, and then restart your computer.
• Use the software that came with your scanner to verify that the
scanner works properly before using it with OmniPage Pro.
• Make sure you have the correct drivers for your scanner, printer,
and video card. See the
• Run ScanDisk for Windows 95 or Check Disk for Windows NT to
check your hard disk for errors. See Windows online help for more information.
• Defragment your hard disk. See Windows online help for more
information.
• Uninstall and reinstall OmniPage Pro and the Scan Manager.
Scanner Setup Notes
General Troubleshooting So lutions
for more information.
Technical Information - 86

Testing OmniPage Pro

Restarting Windows 95 in you to test OmniPage Pro on a simplified system. This is recommended when you cannot resolve crashing problems or if OmniPage Pro has stopped running altogether. See Windows online help for more information.
Your scanner will not run with OmniPage Pro in safe mode or VGA mode, so do not test scanner problems in this configuration.
To test OmniPage Pro in safe mode (Windows 95):
1 Restart your computer in safe mode by pressing F8
immediately after you see the “Starting Windows 95” message. 2 Launch OmniPage Pro and try performing OCR on an image.
Use an existing image file such as the
• If OmniPage Pro does not launch or run properly in safe mode, then there may be a problem with the installation. Uninstall and reinstall OmniPa ge Pro, and then run it in Windows safe mode.
• If OmniPage Pro runs in safe mode, then a device driver on your system may be interfering with OmniPage Pro operation. Troubleshoot the problem by restarting Windows in Step-by-Step Confirmation mode. See Windows online help for more information.
safe mode
General Troubleshooting So lutions
or Windows NT in
6DPSOHWLI
VGA mode
file.
allows
To Test OmniPage Pro in VGA mode (Windows NT):
1 Restart your computer. 2Select
3 Press Ctrl+Alt+Delete and select 4 In the Task Manager dialog box, select all background
5 Launch OmniPage Pro and try performing OCR on an image.
Windows NT Workstation Version 4.00 [VGA mode]
press Enter.
Task Manager
applications and click End Process. See your Windows documentation for more information.
Use an existing image file such as the
6DPSOHWLI
.
Technical Information - 87
and
file.

Low Memory Problems

OmniPage Pro may run poorly under low memory conditions. This may be indicated by various error messages or if OmniPage Pro works slowly and accesses the hard drive often. Try these solutions for low memory conditions:
• Restart your computer.
• Close other open applications to free up memory.
• Close unnecessary OmniPage Pro windows.
• Defragment your hard disk to free up contiguous blocks of disk space. See Windows online help for instructions.
• Increase the amount of free hard disk space.
• Increase your computer’s physical memory (RAM). More memory optimizes OCR performance. See “Minimum
System Requirements” on page 15 for more information.

Low Disk Space Problems

Problems may occur if your system runs low on free disk space. Try these solutions for low disk space problems:
• Empty the Windows Recycle Bin.
• Delete the located in your Windows folder.
• Run ScanDisk for Windows 95 or Check Disk for Windows NT to check for errors that may be using up disk space. See Windows online help for instructions.
• Back up unneeded files onto floppy disks or other media and delete them from your hard disk.
• Remove Windows applications that you do not use.
• Defragment your hard disk. See Windows online help for instructions.
• Clean the cache for your web browser and limit its size.
WPS
files in the
General Troubleshooting So lutions
7HPS
folder. This folder is usually
Technical Information - 88

Using Visioneer Scanners with OmniPage Pro

Using Visioneer Scanners with OmniPage Pro
During installation, OmniPage Pro automatically integrates with your Visioneer PaperPort software. However, you cannot scan directly into OmniPage Pro if you use a Visioneer scanner or if your scanner is set up to work with PaperPort software (such as the HP ScanJet 5s). Instead, scan pages into PaperPort and then drag the page images onto the OmniPage Pro icon at the bottom of the PaperPort Desk top. The page images will be loaded into OmniPage Pro. See OmniPage Pro’s online help for more information.

Supported File Formats

OmniPage Pro can open these file formats:
Bitmap (*.bmp) OmniPage Document (*.met)
DCX (*.dcx) PCX (*.pcx)
JPEG (*.jpg) TIFF (*.tif)
Caere Documents from version 6.0 and earlier can only be opened if the original
images were preserved.
TIFF files can be single- or multiple-page, line art or grayscale, compressed or uncompressed. They can be 200, 300, 400 dpi, but 300 dpi is r ecommended. OmniPage Pro stores and displays TIFF files as 300 dpi line art.
OmniPage Pro can save original images to these file formats:
Bitmap (*.bmp) TIFF Uncompressed (*.tif)
OmniPage Document (*.met) TIFF Packbits (*.tif)
PCX (*.pcx) TIFF Group 4 Compressed (*.tif)
Saving Image Files
OmniPage Pro saves each page of a multiple-page ima ge sepa rately. If you select
Save all pages
in the Save Image dialog box,
Page#
is appended to file names to distinguish separately saved pages. If you select
Save each graphic zone to a file
, then
is appended to file names
Zone#
to distinguish separately saved graphic zones.
Technical Information - 89
Supported File Formats
OmniPage Pro can save recognized text to these file formats:
Ami Professional
2.0, 3.0, 3.1
ANSI HTML
ANSI Standard Lotus 123 Windows Write 3.x
ANSI Stripped Microsoft PowerPoint
ASCII Microsoft Publisher Word for Windows 2.0,
ASCII Standard OmniPage Document
ASCII Stripped PageMaker (MS Word) WordPerfect 5.0, 5.1,
dBase III, III+, IV Quattro Pro 4.0 WordPerfect for
DisplayWrite (DCA/RFT) Quattro Pro for Windows
Excel 3.0, 4.0, 5.0, 6.0,
7.0, 97
FrameMaker Text Only
UWI
)
(
PHW
)
(
4.0
Rich Text Format WordStar for
Ventura Publisher (MS Word)
Word for DOS 5.0, 5.5
6.0, 7.0, 97
Wordpad
6.0, 6.1
Windows 5.1, 5.2, 6.0,
6.1
WordPro 96, 97
Windows 1.x, 2.0
XyWrite III Plus, IV
When saving to HTML, all graphics are saved as separate image files using
JPEG format.
Technical Information - 90

Scanner Setup Issues

This section contains information on scanner setup and solutions for scanning problems you may encounter.
Scanner Setup Issues
For more detailed scanner information, please read the
included in the OmniPage Pro package.
Notes
Topics in this section include:
• Scanner Drivers Supplied by the Manufacturer
• Scanner Drivers Supplied by Caere
• Problems Connecting OmniPage Pro to Your Scanner
• Missing Scan Image Command
• Scanner Message on Launch
• System Crash Occurs While Scanning

Scanner Drivers Supplied by the Manufacturer

Many scanners are shipped with one or more software that allows your computer to communicate with your scanner. Some scanners do not require drivers and other scanners require more than one driver. Refer to your scanner documentation for information about installing any required scanner drivers.
Make sure that your scanner and scanner drivers are properly installed and configured before installing OmniPage Pro. Make sure that you have installed the appropriate scanner drivers supplied by the manufacturer.
scanner drivers
Scanner Setup
. This is
For HP IIp, IIc, IIcx, 3p, and 3c scanners, use the drivers that came with the scanners, or select a TWAIN or ISIS driver in the Caere Scan Manager.
Technical Information - 91
Scanner Setup Issues

Scanner Drivers Supplied by Caere

OmniPage Pro is shipped with special scanner drivers that allow it to communicate with supported scanners. These scanner driver files are installed on your computer when you install the Caere Scan Manager. These drivers often work in conjunction with the drivers from your scanner manufacturer. In order to use your scanner with OmniPage Pro, you must select the appropriate scanner in the Caere Scan Manager. See “Setting Up Your Scanner with OmniPage Pro” on page 16 for more information.

Problems Connecting OmniPage Pro to Your Scanner

Try these solutions if you experience a problem between OmniPage Pro and your scanner or if you receive a scanner error message when you launch OmniPage Pro.
• Make sure the scanner is supported by OmniPage Pro with your version of Windows 95 or Windows NT.
A list of tested scanners is provided in the your scanner is not listed, call your scanner manufacturer to find out if it is supported.
• Make sure the Caere Scan Manager is installed and that you have selected the correct scanner in the Scan Manager.
See “Setting Up Your Scanner with OmniPage Pro” on page 16.
• Make sure you have installed the appropriate scanner driver. See the
Scanner Setup Notes
• Make sure your scanner is connected, compatible with your system, and runs with the software provided by the manufacturer
• Make sure your scanner is connected securely and turned on
before you start Windows.
Scanner drivers must be loaded at startup. Turn on your scanner
first and then restart your computer.
• Make sure the scanner is not in use by another application.
• Uninstall and then reinstall the Caere Scan Manager.
before
for more information.
you use it with OmniPage Pro.
Scanner Setup Notes
. If
Technical Information - 92

Missing Scan Image Command

The
Scan Image
down list in the following cases:
• You did not install the Caere Scan Manager or select an appropriate scanner. See “Setting Up Your Scanner with OmniPage Pro” on page 16 for instructions.
• Your scanner is not connected to your computer or is not functioning properly. See “Scanner Setup Issues” on page 91.
• You use a Visioneer scanner or your scanner is set up to work with Visioneer’s PaperPort software such as the HP ScanJet 5s. See the
command does not appear in the Image button’s drop-
Scanner Setup Notes

Scanner Message on Launch

The first time you launch OmniPage Pro after installing or changing your current scanner in the Caere Scan Manager, you may get this message:
This scanner’s configuration is set using the system -level driver.
no more information, click OK in the dialog box . You m a y al so ha v e the option to select the following:
• SCSI ID or scanner configuration information Consult your scanner documentation for the correct information.
• Page size information Enter the largest size page that your scanner supports
Scanner Setup Issues
for more information.
If it asks for
.

System Crash Occurs While Scanning

Try these solutions if a crash occurs during a scan:
• Turn your computer off. Turn your scanner off and on again to return the scanner to its default state. Then restart your computer.
• Check your scanner setup. See “Scanner Setup Issues” on page 9 1 for more information.
• Check the if you are using a TWAIN scanner.
• Check with the scanner manufacturer to make sure you have the appropriate driver for your scanner.
• Resolve low memory problems. See “Low Memory Problems” on page 88 for more information.
• Resolve low disk space problems. See “Low Disk Space Problems” on page 88 for more inf ormation.
• Check Caere Corporation’s web site (www.caere.com) for Scan Manager updates.
TWAIN Scanner Settings
tab in the Caere Scan Manager
Technical Information - 93
Scanner Setup Issues

Scanner Not Listed in Supported Scanners List Box

Try these solutions if your scanner is not listed in the Scan Manager
Supported Scanners
• Check Caere Corporation’s web site (www.caere.com) for Scan Manager updates.
• Select
Scanners
list box:
TWAIN scanner
list box.
as your current scanner in the

Scanning Tips

OCR results will be poor if an image is not scanned properly. Remember the following tips when you scan:
• Take the color and quality of your document into account when scanning.
High-quality documents return better recognition results than low-quality documents. Shaded, colored, or low-quality documents may result in poor recognition accuracy unless adjustments are made before scanning. See “What is the quality of the original document?” on page 57 for more information.
• Always try to scan an original docum e nt instead of a photocopy.
• Make sure the page is properly aligned in the scanner. Select
Automatically strai gh ten page image
settings of the Options dialog box to auto matically straighten a page image by up to 10 degrees if necessary.
• Check the glass, mirrors, and lenses on your scanner for dust, smudges, or scratches. Clean if necessary.
• Make sure the proper settings are selected in the of the Options dialog box before scanning.
See “Scanner Settings” on page 49 for more information.
in the
Page Format
Scanner
Supported
section
Technical Information - 94

OCR Problems

System Crash During OCR

OCR Problems
This section contains information and solutions for possible OCR problems.
Topics in this section include:
• System Crash During OCR
• Text Does Not Get Recognized Properly
• Problems With Fax Recognition
Try these solutions if a crash occurs during OCR or if processing takes a very long time:
• Resolve low memory problems. See “Low Memory Problems” on page 88 for more information.
• Resolve low disk space problems. See “Low Disk Space Problems” on page 88 for more inf ormation.
• Minimize all applications or click Alt+Tab to check for Windows error messages.
• Check the quality of the image you are recognizing. See “What is the quality of the original document?” on page 57 for more information.
See “Scanning Tips” in the previous section for ways to improve the quality of scanned images.
• Break complex page images (lots of text and graphics o r elaborate formatting) into smaller jobs. Draw zones manually or modify automatically created zones and perform OCR on one page area at a time. See “Customizing Zones” on page 65 for more information.
• Restart Windows 95 in safe mode or Windows NT in VGA mode and test OmniPage Pro by performing OCR on “Testing OmniPage Pro” on page 87.
• If you are performing multiple tasks at once, such as recognizing and printing, OCR may take longer.
6DPSOHWLI
. See
Technical Information - 95

Text Does Not Get Recognized Properly

Try these solutions if any part of the original document is not converted to text properly during OCR:
• Look at the original page image and make sure that all text areas are enclosed by text zones. If an area is not enclosed by a zone, it is ignored during OCR. See “Creating Zones for OCR” on page 26 for more information.
• Make sure text zones are identified correctly. Alphanumeric text
zones are marked by an A. Graphic zones are marked by a G.
Reidentify zones, if necessary, and perform OCR on the document again. See “Changing Zone Properties” on page 71 for more information.
• Make sure the correct main and secondary document languages are selected in the the document should be selected. See “Language Settings” on page 50 for more information.
• Select
• Train OmniPage Pro to recognize special characters that might
• If you use
• Check the glass, mirrors, and lenses on your scanner for dust,
Use Language Analyst
Language Analyst evaluates words and corrects likely errors during OCR. See “Accuracy Settings” on page 49 for more information.
normally be difficult to recognize, such as the copyright symbol © or the registered trademark symbol ®. See “Training OCR for
Special Characters” on page 74 for more information.
True Page
gets put into frames (formatting boxes) in the text viewer. Some text may be hidden from view if a frame is to o small. To view th e text, place the cursor in the text frame and use the arrow keys on your keyboard to scroll to the top, bottom, left, or right of the frame.
smudges, or scratches. Clean if necessary.
Language
as the
settings. Only languages included in
in the
Output Format
Accuracy
OCR Problems
settings. The
setting, recognized text
OmniPage Pro only recognizes printed text charac ters such as typewritten or laser-printed text. However, it can retain handwritten text, such as a signature, as a graphic. See “Do you want to retain graphics in your document?” on page 60 for guidelines on retaining graphics.

Problems With Fax Recognition

Try these solutions to improve OCR accuracy on fax images:
Technical Information - 96

Uninstalling the Software

• Ask senders to select This produces a resolution of 200x200 dpi.
• Ask senders to transmit files directly to your computer via fax modem if you both have one. You can save fax images as image files and then load them into OmniPage Pro. See “Supported File Formats” on page 89 for more information.
• Ask senders to use clean, original documents if possible. Sans serif fonts (such as the one used for headings in this manual) are easier to recognize than serif fonts (such as the one used for body text in this manual).
Uninstalling the Software
Sometimes uninstalling and then reinstalling OmniPage Pro and the Caere Scan Manager will solve a problem.
OmniPage Pro’s Uninstall program will the OmniPage Install directory or subdirectories, in addition to the following files:
• Zone templates (
• Training files (
• User dictionaries (
• Temp files (
WPS
]RQ
WUQ
XG
)
Fine
)
or
)
)
mode when they send you a fax.
Best
remove any files saved to
not
To uninstall OmniPage Pro:
1 Close OmniPage Pro. 2 Click
ApplicationsUninstall OmniPage Pro
3 Click 4 Restart your computer.
in the Windows taskbar and choose
Start
to confirm that you want to remove OmniPage Pro.
Yes
Caere
.
Technical Information - 97
To uninstall the Caere Scan Manager:
1 Close OmniPage Pro.
Uninstalling the Software
2 Click
SettingsControl Panel Add/Remove Programs
3 Select
in the Windows taskbar and choose
Start
Caere Scan Manager 3.0
and click
.
Add/Remove
.
4 Click OK to confirm that you want to remove the Caere Scan
Manager.
5 Restart your computer.
Some icons and program files may remain on your system if they have been renamed, modified, or moved to different locations.
Technical Information - 98
Glossary Terms
3D OCR® A technology developed by Caere that uses grayscale
information to increase accuracy when recognizing scanned text characters.
ADF See AnyPage A technology developed and licensed by Caere that
improves the combined performance of grayscale scanners and OmniPage Pro. AnyPage uses th e quality of grayscale images to improve the recognition of scanned pages. It is especially useful for text printed on shaded backgrounds.
auto zoning The process OmniPage Pro uses to automatically draw and
order zones on a page image.
automatic document feeder (ADF) A device that allows you to scan
multiple pages without having to place each page in the scanner. Some ADFs are built into scanners; others are add-on products.
automatic processing Using OmniPage Pro’s AUTO button to process
an open document or a new document from start to finish according to the selected process commands.
fax Short for facsimile machine. Fax machines scan a page, convert the
image into digital data, and send the data over a phone line to another fax or computer. The receiving machine creates the image on paper or stores the data on disk as a fax file.
file format The way an application records and stores information in a
file. A document’s file format must be converted in order to open it in another application that does not support the current file format.
font In typography, a complete set of type in one size and style of
character. In computer usage, a collection of letters, numbers, punctuation marks, and other typographical symbols with a consistent appearance; the size can be changed readily.
automatic document feeder
.
font mapping Matching a font type with a particular font. OmniPage
Pro can map selected TrueType fonts to the font types that it detects in a document during recognition.
format The form in which information on a printed page is organized
and presented, including page size, column layout, paragraph spacing, fonts, and so on.
Glossary Terms - 99
Glossary Terms
frame A formatting box containing text or graphics that is used to
design page layout. For example, columns in a document may be contained within a separate frame.
HP AccuPage® A technology developed and licensed by Hewlett-
Packard that improves the combined performance o f HP scanners and OmniPage Pro. While preserving the quality of grayscale images, HP AccuPage technology retains the format of scanned pages, improves the recognition of text printed on shaded backgrounds, and accurately recognizes text printed at small point sizes.
image An electronic picture of text and/or graphics such as a scanned
paper document or an electronic fax file. Images do not have editable text characters; they have many tiny dots (pixels) that together form a picture of text.
image viewer The area on the O mniPage Pro desktop that displays the
original page image.
are created in the image viewer before OCR
Zones
takes place.
Language Analyst® A Caere technology that uses information about
language context and usage rules to evaluate text and correct likely errors during OCR.
mapping See
font mapping
.
monospaced font Any font in which all characters have the same width.
For example, in Courier New (a monospaced font), the letter 0 is the same width as the letter . Thus,
OCR See
optical character recognition
00000
.
is the same width as

OCR Aware A feature that allows you to use OmniPage Pro OCR in
another application such as Microsoft Word. You can perform OCR on an image and paste the resulting text directly into an open document.
OmniPage Document A file that is saved in OmniPage Pro’s
proprietary format (
PHW
). OmniPage Documents can consist of
original page images, zones, and recognized text.
optical character recognition (OCR) The process of turning an image,
such as a scanned paper document or an electronic fax file, into computer-editable text so you do not have to retype the text manually.
point A typographic unit of measurement equal to 1/72 inch, measured
vertically. Points are used to describe font size.
proportional font Any font in which characters differ in width. For
example, in Palatino (a proportional font), the letter M is wider than the letter l. Thus, MMMMM is wider than lllll.
recognition The OCR process. A page is
recognized
when OmniPage Pro
performs OCR on it.
.
Glossary Terms - 100
Loading...