European Offices:Caere GmbHInnere Wiener Strasse 581667 MunichGermany
Please Note
In order to use this program, you should know how to work in the Microsoft Windows
environment. Please refer to Windows documentation if you have questions about how to use menu
commands, dialog boxes, scroll bars, edit boxes, and so on.
Many of the designations used by manufacturers and sellers to distinguish their products are
claimed as trademarks. Such designations appearing in this manual have been printed in initial
caps.
PD 802-0584-030A
2
Welcome
Welcome to OmniPage Pro, and thank you for buying our software!
The following documentation has been provided to help you learn
about OmniPage Pro.
This Users Manual
This manual introduces you to the basics of using OmniPage Pro. It
includes an introduction to OmniPage Pro, insta llation and setup
instructions, task-oriented instructions, ways to customize processing,
settings guidelines, and technical information.
Online Help
OmniPage Pro’s online help contains detailed information on features,
settings, and procedures. The online help conforms to Windows online
help conventions and has been designe d for quick and easy information
retrieval. Please see “Getting Online Help” on page 12 for more
information.
Release Notes
The
Release Notes
OmniPage Pro. Please read this before installing th e application.
Scanner Setup Notes
The
Scanner Setup No tes
supported scanners and scanner setup.
booklet contains last-minute in formation about
booklet contains the latest information about
Welcome – 3
Using This Manual
This manual is written with the assumption that you know how to work
in the Microsoft Windows environment. Please refer to your Windows
user’s manual or online help if you have questions about how to use
dialog boxes, menus, and so on.
The following conventions are used in this manual.
ConventionPurpose
Using This Manual
Italicized text
Note symbolIntroduces a tip or an item of
Warning symbolIntroduces cautionary text
• Emphasizes menu
commands, dialog box
options, labeled buttons, and
file names
For example:
“Choose
menu.”
• Emphasizes new terms the
first time they are used
• Emphasizes important words
in a sentence
note
Open...
in the File
Welcome – 4
Chapter 1
Introduction to
OmniPage Pro
You probably use your computer for most business correspond ence and
other written projects. The problem is that certain sources of information
cannot be immediately used on a computer.
For example, if you want to incorporate information from a magazine
article into a document in your word processor, you somehow have to
get the text from the article into your computer. Painstakingly retyping
the article is not an appealing solution.
OmniPage Pro offers a smart solution to in crease your work
productivity. OmniPage Pro’s
technology accurately and easily converts scanned paper documents
and image files into editable text for use in your favorite computer
applications. OmniPage Pro does the retyping for you.
optical character recognition (OCR)
Please continue reading this chapter for information on these topics:
• What Is Optical Character Recognition (OCR)?
• The OmniPage Pro Desktop
• Getting Online Help
• Product Support
Introduction to OmniPage Pro - 5
What Is Optical Character Recognition (OCR)?
What Is Optical Character Recognition (OCR)?
Optical character recognition (OCR
computer-editable text. An image is an electronic picture of text such as
a scanned paper document or an electronic fax file. Images do not have
editable text characters; they have many tiny dots (
form a picture of text.
During OCR, OmniPage Pro analyzes an image and defines characters
to produce editable text. After OCR, you can export the resulting text to
a variety of word-processing, page layout, and spreadsheet
applications.
OmniPage Pro OCR
In addition to text recognition, OmniPage Pro can retain the following
elements of a document during OCR.
Graphics
Photos, logos, and drawings are examples of graphics.
Text formatting
Font types, font sizes, and font styles (such as bold or
of text formatting.
Page formatting
Column structure, paragraph spacing, and placement of graphics are
examples of page formatting.
) is the process of turning an
) that together
pixels
) are examples
italic
image
into
The graphics, text formatting, and page formatting elements that
OmniPage Pro retains are determined by the settings you select. See
“Settings Guidelines” on page 54 for more information.
OmniPage Pro only recognizes machine- printed characters such as
laser-printed or typewritten text. However, it can retain handwritten
text, such as a signature, as a graphic.
Introduction to OmniPage Pro - 6
What Is Optical Character Recognition (OCR)?
Basic Steps of OmniPage Pro OCR
These are the basic steps of OmniPage Pro’s OCR process.
1Bringa document image into OmniPage Pro.
You can scan a paper document, load an image file, or load a fax
from Microsoft. The resulting image appears in OmniPage Pro’s
image viewer. See “Bringing Document Images into OmniPage
Pro” on page 23 for more information.
2Create zonesto identify areas youwant to recognize as
text or retain as graphics.
Zones are borders that enclose the parts of a document image that
will get processed. You can create zones manually, automatically,
or with a template. Any areas not enclosed by zones are ignored
during OCR. See “Creating Zones for OCR” on page 26 for more
information.
3Perform OCR toconvert textinformation into editable
text characters.
During OCR, OmniPage Pro defines text characters in an image.
After OCR, you can check and correct errors in the text. See
“Performing OCR on a Document” on page 27 for more
information.
4Export the document to the desired location.
You can save your document to a specified file format, place it on
the Clipboard, or send it as a mail attachment. See “Exporting
Documents” on page 39 for more information.
There are different ways to start the OCR process in OmniPage Pro. See
“Ways to Process Documents” on page 21 for more information.
Introduction to OmniPage Pro - 7
The OmniPage Pro Desktop
OmniPage Pro’s desktop displays the pages of a docum ent in its
thumbnail viewer, image viewer, and text viewer. You can use buttons
in the Standard, AutoOCR, and Zone toolbars to perform various tasks
on the document.
Standard toolbar
AutoOCRtoolbar
The thumbnail viewer displays apicture ofeach pagein the document.
The OmniPage Pro Desktop
Zone
toolbar
The current page has a boxaround it.
The image viewer
displays the current
pages original image.
Drag this splitter to
the left or right to
resize a viewer.
The text viewer displays the
current pagesrecognized
text and retained graphics.
Introduction to OmniPage Pro - 8
AutoOCR Toolbar
The AutoOCR toolbar contains buttons that can activate each step of the
OCR process.
AUTO
button
Set commands in the AutoOCR toolbar buttons for the operations you
want to perform. You can choose commands in a buttons’s drop-down
list.
• The AUTO button allows you to activate automatic processing or
use the OCR Wizard.
• The Image button allows you to bring in ima ges by scanning or
loading pages.
• The Zone button allows you to a utomatically create zones on
images based on their original page layouts or predefined
templates.
• The OCR button allows you to perform OCR or train characters
for OCR.
• The Export button allows you to save, copy, or send your
recognized document as a mail attachment.
Image button
The OmniPage Pro Desktop
Zone
button
Click the down arrow to
display the commands in a
buttons drop-down list.
OCR
button
Export button
Please see “Setting AutoOCR Toolbar Commands” on page 43 for more
information on each toolbar button.
Introduction to OmniPage Pro - 9
Standard Toolbar
The Standard toolbar contains buttons and drop-down lists for
performing various tasks.
The OmniPage Pro Desktop
New
Open
Save
Print
Zone Toolbar
Check
Recognition
Cut
Copy
Paste
Undo
View
Image
Editor
Options
Rotate Image
Straighten
Image
Zoom
The Zone toolbar contains buttons that allow you to draw and define
zones on a page image.
Draw
Rectangular
Zones
Add to
Zone
Reorder
Zones
Help
Draw
Irregular
Zones
Subtract
from
Zone
Zone
Properties
See “Customizing Zones” on page 65 for more information.
Introduction to OmniPage Pro - 10
Options Dialog Box
Click thetabs inthe Options dialog box to view and select different settings.
The OmniPage Pro Desktop
You can select settings fo r OmniPage Pro in the Options di alog box. To
open it, click the Options button or choose
Options...
in the Tools menu.
See Chapter 4, OmniPage Pro Settings, for more information on settings.
Introduction to OmniPage Pro - 11
Getting Online Help
After installing OmniPage Pro, you can use its online help system to get
information on features and procedures.
Please refer to your Windows documentation to learn more about using
Windows online help systems.
Help Menu
Use commands in the Help menu to open topics that provide
information on features and procedures.
Getting Online Help
• Choose
listings for OmniPage Pro help topics.
• Choose
Pro, including tutorial exercises.
• Choose
services for OmniPage Pro.
• Choose
OmniPage Pro Help Topics
Getting Started
Product Support
Tip of the Day
Context-Sensitive Help
You can get on-the-spot informatio n about a particular OmniPage Pro
command, toolbar button, or dialog box option in the following ways.
• Click the Help button in the Standard toolbar and then click any
toolbar button, menu command, or area of the OmniPage Pro
desktop to display informa tion about that item.
• Click the question-mark button in the upper-right corner of a
dialog box and then click an item in the dialog box to get an
explanation of that item.
• Some dialog boxes have a
information about that dialog box.
to get contents and index
to get introductory topics to OmniPage
to find out how to get product support
to get hints for using OmniPage Pro.
button. Click
Help
Help
to get
Introduction to OmniPage Pro - 12
Product Support
For the fastest and easiest way to get help, please look for solutions in
this manual or in the online help. For troubleshooting tips, see “Genera l
Troubleshooting Solutions” on page 86.
If you need additional help, product support and information are
available to registered users through the services listed in this table.
Product Support
ServiceHow to Contact
World WideWeb home page(common Q&A, patches, updates,troubleshooting procedures,and product information)
TelephoneSupport in North America (fee-based troubleshooting)
For international telephonenumbers, pleaserefer to the insert in your OmniPage Propackage.
http://www.caere.com
+1 408 395-1631(8 bits, no parity, 1 stop bit)
+1 408 354-8471 (USfax numbers only)
+1 408 395-8319
Caere Product Support
Please have the following information ready for the best service if you
call Caere Product Support:
• OmniPage Pro version and serial number
• The make and model of your computer system, scanner, and
other peripheral devices (printer, monitor, and so on)
• The names and version numbers of any other scanning software
you use
• The amount of memory (RAM) on your system
To get memory information, choose
in the Windows taskbar. Double-click the System icon in
Panel
SettingsControl
Start
the Control Panel to open the System Properties dialog box. On
Windows 95, click the
Performance
information. On Windows NT, click the
tab to see memory
General
tab to see
memory information.
• The amount of free hard disk space on your system
To get disk space information, open Windows Explorer and
select the drive letter for your hard disk. The status bar will
report how much free hard disk space is available.
Introduction to OmniPage Pro - 13
Chapter 2
Installation and Setup
This chapter provides installation and setup information for OmniPage
Pro and the Scan Manager.
For technical and troubleshooting information, please read Chapter 6,
Technical Information. For specific scanner information, please read the
Scanner Setup Notes
This chapter contains the following topics:
• Minimum System Requirements
• Install i n g OmniPage Pro
• Setting Up Your Scanner with OmniPage Pro
• Starting OmniPage Pro
• Registering OmniPage Pro
included in your OmniPage Pro package.
Installation and Setup - 14
Minimum System Requirements
You need the following setup, at minimum, to install and run OmniPage
Pro:
• Computer with a 486 or higher processor
• Microsoft Windows 95 or Windows NT 4.0
• 8MB of memory (RAM) for Windows 95
16MB of memory for Windows NT
• 30MB of free hard disk space to install application files, the Scan
Manager, and one OCR language
40MB to install above files and all O C R languages
10MB of free hard disk space for temporary files during
installation
• SVGA or VGA monitor
• Windows-compatible mouse
• A compatible scanner if you plan to sc an documents
Please see the
Scanner Setup Notes
Minimum System Requirements
for a list of tested scanners.
Performance and speed will be enhanced if your system exceeds these
minimum requirements.
Installing OmniPage Pro
OmniPage Pro’s Setup program takes you through installation with
onscreen instructions at every step. For best results, do not run any other
programs — especially anti-virus programs — during installation.
Be sure your scanner is connected, turned on, compatible with your
system, and runs with the software provided by the scanner
manufacturer
Toinstall OmniPage Pro:
1Insert OmniPage Pro’s CD-ROM in the CD-ROM drive.
before
The Setup program should start automatically. If it does not
start, locate your CD-ROM drive in Windows Explorer and
double-click the
CD-ROM.
you install OmniPage Pro.
6HWXSH[H
program at the top-level of the
Installation and Setup - 15
Setting Up Your Scanner with OmniPage Pro
2Click
3Follow the onscreen instructions to finish installation.
During installation, you are prompted to enter a serial number.
You can find the serial number on the label of the CD-ROM.
to continue with installation.
Next
Setting Up Your Scanner with OmniPage Pro
To use your scanner with OmniPage Pro, you must install the Scan
Manager and select your scanner. You are prompted to do this during
OmniPage Pro’s regular installation. However, you can also instal l the
Scan Manager at a separate time.
The
Scanner Setup No tes
scanner support and setup. You can also find more information in
“Scanner Setup Issues” on page 91.
Use the following procedure to install the Scan Manager if you did not
install it during OmniPage Pro installation.
To install the Scan Manager:
1Make sure your scanner is turned on when you start your
computer.
contain the most detailed information about
2Close OmniPage Pro if it is open.
3Insert OmniPage Pro’s CD-ROM in the CD-ROM drive.
4Cancel the regular Setup program if it sta r ts automatically.
5Double-click the
Scanmgr\Disk 1
6Select your scanner when you are prompted.
The Scan Manager finishes installing after you make your
scanner selection.
Once your scanner is set up with OmniPage Pro, you can select
scanner settings in OmniPage Pro’s Options dialog box. See
“Scanner Settings” on page 49 for more information.
VHWXSH[H
folder.
program located in the
Installation and Setup - 16
Starting OmniPage Pro
Tochange your scanner selection inthe Scan Manager:
1Make sure your scanner is turned on when you start your
computer.
2Close OmniPage Pro if it is open.
3Click
SettingsControl Panel
4Double-click the
Manager.
5Click the
6Select the name of the scanner you want to use in the
Scanners
7Click
You are prompted to select the directory containing the files
that need to be installed.
8Insert OmniPage Pro’s CD-ROM in the CD-ROM drive.
Cancel the regular Setup program if it starts automatically.
9Select
10 Click
Starting OmniPage Pro
If you plan to scan, make sure your scanner is attached to your computer
and turned on before you start OmniPage Pro.
To start OmniPage Pro, click
ProgramsCaere ApplicationsOmniPage Pro 8.0
group you selected during installation if it is different than
Applications
.)
in the Windows taskbar and choose
Start
.
Caere Scan Manager 3. 0
Select Scanner
list box.
Set as Current Scanner
Scanmgr\Disk 1
in the Scan Manager after processing is complete.
Close
tab.
and then click
as the installation directory and click OK.
in the Windows taskbar and choose
Start
icon to open the Scan
.
Apply
. (Use the program
Supported
Caere
Or, double-click the OmniPage Pro icon located in the folder where you
installed OmniPage Pro.
See “The OmniPage Pro Desktop” on page 8 for an intro d uction to
OmniPage Pro’s user interface.
Installation and Setup - 17
Registering OmniPage Pro
Registering your copy of OmniPage Pro entitles you to product support,
notification of special offers, and the lowest price offered on the next
OmniPage Pro upgrade.
You can use OmniPage Pro for 25 sessions without registering it. The
Register dialog box appears the 26th time you launch OmniPage Pro,
and the program exits if you do not register at that time.
If you purchased your product directly from Caere or if you are already
a registered user, you should
To registerOmniPage Pro by telephone:
Registering OmniPage Pro
be prompted to register again.
not
You will be askedto provide your serial and key numbers.
When you get your registrationnumber, enter it here.
1Click the
Register
menu to open the Register dialog box.
This dialog box appears automatically the first time you start
OmniPage Pr o .
Closes the Register dialog box without registering.
Prints out your registration information.
2Click the
drop-down list and locate the phone number for
Call
your country.
3Call the phone number and ask for a registration number.
You will be asked to provide your serial and key numbers that
are listed in the Register dialog box.
4Enter the registration number in the
box and click OK.
Registration Number
Installation and Setup - 18
text
Registering OmniPage Pro
The Registration menu disappears from the menu bar after you
register.
Toregister OmniPagePro at Caeres Web site:
You will need to enter your serial and key numbers.
When you get your registration number, enter it here.
1Click the
Register
menu to open the Register dialog box.
Opens a help topic thatprovides instructions andalink to Caeres Web site.
2Open your Web browser and go to the following address:
http://www.caere.com/registration
3Enter the requested information in the fields provided.
You will need to enter your serial number and key numbers
that are listed in the Register dialog box.
4Click
Submit Information
when you are finished entering
information.
You will be given a registration number.
5Enter the registration number in the
Registration Number
box and click OK.
The Register menu disappears from the menu bar af ter you
register.
Installation and Setup - 19
text
Chapter 3
Processing Documents
This chapter describes how to work with documents in OmniPage Pro ,
including each step of the OCR process.
There are different ways to accomplish the same tasks in OmniPage Pro.
You can use toolbar buttons or menu commands to start procedures.
OmniPage Pro can perform all OCR steps automatically, or you can start
each step individually. You can even do different tasks at the same time.
Please continue reading this chapter for information on these topics:
• Ways to Process Documents
• Bringing Document Images into O mniPage Pro
• Creating Zones for OCR
• Performing OCR on a Document
• Checking OCR Results
• Using OCR in Other Applications
• Working with Documen t s
• Exporting Documents
For complete information on all OmniPage Pro commands, settings, and
procedures, please use OmniPage Pro’s online help. See “Getting Online
Help” on page 12 for more information.
Processing Documents - 20
Ways to Process Documents
Optical character recognition (OCR) is the process of turning an image
into computer-editable text so you do not have to retype the text
manually. Chapter 1 explains the basic steps of Omn iPage Pro’s OCR
process. The following is a summary of those steps.
1Bring a document image into OmniPage Pro.
See page 23 for more information.
2Create zones to identify areas you want to recognize as text or
retain as graphics.
See page 26 for more information.
3Perform OCR to convert text information into editable text
characters.
See page 27 for more information.
4Export the document to the desired location.
See page 39 for more information.
Using the OCR Wizard
The OCR Wizard guides you through the entire OCR process by asking
you questions about your document and selecting the appropriate
settings for you.
Ways to Process Documents
To process your document using the OCR Wizard:
1Set
2Click AUTO or choose
3Answer the question in the first screen and click
4Continue answering questions in the screens that follow.
OCR Wizard
down list.
The first wizard screen appears.
as the command in the AUTO button’s drop-
OCR Wizard
in the Process menu.
Next
Processing Documents - 21
.
Automatic Processing
Use the AUTO button to process a new document from start to finish or
finish processing an open document.
To process your document automatically:
Ways to Process Documents
1Set
2Set the desired Image, Zone, OCR, and Export commands.
3Choose
4Place your document in your scanner if you are scanning.
5Click AUTO or choose
AutoOCR
down list.
See “Setting AutoOCR Toolbar Commands” on page 43 for
more information.
appropriate for your document.
See “Settings Guidelines” on page 54 for more information.
Each page of the document is processed and finished in order
according to the selected commands. If page images in an open
document already have zones, OmniPa ge Pro will skip zoning
for those pages and continue with the selected OCR and export
operations.
as the command in the AUTO button’s drop-
Options...
in the Tools menu and check that settings are
AutoOCR
Performing Multiple Tasks at Once
OmniPage Pro takes advantage of your computer’s ability to handle
more than one process at a time. You can simultaneously scan, create
zones, recognize, and edit documents. You do not have to wait for any
process to complete before moving on to the next task.
For example, if you scan a multiple-page document, you can draw zones
on an image as soon as the first page is scanned and you can edit
recognized text as soon as it appears in the text viewer. These tasks can
be done at the same time other pages are being scanne d and recognized.
in the Process menu.
Starting the OCR Process Outside OmniPage Pro
You can start the OCR process outside OmniPage Pro in a variety of
ways. For example, you can use the
from another application and paste recognized text into an open
document. See “Using OCR in Other Applications” on page 33 for more
information.
OCR Aware
feature to initiate OCR
Processing Documents - 22
Bringing Document Images into OmniPage Pro
Bringing Document Images into OmniPage Pro
You can bring document images into OmniPage Pro by:
• Scanning Pages
• Loading Ima g e Files
• Loading Exchange Faxes
Scanning Pages
You can scan paper documents to convert them to electronic images in
OmniPage Pro. If a document is already open, scanned pages are
inserted as new pages.
To scan in OmniPage Pro, you must install the Scan Manager and select
your default scanner. See “Setting Up Your Scanner with OmniPage
Pro” on page 16 for more information.
If you use a Visioneer scanner or if your scanner is set up to work with
Visioneer’s PaperPort software, see “Using Visioneer Scanners with
OmniPage Pro” on page 89.
To scan pages into OmniPage Pro:
1Place your page in your scanner.
You can scan a stack of pages if you ha ve an automatic
document feeder (ADF).
2Set
3Choose
4Click the Image button or choose
Scan Image
down list.
make sure the appropriate settings are selected.
Select
Scan Until Empty
at once. Otherwise, you must click the Image button to scan
each subsequent page.
menu.
Pages are scanned in order and combined into one working
document.
as the command in the Image button’s drop-
Options...
in the Tools menu and click the
if you want to scan all pages in an ADF
Scan Image
Scanner
in the Process
tab to
Processing Documents - 23
Loading Image Files
An image file is an electronic picture of text, such as a scanned paper
document or an electronic fax, that is saved in an image file format such
as PCX or TIFF. You can load image files into OmniPage Pro. If a
document is already open, loaded image files are inserted as new pages.
The following procedure is for loading image files only. To open a n
OmniPage Document (
menu.
To load image files into OmniPage Pro:
PHW
Bringing Document Images into OmniPage Pro
), use the
command in the File
Open...
1Set
Load Image
as the command in the Image button’s drop-
down list.
2Click the Image button or choose
Load Image
in the Process
menu.
The Load Image dialog box appears.
Click
Advanced
you want to select files from more than one folder.
3Select the folder location and file type of the file you want to
load.
See “Supported File Formats” on page 89 for a complete list of
supported file formats.
4Select the files you want to load.
You can Shift-click or Ctrl-click to select multiple files in the
same folder.
if
5Click
folder
Advanced
.
• Select a file and click
• Click
Add All
if you want to select files from more th an one
to put it in the
Add
Selected Files
list.
to add all files from the current folder.
Processing Documents - 24
Bringing Document Images into OmniPage Pro
6Click
load.
Image files are loaded in the order selected and combined into
one working document.
when you have selected all the files you want to
Open
Loading Exchange Faxes
You can load fax images into OmniPage Pro from Microsoft Exchange or
Outlook if you have the Microsoft Fax comp onent installed with those
applications. Please see Microsoft documentation for information on
configuring these applications.
If a document is already open, loaded faxes are inserted as new pages.
For best results, ask senders to use
you faxes.
To load Exchange faxes into OmniPage Pro:
1Set
Load Exchange Fax
drop-down list.
This command only appears in the drop-down list if you have
the Microsoft Fax component installed with Microsoft
Exchange or Outlook.
or
Fine
as the command in the Image button’s
mode when they send
Best
2Click the Image button or choose
Process menu.
The Exchange dialog box appears.
3Select the folder that contains the faxes you want to load.
4Select the faxes you want to load.
You can Shift-click or Ctrl-click to select multiple faxes.
5Click
load.
Exchange faxes are loaded in the order selected and combined
into one working document.
when you have selected all the faxes you want to
Open
Load Exchange Fax
Processing Documents - 25
in the
Creating Zones for OCR
Creating Zones for OCR
This is a text zone. It will be converted to text during OCR.
All unzoned areas of the page will be ignored during OCR.
Page images are displayed in OmniPage Pro’s image viewer where
zones
are created before OCR. Zones are borders that identify areas of an
image that will be recognized as text or retained as graphics. Any part of
an image not enclosed by a zone is ignored during OCR.
This is a graphic zone. It will be kept as a graphicimage during OCR.
For information on drawing zones manually, modifying zones, deleting
unwanted zones, and using zone templates, please see “Customizing
Zones” on page 65.
Creating Zones Automatically
OmniPage Pro can analyze a page and create zones automatically for
you. It uses the selected setting in the Zone button to determine the text
flow on a page and breaks it into ordered zones.
To create zones automatically:
1Choose a setting in the Zone button’s drop-down list that most
closely matches the format of your document.
You can choose
Tables, Mixed Pages
Button Commands” on page 45 for more information o n these
settings.
Single-Column Pages, Multiple-Column Pages
, or a template of your own. See “Zone
,
Processing Documents - 26
Performing OCR on a Document
You can also choose
HP AccuPage
scanning and zoning technology — as the zone setting if your
scanner supports it and
HP AccuPage
Manager.
2Click the Zone button or choose
menu.
OmniPage Pro automatically draws zones on the current page
in the image viewer. Each zone has a number indicating its
order and a letter indicating its zone properties.
Zone #1: alphanumeric text
Zone #2: alphanumeric text
Make sure zones are identified correctly before performing
OCR. For example, if you want to retain an area as a graphic,
that area should be identified as a
“Changing Zone Properties” on page 71 for more information.
Performing OCR on a Document
— an advanced Hewlett Packard
is selected in the Scan
Auto Zones
Zone #3: graphic
Graphic
in the Proces s
zone type. See
Performing OCR converts an image to editable text. This is also referred
to as
recognizing text
.
OmniPage Pro only recognizes printed characters such as laser-printed
or typewritten text. However, it can retain handwritten text, such as a
signature, as a graphic.
To perform OCR:
1Choose
Options...
in the Tools menu and click the
Page For mat
tab.
2Select an
Output Format
setting for your document.
OmniPage Pro uses this setting to determine the output
formatting of a document during OCR.
Processing Documents - 27
Checking OCR Results
3Set
down list.
Or, set
checking to begin automatically after OCR.
4Click the OCR button.
The page is recognized according to the current zones and
settings. If there are no zones on the page, zones are created
according to the current command in the Zone button.
To schedule a group of documents for OCR at a particular time, see
“Scheduling OCR” on page 79.
Checking OCR Results
After performing OCR, recognized text appears in the text viewer where
you can check for errors. Error checking starts automatically if you chose
OCR and Check
OmniPage Pro marks suspected errors in green and inserts a red “reject”
character for any character it cannot recognize. To turn off these color
markers, choose
OCR and Check
Perform OCR
as the command in the OCR button’s drop-
as the command if you do not want error
as the OCR process command.
Show Markers
in the View menu.
To check and correct errors:
1Click the Check Recognition button or choose
Recognition...
in the Tools menu.
The Check Recognition dialog box displays the first suspected
error and a picture of how it originally looked in the image.
Check
Click in this window to enlarge or reduce the picture.
Processing Documents - 28
Checking OCR Results
2Select one of these options for the word:
• Click
• Click
to allow the word to remain as is.
Ignore
Ignore All
to ignore all instances of the word in the
current document.
• Click
edit box.
to
• Click
word in the
• Click
to replace the word with the word in the
Change
Change All
to add the word to the current user dictionary.
Add
to replace all instances of the word with the
Change to
edit box.
After you choose an option for the word, OmniPage Pro
automatically continues to find the next possible error.
Change
3Click
Verifying Text
After performing OCR, you can compare recognized text against the
original image to verify that the text was recognized correctly.
To verify text against its original image:
1Double-click any word in the text viewer or select a word and
2Click inside the window to enlarge or reduce the picture.
to stop checking recognition.
Done
Color markers are removed from words that have been
checked.
choose
Verify Text
in the Tools menu.
The Verify Text window opens and shows a picture of the
original word and its surrounding area.
Close button
3Continue double-clicking words that you want to verify.
The window display changes as you select new words.
4Click the standard Close button to close the window.
Processing Documents - 29
Checking OCR Results in Microsoft Word
You can check for OCR errors directly in Microsoft Word 7 or Microsoft
Word 97 if you have those versions installed on your computer.
Checking OCR Results
To enable this feature, you must select settings in the
Microsoft Word
section of OmniPage Pro’s Options dialog box. See “Microsoft Word
Settings” on page 53 for more information.
Make sure the
*.doc
file extension is associated with the version of
Word you plan to use. Please refer to your Windows documentation for
more information on associ ating file extensions with application s.
To check and correct errors in Microsoft Word:
1Perform OCR on your document and then save it as the
appropriate file type:
• Save as
• Save as
Word for Windows 7.0
Word 97
if you are using that version.
if you are using that version.
2Open the document in Microsoft Word.
The document must be opened on a system that has OmniPage Pro
installed.
An OmniPage menu appears in Microsoft Word’s m e nu bar
along with a corresponding toolbar:
Check Recognition
Verify Text
3Choose
Close Image Viewer
Check Re cognition ...
Remove Check Recognition Support
in the OmniPage menu.
Processing Documents - 30
Use these buttons to zoomin or out on the image.
original image
Checking OCR Results
When the first suspected error is located, the Verify Text
window appears displaying the original image of the text.
The Check Recognition dialog box also appears.
4Select one of these options for the word:
• Click
• Click
• Click
edit box.
to
• Click
word in the
• Click
to allow the word to remain as is.
Ignore
Ignore All
Change
Change All
Add
to ignore all instances of the word.
to replace the word with the word in the
to replace all instances of the word with the
Change to
edit box.
to add the word to the current user dictionary.
Change
After you choose an option for the word, the next possible error
is located.
5Click
to stop checking recognition.
Done
Color markers are removed from words that have been
checked.
To verify recognized text against its original image in Microsoft Word,
you must process the document in OmniPa ge Pro and save it to the
appropriate Word format. You cannot verify text against original
images using the OCR Aware feature.
Processing Documents - 31
Checking OCR Results
To verify text against its original image in Microsoft Word:
1Follow steps 1 and 2 in the preceding instructions if your
document is not already open in Microsoft Word.
2Select a suspect word.
Suspect words are marked in the color that was selected in the
Microsoft Word
section of OmniPage Pro’s Options dialog box.
You can only verify words that are marked as suspected errors.
However, once the Verify Text window is open, you can use its
scroll bars and zoom buttons to see any pa rt of the original image.
Use these buttons to zoom inor out on the image.
Removing OmniPage Pro Data from the Word Document
After checking for OCR errors, you should remove OmniPage Pro data
from your document to reduce its file size. You are automatically
prompted to remove OmniPage data after all suspect words have been
checked. You can also choose
OmniPage menu. The OmniPage menu, toolbar, color markers, and
image data will all be removed from the document.
3Choose
Verify Text...
in the OmniPage menu.
The Verify Text window opens and shows a picture of the
original word and its surrounding area.
4Repeat steps 2 and 3 to continue checking other suspect words.
The window display changes as you select new words.
5Choose
Close Image Viewer
in the OmniPage menu to close the
window when you are done.
Remove Check R ecognition Support
in the
Processing Documents - 32
Using OCR in Other Applications
Using OCR in Other Applications
You can use OmniPage Pro's
applications. For example, you can scan, recognize, and paste text
directly into a word-processing document without ever leaving the
application.
You can use OCR Aw are with 32-b it (and som e 1 6-bit ) ap plic at ions t hat
have been registered with OmniPage Pro. An application must be
installed on your computer in order to use it with OCR Aware. See page
51 for more information on registeri ng applications with OCR Aware.
For information on other way s to start OCR outside OmniPage Pro,
please see the “Starting OCR Outside OmniPage Pro” online help topic.
To use OCR Aware in an application:
1Align your document in your scanner if you plan to scan.
2Open the application in which you want to insert recognized
text.
The application must be registered to work with OCR Aware.
You do not need to open OmniPage Pro itself.
3Place the cursor at the location in your document where you
want to insert recognized text.
If no document is open, recognized text will be pasted to the
Clipboard.
OCR Aware
feature to use OCR in other
4Choose
you want to check the current settings.
5Choose
are ready to start the OCR process.
OCR processing occurs according to the selected settings.
Recognized text appears at the cursor location in your
application. If no document is open, text is pasted to the
Clipboard.
Text formatting, such as bold and italics, is retained if the
application supports RTF information. Otherwise, only plain text
will be pasted. Graphics are retained if the application supports
bitmap images.
Acquire Text Settings...
Acquire Text...
in the application's File menu when you
in the applica t io n' s File menu if
Processing Documents - 33
Working with Documents
OmniPage Pro’s thumbnail, image, and text viewers allow you to look
at and work with pages in the current document.
Thumbnail viewer
Working with Documents
Image viewerText viewer
This section describes the following procedures:
• Saving a Document as You Work
• Resizing a Page View
• Changing Pages
• Reordering Pages
• Deleting Pages
• Printing a Document
• Closing a Document
• Closing Omn i Page Pro
Drag this splitter to the left or right to resize a view.
Processing Documents - 34
Saving a Document as You Work
Click the
menu to save changes to the current document as you work. The first
time a document is saved, the Save As dialog box appears. See “Saving
a Document” on page 39 for more information.
button in the Standard toolbar or choose
Save
Working with Documents
in the File
Save
If a document has been saved as an OmniPage Document (
the changes you make in the open document are saved. If a document
has been saved as a text-based file type, only the text changes are saved
out to that file.
For example, suppose you save the current document as a text file called
Memo.txt
Pro. Whenever you click the Save button, changes in the recognized text
will overwrite the
but continue to work with the recognized text in OmniPage
Resizing a Page View
You can resize a page displayed in the image viewer or text viewer to
enlarge or reduce the view.
To resize a page view:
1Click in the viewer you want to enlarge or reduce to make it
active.
2Choose a size option in the Zoom drop-down list in the
Standard toolbar.
Or, choose
the drop-down list.
The page resizes as specified.
Memo.txt
Zoom
file.
in the View menu and select a size option in
PHW
), all
You can also click your right mouse butt on in the viewer you want to
resize and select a size option in the shortcut menu.
Processing Documents - 35
Changing Pages
The thumbnail viewer, image viewer, and text viewer all display the
same page in a document.
You can change pages in a document in the following ways:
• Click the thumbnail of the page you want to display.
Working with Documents
The thumbnail of the currently displayed page has a box around it.
• Click the Next Page or Previous Page buttons at the lower-right
corner of the OmniPage Pro desktop.
• Choose
Next Page, Previous Page
, or
Go to Page...
in the Edit menu.
Processing Documents - 36
Reordering Pages
You can reorder pages in a document by dragging their thumbnails to
different positions in the thumbnail viewer.
Working with Documents
Click the thumbnail of the page you want to move and drag it above the desired page number.
Hold down the Ctrl key while you click thumbnails if you want to select
multiple thumbnails to move as a group.
Deleting Pages
If you delete a page from a document in OmniPage Pro, the thumbnail,
original image, and recognized text for that page are all deleted.
To permanently delete pages:
• Choose
currently displayed page.
• Select one or more thumbnails of pages you want to delete and
press the Delete key.
Delete Current Page
in the Edit menu to delete the
Processing Documents - 37
Undoing Changes
Working with Documents
You can click the Undo button or choose
the very last change you made in the text viewer. You can also choose
to cancel zone deletions in the image viewer. However, page
Undo
deletions cannot be undone.
Printing a Document
You can print the current document's original page images or
recognized text.
To print a document:
1Choose
following in the submenu:
• Choose
• Choose
2Select the desired print settings in the Print dialog box.
3Click OK to start the print job.
As a shortcut, you can click either the text or image viewer to make it
active and then click the Print button to print from that viewer.
in the Edit menu to cancel
Undo
in the File menu and choose one of the
Print...
Image...
Text...
to print original page images.
to print recognized text.
Closing a Document
Choose
You are prompted to save your document if you have not saved it or
have modified it since the last save. Save a document as an OmniPage
Document (
in the File menu to close a document.
Close
PHW
Closing OmniPage Pro
Choose
to save the current document if you have not saved it or have modified
it since the last save.
in the file menu to close OmniPage Pro. You are prompted
Exit
) if you want to reopen it in OmniPage Pro again.
Processing Documents - 38
Exporting Documents
You can export a document to other applications by:
• Saving a Document
• Copying a Docume nt to the Clipboard
• Sending a Document as a Mail Attachment
After you export a document, a copy of the document remains open in
OmniPage Pro. Save the document as an OmniPage Document (
if you want to reopen it in OmniPage Pro again. OmniPage Documents
retain all original images, zones, and recognized text.
Saving a Document
You can save recognized text and original images to disk in a va riety of
file types.
To save recognized text:
Exporting Document s
PHW
)
1Choose
You can also click the Export button with
the drop-down list.
The Save As dialog box appears.
2Select a folder location and file type for your document.
See “Supported File Formats” on page 89 for a complete list of
supported file types.
3Type in a file name and select save options.
Save As...
in the File menu.
Save As
selected in
Processing Documents - 39
4Click OK.
The document is saved to disk as specified. Graphics and
formatting are saved in the document only if the selected file
type supports them.
To save original images:
Exporting Document s
1Choose
The Save Image dialog box appears.
2Select a folder location and file type for your document.
See “Supported File Formats” on page 89 for a complete list of
supported file types.
3Type in a file name and select
4Click OK.
The image is saved to disk as specified (zones and recognized
text are not saved with the file).
Save Image...
in the File menu.
and
Save
Image
options.
Copying a Document to the Clipboard
You can copy every page of a recognized document to the Clipboard
and then paste the text directly into another application.
To copy a document to the Clipboard:
1Set
2Click the Export button or choose
Copy to Clipboard
drop-down list.
Process menu.
The document is copied to the Clipboard.
as the command in the Export button’s
Copy to Clipbo ard
Processing Documents - 40
in the
Text formatting, such as bold and italics, is retained when you paste
into an application that supports RTF information. Otherwise, only
plain text will be pasted. Graphics are retained if the application
supports bitmap images.
Sending a Document as a Mail Attachment
You can send a recognized document as a file attached to a mail message
if you have a MAPI-compliant mail application, such as Microsoft
Exchange or Outlook, installed.
To send a document as a mail attachment:
Exporting Document s
1Choose
You can also click the Export button with
the drop-down list.
The Send Mail dialog box appears.
2Specify a file type and attachment options for your document.
3Click OK.
4Log into your mail application if you are prompted to do so.
A new message appears ready for addressing.
5Address your mail message as desired and click the Send
button.
The document is sent as an attachment to the mail message.
Send Mail...
in the File menu.
Send Mail
selected in
Processing Documents - 41
Chapter 4
OmniPage Pro Settings
This chapter describes the settings in the AutoOCR toolbar and Options
dialog box. Please look in OmniPage Pro’s online help for more detailed
information on settings.
The settings you select for processing documents can greatly affect OCR
results. You may have to experiment with different settings to get the
results you want. Settings guidelines are provided at the end of the
chapter to get you started.
Please continue reading this chapter for information on these topics:
• Setting AutoOCR Toolbar Commands
• Selecting OmniPage Pro Settings
• Accuracy Settings
• Scanner Settings
• Page Format Settings
• Language Settings
• OCR Aware Settings
• Process Settings
• Microsoft Word Settings
• Settings Guidelines
OmniPage Pro Settings - 42
Setting AutoOCR Toolbar Commands
The AutoOCR toolbar buttons allow you to take a document through
each step of the OCR process. Every toolbar button has different process
commands that can be set for the operations you want to perform.
OmniPage Pro can go through al l steps automatically, or you can start
each step individually.
Setting AutoOCR Toolbar Commands
AUTO button
Image button
You can set AutoOCR Toolbar commands in two locations:
• Click the down arrow next to each AutoOCR toolbar button and
select a process command in the drop-down list.
• Choose
Process Settings...
button and select process commands in the Options dialog box.
The pictures in the AutoOCR toolbar buttons change as you set dif ferent
process commands. The commands can be activated by clicking the
AutoOCR toolbar buttons or choosing commands in the Process men u.
AUTO Button Commands
Use the AUTO button to process a document from start to finish. The
AUTO button’s drop-down list contains the
commands.
AutoOCR
Select
AutoOCR
to the selected process commands. See “Automatic Processing” on page
22 for more information.
to finish processing a new or open document according
Zone
button
OCR
button
Export button
in the Process menu or click the Options
AutoOCR
and
OCR Wizard
OCR Wizard
Select
OCR Wizard
to have the OCR Wizard guide you through the
entire OCR process. See “Using the OCR Wizard” on page 21 for more
information.
OmniPage Pro Settings - 43
Image Button Commands
Use the Image button to bring a document image into OmniPage Pro’s
image viewer. The Image button’s drop-down list contains the
Image, Load Exchange Fax,
Load Image
and
Scan Image
Setting AutoOCR Toolbar Commands
Load
commands.
Select
Load Image
files.
Load Exchange Fax
Select
Load Exchange Fax
Outlook. This command only appears in the drop-down list if you have
the full Microsoft Fax application installed.
Scan Image
Select
Scan Image
command only appears in the drop-down list if you have installed the
Caere Scan Manager and have selected your default scanner.
Please see “Bringing Document Images into OmniPage Pro” on page 23
for more information.
to load existing image files such as TIFF or PCX
to load faxes from Microsoft Exchange or
to scan paper documents in your scanner. This
OmniPage Pro Settings - 44
Zone Button Commands
Use the Zone button to automatically create zones on document images.
Zones are boxes that specify what will be recognized as text or retained
as graphics on an image. The Zo ne butt on’s dro p-down li st cont ains th e
created. See “Creating Zones fo r OCR” on page 26 f or more information.
Single-Column Pages
Select
and order zones on single-column document images such as letters or
memos.
Multiple-Column Pages
commands and the names of any zone templates you have
Single-Column Pages
Setting AutoOCR Toolbar Commands
and
HP
to have OmniPage Pro automatically draw
Select
Multiple-Column Pages
to have OmniPage Pro automatically draw
and order zones on multiple-column document images such as
magazine or newspaper articles.
Tables
Select
to have OmniPage Pro automatically draw and order zones
Tables
on table format document images such as spreadsheets, or any pa ge that
contains a table.
Mixed Pages
Select
Mixed Pages
if your document contains multiple pages w ith a
variety of page layouts. OmniPage Pro will automatically draw and
order zones on each page.
HP AccuPage
®
,
If you use a scanner that supports HP AccuPage
AccuPage
as the auto zoning option for scanned pages.
you can select
Zone Templates
Select a zone template to create zones on document images using that
template. See “Creating Zone Templates” on page 72 for more
information.
HP
OmniPage Pro Settings - 45
OCR Button Commands
Use the OCR button to perform the selected OCR operation on
document images. The OCR button’s drop-down list contains the
Perform OCR, OCR and Check, Train OCR,
Perform OCR
Setting AutoOCR Toolbar Commands
and
Defer OCR
commands.
Select
Perform OCR
OmniPage Pro analyzes the image and identifies characters to produce
editable text. See “Performing OCR on a Document” on page 27 for more
information.
OCR and Check
Select
OCR and Check
automatically start checking for errors after OCR. Se e “Checking OCR
Results” on page 28 for more information .
Train OCR
Select
Train OCR
characters. These pre-recognized characters are saved in a training file,
which OmniPage Pro can use to compare with the characters in
document images during OCR. See “Training OCR for Special
Characters” on page 74 for more information.
Defer OCR
Select
Defer OCR
OmniPage Pro will process your document up to the point of OCR and
then ask if you want to schedule the document to be finished later. See
“Scheduling OCR” on page 79 for more information.
to recognize text on document images. During OCR,
to recognize text on document images and
to teach OmniPage Pro how to recognize special
to delay text recognition during automatic processing.
Export Button Commands
Use the Export button to export recognized text and retained graphics to
other applications. The Export button’s drop-down list contains the
As, Send Mail, Copy to Clipboard,
and
Defer Export
Save
commands.
Save As
Select
Save As
format. See “Saving a Document” on page 39 for more information.
to save a recognized document to disk in a specified file
OmniPage Pro Settings - 46
Send Mail
Setting AutoOCR Toolbar Commands
Select
Send Mail
to send a recognized document as a file attached to a
mail message if you have a MAPI- c ompliant mail application, such as
Microsoft Exchange or Outlook, installed. See “Sending a Document as
a Mail Attachment” on page 41 for more information.
Copy to Clipboard
Select
Copy to Clipboard
to place a copy of a recognized document on the
Clipboard. See “Copying a Document to the Clipboard” on page 40 for
more information.
Defer Export
Select
Defer Export
if you do not want to export your document right
after automatic processing. OmniPage Pro will process your document
up to the point of export and then stop.
OmniPage Pro Settings - 47
Selecting OmniPage Pro Settings
Selecting OmniPage Pro Settings
Click each tab toview and selectdifferent settings.
Click the Op ti ons b utt o n or cho os e
Options...
in the Tools menu to open
the Options dialog box. This is the central location for Omn iPage Pro
settings.
Click for a description ofeach setting.
Documents require different settings depending on their input
attributes and your output goals. To get the best results, learn how to
identify document attributes and make selections for them. You may
have to experiment with different settings to get the results you want.
Refer to the Settings Guidelines beginning on page 54 for more
information.
OmniPage Pro Settings - 48
Accuracy Settings
Accuracy Settings
Click the
most.
Language Analyst
evaluates and replacesunknown words withwords most likely to becorrect during OCR.
Training files helprecognize special
characters during OCR.
Select a brightness setting to account forvariations in paper and print qualitywhen you scan.
Scanner Settings
Click the
Accuracy
Scanner
tab to select settings that affect OCR accuracy the
Select
Small
text
if you areprocessing apage containingtext that is < 6 pt.
tab to select settings for scanning pages.
This is recommended
for black and whitepages.
This is recommendedfor pages with colored
backgrounds, coloredtext, or pages containinggrayscale graphics.
This isrecommendedfor highest accuracywith HP scanners that
support HPAccuPage.
Use thesesettingsif your scanner
has an automatic
document feeder.
OmniPage Pro Settings - 49
Page Format Settings
Page Format Settings
Click the
formatting of a page is handled during OCR.
Select a settingthat describeshow your originalpage looks.
Select a settingto determine whatyour page will look like after OCR.
Language Settings
Click the
Page Format
Language
tab to select language settings for your document.
tab to select settings that determine how the
Click to selectfont options forrecognized text.
Select the documentsmain language.
Select additionallanguages for a
multi-languagedocument.
This isthelanguage that will
be used in dialogboxes, windows, and menucommands.
This isthecharacter used inplace of unknowncharacters.
OmniPage Pro Settings - 50
OCR Aware Settings
OCR Aware Settings
OCR Awareallows you to
initiate OCR fromanother application.
If your application is notlisted, clicklocate the applicationfile
(
*.exe
Registered
Browse...
)and add it to the
list box.
to
Click the
OCR Aware
tab to select settings for the OCR Aware feature.
OCR Aware allows you to initiate OCR fr om another application. See
“Using OCR in Other Applications” on page 33 for more information.
An applicationmust be registeredto work with OCR Aware.
Click
Register
Office 97...
register Office 97applications.
to
Some applications may be pre-registered with OCR Aware during
OmniPage Pro installation. These applications will display in the
Registered
list box.
To register an application with OCR Aware:
1Launch the application you want to register and open a
document in it.
This will ensure that the applica tion name appears in the list
box in step 5.
2Choose
3Click the
Options…
OCR Aware
4Make sure that
in OmniPage Pro’s Tools menu.
tab in the Options dialog box.
Enable OCR Aware
is selected.
5Select the name of the application you want to register in the
Add >>
list box.
to add the selected application to the
Registered
Unregistered
6Click
list box and then click OK.
OmniPage adds the
Acquire Text...
and
Acquire Text Settings...
commands to the File menus of registered applications.
OmniPage Pro Settings - 51
Process Settings
Process Settings
The OCR Wizardwill guide you through
the OCR processwhen you click AUTO.
Specifies wherenewly loaded orscanned images
are added to anopen document.
Click the
tab to set commands and settings for each step of OCR.
Process
OmniPage Pro Settings - 52
Microsoft Word Settings
Microsoft Word Settings
Select this if youwant to check forOCR errors inMicrosoft Word.
Click the
Microsoft Word
tab to select settings for performing check
recognition directly in Microsoft Word. See “Checking OCR Results in
Microsoft Word” on page 30 for more information.
Select the colorin which youwant suspected
errors to appear
in Microsoft
Word.
Checking recognition in Microsoft Word is only supported in Microsoft
Word versions 7 and 97. Make sure you associate the
*.doc
extension
with the version you plan to use. Please refer to your Windows
documentation for more information.
OmniPage Pro Settings - 53
Settings Guidelines
The settings you select in OmniPage Pro can greatly affect OCR results.
Make sure that settings are appropriate for your document
begin processing. You may have to experiment with different settings to
get the results you want.
Answer the following questions to get settings recommendations for
your documents.
What type ofdocument are you processing?
Magazine and newspaper pages
Memos and letters
Spreadsheets and tab l es
Legal documents
Mixed formats or not sure
What is the quality of the original document?
Poor or not sure
Good
How much original formatting do you want to keep?
Minimal
Some
As much as possible
, page 57
, page 58
, page 58
, page 55
, page 55
, page 56
, page 57
, page 59
, page 55
, page 56
Settings Guidelines
you
before
Do you want toretain graphics in your document?
, page 60
Yes
, page 60
No
How many languages are in your document?
One language
More than one language
Are you processinga multi-pagedocument?
, page 62
Yes
, page 62
No
, page 61
, page 61
OmniPage Pro Settings - 54
What type of document are you processing?
Settings Guidelines
Magazine and newspaper pages
Recommendations
Select
Select the appropriate pagesize and
Draw zones manually or modify
Multiple columns
settings.
orientation inthe scanning.
automatically created zones if auto zoning does not successfully create zones around all page areasyou want to process. See Customizing Zones on page65, for more information.Keep associated sections of text, such as paragraphs, together inone zone. Omitunnecessary parts of the page such as separator lines between columns.
Scanner
Memos and lettersRecommendations
Select
Select the appropriate pagesize and
Identify graphics that you want to retain as
Single column
settings.
orientation inthe scanning.
Graphic
zone types.
Scanner
in the
Page Format
settings if you are
in the
Page Format
settings if you are
Spreadsheets and tablesRecommendations
Select Select the appropriate pagesize and
Select
Identify the zone type as
Identify thezone content as
Table
in the
orientation inthe scanning.
Retain flowing columns
Format
settings.
that contain graphics you want to retain.
zones that only contain numbers.
Scanner
Page Format
OmniPage Pro Settings - 55
settings.
settings if you are
in the
Page
Graphic
for zones
Numeric
for
What type of document are you processing?
Legal documentsRecommendations
Select
Select
Select the appropriate pagesize and
Draw zones manually or modify
Select
Multiple columns
settings iftext appearsin two or more columns.
Single column
settingsif the document has one, page-wide text column.
orientation inthe scanning.
automatically created zones to omit unnecessary parts of the page. For example, do not include line numbers in azoneif you plan to renumber lines in your word processor.
Table
select
Hard carriage return after every line
in the Save As dialog box if you want to preserve line numbering.
Scanner
in the
Settings Guidelines
in the
in the
Page Format
settings if you are
PageFormat
Page Format
settings and
Mixed formats or not sureRecommendations
Select
Select the appropriate pagesize and
Draw zones manually or modify
Mixed pages
settings.
orientation inthe scanning.
automatically created zones if auto zoning does not successfully create zones around all page areasyou want to process. See Customizing Zones on page 65, for more information.Keep associated sections of text, such as paragraphs, together inone zone. Omitunnecessary parts of the page such as unwanted graphics.
Scanner
in the
Page Format
settings if you are
OmniPage Pro Settings - 56
What is the quality of the original document?
Settings Guidelines
Pooror not sure
Degraded copies, colored or shaded backgrounds or text, run-together or broken text characters
thick, run-together text
characters
thin, broken text
characters
Recommendations for scanning
Select
Select
For best accuracy, use the
Try to scan original documents rather than
Grayscale with 3D OCR
in the
Accuracy
settings if you have a grayscale scanner and yourpage containsgrayscale graphics, colored background, or colored text.
Grayscale withHP AccuPage
in thesettings if you have an HP scanner that supports HPAccuPage, and you selected HP AccuPage in the Scan Manager.
Black and white
your pages are black and white. Lighten the settingfor thick, run-together text characters or dark backgrounds. Darken the setting for thin, broken text characters.
photocopies.
Other recommendations
Select
Draw zonesmanually to omit any smudgesor
Choose
Ask senders to select
Use Language Analyst
in the
Accuracy
OmniPage Pro will evaluate words and make logicalreplacements for hard-to-recognize characters.
scribbles on the page.
Check Recognition...
in the Tools menu to
locate possible errors after OCR.
Fine
or
Best
mode when they
send faxesthat you plan to recognize.
Accuracy
setting if
settings.
Good
Clear, well-formed, black textcharacters on a clean, white background
well-formed text
characters
Recommendations
Select
Deselect
Black and white
in the
Accuracy
thefastest processing ifyou are scanning. Use asetting near the middle of the slider box.
Use Language Analyst
in the
settings for faster processing.
OmniPage Pro Settings - 57
settings for
Accuracy
Settings Guidelines
How much original formatting do you want to keep?
Minimal
Keep one font and one font size only
Some
Keep font characteristics and paragraph formatting
Recommendations
Select
Click
Select
Remove formatting
settings.
Font Mapping
and select one font and one font size to beused for all text.
ANSI
in the Save As dialog box if you want to be able to open the document in any application.
in the
...
in the
Page Format
Page Format
Recommendations
Select
Click
Save to a file format, such as Rich Text Format
Retain font and paragraph formatting
the
Page Format
Font Mapping
and select the fonts you want mapped to various font types.
(RTF), that supports the formatting. Text formatting, such as bold and italics, is
retained if the application supports RTFinformation. Otherwise, only plain text will beretained. Graphics are retained ifthe application supports bitmap images.
settings.
in the
...
Page Format
settings
in
settings
OmniPage Pro Settings - 58
Settings Guidelines
How much original formatting do you want to keep?
As much as possible
Keep font characteristics, paragraph formatting, column formatting and graphicpositioning
Recommendations
Select
Select
Please note:
frames when necessary to maintain columnformatting andgraphic positioning. Although frames will appear inthe text viewer, only required frames, such as frames around graphics, will be exported.
Click
Make sure all parts of thepage are included
Save to a file format, such as Rich Text Format
True Page
retain the original appearance of a page usingframes. The formatting will be more precise but will be more difficult to edit.
Retain flowing columns
settings if your page contains multiple columnsand you want text to flow between paragraphsand columns in your target application. The formatting may be less precise thanwill be easier to edit.
The
Font Mapping
and select the fonts you want mapped to various font types.
within zones. Any part not enclosed within a zone is ignoredduring OCR and will not appear in the recognized document.
(RTF), that supports the formatting. Text formatting, such as bold and italics, is
retained if the application supports RTFinformation. Otherwise, only plain text will beretained. Graphics are retained ifthe application supports bitmap images.
in the
Page Format
in the
Retain flowingcolumns
in the
...
Page Format
settings to
Page Format
True Page
setting uses
settings
but
OmniPage Pro Settings - 59
Settings Guidelines
Do you want to retain graphics in your document?
Yes
Keep graphics such as logos and photos during OCRprocessing
Recommendations forscanning
Select
Select
Please note:
doesnot support grayscale graphics.
Grayscale with 3D OCR
settings if you are scanning with a grayscale scanner or loading a grayscale imagefile andyou want to retain grayscale graphics.
Black and white
you are scanning line-art drawings.
The
Grayscale with HPAccuPage
in the
in the
Scanner
Scanner
settings if
setting
Other recommendations
Select
Manually draw zones around graphicareas if
Make sure separatezones are drawn around
Make sure graphic zones are identified as
Select
To save graphics separately from text after OCR,
Multiple columns or Mixed pages
Page Format
will not automatically detectgraphics.
necessary.
graphic areas and text areas.
Graphic
in the upper-right corner.
when you save a document to another file format.
choose
Save each graphic zone to a file
settings. The
zone types. Theseare marked with a G
Retain graphics
Save Image...
Single column
in the Save As dialog box
in the File menu and select
.
in the
setting
No
Ignore graphics such as logos and photos during OCRprocessing
Recommendations
For best accuracy, select
Accuracy
on a white background.
Deselect
box when you save a document to another file format.
settings if your page contains black text
Retain graphics
Black and white
in the Save As dialog
OmniPage Pro Settings - 60
in the
How many languages are in your document?
One languageRecommendations
If your document contains a language that is not
installed in OmniPage Pro, you can add languages to OmniPage Pro by uninstalling and then reinstalling it.
Select the document language in the
settings.
For faster processing and more accurate results,
select only the language that appears in your document in the
More than one languageRecommendations
If your document contains languages that are not
installed in OmniPage Pro, you can add languages to OmniPage Pro by uninstalling andthen reinstallingit. You will be prompted during installation to select which languages you want installed. Select all languages thatyour document contains, as well asany other languages you commonly use.
Select the main document language and any
additional languages in the
For faster processing and more accurate results,
select only the languages that appear in your document in the
Language
Language
Settings Guidelines
settings.
Language
settings.
Language
settings.
OmniPage Pro Settings - 61
Settings Guidelines
Are you processing a multi-page document?
YesRecommendations if you have an
automatic document feeder (ADF)
Select
Select
Insert blank pages to separate more than one job
Other recommendations
Set the desired process commandsand click
Create and use a zone template if all pages have
Choose
After OCR, choose
Scan until empty
scan a stack of pages at once. Otherwise, youmust click the Image button to scan eachsubsequent page.
Double-sided pages
print on both sides. You will be prompted to turn the stack over.
within a stack of pages. You can save pages between blankpages as separate files after OCR.
AUTO
to automatically process each page of your
document in order.
similar zoning requirements. See Creating Zone Templates on page72 for more information.
Schedule OCR...
schedule processing for a specific time. Pick atime that you plan to be away from your computer.
You can select an option to save the recognizeddocument as a single file, one file per page, or anew file after eachblank page.
in the
Scanner
to scan pages with
in the Process menu to
Save As...
in the File menu.
settings to
No Recommendations
Set the desired process commandsand click
AUTO
to automatically processthe page.
Click the Image button to add more pages to the
document by scanning or loading images.
OmniPage Pro Settings - 62
Chapter 5
Customizing OCR
OmniPage Pro has many features that allow you to customize the way
your documents are handled during OCR. This chapter describes how
to use these features.
Please continue reading this chapter for information on these topics:
• Adjusting Page Images Before OCR
• Customizing Zones
• Specifying Fonts
• Training OCR for Special Characters
• Creating User Dictionaries
• Saving Settings Files
• Scheduling OCR
Customizing OCR - 63
Adjusting Page Images Before OCR
You can rotate and straighten page images in OmniPage Pro’s image
viewer before zoning and OCR take place. This is recommended to
improve OCR accuracy on pages that are not oriented correctly.
Adjusting Page Images Before OCR
If you need to rotate or straighten a page, be sure to do so
create zones because all zones are deleted during these operations.
To rotate a pageimage:
1Click on the page image to make the image viewer active.
2Click the Rotate Image button to rotate the imag e 90-deg rees
(clockwise) at a time.
Or, choose
degrees.
To straighten a page image:
1Click on the page image to make the image viewer active.
2Click the Straighten Image button.
Or, choose
OmniPage Pro straightens the page image up to a maximum of
10 degrees. OmniPage Pro will not straighten a page if it
determines that it is unnecessary.
You can also have OmniPage Pro automatically rotate or straighten
pages as necessary during OCR by selecting those options in the
Page Format
Rotate
Straighten Ima ge
section of the Options dialog box.
in the View menu and sel ect 90, 18 0, or 270
in the View menu.
before
you
Customizing OCR - 64
Customizing Zones
Zones are borders created around areas o f a page image to identify what
will be recognized as text or retained as a graphic during OCR. Zones
play a big part in determining OCR results.
You can create zones automatically, manua lly, or with a template.
Topics in this section describe how you can customize zones including:
• Drawing Zones Manually
• Modifying Zones
• Deleting Zones
• Changing Zone Properties
• Creating Zone Templates
For information on creating zones automatically, please see “Creating
Zones for OCR” on page 26.
Zone toolbar
The Zone toolbar contains buttons for drawing and m odifying zones.
Customizing Zones
Draw
Rectangular
Zones
Draw
Irregular
Zones
Add to
Zone
Subtract
Reorder
Zones
from
Zone
Zone
Properties
Customizing OCR - 65
Customizing Zones
Drawing Zones Manually
You can draw zones manually on a page image using buttons in the
Zone toolbar. Rectangular zones are the most common, but you can a lso
draw irregular-shaped zones.
To draw rectangular zones:
1Click the Zone Properties button and select the zone type and
content for the zone you are about to draw.
See “Changing Zone Properties” on page 71 for more
information.
2Click the Draw Rectangular Zones button.
The mouse pointer in the image viewer becomes a drawing
tool.
3Enclose an area of the image you want as a zone by holding
down the mouse button and dragging the drawing tool to form
a rectangular box.
Try to keep areas of text, such as paragraphs or single columns,
together in the same zone.
4Release the mouse button when you are done.
A number appears within the zone indicating its processing
order.
5Repeat steps 3 and 4 until you have finished drawing zones
around the desired areas of the page.
You cannot draw overlapping zones. If you attempt to draw a zone
over an existing zone, the borders of the new zone will wrap
the boundaries of the existing zone.
Todraw irregular-shaped zones:
1Click the Zone Properties button and select the zone type and
content for the zone you are about to draw.
See “Changing Zone Properties” on page 71 for more
information.
2Click the Draw Irregular Zones button.
The mouse pointer in the image viewer becomes a drawing
tool.
3Position the drawing tool where you want to start drawing the
first side of the zone.
4Click the mouse button once.
Customizing OCR - 66
around
Customizing Zones
5Drag the drawing tool to form the first side of your zone.
6Click the mouse button when you have drawn the desired line
length.
7Draw a perpendicular line in either direction to form the next
side of the zone.
8Repeat steps 6 and 7 to finish drawing each side of your zone.
You will not be allowed to draw a line if it constitutes a
restricted shape. The following zone shapes are restricted:
Modifying Zones
You can modify zones by moving, resizing, reordering, extending,
subtracting, connecting, or dividing them.
To movezones:
1Deselect the buttons in the Zone toolbar.
(If one of the first two drawing buttons is selected, you do not
have to deselect it.)
2Place the mouse pointer inside a zone.
3Hold down the mouse button and drag the zone to the desired
location.
To resize zones:
1Deselect the buttons in the Zone toolbar.
(If one of the first two drawing buttons is selected, you do not
have to deselect it.)
Indented along
the bottom
Indented along
the top
2Select the zone you want to resize by clicking inside it.
The selected zone is shaded and handles appear on its border.
3Place the mouse pointer over a handle so that it changes to a
two-way arrow.
4Hold down the mouse button and drag the handle in the
direction that you want to enlarge or reduce the zone.
5Release the mouse button when you are done.
The zone border changes to display the modified zone area.
Customizing OCR - 67
Customizing Zones
To reorder zones:
1Click the Reorder Zones button.
The numbers in the zones disappear.
2Click within the zone you want recognized first.
The number 1 appears in the zone.
3Click within the zone you want recognized next.
The number 2 appears in the zone.
4Repeat step 3 until all the zones are appropriately ordered.
If you do not number all the zones, they are automatically
numbered for you when you start OCR.
The numbered order of zones determines the order in which text
will be placed on a recognized page. However, if you select
Page
or
Retain flowing columns
as the Output Option for a page, the
order of the text will be based on the order of the original page.
Toextend an area of a zone:
1Click the Add to Zone button.
The mouse pointer in the image viewer becomes a drawing tool
with a plus sign.
True
drawing tool
2Position the drawing tool at the poin t where you want to start
extending the zone.
3Hold down the mouse button and drag the drawing tool in the
direction that you want to extend the zone.
Customizing OCR - 68
The left area ofthis zone has been extended downward.
Tosubtract an area of a zone:
Customizing Zones
4Release the mouse button when you are finished extending the
zone.
The zone border changes to display the modified zone area.
1Click the Subtract from Zone button.
The mouse pointer in the image viewer becomes a drawing tool
with a minus sign.
2Position the drawing tool at the poin t where you want to start
subtracting from the zone.
drawing tool
3Hold down the mouse button and drag the drawing tool in the
direction that you want to subtract from the zone.
4Release the mouse button when you are finished subtracting
from the zone.
The zone border changes to display the modified zone area.
Customizing OCR - 69
To connect two or morezones:
1Click the Add to Zone button.
2Hold the mouse button down and drag the drawing tool over
3Release the mouse button when you are done.
To divide a zone:
1Click the Subtract from Zone button.
2Hold the mouse button down and drag the drawing tool over
3Release the mouse button when you are done.
Deleting Zones
You can delete the current zones if you want to create new zones. You
can also delete individual zones that you do not want to process during
OCR. Any part of a page image not enclosed by a zone is ignored during
OCR.
Customizing Zones
The mouse pointer in the image viewer becomes a drawing tool
with a plus sign.
the area where you want the zones to be connected.
The zone border changes to display the modified zone area.
The mouse pointer in the image viewer becomes a drawing tool
with a minus sign.
the area where you want to divide the zone.
The zone border changes to display the modified zone area.
To delete and replace the current zones automatically, click the Zone
button. You will be prompted to replace the current zones.
To delete zones:
1Select the zone you want to delete by clicking inside the zone.
• Shift-click to select additional zones.
• Choose
current page.
Selected zones are shaded.
2Press the Delete key or choose
The selected zones disappear.
Select All
in the Edit menu to select all zones on the
in the Edit menu.
Clear
Customizing OCR - 70
Changing Zone Properties
You can set certain properties for zones to customize how each zone will
be treated during OCR. The Zone Properties dialog box contains settings
for
zone type
Zone Type
Every zone on a page has a zone type setting. You can select the
following zone types:
•
•
•
•
•
and
zone content
Single-column zone
Multiple-column zone
Table zone
Mixed zone
for text zones that contain text in tabbed columns
for text zones that contain a mixture of column
layouts
Graphic zone
for photos, drawings, and areas of text that you want
to retain as a graphic. The letter G appears within graphic zones.
OCR is not performed on graphic zones.
Customizing Zones
.
Close button
for text zones that contain a single column
for text zones that contain multiple columns
Zone Content
All text zones on a page also have a zone content setting. This specifies
the characters OmniPage Pro looks for within a zone during OCR. You
can select
appears within an alphanumeric zone and the letter N appears within
A
Alphanumeric
or
Numeric
as the zone content setting. The letter
a numeric zone.
For example, if a particular zone only contains numbers and
mathematical signs, you can specify the contents of that zone to be
Numeric
. OmniPage Pro will only look for numeric characters in that
zone during recognition.
OmniPage Pro assigns zone properti es to each zone when it creates
zones automatically. You do not need to change the zone properties
unless you want to modify the way zones will be treated during OCR.
Customizing OCR - 71
The settings in this dialogbox will be blank ifmultiple zones with different settings are selected at once.
Customizing Zones
To change the properties of a zone:
1Select the zone you want to modify by clicking it.
You can Shift-click to select multip le zones. Selected zones are
shaded.
2Click the Zone Properties button to open the Zone Properties
dialog box.
Close button
3Select a zone type for the selected zones.
4Select a zone content for the selected zones.
You can only select a zone content setting for text zones.
5Click the standard Close button when you are done.
You can also change a zone’s type and content settings individually
by clicking your right mouse button over the zone and choosing a
setting in the shortcut menu that appears.
Creating Zone Templates
You can use zone templates to create zones on a page image. A zone
template contains zone attributes including size, shape, position, order,
type, and content. Zone templates are useful if you frequently process
documents that have the same layouts and similar content.
To create a zone template:
1Load a page image and create the desired zones.
2Choose
The New Template dialog box appears.
3Type a name for your file in the
4Click OK.
The zone template file is saved in the
installation folder. It can be selected in the Zone button drop-
down list.
Save Zone Templa te
... in the Tools menu.
File name
text box.
GDWD
folder in your
Customizing OCR - 72
Tocreate zones with a template:
1Select the zone template that you want to use in the Zo ne
button drop-down list.
Specifying Fonts
Specifying Fonts
You can retain the font characteristics in your document during OCR if
you select an Output Format option other than
Page Format
OmniPage Pro automatically
To map fonts, OmniPage Pro analyzes text and categorizes it as one of
these font types:
2Click the Zone button or choose
OmniPage Pro creates zones o n the page image using the zone
template.
section of the Options dialog box.
detected font types to specified fonts.
maps
• Proportional Serif
Character spacing varies depending on the character; short lines
finish off the letter strokes. The body text in this manual is an
example of this font type.
• Proportional Sans-Serif
Character spacing varies depending on the character; letter
strokes do not have finishing lines. The he adings in this manual
are an example of this font type.
• Monospaced Serif
Character spacing is the same for each character; short lines finish
off the letter strokes.
• Monospaced Sans-Serif
Character spacing is the same for each character; letter strokes do
not have finishing lines.
font type.
&RXULHU
Template
is an example of this font type.
in the Process menu.
Remove formatting
is an example of this
in the
Tocustomize the font mapping for font types:
1Choose
box.
2Click the
Options...
Page Format
in the Tools menu to open the Options dialog
tab.
Customizing OCR - 73
Training OCR for Special Characters
The selected fonts are applied to text when their corresponding font typesare detected during OCR.
3Click
4Select the font you want mapped to each font type.
5Click OK when you are done.
Font Mapping...
The fonts available in the drop-down lists depend on the True
The Specify Character dialog box shows how the selected
character appeared in the original page image.
Click thecharacter you want to associate with the selected character
The associatedcharacter appearshere
The original image of the selected character
6Specify how you want OmniPage Pro to interpret th e character
during OCR by entering a character in the
7Click
return to the Train Characters dialog box.
OK to
Character
edit box.
8Repeat steps 5–7 to continue specifying characters.
9Click
Or, click
to save the specified characters to a training file.
Save
Append
to add the specified characters to another
training file.
After saving or appending to a file, you are asked if you want
to make this the current training file. Click
current page using the training file you just created. Click
to recognize the
Yes
No
to
return to the image without recognizing it.
Customizing OCR - 75
Training OCR for Special Characters
Training files are saved in the
You can select them in the
To edit a training file:
Original imageAssociated characters
GDWD
folder in your installation folder.
section of the Options dialog box.
in the Tools menu.
1Choose
Accuracy
Edit Training File...
A dialog box appears listing all your training files.
2Double-click the training file you want to edit. Or, select it and
click
Edit.
The Train Character dialog box displays characters in the
selected file.
3Edit the characters as desired.
• Double-click a character that you want to edit.
• Click a character that you want to remove and click
4Do one of the following after editing the training file:
• Click
• Click
to save changes in the training file.
Save
Append
to add all train ed ch ara c ters to another training
file.
• Click
to exit without saving the edits to the training file.
Cancel
Customizing OCR - 76
Delete
.
Creating User Dictionaries
Words in the user dictionary appear in this list box.
A user dictionary is used when you perform OCR and check for errors
afterward. You can select a user dictionary in the
Options dialog box.
Tocustomize a user dictionary:
Creating User Dictionaries
Language
section of the
This is Microsoft Words user dictionary. You canuseit with OmniPage Pro.
This is OmniPage Pros default user dictionary.
1Choose
Edit User Dictionary...
in the Tools menu.
A dialog box lists all user dictionary files.
2Do one of the following:
• Select a file and click
• Click
to create a new user dictionary. Enter a name in the
New
to edit an existing user dictionary.
Edit
dialog box that appears and click OK.
The User Dictionary dialog box appears.
3Add or delete words as desired:
• Type a word in the
User word
• Select a word in the list box and click
Delete All
• Click
4Click
to remove all words from the dictionary.
Import...
Close
to add words from a text file.
when you are finished editing the user dictionary.
edit box and click
Delete
OmniPage Pro’s user dictionaries are saved in the
in your installation folder.
to add it.
Add
to delete it. Click
GDWD
folder
Customizing OCR - 77
Saving Settings Files
You can save OmniPage Pro settings to a file. A settings file is useful for
quickly loading particular settings that you need for certain documents.
The settings you select in OmniPage Pro can greatly affect OCR results.
For help in selecting settings for different kinds of documents, see
“Settings Guidelines” on page 54 .
To savesettings to a file:
Saving Settings Files
1Choose
2Select the desired settings in the Options dialog box.
3Click
4Select a folder location for the settings file.
5Type in a file name for the settings file and click OK.
All the current settings in the Options dialog box are saved into
a settings file with an
6Click OK to close the Options dialog box.
Options...
Save Settings...
in the Tools menu.
to open the Save Settings dialog box.
LQL
extension.
Customizing OCR - 78
Toload a settings file:
Scheduling OCR
Scheduling OCR
1Choose
box.
2Click
3Select the folder location of the settings file you want to load.
4Select the name of the settings file you want to load and click
OK
The settings change according to the selected file.
5Click OK to close the Options dialog box.
You can schedule OCR to take place on one or more OmniPage
Documents, supported image files, and pages in your scanner. This
processing can take place while you are away from your computer as
long as OmniPage Pro is still running. Scheduled documents are opened
at the specified time, unfinished pages are recognized, and the
documents are saved in a preselected format and location.
Options...
Load Settings...
.
in the Tools menu to open the Options dialog
to open the Load Settings dialog box.
Scheduled documents are deleted from the processing queue if you
close OmniPage Pro. Therefore, you should keep OmniPage Pro
running until the documents are processed.
Topics in this section include:
• Scheduling Individual Documents
• Scheduling Documents from an Input Folder
• Modifying Output Options for Documents
Customizing OCR - 79
Scheduling Individual Documents
You can schedule individual documents from different folders.
Scheduled documents are recognized at the specified time and then
saved in the designated output folder.
To schedule individual documents:
Scheduling OCR
All scheduled documents are displayed in this processing queue.
Click this to modify default output options.
OmniPage Pro starts processing scheduleddocuments, in order,at the specified time.
1Choose
Schedule OCR...
in the Process menu.
The Schedule OCR dialog box appears.
2Click
to open the Add Jobs dialog box.
Add...
Click
Add...
to add documents to the processingqueue.
Click
Remove
to remove aselected document from the processingqueue.
Click
Advanced
to select documents from more than one folder.
3Locate and select the files you want to add to the schedule.
You can select OmniPage Documents and supp orted image
files.
4Click
after selecting the desired files.
Open
The Schedule OCR dialog box displays the newly added files.
Customizing OCR - 80
5Select the time that you want OmniPage Pro to process the
scheduled documents.
Select
Finish now
if you want OmniPage Pro to process all
scheduled documents as soon as you close the dialog box.
6Click OK in the Schedule OCR dialog box to save your settings
as specified.
All scheduled files are processed, in order, at the scheduled
time.
Scheduling Documents from an Input Folder
You can set up OmniPage Pro to automatically schedule documents
from a specified input folder. Scheduled documents are recognized at
the specified time and then saved in the designated output folder.
To schedule documents from an input folder:
Scheduling OCR
All scheduled documents are displayed inthis processing queue.
Click this to modifydefault output options.
OmniPage Pro startsprocessing documents in the queue at the specified time.
1Choose
Schedule OCR...
in the Process menu.
The Schedule OCR dialog box appears.
Customizing OCR - 81
Scheduling OCR
Select this to schedule documents in your scanners ADF.
Select this to automatically schedule documents in the specified folder.
2Click the
Options...
button to open the Schedule OCR Options
dialog box.
The selected output options are used for all newlyscheduled documents.
3Select
Auto add new jobs from folder
and select the desired input
folder.
If you use the auto-add feature to schedule documents and you do
not select
Delete original file after OCR
, original files will be moved
from the input folder to the output folder after processing.
4Click OK in the Schedule OCR Options dialog box to accept the
selected settings.
The Schedule OCR dialog box reappears and adds documents
from the input folder to the processing queue.
5Select the time that you want OmniPage Pro to process
scheduled documents.
6Click OK in the Schedule OCR dialog box to save the settings
and close the dialog box.
Processing begins at the specified time. Right before processing
begins, OmniPage Pro checks the input folder again and adds
any new documents to the processing queue.
After scheduled jobs are processed, the
Auto add new jobs from folder
option will be deselected.
Customizing OCR - 82
Modifying Output Options for Documents
All newly scheduled documents have the same default output folder
and file format assigned to them. The default output file name uses the
original file name and the extension of the output file format. You can
modify all of these output options for any scheduled document.
Scheduling OCR
Click the
default options used for all newly scheduled documents.
To modify the output options for anindividual document:
Select the document for which you want to modifyoutput options.
Click this to modifydefault output options.
Options...
1Choose
button in the Schedule OCR dialog box to change the
Schedule OCR...
in the Process menu.
The Schedule OCR dialog box appears.
2Select a scheduled file and click
Scheduled Job dialog box.
Modify…
Click this to modify the output options for theselected document.
to open the Modify
Select output options for this particular document.
Select this if you want the original document deleted after processing.
Customizing OCR - 83
3Select the desired options for the document.
4Click OK to accept the selected options.
The Schedule OCR dialog box reappears.
5Click OK to close the Schedule OCR dialog box.
Scheduling OCR
Customizing OCR - 84
Chapter 6
Technical Information
This chapter provides troubleshooting and other technical information
about using OmniPage Pro.
Please also read the
your OmniPage Pro package. These contain the latest information on
OmniPage Pro and its supported scanners.
Please continue reading this chapter for information on these topics:
• General Troubleshooting Solutions
• Using Visioneer Scanners with Omn iPage Pro
• Supported File Formats
• Scanner Setup Issues
•OCR Problems
• Uninstalling the Software
Release Notes
and
Scanner Setup Notes
that came in
Technical Information - 85
General Troubleshooting Solutions
Although OmniPage Pro is designed to be easy to use, problems
sometimes occur. Many of the onscreen error messages contain selfexplanatory descriptions of what to do — check connections, close other
applications to free up memory, and so on. Sometimes that is all the
troubleshooting help you need.
Please see your Windows documentation for information on optimizing
your system and application performance.
Topics in this section include:
• Solutions to Try First
• Testing OmniPage Pro
• Low Memory Problems
• Low Disk Space Problems
Solutions to Try First
Try these possible solutions if you experience problems using
OmniPage Pro:
• Make sure that your system meets all requirements listed under
“Minimum System Requirements” on page 15.
• Restart your computer and make sure other applications are
functioning properly.
• Make sure that your scanner is plugged in and that all cable
connections are secure.
• Turn off your computer and your scanner, turn your scanner
back on, and then restart your computer.
• Use the software that came with your scanner to verify that the
scanner works properly before using it with OmniPage Pro.
• Make sure you have the correct drivers for your scanner, printer,
and video card. See the
• Run ScanDisk for Windows 95 or Check Disk for Windows NT to
check your hard disk for errors. See Windows online help for
more information.
• Defragment your hard disk. See Windows online help for more
information.
• Uninstall and reinstall OmniPage Pro and the Scan Manager.
Scanner Setup Notes
General Troubleshooting So lutions
for more information.
Technical Information - 86
Testing OmniPage Pro
Restarting Windows 95 in
you to test OmniPage Pro on a simplified system. This is recommended
when you cannot resolve crashing problems or if OmniPage Pro has
stopped running altogether. See Windows online help for more
information.
Your scanner will not run with OmniPage Pro in safe mode or VGA
mode, so do not test scanner problems in this configuration.
To test OmniPagePro in safe mode (Windows95):
1Restart your computer in safe mode by pressing F8
immediately after you see the “Starting Windows 95” message.
2Launch OmniPage Pro and try performing OCR on an image.
Use an existing image file such as the
• If OmniPage Pro does not launch or run properly in safe
mode, then there may be a problem with the installation.
Uninstall and reinstall OmniPa ge Pro, and then run it in
Windows safe mode.
• If OmniPage Pro runs in safe mode, then a device driver on
your system may be interfering with OmniPage Pro
operation. Troubleshoot the problem by restarting Windows
in Step-by-Step Confirmation mode. See Windows online
help for more information.
safe mode
General Troubleshooting So lutions
or Windows NT in
6DPSOHWLI
VGA mode
file.
allows
To Test OmniPage Pro in VGA mode (Windows NT):
1Restart your computer.
2Select
3Press Ctrl+Alt+Delete and select
4In the Task Manager dialog box, select all background
5Launch OmniPage Pro and try performing OCR on an image.
Windows NT Workstation Version 4.00 [VGA mode]
press Enter.
Task Manager
applications and click End Process. See your Windows
documentation for more information.
Use an existing image file such as the
6DPSOHWLI
.
Technical Information - 87
and
file.
Low Memory Problems
OmniPage Pro may run poorly under low memory conditions. This may
be indicated by various error messages or if OmniPage Pro works
slowly and accesses the hard drive often. Try these solutions for low
memory conditions:
• Restart your computer.
• Close other open applications to free up memory.
• Close unnecessary OmniPage Pro windows.
• Defragment your hard disk to free up contiguous blocks of disk
space. See Windows online help for instructions.
• Increase the amount of free hard disk space.
• Increase your computer’s physical memory (RAM).
More memory optimizes OCR performance. See “Minimum
System Requirements” on page 15 for more information.
Low Disk Space Problems
Problems may occur if your system runs low on free disk space. Try
these solutions for low disk space problems:
• Empty the Windows Recycle Bin.
• Delete the
located in your Windows folder.
• Run ScanDisk for Windows 95 or Check Disk for Windows NT to
check for errors that may be using up disk space. See Windows
online help for instructions.
• Back up unneeded files onto floppy disks or other media and
delete them from your hard disk.
• Remove Windows applications that you do not use.
• Defragment your hard disk. See Windows online help for
instructions.
• Clean the cache for your web browser and limit its size.
WPS
files in the
General Troubleshooting So lutions
7HPS
folder. This folder is usually
Technical Information - 88
Using Visioneer Scanners with OmniPage Pro
Using Visioneer Scanners with OmniPage Pro
During installation, OmniPage Pro automatically integrates with your
Visioneer PaperPort software. However, you cannot scan directly into
OmniPage Pro if you use a Visioneer scanner or if your scanner is set up
to work with PaperPort software (such as the HP ScanJet 5s). Instead,
scan pages into PaperPort and then drag the page images onto the
OmniPage Pro icon at the bottom of the PaperPort Desk top. The page
images will be loaded into OmniPage Pro. See OmniPage Pro’s online
help for more information.
Supported File Formats
OmniPage Pro can open these file formats:
Bitmap (*.bmp)OmniPage Document (*.met)
DCX (*.dcx)PCX (*.pcx)
JPEG (*.jpg)TIFF (*.tif)
Caere Documents from version 6.0 and earlier can only be opened if the original
images were preserved.
TIFF files can be single- or multiple-page, line art or grayscale, compressed or
uncompressed. They can be 200, 300, 400 dpi, but 300 dpi is r ecommended.
OmniPage Pro stores and displays TIFF files as 300 dpi line art.
OmniPage Pro can save original images to these file formats:
Bitmap (*.bmp)TIFFUncompressed (*.tif)
OmniPage Document (*.met)TIFFPackbits(*.tif)
PCX (*.pcx)TIFF Group 4 Compressed (*.tif)
Saving Image Files
OmniPage Pro saves each page of a multiple-page ima ge sepa rately.
If you select
Save all pages
in the Save Image dialog box,
Page#
is
appended to file names to distinguish separately saved pages. If you
select
Save each graphic zone to a file
, then
is appended to file names
Zone#
to distinguish separately saved graphic zones.
Technical Information - 89
Supported File Formats
OmniPage Pro can save recognized text to these file formats:
When saving to HTML, all graphics are saved as separate image files using
JPEG format.
Technical Information - 90
Scanner Setup Issues
This section contains information on scanner setup and solutions for
scanning problems you may encounter.
Scanner Setup Issues
For more detailed scanner information, please read the
included in the OmniPage Pro package.
Notes
Topics in this section include:
• Scanner Drivers Supplied by the Manufacturer
• Scanner Drivers Supplied by Caere
• Problems Connecting OmniPage Pro to Your Scanner
• Missing Scan Image Command
• Scanner Message on Launch
• System Crash Occurs While Scanning
Scanner Drivers Supplied by the Manufacturer
Many scanners are shipped with one or more
software that allows your computer to communicate with your scanner.
Some scanners do not require drivers and other scanners require more
than one driver. Refer to your scanner documentation for information
about installing any required scanner drivers.
Make sure that your scanner and scanner drivers are properly installed
and configured before installing OmniPage Pro. Make sure that you
have installed the appropriate scanner drivers supplied by the
manufacturer.
scanner drivers
Scanner Setup
. This is
For HP IIp, IIc, IIcx, 3p, and 3c scanners, use the drivers that came with
the scanners, or select a TWAIN or ISIS driver in the Caere Scan
Manager.
Technical Information - 91
Scanner Setup Issues
Scanner Drivers Supplied by Caere
OmniPage Pro is shipped with special scanner drivers that allow it to
communicate with supported scanners. These scanner driver files are
installed on your computer when you install the Caere Scan Manager.
These drivers often work in conjunction with the drivers from your
scanner manufacturer. In order to use your scanner with OmniPage Pro,
you must select the appropriate scanner in the Caere Scan Manager. See
“Setting Up Your Scanner with OmniPage Pro” on page 16 for more
information.
Problems Connecting OmniPage Pro to Your Scanner
Try these solutions if you experience a problem between OmniPage Pro
and your scanner or if you receive a scanner error message when you
launch OmniPage Pro.
• Make sure the scanner is supported by OmniPage Pro with your
version of Windows 95 or Windows NT.
A list of tested scanners is provided in the
your scanner is not listed, call your scanner manufacturer to find
out if it is supported.
• Make sure the Caere Scan Manager is installed and that you have
selected the correct scanner in the Scan Manager.
See “Setting Up Your Scanner with OmniPage Pro” on page 16.
• Make sure you have installed the appropriate scanner driver. See
the
Scanner Setup Notes
• Make sure your scanner is connected, compatible with your
system, and runs with the software provided by the
manufacturer
• Make sure your scanner is connected securely and turned on
before you start Windows.
Scanner drivers must be loaded at startup. Turn on your scanner
first and then restart your computer.
• Make sure the scanner is not in use by another application.
• Uninstall and then reinstall the Caere Scan Manager.
before
for more information.
you use it with OmniPage Pro.
Scanner Setup Notes
. If
Technical Information - 92
Missing Scan Image Command
The
Scan Image
down list in the following cases:
• You did not install the Caere Scan Manager or select an
appropriate scanner. See “Setting Up Your Scanner with
OmniPage Pro” on page 16 for instructions.
• Your scanner is not connected to your computer or is not
functioning properly. See “Scanner Setup Issues” on page 91.
• You use a Visioneer scanner or your scanner is set up to work
with Visioneer’s PaperPort software such as the HP ScanJet 5s.
See the
command does not appear in the Image button’s drop-
Scanner Setup Notes
Scanner Message on Launch
The first time you launch OmniPage Pro after installing or changing your
current scanner in the Caere Scan Manager, you may get this message:
This scanner’s configuration is set using the system -level driver.
no more information, click OK in the dialog box . You m a y al so ha v e the
option to select the following:
• SCSI ID or scanner configuration information
Consult your scanner documentation for the correct information.
• Page size information
Enter the largest size page that your scanner supports
Scanner Setup Issues
for more information.
If it asks for
.
System Crash Occurs While Scanning
Try these solutions if a crash occurs during a scan:
• Turn your computer off. Turn your scanner off and on again to
return the scanner to its default state. Then restart your computer.
• Check your scanner setup. See “Scanner Setup Issues” on page 9 1
for more information.
• Check the
if you are using a TWAIN scanner.
• Check with the scanner manufacturer to make sure you have the
appropriate driver for your scanner.
• Resolve low memory problems. See “Low Memory Problems” on
page 88 for more information.
• Resolve low disk space problems. See “Low Disk Space
Problems” on page 88 for more inf ormation.
• Check Caere Corporation’s web site (www.caere.com) for Scan
Manager updates.
TWAIN Scanner Settings
tab in the Caere Scan Manager
Technical Information - 93
Scanner Setup Issues
Scanner Not Listed in Supported Scanners List Box
Try these solutions if your scanner is not listed in the Scan Manager
Supported Scanners
• Check Caere Corporation’s web site (www.caere.com) for Scan
Manager updates.
• Select
Scanners
list box:
TWAIN scanner
list box.
as your current scanner in the
Scanning Tips
OCR results will be poor if an image is not scanned properly. Remember
the following tips when you scan:
• Take the color and quality of your document into account when
scanning.
High-quality documents return better recognition results than
low-quality documents. Shaded, colored, or low-quality
documents may result in poor recognition accuracy unless
adjustments are made before scanning. See “What is the quality
of the original document?” on page 57 for more information.
• Always try to scan an original docum e nt instead of a photocopy.
• Make sure the page is properly aligned in the scanner.
Select
Automatically strai gh ten page image
settings of the Options dialog box to auto matically straighten a
page image by up to 10 degrees if necessary.
• Check the glass, mirrors, and lenses on your scanner for dust,
smudges, or scratches. Clean if necessary.
• Make sure the proper settings are selected in the
of the Options dialog box before scanning.
See “Scanner Settings” on page 49 for more information.
in the
Page Format
Scanner
Supported
section
Technical Information - 94
OCR Problems
System Crash During OCR
OCR Problems
This section contains information and solutions for possible OCR
problems.
Topics in this section include:
• System Crash During OCR
• Text Does Not Get Recognized Properly
• Problems With Fax Recognition
Try these solutions if a crash occurs during OCR or if processing takes a
very long time:
• Resolve low memory problems. See “Low Memory Problems” on
page 88 for more information.
• Resolve low disk space problems. See “Low Disk Space
Problems” on page 88 for more inf ormation.
• Minimize all applications or click Alt+Tab to check for Windows
error messages.
• Check the quality of the image you are recognizing. See “What is
the quality of the original document?” on page 57 for more
information.
See “Scanning Tips” in the previous section for ways to improve
the quality of scanned images.
• Break complex page images (lots of text and graphics o r elaborate
formatting) into smaller jobs. Draw zones manually or modify
automatically created zones and perform OCR on one page area
at a time. See “Customizing Zones” on page 65 for more
information.
• Restart Windows 95 in safe mode or Windows NT in VGA mode
and test OmniPage Pro by performing OCR on
“Testing OmniPage Pro” on page 87.
• If you are performing multiple tasks at once, such as recognizing
and printing, OCR may take longer.
6DPSOHWLI
. See
Technical Information - 95
Text Does Not Get Recognized Properly
Try these solutions if any part of the original document is not converted
to text properly during OCR:
• Look at the original page image and make sure that all text areas
are enclosed by text zones. If an area is not enclosed by a zone, it
is ignored during OCR. See “Creating Zones for OCR” on page
26 for more information.
• Make sure text zones are identified correctly. Alphanumeric text
zones are marked by an A. Graphic zones are marked by a G.
Reidentify zones, if necessary, and perform OCR on the
document again. See “Changing Zone Properties” on page 71 for
more information.
• Make sure the correct main and secondary document languages
are selected in the
the document should be selected. See “Language Settings” on
page 50 for more information.
• Select
• Train OmniPage Pro to recognize special characters that might
• If you use
• Check the glass, mirrors, and lenses on your scanner for dust,
Use Language Analyst
Language Analyst evaluates words and corrects likely errors
during OCR. See “Accuracy Settings” on page 49 for more
information.
Special Characters” on page 74 for more information.
True Page
gets put into frames (formatting boxes) in the text viewer. Some
text may be hidden from view if a frame is to o small. To view th e
text, place the cursor in the text frame and use the arrow keys on
your keyboard to scroll to the top, bottom, left, or right of the
frame.
smudges, or scratches. Clean if necessary.
Language
as the
settings. Only languages included in
in the
Output Format
Accuracy
OCR Problems
settings. The
setting, recognized text
OmniPage Pro only recognizes printed text charac ters such as
typewritten or laser-printed text. However, it can retain handwritten
text, such as a signature, as a graphic. See “Do you want to retain
graphics in your document?” on page 60 for guidelines on retaining
graphics.
Problems With Fax Recognition
Try these solutions to improve OCR accuracy on fax images:
Technical Information - 96
Uninstalling the Software
• Ask senders to select
This produces a resolution of 200x200 dpi.
• Ask senders to transmit files directly to your computer via fax
modem if you both have one. You can save fax images as image
files and then load them into OmniPage Pro. See “Supported File
Formats” on page 89 for more information.
• Ask senders to use clean, original documents if possible. Sans
serif fonts (such as the one used for headings in this manual) are
easier to recognize than serif fonts (such as the one used for body
text in this manual).
Uninstalling the Software
Sometimes uninstalling and then reinstalling OmniPage Pro and the
Caere Scan Manager will solve a problem.
OmniPage Pro’s Uninstall program will
the OmniPage Install directory or subdirectories, in addition to the
following files:
• Zone templates (
• Training files (
• User dictionaries (
• Temp files (
WPS
]RQ
WUQ
XG
)
Fine
)
or
)
)
mode when they send you a fax.
Best
remove any files saved to
not
Touninstall OmniPage Pro:
1Close OmniPage Pro.
2Click
ApplicationsUninstall OmniPage Pro
3Click
4Restart your computer.
in the Windows taskbar and choose
Start
to confirm that you want to remove OmniPage Pro.
Yes
Caere
.
Technical Information - 97
Touninstall the Caere Scan Manager:
1Close OmniPage Pro.
Uninstalling the Software
2Click
SettingsControl Panel Add/Remove Programs
3Select
in the Windows taskbar and choose
Start
Caere Scan Manager 3.0
and click
.
Add/Remove
.
4Click OK to confirm that you want to remove the Caere Scan
Manager.
5Restart your computer.
Some icons and program files may remain on your system if
they have been renamed, modified, or moved to different
locations.
Technical Information - 98
Glossary Terms
3D OCR® A technology developed by Caere that uses grayscale
information to increase accuracy when recognizing scanned text
characters.
ADF See
AnyPage A technology developed and licensed by Caere that
improves the combined performance of grayscale scanners and
OmniPage Pro. AnyPage uses th e quality of grayscale images to
improve the recognition of scanned pages. It is especially useful for
text printed on shaded backgrounds.
auto zoning The process OmniPage Pro uses to automatically draw and
order zones on a page image.
automatic document feeder (ADF) A device that allows you to scan
multiple pages without having to place each page in the scanner. Some
ADFs are built into scanners; others are add-on products.
automatic processing Using OmniPage Pro’s AUTO button to process
an open document or a new document from start to finish according to
the selected process commands.
fax Short for facsimile machine. Fax machines scan a page, convert the
image into digital data, and send the data over a phone line to another
fax or computer. The receiving machine creates the image on paper or
stores the data on disk as a fax file.
file format The way an application records and stores information in a
file. A document’s file format must be converted in order to open it in
another application that does not support the current file format.
font In typography, a complete set of type in one size and style of
character. In computer usage, a collection of letters, numbers,
punctuation marks, and other typographical symbols with a consistent
appearance; the size can be changed readily.
automatic document feeder
.
font mapping Matching a font type with a particular font. OmniPage
Pro can map selected TrueType fonts to the font types that it detects in
a document during recognition.
format The form in which information on a printed page is organized
and presented, including page size, column layout, paragraph
spacing, fonts, and so on.
Glossary Terms - 99
Glossary Terms
frame A formatting box containing text or graphics that is used to
design page layout. For example, columns in a document may be
contained within a separate frame.
HP AccuPage® A technology developed and licensed by Hewlett-
Packard that improves the combined performance o f HP scanners and
OmniPage Pro. While preserving the quality of grayscale images, HP
AccuPage technology retains the format of scanned pages, improves
the recognition of text printed on shaded backgrounds, and accurately
recognizes text printed at small point sizes.
image An electronic picture of text and/or graphics such as a scanned
paper document or an electronic fax file. Images do not have editable
text characters; they have many tiny dots (pixels) that together form a
picture of text.
image viewer The area on the O mniPage Pro desktop that displays the
original page image.
are created in the image viewer before OCR
Zones
takes place.
Language Analyst® A Caere technology that uses information about
language context and usage rules to evaluate text and correct likely
errors during OCR.
mapping See
font mapping
.
monospaced font Any font in which all characters have the same width.
For example, in Courier New (a monospaced font), the letter 0 is the
same width as the letter . Thus,
OCR See
optical character recognition
00000
.
is the same width as
OCR Aware A feature that allows you to use OmniPage Pro OCR in
another application such as Microsoft Word. You can perform OCR on
an image and paste the resulting text directly into an open document.
OmniPage Document A file that is saved in OmniPage Pro’s
proprietary format (
PHW
). OmniPage Documents can consist of
original page images, zones, and recognized text.
optical character recognition (OCR) The process of turning an image,
such as a scanned paper document or an electronic fax file, into
computer-editable text so you do not have to retype the text manually.
point A typographic unit of measurement equal to 1/72 inch, measured
vertically. Points are used to describe font size.
proportional font Any font in which characters differ in width. For
example, in Palatino (a proportional font), the letter M is wider than
the letter l. Thus, MMMMM is wider than lllll.
recognition The OCR process. A page is
recognized
when OmniPage Pro
performs OCR on it.
.
Glossary Terms - 100
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.