The software described in this book is furnished under license and may be used or copied only
in accordance with the terms of such license.
MPORTANT NOT ICE
I
ScanSoft, Inc. provides this publication "as is" without warranty of any kind, either express or
implied, including but not limited to the implied warranties of merchantability or fitness for a
particular purpose. Some states or jurisdictions do not allow disclaimer of express or implied
warranties in certain transactions; therefore, this statement may not apply to you. ScanSoft
reserves the right to revise this publication and to make changes from time to time in the
content hereof without obligation of ScanSoft to notify any person of such revision or
changes.
RADEMARKSAND CREDITS
T
ScanSoft, OmniPage, OmniPage SE, OmniPage Pro, PaperPort, Pagis, True Page, Direct OCR,
AutoOCR, OCR Pro ofreader are registered trademarks or trademarks of ScanSoft, Inc., in the
United States and/or other countries.
All other trademarks and tradenames are hereby recognized and may be registered to their
respective holders.
ScanSoft, Inc.
9 Centennial Drive
Peabody, MA 01960,
United States of America
Part Number 58-28001-05A
CONTENTS
WELCOMEVII
Using this guide viii
Getting online help ix
Online HTML Helpix
Context-Sensitiv e Helpix
Tech Notesx
Glossaryx
OmniPage SE x
1INSTALLATIONANDSETUP11
System requirements 12
Installing OmniPage SE 13
Setting up your scanner with OmniPage SE 14
How to start the program 16
Registering your software 17
New features in OmniPage Pro 11 18
OmniPage SE and OmniPage Pro 11 19
The Menu bar25
The Image toolbar26
The Formatting toolbar26
The OmniPage Toolbox27
Managing documents 28
Thumbnail view28
Detail view29
Customizing columns in Detail view30
Deleting pages from a document30
Printing a document30
Closing a document31
OmniPage Documents
Why save to OPD32
How to save to OPD32
Settings 33
31
3 TUTORIAL: PROCESSINGDOCUMENTS35
Quick Start Guide 36
Loading and recognizing sample image files36
Scanning and recognizing a single page36
Processing documents using the OCR Wizard 39
Processing documents automatically 42
Command buttons43
Processing documents manually 44
Processing a document automatically and
finishing it manually 46
Processing from other applications 47
ivCONTENTS
How to set up Direct OCR47
How to use Direct OCR47
How to use OmniPage SE with your
PaperPort software48
Processing documents with Schedule OCR 49
Defining the source of page images 50
Input from image files50
Input from scanner51
Scanning with an ADF52
Scanning long documents without an ADF53
Describing the layout of the document 53
Manual zoning 55
Working with zones55
Zone properties56
Table grids in the image 58
Using zone templates 59
4 PROOFINGANDEDITING61
Proofreading OCR resul t s 62
Checking recognized text against original 63
User dictionaries 64
IntelliTrain 65
The editor display and views 68
Text and image editing 69
Reading text aloud 70
Page outline 72
5 SAVINGANDEXPORTING73
Preparing recognition results for export 74
Saving to file 75
Saving original images75
Saving recognition results76
Saving a document as you work 77
Copying a document to the Clipboard 78
Sending a document as a mail attachment 79
OMNIPAGE SE USER’S GUI DEv
6 TECHNICALINFORMATION81
Troubleshooting 82
Solutions to try first82
Testing OmniPage SE83
Low memory problems84
Low disk space problems84
Supported file types 85
File types for opening and saving images85
File types for saving recognition results86
Saving to PDF87
OCR problems 88
Text does not get recognized properly88
Problems with fax recognition89
System or performance problems d uring OCR 89
Uninstalling the software 90
INDEX91
viCONTENTS
Welcome
Welcome to OmniPage SETM, and thank you for using our software!
The following documentation has been provided to help you get
started and give you an overview of the program.
This User’s Guide
This Guide introduces you to using OmniPage SE. It includes
installation and setup instructions, a description of the program’s
commands and working areas, task-oriented instructions, ways to
customize and control processing, and technical information. The
Guide is presented in PDF format, allowing you to use hyperlink
jumps on cross-references and other navigation tools in your PDF
viewer.
Online Help
OmniPage SE’s online Help contains information on features,
settings, and proce dur es. The on line Help is pr ovi ded as HTM L help ,
and has been designed for quick and easy information retrieval.
Comprehensive context-sensitive help aims to provide just enough
assistance to let you keep worki ng without dela y. Please see the section
Getting online help.
Readme File
The Readme file contains la st-minute i nformation about the s oftware.
Please read it before using OmniPage SE. To open this HTML file,
choose Readme in the OmniPage SE Installer or afterwards in the
Help menu.
Scanning and other information
ScanSoft’s web site at www.scansoft.com provides timely information
on the program. The Scanner Guide contains up-dated information
about supported scanners and related issues. Access ScanSoft’s w eb site
from the OmniPage SE Installer or afterwards from the Help menu.
OMNIPAGE SE USER’S GUI DEvii
USINGTHISGUIDE
This Guide is written with the assumption that you know how to
work in the Microsoft Windows environment. Please refer to your
Windows documentation if you have questions about how to use
dialog boxes, menu commands, scroll bars, drag and drop
functionality, shortcut menus, and so on.
We also assume you are familiar with your scanner and its supporting
software, and that the scanner is installed and working correctly before
it is setup with OmniPage SE. Please refer to the scanner’s own
documentation as necessary.
The following conventions are used in this Guide:
BoldIntroduces new terms and presents sub-headings.
ItalicNames sections in this Guide (unless otherwise stated,
the section is located in the same chapter as the
reference).
Names the main buttons used in automatic processing:
Start, Stop, Finish, Additional.
viiiWELCOME
Non-serif
Note
Tip
Presents file names: sample.tif
Presents an item of additional information.
Pre sents ideas for using pro gram features to acco mplish
specific tasks.
GETTINGONLINEHELP
In addition to using this Guide, you can use OmniPage SE’s online
Help to learn about features, settings, and procedures. Online Help is
available after you install OmniPage SE.
Online HTML Help
Open OmniPage SE’s online Help at its top level by choosing
OmniPage SE Help Topics at the top of the Help menu. This allows
you to see topics arranged in a Table of Contents, search an
alphabetical list of keywords or make full-text searches through the
topics. Other items in the Help menu provide access to useful topics
or web pages.
Press F1 as you are working with the program to see an online help
topic relating to the current screen area, dialog box or warning
message.
Context-Sensitive Help
You can get concise on-the-spot information in a popup window
about a particular OmniPage SE menu item, toolbar button, screen
area or dialog box, in the following ways:
Click the Help button in the Standard toolbar to get the help icon.
Click this on any item on the desk top outside a dial og box or warn ing
message.
Press Shift + F1 to get the same help icon.
Click the question mark button in the upper right corner of a dialog
box and then click an item in the dialog box to see the popup window.
Some dialog boxes or warning messages have their own Help button,
or a help text. Click the button or the text to get information on the
dialog or message box.
Click anywhere to remove a context-sensitive popup Help window.
OMNIPAGE SE USER’S GUI DEix
Tech Notes
ScanSoft’s web site at www.scansoft.com contains Tech Notes on
commonly reported issues using OmniPage SE. Web pages may also
offer assistance on the installation process and troubleshooting.
Glossary
This Guide does not include a glossary. The online Help has a
comprehensive gl os sary, with its own alphabetica l inde x an d a ta ble of
contents. Please consult it if you want to find the meaning of a term
used in this Guide or in the program.
OMNIPAGE SE
The product you have is a Special Edition of the world-renown
OmniPage ProTM software. This edition has been developed for
distribution by selected scanner manufacturers and contains a subset
of the features of the OmniPage Pro 11 product. This Guide and the
online H elp de sc ri be the fea tur es o f t he ful l p ro duct, us in g an SE ic on
to document the differences between the two products.
xWELCOME
If you find the additional features of the professional product would
be of benefit to you, you can use online facilities to upgrade your
Special Edition to OmniPage Pro 11.
1
Installation and setup
This chapter provides information on installing and starting OmniPage
SE. It presents the following topics:
u System requirements
u Installing OmniPage SE
u Setting up your scanner with OmniPage SE
u How to start the program
u Registering your software
u New features in OmniPage Pro 11
u OmniPage SE and OmniPage Pro 11
OMNIPAGE SE USER’S GUIDE 11
SYSTEMREQUIREMENTS
You need the following minimum system requirements t o install and run
OmniPage SE:
u A computer with a Pentium or higher processor
u Microsoft Windows 95, Windows 98, Windows ME, Windows
2000, or Windows NT 4.0
u 32MB of memory (RAM), 64MB recommended
u 75MB of free hard disk spa ce for t he appli cation fi les plus 10MB
working space duri ng install ation
u 9MB for Microsoft Installer (MSI) if not present and 44MB for
Internet Explorer if not present. (These are present as part of the
operating system in Windows 98, Windows ME and Windows
2000.)
u SVGA monitor with 256 colors and 800 x 600 pixel resolution
u Windows-compatible pointing device
u CD-ROM drive for installation
u A compatible scanner if you plan to scan documents. Please see
the Scanner Guide at ScanSoft ’s web site (www.scansoft.com) for
a list of supported scanners.
Note Performance and speed will be enhanced if your computer’s
processor, memory, and available disk space exceed minimum
requirements.
12INSTALLATIONANDSETUP
INSTALLING OMNIPAGE SE
OmniPage SE’s installation program takes you through installation with
instructions on every screen.
Before installing OmniPage SE:
u Make sure your scanner is connected, turned on, and compatible
with your system.
u Close all other applications, especially anti-virus programs.
u Log into your computer with administrator privileges if you are
installing on Windows 2000 or Windows NT.
u If you have previous OmniPage software on your system, the
installer will ask for your consent to uninstall that software first.
t To install OmniPage SE:
1. Insert OmniPage SE’s CD-ROM in the CD-ROM drive. The
installation program should start auto matically. If it does not start,
locate your CD-ROM drive in Windows Explorer and double-click
Autorun.exe program at the top-level of the CD-ROM.
the
2. Choose a language to use during installation. This language will be
used for the Text-to-Speech system and as the program’s interface
language. The program interface language is used for displ ays such as
menu items, dialog boxes, warning messages and so on. You can
change the interface language later from within OmniPage SE, but
your choice at installation time determines which Text-to-Speech
system will be installe d with the program. References to the Text-toSpeech faciliy do not apply to OmniPage SE.
3. Follow the instructions on each screen to install the software. All files
needed for scanning are copied automatically during installation.
Note Sometimes uninstalling and then reinstalling OmniPage SE will
solve a problem. See Uninstalling the software at the end of chapter 6.
INSTALLING OMNIPAGE SE13
Note In OmniPage Pro 11, Text-to-Speech is available for English
(British and US), French, German, Italian, Portuguese or Spanish. This is
not available in OmniPage SE. See also the section Reading text aloud in
chapter 4.
SETTINGUPYOURSCANNERWITH OMNIPAGE SE
All files needed for scanner setup and support are copied automatically
during the program’s installation. Before using OmniPage SE for
scanning, your scanner should be correctly installed and tested for correct
functionality.
Scanner installa tion an d setup ar e done thr ough the Sc anner W izar d. You
can start this yourself, as described below. Otherwise, the Scanner Wizard
appears when you first attempt to perform scanning from OmniPage SE.
Please follow these steps to use the Scanner Wizar d to setup your scanner
with OmniPage SE:
u Choose StartÉProgramsÉScanSoft OmniPage SEÉ Scanner
Wizard
or click the Setup button in the Scanner panel of the Options
dialog box.
or choose a scan command in the Get Page drop-down list in the
OmniPage Toolbox.
u Choose Select scanning source, then click Next.
u Click once on your scanner’s TWAIN driver to select it, then
click Next.
u Choose Yes to test your scanner configuration, then click Next.
u The wizard will now test the connection from the computer to
your scanner. Click on Next.
u Insert a test page into your scanner.
u The wizard is now prepared to do a basic scan using your scanner
manufacturer’s software. Click on Next.
14INSTALLATIONANDSETUP
u Your scanner’s native user-interface will appear. Click on Scan to
begin the sample scan.
u If necessary, click on Inverse Image… or Missing Image… and
make the appropriate selections.
u Once the image appears correctly in the window, click on Next.
u Select the item that most appropriately describes your scanner,
then click on Next.
u Click on Next to proceed to page size.
u The page sizes that the Scanner Wizard believes that your
scanner supports are listed in the window. To make any changes
to the page sizes, click on Advanced, make the changes and then
click on Next.
u Insert a page with text but no pictures into your scanner. Click
on Next to begin a scan in black and white mode.
u If necessary, click on Inverse Image… or Missing Image… and
make the appropriate selections.
u Once the image appears correctly in the window, click on Next.
u If you have a color scanner, insert a color photograph or a page
with a color picture into your scanner. Click on Next to begin a
scan in color mode. If necessary, click on Inverse Image… or
Missing Image… and make the appropria te se lection s. O nce th e
image appears correctly in the window, click on Next. If your
scanner cannot scan in color, skip this step.
u Insert a photograph or a page containing a picture into your
scanner. Click on Next to begin a scan in grayscale mode. If
necessary, click on Inverse Image… or Missing Image… and
make the appropriate selections. Once the image appe ars
correctly in the window, click on Next.
u You have successfully configured your scanner to work with
OmniPage SE! Click on Finish.
SETTINGUPYOURSCANNERWITH OMNIPAGE SE15
To change the scanner settings at a late r time, or to set up a different
scanner, or to test and repair an installed scanner, please follow one of
these two methods to reopen the Scanner Wizard:
u StartÉProgramsÉScanSoft OmniPage SEÉScanner Wizard or
u StartÉProgramsÉScanSoft OmniPage SEÉOmniPage
SE
ÉTools menuÉOptionsÉScanner…ÉSetup button.
Note To test and repair an improperly functioning scanner, follow the
procedure above, selecting ‘Test and configure current scanning source’ at
the start of the process.
HOWTOSTARTTHEPROGRAM
To start OmniPage SE do one of the following:
u Click Start in the Windows taskbar and choose
Programs
ÉScanSoft OmniPage SEÉOmniPage SE.
u Double-click the OmniPage SE icon in the program’s instal lation
folder or on the Windows desktop if you placed it there.
u Double-click an OmniPage Document (OPD) icon or file name;
the clicked document is loaded into the program. See OmniPage
Documents in chapter 2.
On opening, OmniPage SE’s title screen is displayed and then its d esktop.
See chapter 2 for an introduction to OmniPage SE’s desktop.
There are several ways of running the program with a limited interface:
u Use the Schedule OCR program. Click Start in the Windows
taskbar and choose ProgramsÉScanSoft OmniPage SEÉ
Schedule OCR. See Processing documents with Schedule OCR in
chapter 3.
u Click Acquire Text from the File menu of an application
registered with the Direct OCR™ facility. See How to set up
Direct OCR in chapter 3.
16INSTALLATIONANDSETUP
u Right-click an image file icon or file name for a shortcut menu.
Select a sub-menu item from ‘Convert To...’ to define a target.
u Use OmniPage SE with ScanSoft’s PaperPort
®
or Pagis®
document management products, to add OCR services. See How to use OmniPage SE with your PaperPort software in chapter 3.
REGISTERINGYOURSOFTWARE
ScanSoft’s registration Wizard runs at the end of installation. We provide
an easy electronic form that can be completed in less than five minutes.
When the form is filled, and you click Send the program will search an
Internet connection to immediately perform the registration online.
If you did not register the software during installation, you will be
periodically in vited to register later. You can go to www.scansoft.com to
register online. Click on Support and from the main suppor t screen
choose Register on the left-hand column.
For a statement on the use of your registration data, please see ScanSoft’s
Privacy Policy.
REGISTERINGYOURSOFTWARE17
NEWFEATURESIN OMNIPAGE PRO 11
The OmniPage® product family is augmented by OmniPage Pro 11 and
OmniPage SE. This section lists enhancements introduced in the
professional product OmniPage Pro 11. Some of these are incorporated
in OmniPage SE, as detailed in the next section.
New features in OmniPage Pro 11 compared to OmniPage Pro 10 are:
u Greater accuracy - redeveloped recognition engines ma ke
OmniPage Pro 11 the most accurate OmniPage ever.
u Improved page layout - OmniPage Pro 11 will allow you to retain
formatting that is true to the original, even on pages with nongridded tabl es, headers and footers and droppe d capitals.
u More intelligent proofreading - new IntelliTrain feature
automatically uses previous corrections to generate be tter OCR
results.
u PDF capability - now you can import PDF files (even read-only
files) and convert them to your favorite program files (Word,
Excel, etc.). You can also create PDF files from any paper
document or image files.
18INSTALLATIONANDSETUP
u Better HTML - new WYSIWYG (What You See Is What You
Get) HTML output will handle graphics, text, and backgrounds
to keep your web output looking like the or iginal document.
u Language support - OmniPage Pro 11 now supports over 100
languages and extends to the Greek and Cyrillic alphabets.
u Detail view - this pro vi des more c ustomizabl e informa tion about
each page, making it easier to handle pages in a document.
u Text Editor - a new fully-featured WYSIWYG editor for
recognition results, with a wide range of editing tools, color
support, and a choice of four formatting levels for display and
export.
u Better results on degraded text - a new despeckle module
significantly reduces errors on spotty, shaded and color
backgrounds.
OMNIPAGE SE AND OMNIPAGE PRO 11
This list documents features which are not incorporated in OmniPage
SE, but which can become available by upgrading to OmniPage Pro 11:
u Significant improvement in recognition accuracy.
u Access to the IntelliTrain character training facility.
u Abitity to open and read the contents of PDF files.
u Ability to save recognized documents to PDF format.
u Ability to open TI FF FX image files.
u Handling LZW TIFF and GIF image files for input and output.
u Support for WYSIWYG HTML 4.0 output.
u Language support rises from about 50 to over a hundred.
u Access to text-to-speech software, allowing recognized texts to be
read aloud.
For more information or to upgrade, please visit www.scansoft.com,
make a selection from the country/continent list if you prefer a different
language, then click on the OmniPage icon.
OMNIPAGE SE AND OMNIPAGE P RO 1119
20INSTALLATIONANDSETUP
2
Introduction
You probably use your computer for business correspondence, preparing
reports, handling data and an ever-increasing number of other uses. The
challenge is that, in spite of the digital revolution, certain sources of
information still circulate in printed, paper form and cannot be used
immediately in a computer.
For example, if you want to incorporate information from a magazine
article in a report you are preparing, you somehow have to get the text
from the article into your computer. Painstakingly retyping the article is
not an appealing solution.
This chapter introduces you to the solution: optical character recognition
(OCR). It describes how OmniPage SE uses OCR technology to
transform text from scanned pages or image files into editable text fo r use
in your favorite computer applications.
The chapter includes the following sections:
u What is optical cha racter recognition
u Documents in OmniPage SE
u Basic processing steps
u The OmniPage SE desktop
u Managing documents
u OmniPage Documents
u Settings
OMNIPAGE SE USER’S GUI DE21
WHATISOPTICALCHARACTERRECOGNITION
Optical character recognition is the process of extracting text from an
image. This image can result from scanning a paper document or
opening an electronic image file.
characters; they have many tiny dots (pixels) that together form character
shapes. These present a picture of the text on a page.
During OCR, OmniPage SE 11 analyzes the character shapes in an image
and defines solutions to produce editable text. After OCR, you can save
the resulting text to a variety of word-processing, deskto p publishing or
spreadsheet applications.
OmniPage SE’s OCR capabilities
In addition to text recognition, OmniPage SE can retain the following
elements of a document through the OCR process.
Graphics
Photos, lo gos, and drawings are ex amples of graphics.
Images do not have editable text
Text formatting
Font types, sizes and styles (such as bold, italic and underlines
examples of character format ting. I ndents, ta bs, margin s and line spac ing
are examples of paragraph formatting.
Page formatting
Column structure, table formats, and placement o f graphics an d headings
are examples of page forma tting.
The graphics, text an d page formatting elements that Omni Page SE
retains are dete rmined by the settings you select. Refe r to the Settings Guidelines in the online Help for more information about selecting
settings.
) are
Note OmniPage SE only recognizes machine-generated characters such
as offset or laser-printed or typewritten text. However, it can retain
handwritten text, such as a signature, as a graphic.
22INTRODUCTION
Documents in OmniPage SE
OmniPage SE handles documents one at a time. When you acquire your
first image (from scanner or from file) a new document is started. Further
acquired images are added to the same document, until you save and
close it.
A document in OmniPage SE consists of one image for each document
page. After you perform OCR, the document will also contain recognized
text, displayed in the Text Editor , possibly along with graphics and tables.
For more information on screen areas, see the section The OmniPage SE desktop.
Basic processing steps
There are two main ways of handling documents: with automatic
processing or manual processing. See chapter 3, Processing documents automatically and Processing documents manually. The basic steps for both
processing methods are broadly the same:
1. Bring a set of images into OmniPage SE.
You can scan a paper document with or without an Automatic
Document Feeder (ADF) or load one or more image files. The
resulting images appear in miniature in the Document Manager’s
Thumbnail view and the pages are summarized in its Detail view.
The image of the current page is displayed in the Original Image
area.
2. Perform OCR to generate editable text.
3. Export the document to the desired location.
During OCR, OmniPage SE creates zones around elements on the
page that will be processed, and then interprets text characters or
graphics in each zone. Manual and template zoning are also possible.
After OCR, you can check and correct errors in the document using
the OCR Proofreader and edit the document in the Text Editor.
You can save your document to a specified file name and type, place
it on the Clipboar d, or se nd it as a mail att achment. You can save it as
an OmniPage Document (OPD) as described later. You can save the
same document repeatedly to different destinations, different file
types, with different settings and levels of formatting. See chapter 5.
WHATISOPTICALCHARACTERRECOGNITION23
Standard toolbar
OmniPage
Toolbox
The current page
has a pale border.
This page has
been recognized.
THE OMNIPAGE SE DESKTOP
OmniPage SE’s desktop has a title bar and a menu bar along the top and
a status bar along the bottom. It has three main working areas, separated
by splitters: the Document Manager, the Original Image area and the
Text Editor. The Document Manager has two tabbed panels : Thumbnail
view and Detail view. The Original Image area has an Image toolbar and
the Text Editor has a Formatting toolbar.
Formatting toolbar
Thumbnail view
shows a picture of
each page in the
document.
Page navigation
buttons
Buttons to show,
hide or rearrange
the working
areas.
Image
toolbar
Original Image area:
This displays the image of the current page,
together with any zones automatically or
manually placed on the image.
Drag this splitter to left or right
to resize the working areas.
The Text Editor view
buttons offer four
formatting levels.
Text Editor: This is
displaying the recognition
results from the current
page in True Page™ view.
Note To control which of the three views (Document Manager,
Original Image, and Text Editor) are displayed, check or uncheck each
view from the View menu or with the status bar buttons.
24INTRODUCTION
The OmniPage Toolbox lets you control processing. It can have three
states, depending which of the three tab buttons on the left is clicked. In
the picture, we display its appearance for Manual OCR. We show the
program wi th a thr ee-page do cument. P a ge one is the curr ent page, which
has been recognized and proofed. Page two has been recognized but not
proofed yet. Page three has been acquired and manually zoned, but not
recognized yet. The icons at the bo ttom right of the thumbnail ima ge s
show page status.
Status bar buttons let you show, hide or rearrange the main screen areas
and move to other pages in the document. A right mouse click in any
screen area brings up a shortcut menu with the most useful commands
for that are a.
The Standard toolbar
The Standard toolbar contains buttons and a drop-down list for
performing stand ard tasks. It can be floated and docked to any edge of
the OmniPage SE desktop. All these functions can also be accessed from
menus.
New
start a new
document.
Open an
OmniPage
Document
Save the current
document und er
the name and type
of its last save.
Print images or
recognition results
from all or
selected pages.
The Menu bar
For concise information on any menu item, click the context-sensitive
help button and then click a menu item. A popup text explains the
purpose of the menu item. Click anywhere to close the popup.
Proofread
the recognized
text.
Cut the current
selection in the
Text Editor.
Copy
the current Text
Editor selection.
Paste selection
into the Text
Editor.
Undo
the last
editing
action.
Open the
Options dialog
box.
Zoom the active area:
Original Image or Text
Editor.
Contextsensitive Help
THE O MNIPAGE SE DESKTOP25
The Image toolbar
The Image toolbar contains buttons that allow you to zoom in or out on
the current image or to rotate it. They also allow you work with zones
and table dividers on the page. See chapter 3, Manual zoning and Table grids in the image. Here we summarize the purpose of the buttons. The
Image toolbar can be floated (that i s, undocked and moved anyw here on
the desktop). It can be docked to any edge of the Original Image area.
Draw
rectangular
zones.
Draw
irregular
zones.
Add
to a zone or
combine
zones.
Subtract
from zone or
separate
zones.
Tip You can also resize or rotate the original image with a shortcut
menu. Right click in the Original Image area outside a zone and select a
zoom or rotation value.
The Formatting toolbar
The Formatting toolbar contains buttons that allow you to edit
recognized text in the Text Editor. See Text and image editing in chapter
4. Here we summarize the purpose of the buttons. The Formatting
toolbar always remains along the top of the Text Editor.
Reorder
zones.
Zone
properties
Move
row or column
dividers in a
table.
Insert
column
dividers in a
table.
Insert
row dividers
in a table.
Remove
row or column
dividers one by
one.
Remove/
replace all
row and column
dividers.
Rotate
images.
Zoom in
on page
image.
Zoom out
from page
image.
Paragraph
styles
26INTRODUCTION
Font nameF o nt size
BoldUnderline
Paragraph
alignment
BulletsItalic
Show/hide non-
printing characters.
The OmniPage T oolbox
This Toolbox lets you drive the processing. By default it is located along
the top of the OmniPage SE desktop, just above the working areas. It can
be floated and also be docked along the bottom of the desktop.
It has three tabs on the left: AutoOCR™, Manual OCR and OCR
Wizard. Click one to see its controls in the Toolbox. The picture at the
beginning of th is sectio n showe d the OmniPage desktop with the Manua l
OCR toolbar. The AutoOCR toolbar looks like this.
Automatic processing is started, and can be stopped and re-started with
the buttons on the right of the toolbar. The use of these buttons is
explained in Processing documents automatically in chapter 3. The effects
of other settings are also described in chapter 3, Tutorial: Processing
documents
You can switch between automatic and manual processing any time the
program is not busy with processing. That means you can switch between
them while you are wor k ing within a doc ument. You can automatically
process some pages, then add more pages with manual processing. After
processing a stack of pages automatically, you can inspect the results and
then go back to reprocess certain pages manually. This procedure is
described in chapter 3 in the section Processing a document automatically and finishing it manually.
OmniPage SE must be empty when you start the OCR Wizard. See the
section Processing documents using the OCR Wizard in chapter 3. When
you have used the OCR Wizard to process and save a document, it
remains in the program and can be further processed (adding more pages,
rerecognizing pages etc.) with either manual or automatic processing.
THE O MNIPAGE SE DESKTOP27
MANAGINGDOCUMENTS
The Document Manager is situated on the left of the OmniPage SE
desktop. It has two tabbed panels: Thumbnail view and Detail view.
Click a tab to see its view. Both views summarize the pages in the
document and are synchronized: the current and selected pages remain
the same when you switch views. Our pictures show the two views with
the same four-page document. Pages 1 and 2 are selected and page 4 is
the current pa ge, that is, the one shown i n the Original Image area. The
Document Manager shows page status with the following icons:
PageStatus
1Acquired—
2Zoned—
3Recognized
4Proofed
Thumbnail icon
Detail
icon
Thumbnail vi ew
This presents a vertical set of number ed thumbnail images, one for each
page in the document. Scroll to see pages as necessary . The curre nt page has
a paler background and its page number text appears bold. You can select
multiple pages in the document; these have a ‘pushed-in’ appearance. A
status icon appears at the bottom right of each page as described above.
Jump to a page: Click the icon of the desired page.
Reorder a pa ge: Cl ick the thumbnail of the page you want to move and
drag it above the desired page number. Pages are renumbered
automatically.
Page image has been...
Acquired with no manual or template zones and has
not yet been recognized.
Acquired and manual or template zones have been
placed; not yet recognized.
Recognized, but not proofread, or proofing was
interrupted on the page.
Recognized, and proofing has reached the end of
the page.
Delete a page: Select the thumbnail of the page you want to delete and
press the Delete key.
Select multiple pages: Hold down the Shift key and click two
thumbnails to select all pages between and including them. Hold down
the Ctrl key as you click thumbnails to add pages to a selection one by
one. Then you can move or delete the selected pages as a group, or send
them to (re)recognition.
28INTRODUCTION
Detail view
This facility is new to OmniPage SE. It provides an overview of your
document with a table. Each row represents one page. Columns present
statistical or status information for each page, and (where appropriate)
document totals. The pictur e below sh ows the d efault columns on the left
and four columns which a user has specified.
Move the cursor
onto the page’s
status icon to see
a thumbnail of the
page.
This shows the
number of
zones of each
type on the
page.
The current page is shown with a highlight. You can use Detail view for
page operations, as follows:
Jump to a page: Click the row of the desired page.
Reorder a page: Click the row of th e pag e y ou wan t to move and drag it
to the desired location. An arrow indicator on the left shows where the
page will be inserted. Pages are renumbered automatically.
Delete a page: Select the row of the page you want to delete and press the
Delete key.
Select multiple pages: Hold down the Shift key and click two page rows
to select all pages between and includi n g them. Hold down the Ctrl key
as you click rows to add pages to a selection one by one. Then you can
move or delete the selected pages as a group, or send them to
(re)recognition.
When multiple pages are being selected, the page set as current does not
change. All selected pages are highlighted.
Tip Get image size information by hovering the cursor over a thumbnail
or outside a zone on an original image. A popup text displays the image
size in pixels and the program’s unit of measurement. Image resolution is
also show n.
MANAGINGDOCUMENTS29
Customizing columns in Detail view
Th
You can specify which columns of information you want to see in Detail
view. Click Customize Details... in the View menu for the following
dialog box:
is item is
highlighted.
Click a checkbox
to select the item.
Image sizes are
expressed in
pixels.
Define a width for
the highlighted
item.
Highlight an
item and use
these arrows to
change the
order of
columns.
Define which columns should appear, their widths, and column order.
The topic Customizing Detail view columns in online Help clarifies what is
presented in each column. Y ou can change column widths easily in Detail
view; just drag the column dividers in the title bar.
Deleting pages from a document
Page del eti ons must be conf irmed and can be undo ne. Delete t he curr ent
page only with the item Delete C urr ent P age in the Edit menu. Del ete all
selected pages in the Document Manager (either view) by pressing the
Delete key or using the shortcut menu command Clear.
Printing a document
You can print the document with the Print item in the File menu.
Choose whether to print images or text (that is, recognition results as
they appear in the Text Editor). You can print all pages or a range of
pages. The Print button in the Standard toolbar prints images or text,
depending whether the Original Image area or the Text Editor is active.
30INTRODUCTION
Closing a document
Choose Close in the F i le menu to cl ose a documen t. You are prompted to
save your document if you have not saved it or you have modified it since
the last save. See the next section on saving the document as an
OmniPage Document (*.opd). Y ou will also be prompted to save unsaved
training data if you selected ‘Prompt to save IntelliTrain’ data when
closing document’ in th e Proofing panel of the Options dialog box.
The last sentence does not apply to OmniPa ge SE.
OMNIPAGE DOCUMENTS
The OmniPage Document is the program’s proprietary file type; it has
the extension .opd. It is one of the file types offered when saving a
document to file. You save the document to the OPD file type if you
want to work with it again in OmniPage SE during a future session. You
can then process unfinished pages, add more pages and proof or edit
recognition results.
An OmniPage Document contains the original page images with any
zones placed on them. After recognition, the OPD also contains the
recognition results. Recognized characters are stored along with their
coordinate and confidence data. This preserves the links between image
and text, so that verification and proofing remain available when the
OPD is reopened in future sessions.
When you save an OmniPage Document, the current settings (and
unsaved training) are also saved. When you open an OmniPage
Document, its settings are appli ed, temporarily replacing th ose existing in
the program.
OMNIPAGE DOCUMENTS31
Why save to OPD
You do not have to save your documents to the OPD file type. You would
typically do this for the following reasons:
You cannot finish working with the document in the current session.
You want to pass the document to other users who have OmniPage
SE or OmniPa ge P r o 11. For example, you can pass an OPD fil e to a
specialist for proofing. In an office network, you may have one
scanner generating images for recognition and proofing at several
workstations.
You want to build up an archive of recognized documents whose
original images remain accessible. The recognized texts allow
searching by keywords and other document retrieval techniques.
Note Recognition results should be saved away from OPD files before
installing any OmniPage upgrade. These files may not be upwards
compatible to newer OPD file formats, or possibly only the images will
be retained when the files are upgraded.
32INTRODUCTION
How to save to OPD
If you intend to create an OPD, you can save it to this format at an early
stage, for protection. Use the Save button to save it periodically as you
work. Save it again at the end of your session.
The Save button saves the document to the name and file type of its last
save. You can save your document repeatedly to different formats. If your
first save was to another format (for instance .DOC), use the item Save
As... from the File menu to save it as an OPD. If a document is saved as
an OPD, then you later save it to another format, it is not automatically
resaved as an OPD. When you close the document or exit the program,
you will be prompted to save the document as an OPD.
SETTINGS
The Options dialog box is the central location for OmniPage SE settings.
It has seven panels. Context-sensitive help provides information on each
setting. In overview, the settings panels are:
OCR
Use this to spec ify recognition language(s), a us er dictionary, a reject
character, an OCR method (optimize for speed or accuracy) and font
matching.
Scanner
Use this to define page size and orientation for scanning. You can also
make brightness and contrast settings and define options for scanning
multi-page documents, with or without an Automatic Document Feeder
(ADF). You can change scanner setup settings or install a new scanner or
change the default scanner.
Direct OCR™
This feature provides OCR services directly from your favorite word
processor or similar application. Use this panel to register and unregister
applications for D irect OCR a nd to enable or di sable this service. You can
also specify automatic or manual zoning and whether proofreading is
desired or not.
SETTINGS33
Process
Use this to define where new images should be placed in the document
and set other preferences governing the behavior of the processing. You
can change the interface l anguage here.
Proofing
Use this to define whether proofreading should begi n automatically af ter
recognition. Define also whether IntelliTrain should run, and use it to
load or work with a training file. For more detail, see chapter 4,
Proofreading OCR results.
The references to IntelliTrain and training files do not apply to
OmniPage SE.
Cust om Layout
Use this to describe the layout of your input document pages very
precisely. This gives you maximum control over the auto-zoning process,
instructing it to search or ignore columns, graphics and tables.
Text Editor
Use this to show or hide some features in the Text Editor, to define the
unit of measurement to be used and to turn word wrapping on or off.
34INTRODUCTION
Note Some settings h ave an effect only on futur e reco gnition. E xamples
are the recognition languages, a training file and scanner brightness.
These settings should be correctly adjusted before you start processing.
To have chang e s in these settings a pplied to already recognized pages,
you will have to rerecognize them. Other settings are implemented
immediately in all existing pages. Examples are Text Editor settings like
word wrap and measurement units.
3
Tutorial:
Processing documents
This chapter describes different ways you can process a document and
also provides information on key parts of this processing.
u Quick Start Guide
u Processing documents using the OCR Wizard
u Processing documents automatically
u Processing documents manually
u Processing a document automatically and finishing it manually
u Processing from other applications
u Processing documents with Schedule OCR
The detailed topics are:
u Defining the source of page images
u Describing the layout of the document
u Manual zoning
u Table grids in the image
u Using zone templates
OMNIPAGE SE USER’S GUI DE35
QUICK START GUIDE
This topic takes you step-by-step through the basic OCR process.
Loading and recognizing sample image files
You will find sample image files in the program folder, both single-page
and multi-page files. First try reading these files using the procedure
presented below, except for the references to a scanner. See Input from image files for more information on acquiring the images. The results
provide you with a benchmark of the recognition quality you should
expect from your own files of comparable quality.
Next, try scanning a page from your scanner.
Scanning and reco gn iz ing a s ingle page
Turn your scanner on and be sure it is working correctly. Choose a page
with good-qua lity clear text for this test.
We assume OmniPage SE’s default settings are set and that your
document is in the language you specified for interface language during
installation. Open the Options dialog box from the Tools menu and
choose Use Defaults if you are not using the program for the first time.
You will process the document automatically and save the recognition
results to a file. You will proof the document but will not edit it inside
OmniPage SE’s Text Editor.
36TUTORI AL: PROCESSINGDOCUMENTS
What you doWhat happens
1.Set up your scanner using the Scanner Wizard,
if this is not already done.
2.Select Start
OmniPage SE
3.Place the document correctly in your scanner.
4.Check the three tab buttons to the left of the
OmniPage Toolbox. The AutoOCR button
should be selected. If not, click on it.
5.From the Get Page drop-down menu, select a
scan option for your document: black-andwhite, grayscale or color.
6.From the Describe Original drop down menu,
check Automatic is selected. For a wide range
of documents, this is the best choice.
7.From the Export Results drop-down menu,
check that Save as File is selected.
8.Click on Start.OmniPage SE will start to scan in your document.
9.The OCR Proofreader appears and invites you
to modify words that the program suspects
have not been recognized correctly.
10.Click in the Text Editor. Select Text Editor views
one after another, to see how the page appears
in each view. Choose the view you want for
export.
É
Programs É ScanSoft
É
OmniPage SE
Configures OmniPage SE to work with your scanner.
Opens OmniPage SE on your computer.
Specifies that you want OmniPage SE to process the
document automatically according to the given settings.
Allows you to determine how pictures or colored texts
and backgrounds will look in the exported document.
Color scanning needs a color scanner.
Configures OmniPage SE to place zones on the page
and decide their properties automatically.
This means you will be able to name your export file after
you have proofed the document.
The OCR Proofreader operates like a spell checker in a
word processing program, but with added OCR-specific
features.
Each Text Editor view defines a formating level. The view
set at saving time is applied to the text in the saved file.
11.Click Resum e to restart proofing. When the
message OCR Proofreading is complete
appears, click on OK.
12.
Choose the location and file type to save your
recognized document. Click on OK.
13.Inspect the document in your word processing
program.
Tip If you suceeded in getting good results from the sample image files,
but not from the sc anned page, check you r scanner installation and
settings: in particular brightness and image resolution. See Input from scanner for a model of optimum brightness. See also the online Help
topics Setting up your Scanner and Scanner Troubleshooting.
This ends the OCR Proofreader process. The Save As
dialog box will appear.
By default, Save and Launch is enabled, so your document will be automatically opened in the word processing
program associated with the file type that you selected.
You have successfully used OmniPage SE to recognize
your document and open it in your target application!
QUICK S TART GUIDE37
Here is an ov e rview of the processing methods you can use. You will find
step-by-step guidance for each of them in the following pages.
Using the OCR Wizard
The OCR Wizar d guides you th ro ugh the selection of settings and
commands by asking you questions. It then launches automatic processing.
This is a good way to get started if you are new to OmniPage SE.
Automatically
The fastest and easies t way to process documents is to let OmniPage SE
do it automati ca ll y f or y ou. S elect settings in th e Op tio n s dialog box and
commands in the AutoOCR toolbar and then click Start. It will take each
page through the whole process from beginning to end, when possible
running in parallel. It will typically auto-zone the pages.
Manually
Manual processing gives you more precise control over the way your
pages are handled. You can process the document page-by-page with
different settings for each page. The program also stops between each
step: acquiring images, performing recognition, exporting. This lets you,
for instance, draw zones manually or change recognition language(s). You
start each step by clicking buttons on the Manual OCR toolbar.
Automatically with manual finishing
You can process a document automatically and view results in the Te xt
Editor. If most pages are in order, but a few have not turned out as
expected, you can switch to manual processing to adjust settings and
rerecognize just tho se problem pages.
In other applications
You can use the Direct OCR feature to call on the re cognitio n services of
OmniPage SE while working in your usual word-processor or similar
application. OmniPage SE automatically links itself to ScanSoft’s
PaperPort and Pagis document management programs.
At a later time
You can schedule OC R jobs to be per formed automatically at a later
time, when you may not even be present at your computer. The Add Job
Wizard in Schedule OCR allows you to specify settings and a starting
time.
38TUTORI AL: PROCESSINGDOCUMENTS
PROCESSINGDOCUMENTSUSINGTHE OCR WIZARD
The OCR Wizard takes you through six settings panels, guiding you to
make settings for you r document and then launching automatic
processing. Context-sensitive help is available for all Wizard panels. The
OCR Wizard can run only when there is no document op en in
OmniPage SE.
Click the OCR Wizard tab in the OmniPage Toolbox and clic k the
Wizard button to see the first wizard screen:
1. The first panel lets you define your document source: scanner or
2. The second panel asks you to describe the layout of the input
image file. For more information, see the section Defining the source
of page images. Answer the questions in the fi rst screen and click Next.
document, to assist the auto-zoning. For more information, see the
section Describing the layout of the document.
PROCESSINGDOC UMENTSUSINGTHE OCR WIZARD39
3. The third panel (shown below) lets you define recognition languages
and decide OCR method. Languages with dictionary support have
the icon .
4. The fourth panel lets you define the formatting level to be applied to
your document for display and export. See The editor display and views in chapter 4 for more information.
5. The fifth panel asks if you want t o proo fread th e text befor e export. I f
you choose Yes you can also edit the text before saving. You also
decide whether to create and use IntelliTrain data during proofing.
See chapter 4 for more information. The reference to IntelliTrain
does not apply to OmniPage SE.
6. The last panel asks you to define the export choice: saving to file or
copying to Clip board. After setting the choice, click Finish to close
the Wizard and start the automatic processing .
40TUTORI AL: PROCESSINGDOCUMENTS
7. If you requested proofing and the text contains suspect words, the
OCR Proofreader™ dialog box will appear. When proofing is
finished or closed, recognition results either go directly to the
Clipboard, or the Save As dialog box appears so you can specify file
export settings.
8. The document remains in OmniPage SE. You can edit recognition
results and save it again to other formats. You can change zones
manually or change other settings and then use manual processing to
rerecognize single pages from the document. You can add pages with
automatic or manu al processing.
Note The Wizard panels present settings as they were last set in the
program. Also, OmniPage SE will remember the settings you make in the
OCR Wizard panels and apply them to future automatic or manual
processing, until you change them. So, if you have more documents for
which your OCR Wizard settings are suitable, just switch to the AutoOCR
toolbar and click Start.
Note Applicable settings not offered by the OCR Wizard take the values
last set in the program. This concerns mainly scanner settings, a user
dictionary or a training file. Zone templates cannot be used with the OCR
Wizard. If a templ ate file was set when the OCR Wizard starts, it is unloaded
and Automatic is set as input description. You cannot export a recognized
document as a mail attachment. Please use automatic or manual processing
for this.
PROCESSINGDOC UMENTSUSINGTHE OCR WIZARD41
PROCESSINGDOCUMENTSAUTOMATICALLY
Automatic processing provides an efficient way of handling documents,
especially larger ones. F irst you select all set tings needed, then you can use
the AutoOC R™ toolbar in the OmniPage To olbox to process a new
document from start to finish or to restart and finish processing on an
open document.
1. Click the AutoOCR tab in the OmniPage Toolbox to display the
AutoOCR toolbar.
2. Select the desired Get Page command in the drop-down list. You
define the document source, which can be fr om image files or fr om a
scanner. For more detail see the section Defining the source of page images.
3. Select a c o mmand from the Describe Original drop-down list , as
shown above. This guides the prog ram in auto-z oning the pag es. You
describe the incoming pages or specify a zone template fil e. For more
information on the choices, see the section Describing the layout of the document.
4. Select a command from the Export Results drop-down list. You can
save the recognized document to file, copy it to Clipboard or send it
as a mail attachment. For information on the choices, see chapter 5.
5. Choose Options in the Tools menu and check that settings are
appropriate for your document. You can, for instance, specify
recognition languages and whether you want to proofread the
document or not. See Settings at the end of chapter 2.
42TUTORI AL: PROCESSINGDOCUMENTS
6. Click Start or choose Start in the Process menu. Each page of the
document is pr ocessed a nd fin ished one after the o ther. The program
may perform tasks simultaneously, for instance it may start loading
and recognizing a new page as you proofread the previous page.
Command buttons
Start: This lets you begin automa tic processing on a new do cument.
Stop: Th is lets you interrupt automatic proce ssing. You may do this if
you find that some settings need to be changed. Then the Start button
changes to Finish.
The start button takes different values when processing is stopped or
finished.
Finish: This appears if processing is incomplete. It lets you:
u Finish processing unfinished pages.
u Export the document, dropping any unrecognized pages.
Additional: This appears if all existing pages are processed and have
been exported once. It lets you:
u Export the document again, maybe with changes, to a
different file type, name or location, or with a different
formatting level.
u Add more pages: from the same source o r a differ ent s ource,
with changed or unchanged settings.
u Re-process all pages: Discard all recognition results and
rerecognize all pag es in the document with different set tings.
You can specify auto-zoning or a template file.
Tip You may reprocess all pages if an unsuitable setting caused poor
results on all pages. An example is incorrect language choice, resulting
in almost all words marked suspect during proofing. ‘Re-process’ lets
you perform rerecognition without having to scan or load or rezone all
the images again.
PROCESSINGDOCUME NTSAUTOMATICALLY43
PROCESSINGDOCUMENTSMANUALLY
Manual processing gives you more precise control over the way your
pages are handled. You can process the document page-by-page with
different settings for each page. The program also stops between each
step: acquiring images, performing recognition, exporting. This lets you,
for instance, draw zones manually on ea ch page. You start each step in the
process by clicking the buttons on the Ma nu al OCR toolbar.
1. Click the Manual OCR tab in the O mniPage Toolbox to dis play the
Manual OCR toolbar.
2. Click in the Standard toolbar or Options i n the Tools menu to
check or make settings in the Options dialog box. See Settings at the
end of chapter 2.
3. Select the desired value for the Get Page button. You define the
document source, which can be from image files or from a scanner.
Access the scanner settings dialog box and make settings as desired.
For more detail see the section Defining the source of page images.
4. Click the Get Page button. This either bring s up the Loa d File dialog
box allowing you to name images files, or initiates scanning. The
result is one or more images displayed in the Document Manager and
one in the Original Image area.
5.
Now y ou can manually draw and modify zones on one or mor e images
and assign properties. St atus bar button s let you mo v e to other pa ges.
Any image without zones will be auto-zoned when recognition is
requested. For guidance, see the section
Manual zoning
.
44TUTORI AL: PROCESSINGDOCUMENTS
6. Select a value for the Perform OCR button. You describe the layout
of the incoming pages. This value has an influence if auto-zoning
runs on any pages. You can also select a template to have its zones
placed on the current page. For more detail see the sections
Describing the layout of the document and Using zone templates.
7. Click the Perform OCR button to have the current page recognized.
To have selected pages recognized, make a multiple selection in the
Docume nt M ana g er ( se e Managing documents in chapter 2) and then
click the Perform OCR button.
8. The Zoning Instructions dialog box appears, unless you disabled it.
When you choose one of its options, recognition starts.
9. If you requested proofing, the OCR Proofreader dialog box displays
suspect words one after the other from the recognized page(s). You
can proof and edit the recognized text. See Proofreading OCR results
in chapter 4.
10. Continue loading pages, performing OCR, editing and proofing as
desired.
11.
Select a value for the Export Results button. You can save the
recognized document to file (including as an OmniPa ge Do cument),
copy it to Clipboard or send it as a mail attachment. You can save the
document more than once; see
Saving recognition results
in chapter 5.
Note If you d eselect ‘Find zones in addition to template/current zones’ in
the Process panel of the Options dialog box, the Zone Instructions dialog
box will not appear and recognition will always run with cur rent zones only.
PROCESSINGDOCUMENTSMANUALLY45
PROCESSINGADOCUMENTAUTOMATICALLYAND
FINISHINGITMANUALLY
When you have a large document with only a few pages needing special
attention, you do not have to manually process the whole document. Y ou
can process it automatically and view results in the Text Editor. You can
determine which pages are in order, and which need different settings or
some manual zoning. Then you can switch to manual processing to
adjust settings and zones and rerecognize just those pages.
1. Prepare the document and perform automatic processing, as already
described.
2. If you close or finish proofing you will be invited to save the
document. This is recommended, even if it is not in its final form.
3. Select a page needing rezoning or changed settings and click the
Manual OCR tab at the left of th e OmniPage Toolbox.
4. Delete or modify the existing zones in the Original Image area. You
can also load a template to let its zones replace existing ones. Draw
new zones as desired. See Manual zoning.
5. Change other settings as requir ed fo r the curr ent pag e. S ee Settings at
the end of chapter 2.
6. Click the Perform OCR button to rerecognize the current page.
Confirm that the previous recognition results should be overwritten.
The Zoning Instructions dialog box will appear, unless disabled.
7. To rerecognize more than one page, select the required pages in the
Document Manager before clicking the Perform OCR button.
8. When all pages have been rerecognized with acceptable results, save
the document again.
46TUTORI AL: PROCESSINGDOCUMENTS
PROCESSINGFROMOTHERAPPLICATIONS
You can use the Direct OCR feature to call on the re cognitio n services of
OmniPage SE while you work in your usual word-processor or other
application. First you must establish the direct connection with the
application. Then, two items in its File Menu open the door to OCR
facilities.
How to set up Direct OCR
1. Start the application you want connected to OmniPage SE. Start
OmniP age SE, op en the Opti ons dialog bo x at the Dir ect OCR p anel
and select ‘Enable Direct OCR’.
2. The Unregistered panel displays running or previously registered
applications. S elec t the desired one(s) and clic k A dd. You can browse
for an unlisted application. Select the process options as desired, to
function as preferences.
How to use Direct OCR
1. Open your registered application and work in a document. To
acquire recognition results from scanned pages, place them correctly
in the scanner.
2. Use the File Menu item Acquire Text Settings... to specify settings to
be used during recognition. Any settings not offer ed take their v alues
from those last used in OmniPage SE. Settings changed for Direct
OCR are also changed in OmniPage SE.
3. Use the File M enu item Acquire Text to acquire images from scanner
or file.
4. If you selected ‘Draw zones automatically’ in the Direct OCR panel
of the Options dialog box, or under Acquire Text Settings...,
recognition proceeds immediately.
5. If ‘Draw zones automatically’ is not selected, each page image will be
presented to you, allowing you to draw zones manually. Click the
Perform OCR button to start recognition.
PROCESSINGFROMOTHERAPPLICATIONS47
6. If proofing was specified, this follows recognition. Then the
recognized text is placed at the cursor position in your application,
with the formatting level specified by Acquire Text Settings... .
Note If OmniP age SE is running when Direct OC R is cal led from a target
application, a second instance of OmniPage SE is launched.
How to use OmniPage SE with your PaperPort software
PaperPort® is a paper management software product from ScanSoft.
It lets you link pages with suitable applications. Pages can contain
pictures, text or both. If PaperPort exists on a computer when
OmniPage SE is installed, its OCR services become available and
amplify the pow er of PaperPort. You can choose an OCR pro gram by
right clicking on a text applications PaperPort link, selecting
Preferences and then selecting OmniPage SE as the OCR package.
OCR settings can be specified, as with Direct OCR.
:
Here OmniPage SE has been selected as the OCR package for MS
Word 2000. Then you can drag page images from the PaperPort
desktop onto the MS Word link on the PaperPort. While the text is
being recognized, only a progress monitor is displayed. OmniPage
SE’s manual zoning window or proofing facility will appear if
requested. The recognition results are placed in a new unnamed
document in the target application.
48TUTORI AL: PROCESSINGDOCUMENTS
PROCESSINGDOCUMENTSWITH SCHEDULE OCR
You can schedule OCR jobs to be performed automatically at any time
within the following 24 hours. Each job handles one document. The
document pages can come from a scanner with an ADF or from image files.
You do not have to be present at your computer at job start time, nor does
OmniPage SE have to be running. It does not matter if your computer is
turned off after the job is set up, so long as it is running at job start time. If
you are scanning pages, your scanner must be functioning at job start time,
with the pages loaded in the ADF. Here is how to set up a job:
1. Click Schedule OCR in the Process menu or in the Windows Start
menu: select ProgramsÉScanSoftÉOmniPage SEÉSchedule OCR.
2. The Schedule OCR dialog box appears. Click Add Job... to get the
Add J ob W i zard. It t ak es yo u th r ough six pan el s , s imila r to the OCR
Wizard.
3. In the first panel you define image source. An additional feature lets
you process all supported image files in a defined folder.
4. The next three panels are similar to those in the OCR Wizard, but
you can also specify a user dictionary. In OmniPage Pro 11 you can
specify a training file and/or run IntelliTrain. These are not available
in OmniPage SE .
5. The fifth panel lets you specify an export file name, type, location
and a file separation choice.
The last panel lets you define the job start time, retain or delete input
6.
files after processing and specify use of a log file to note job completion
and any problems encountered. Click Finish to close the Wizard
.
Note The Schedule OCR dialog box lists all jobs, with status Waiting,
Running, Error or Complete. Use Modify Job... to change settings for a
waiting job. You c a n modify and reuse finished jobs to process new jobs
needing similar settings. You can delete completed jobs when they are no
longer needed.
For more information, please see Scheduling OCR in the online Help.
PROCESSINGDOCUMENTSWITH SCHEDULE OCR49
DEFININGTHESOURCEOFPAGEIMAGES
There are two possible image sources: from image files and from a
scanner. There are two main types of scanners: flatbed or sheetfed. A
scanner may have a built-in or added Automatic Document Feeder
(ADF), which makes it easier to scan multi-page documents. The images
from scanned documents can be input directly into OmniPage SE or may
be saved with the scanner’s own software to an image file, which
OmniPage SE can later open.
Input from image files
You can create image files from your own scanner, or receive them by email or as fax files. OmniPage SE can open a wide range of image file
types; see Supported file types in chapter 6. Image files are specified in the
Load File dialog box. This appears when you start automatic processing.
In manual processing, click the Load F ile butt on or use the P rocess menu.
The lower part of the dialog box provides advanced settings, and can be
shown or hidden. Here it is displayed.
This is the
current folder.
Specify the file
type(s) you want
listed.
This can be used for
multipage TIFF and
DCX files.
This is a blank
image file for the
saving option: "New
file for each blank
page".
Use Shift+ clicks or
Ctrl+clicks to place
more than one file in
the File name text box.
Click Advanced to
open the lower panel
and Basic to close it.
Use this to add files one
by one from different
folders and to control
file order precisely.
50TUTORI AL: PROCESSINGDOCUMENTS
Normally the Add button places each file at the bottom of the file list. To
place a file at a different location, highlight a file in the list. The new file
will be added immediately below the lowest highlighted file.
Input from scanner
You must have a functi oning, supported scanner correctly installed with
OmniPage SE. See Setting up your scanner with OmniPage SE in chapte r 1
for more information. You ha ve a choi ce of scanning modes. In making
your choice, there are two main considerations:
u Which type of output do you want in your export document?
u Which mode will yield best OCR accuracy?
Scan black and white
Select this to scan in black-and-white. This is not suitable if you want
color in your output document, nor if you want pictures to look like socalled ‘black-and-white’ photographs: they need grayscale scanning. For
best OCR accuracy, use this for crisp black texts on a white or light
background. Black-and-white images can be scanned and handled
quicker than others and occupy less disk space.
Scan grayscale
Select this to use grayscale scanning . Choose this to keep ‘black-and-
white’ photograp hs in the outp ut docum ent. F o r best OCR accura cy, use
this for pages with varying or lo w cont ras t (not much difference betw een
light and dark) and with text on colored or shaded backgrounds.
Scan color
Select this to scan in color. Available only with color scanners. Choose
this if you want colored graphics, texts or backgrounds in the output
document. For OCR accuracy, it offers no more benefit than grayscale
scanning (for a given resolution), but will require much more time,
memory resources and disk space.
DEFININGTHESOURCEOFPAGEIMAGES51
Brightness and contrast
Good brightness and contrast settings play an important role in OCR
accuracy. Set these in the Scanner panel of the Options dialog box. The
diagram illustrates an optimum brightness setting. After loading an
image, check its appearance. If characters are thick and touc hing, lighten
the brightness. If characters are thin and broken, darken it. Then rescan
the page.
Unsuitable
Tolerable
Good
Best
Good
Tolerable
Unsuitable
Scanning with an AD F
The best way to scan multi-page documents is with an Automa tic
Document Feeder (ADF). S imply load pages in t he corr ect or der into the
ADF. Place blank pages if you want to save your document to multiple
output files using the ‘Create a new file at each blank page’ option. See
Saving to file in chapter 5.
If you have a document longer than the capacity of your ADF, select
‘Automatically prompt for more pages’ in the Process panel of the
Options dialog box. Then a dialog box lets you add further page batches
and signal when all pages are scanned.
52TUTORI AL: PROCESSINGDOCUMENTS
You can scan double-sided documents with an ADF. A duplex scanner
will manage this automatically. For non-duplex scanners, select ‘Scan
double-sided pages ’ in the Scanner panel of the Options dialog box. Then
you can scan the document in just a few passes, with even pages grouped
together and odd pages also grouped. OmniPage SE will merge the pages
for you.
Scanning long documents w ithout an ADF
You can scan multi-page documents efficiently from a flatbed scanner,
even without an ADF. Select ‘Automatically scan pages’ in the Scanner
panel of the Op tions dialog box, and define a paus e value in seconds.
Then the scanner will make scanning passes automatically, pausing
between each scan by the defined number of seconds, giving you time to
place the next page. A dialog box allows you finish the pause early or
request a longer pause and to specify when the last page is scanned.
DESCRIBINGTHELAYOUTOFTHEDOCUMENT
Before starting recognition you are requested to describe the layout of the
incoming pages to assist the auto-zoning process. When you use the
OCR Wizard, auto-zoning always runs. When you do automatic
processing, auto-zoning always runs unless you specify a template to be
used on its own. When you do manual processing, auto-zoning
sometimes runs. See online Help for more detail.
Here are your input description choi c es:
Automatic
Choose this to let the program ma ke all a u to -zoning decision s. It decides
whether text is in columns or not, whether an item is a graphic or text to
be recognized and whether to place tables or not. Choose Automatic if
your document contains pages with different or unknown layouts.
Choose it for a page with multiple columns and a table, and fo r any pag es
with more than on e table.
DESCRIBINGTHELAYOUTOFTH EDOCUMENT53
Single column, no table
Choose this setting if yo ur pages conta in only one column of text and no
table. Business letters or pages from a book are normally like this. Choose
it also for a page with words or numbers arranged in columns if you do
not want these placed in a table or decolumnized or treated as separate
columns. Graphics may be detected.
Multiple columns, no table
Choose this if some of your pages contain text in columns and you want
this decolumnize d or kept in separa te columns, similar to the original
layout. Columns can be retained in the output document, either with
frames (if True Page is set) or without frames (if Retain Flowing Columns
is set). If tabular data is encountered, it is likely to be treated as flowing
text. Graphics may be detected.
Single column with table
Choose this if your page contains only one column of text and a table.
Auto-zoning will not look for columns but will try to find a table and
place it in a grid in the Text Editor. You can later specify whether to
export it in a grid or as tab separated text columns. Graphics may be
detected.
Spreadsheet
Choose this if your whole page consists of a table which you want to
export to a spreadsheet program, or have treated as single table. No
flowing text or graphics zones will be detected.
Custom
Choose this for maximum control over auto-zoning. You can prevent or
encourage the detection of columns, graphics and tables. Make your
settings in the Custom Layout panel of the Options dialog box.
Template
Choose a zone template file if you wish to have its zones and properties
applied to all acquired pages from now on. In manual processing the
template zones ar e also a pplied to t he curr ent pa ge, re pla cing an y exi sti ng
zones. Other z ones ar e permi tted in a ddition to templa te zones . F or mo re
detail, see the section U sing zone templates.
If auto-zoning yielded unexpected recognition results, use manual
processing to rezone individual pages and rerecognize them.
54TUTORI AL: PROCESSINGDOCUMENTS
MANUALZONING
Zones define areas on the page to be processed. Zones are rectangular or
irregular (with sides formed by vertical and horizontal lines). Zones
cannot overlap. They have a zone number in the top left corner and a
zone type icon top right. Click in a zone to select it. Use Shift+clicks for a
multiple selection. Current and selected zo nes are shaded. Click outside a
zone to remove the selection. Zones appear on an original image in the
following cases:
u The page has been recognized.
u A zone template file was specified in manual processing while the
page was current.
u You have drawn manual zones on the image.
Working with zo nes
The Image toolbar provides zone editing tools. One is always selected.
When you no longer want the service of a tool, click a different tool.
Normally this will be the Draw Rectangular Zones tool.
Draw rectangular zones
Click this and drag the cursor to define rectangular zones. The new zone
takes its properties from the last drawn or selected zone. You can also
move or resize existing zones when this tool is active.
Draw irregular zon e s
Click this for a tool allowing you to draw irregular zones. Click and drag
to draw a single line. Repeat until only one line remains undrawn.
Double-click to close the shape. Irregular zones snap to rectangles if you
set them as table type zones. You can also move or resize existing zones
when this tool is active.
Add to zone
Click this to make irregular additions to an existing zone or combine
separate zones into one. You cannot move or resize existing zones when
this tool is acti ve. You cannot use this with a table type zone.
MANUALZONING55
Subtr act from zone
Click this to subtract irregular parts from an existing zone or split a zone
into smaller ones. Y ou cannot move or resize existing zones when this tool
is active. You cannot use this with a table ty pe zone.
Reorder zones
Click this for th e z o ne reorderi ng to ol. Th en click in zones i n the des i red
reading order. For your order to be respected, ch oose ‘Use current zones
only’ and avoid having multiple-column or auto-detect zones types on
the page.
Zone properties
Click this for the Zone Properties dialog box. This lets you define zone
type and content for the currently selected zone(s) on the page. You can
also do this from a zone’s shortcut menu. See the next section.
Zone properties
Each zone has a zone type. Zones containing text can also have a zone
contents setting: alphanumeric or numeric. The zone type an d zone
contents together consti tute the zon e properties. Rig ht-cli ck in a zone for
a shortcut menu allowing you to change the zone’s properties. Select
multiple zones to change their properties in one move. The zone
properties button in the I m age to ol bar can be used for the same purpose.
The following types are available:
Single-column flowing text zone
Use this to have zone contents treated as flowing text, without colu mns
being found.
Multiple-column flowing text zone
Use thi s to hav e z on e con tents tr eat ed as flo w ing tex t. The pr og ram will try
to detect columns inside the zone. Text will be decolumnized or retained in
columns, depending on the Text Editor view. During recognition, a multicolumn zone may be replaced by separate zones for each column. To do
this, auto-zoning must run, which may also result in changed zone order.
56TUTORI AL: PROCESSINGDOCUMENTS
Table zone
Use this to have the zone contents treated as a table. Table grids can be
automatically detected, or placed manually as described in the next
section. Table zones must be rectangular. The Text Editor displays the
table in an editable grid. You can choose whether to export tables in grids
or in columns separated by tabs.
Auto-detect zone
Use this to let the program deci de the zone type. To do this, auto-zoning
runs, which may also result in changed zone order on the page. After
recognition you can see the type that was applied. If you use an autodetect zone to cover a page area with varied contents, the program may
replace the auto-detect zone with a n umber of smaller zones.
Graphic zo ne
Use this to enclose a picture , diagram, drawing, sign ature or anythin g you
want transferred to the Text Editor as an embedded image, and not as
recognized text. A graphics zone has a green border. Embedded images
can be exported with the document to target applications supporting
graphics.
Igno re zone
Use this to define a page area you do not want in the Text Editor. Autozoning will not place zones here. To exclude a given page area from many
pages (for example a header or page numbers), place ignore zones in a
template and select ‘Find zones in addition to template/current zones’ in
the Process panel of the Options dialog b ox.
Zone contents
This is available for zone types containing text. Alphanumeric contents
validates all characters needed for your language choice. Recognition
results from a numeric zone will contain only numbers and numberrelated punctuation. No letters will be pla c ed.
Note Right-click outside a zone for a shortcut menu tailored for the whole
image. It allows you to zoom in or out or rotate the image. When an image is
rotated, all zones on it are deleted.
MANUALZONING57
TABLEGRIDSINTHEIMAGE
After automatic processing you may see table zones placed on a page.
They are denoted with a table zone icon in the top right corner o f the
zone. To change a zone to or from a table zone, use its shortcut menu.
You can also draw a table type zone. If there is already a table zo ne on the
page, select it, then draw the new rectangular zone. It will inherit the
table type. Otherwise draw a rectangula r zone and use its shortcut menu
to change it to a table type.
You draw or move table dividers t o det ermine wher e gri dli ne s wil l app ear
when the table is placed in the Text Editor. You can use the Add or
Subtract tools to enla rge or reduce a table zone, but it must remain
rectangular. You can do this to discard unneeded columns or rows from a
table.
The five table handlin g tool s on the I magin g toolb ar become ac tiv e if the
current page contains a table type zone. Use them as follows:
Move row or column dividers
Click the tool and mo v e the cursor to t he div ider t o be m o ved . I t disp lays
a double-headed arrow. Drag the border as desired. You cannot drag it
beyond its neighbor. Avoid placing dividers so they overlap one another
or cut through text. Press the Ctrl key as you drag a column divider, to
move it in the cur rent row only.
Insert column divide rs
Click the tool then click at the location in a table zone where you want to
place a column divider. Press the Ctrl key as you click to pl ace the divid er
in the cu rrent row only.
Insert row dividers
Click the tool then click at the location in a table zone where you want to
place a row di vider. Avoid pla cing a divid er on to p of a nother one or so i t
cuts through text.
Remove column or row dividers
Click the tool then c lick on a singl e divide r yo u want to delet e. D o this if
a divider is wrongly located, or if you want to change the appearance of
the table in the final d ocument. For example, you can place two columns
of data in a single column by deleting the divider between the columns.
58TUTORI AL: PROCESSINGDOCUMENTS
Remove/replace all dividers
Click this tool and cli ck i nside a table zone. Its dividers will al l disappear.
Click again to have dividers automatically (re)detected. Divid e r
placement usually occurs du ring recogni tion; clicking twice with this tool
lets you see and edit the dividers before recognition.
USINGZONETEMPLATES
A template is a set of zones, their properties and reading order, stored in a
file. A zone template file can be loaded to have template zones used
during recognitio n. Loa d a templat e file in th e Perform OCR drop-down
list or from the Tools menu.
When you load a template with the Manual OCR toolbar, its zones
appear immediately on the current page, replacing any already there.
Existing pages are not affected. The template zones are placed on all
further acquired pag es until the templa te is unloaded. You can modify th e
template zones and add new zones before performing recognition.
When you load a template with the AutoOCR toolbar, it does not affect
the current or existing pages. The template zones are placed on all further
acquired pages until the template is unloaded. The Process panel of the
Options dialog box presents the option ‘Find zones in addition to
template/current zones’. If this is turned on during automatic processing,
auto-zoning will run on page areas ou tside the templa te zo nes.
How to save a zone template
Prepare zones on a page. Check their
locations, properties and reading order.
Click Zone Template File... in the Tools
menu. In the dialog box, select
page]
and click Save.
[zones on
How to modify a zone template
Load the template and acquire a suitable
image with manual processing. The template zones appear. Modify the
zones and/or properties as desired. Open the Zone Template File dialog
box. The current template is selected. Click Save and then Close.
USINGZONETEMPLATES59
How to unload a template
Select a non-templat e setting for layout description in the Perform OCR
drop-down list. The template zones are not removed from the current or
existing pages, but template zones will no longer be used for future
processing. You can also open the Zone Template Files dialog box, select
[none] and click th e Set As Current b u tton. In this case, the layo ut
description setting returns to Automatic.
How to replace one template with another
Select a different template in the Perform OCR drop-down list, or open
the Zone Template Files dialog box, select the desired template and click
the Set As Current button. When the AutoO CR toolbar is active, no
existing zones are ch anged and the ne w template is used f or future
processing. When the Manu al OCR tool bar is acti ve, zo nes from the new
template are applied to the current page, replacing any existing zones.
How to delete a template file
Open the Zone Template Files dialog box. Select a template and click the
Delete button. Zones already placed by this template are not removed.
Tip Templates accept ignore and auto-detect type zones. A template can
therefore be useful to define which parts of the page to read, and which
parts to ignore.
Note Auto-detect type zones from a template may be replac ed during
recognition by smaller ones; specific zone types will be assigned to these
zones. Multi-column zones may also be split into smaller single-column
zones, one for each de tected column.
Note Templates and the additional auto-zoning feature are available in
Schedule OCR and Direct OCR, but not in the OCR Wizard.
60TUTORI AL: PROCESSINGDOCUMENTS
4
3URRILQJDQGHGLWLQJ
Recognition results are placed in the Text Editor. This newly developed
WYSIWYG (What You See Is What You Get) editor offers the following
features, detailed in this chapter:
u Proofreading OCR results
u Checking recognized text against original (Verifying text)
u User dictionaries
u IntelliTrain
u The editor display and views
u Text and image editing
u Reading text aloud
u Page outline
The Text Editor offers four views for displaying its pages. You can switch
freely from one view to another. These provide different levels of
formatting. The views are:
No Formatting view
This displays plain decolumnized text in a single font and font size.
Retain Fonts and Paragraphs view
This displays decolumnized text with font and paragraph styling.
True Page view
This view tries to conserve as much of the formatting of the original
document as possible. Character and paragraph styling is retained. All
page elements, including columns, are placed in frames.
Retain Flowing Columns view
This view is identical to True Page view, except that the reading order of
zones is shown by arrows. This view’s diffe rence fr om True Page relates
mainly to export, as explained in the section Preparing recognition results for export in chapter 5.
OMNIPAGE SE USER’S GUI DE61
This tells why
the word is
suspected.
This window shows
the relevant part of the
original image. Click
inside it to enlarge or
reduce the display .
PROOFREADING OCR RESULTS
After a page is recogn ized, the recognition results appear in the Text
Editor. Proofreading starts automatically if that was requested in the
Proofing panel of the Options dialog box or in the OCR Wizard. You can
start proofing manually any time th e program is not busy. Work as
follows:
1. Click the Proofread OCR button in the Standard toolbar, or choose
Proofread OCR... in the Tools menu.
2. Proofing starts from the beginning of the document, but skips text
already proofed. If a suspected error is detected, the OCR
Proofreader dialog box displays the error and a picture of how it
originally looked in the image.
This is what
OmniPage SE
thought the
word was.
The image of
the suspect
word is
highlighted.
Drag a corner
or the bottom of
the dialog box
to resize it.
3. If the recognized word is correct, click Ignore or Ignore All to move
4. If the recognized word is not correct, edit the word in the Change to
62PROOFINGAN DEDITING
to the next suspect word. Click Add to add it to the cu rrent user
dictionary an d move to the next suspect wo rd.
edit box, or type in the desired word or select a dictionary suggestion.
Click Change or Change All to implement the change and move to
the next suspect word. Click Add to add the word in the Change to
edit box to the current user dictionary and move to the next suspect
word.
5. Color markers are removed from words i n the Text Edit or as t hey are
proofread. You can switch to the Text Editor during proofing to
make corrections there. Use the Resume button to restart proofing.
Click Close to stop proofreading before the end of the document is
reached.
After performing OCR, you can compare any part of the recognized text
against the cor re s ponding part of the origina l image, to verify that the
text was recognized correctly. Work as follows:
1. Double-click any word in the T ext Editor or select a word and choose
Verify Text in the Tools menu. The Verify Text window opens and
shows a picture of the original wor d and its surrounding ar ea. Mo dify
the word in the Text Editor as necessary.
This is the
original image
of the word you
are verifying.
2. Click inside the window to enlarge or reduce the picture. The picture
is enlarged on the first two clicks and reduced on the next two clicks.
Close button
This is the word
you doubleclicked in the
Text Editor.
3. Continue double-clicking words that you want to verify, and
correcting them as necessary. The display changes as you select new
words.
CHECKINGRECOGNIZEDTEXTAGAINSTORIGINAL63
4. Click the Close button to close the verifier window.
Tip <RXVKRXOGSURRIUHDGDQGYHULI\WH[WVEHIRUHGRLQJODUJHVFDOH
The program has built-in dictionaries for many languages. These assist
during recognition and may offer suggestions during proofing. They can
be supplemented by user dictionaries. You can save any number of user
dictionaries, but only one can be loaded at a time. Your user dictionaries
from Microsoft Word are also available; a dicti onary called Cust om is the
default user dictionary for Microsoft Word.
Starting a user dictionary
Click Add in the OCR Proofreader dialog box with no user dictionary
loaded or open the User Dictionary Files dialog box from the Tools menu
and click New. Y o u will be asked to name the dictionary immediately.
Loading or unloading a user dictionary
Do this fro m th e OCR pan el of the Options d ialog box or fro m th e User
Dictionary Files dialog box. Select a dictionary file to load it or
unload a user dictionary.
Editing a user dictionary
Add words by loading a user dictionary and then clicking Add in the
OCR Proofreader dialog box. You can add and delete words in the User
Dictionary Files dialog box.
Tip :KLOHHGLWLQJDXVHUGLFWLRQDU\\RXFDQLPSRUWDZRUGOLVWIURPD
WH[WILOHWRDGGZRUGVWRWKHGLFWLRQDU\TXLFNO\
64PROOFINGAN DEDITING
[none] to
INTELLITRAIN
IntelliTrain is a newly devel oped and automated fo rm of training. I t takes
input from the correc tio ns you mak e during pr oofi ng. When you make a
change, it remembers the character shape involved, and your proofing
change. It searches other similar character shapes in the document,
especially in suspect words. It assesses whether to apply the user
correction or not.
IntelliTrain and training files are not supported in OmniPage SE. This
section applies only to OmniPage Pro 11. Any training data in an OPD
file will be ignored when it is opened in OmniPage SE.
You can turn IntelliTrain on or off in the OCR panel of the Options
dialog box. It is useful for uniformly degraded documents or when an
unusual typeface is used throughout a document. IntelliTrain will be less
useful for texts with random distortions. Here is an example, based on the
letter “g”, which can be printed in different ways:
The first two examples do not need IntelliTrain, because both shapes are
normal for the letter “g” and the program can handle them. The third
example could benefit from IntelliTrain because the shape of “g” is
unusual, and all instances of “g” in the text are likely to look like this. The
fourth example is not good for IntelliTrain, because the first “g” is poorly
printed, and this shape is unlikely to appear again in the document.
INTELLITRAIN65
OmniPage Pro read this as
bcnefit.
You changed it during
proofing to benefit.
The following shows how IntelliTrain works, using the original image.
Our example involves the letters c and e. With some typefaces and
scanning settings, the horizontal line in e can become very thin, leading
to OCR errors that IntelliTrain can repair.
IntelliTrain
remembers this
shape and the rule:
This is not c.
e
This is e.
IntelliTrain changes:
thcrc to there
likc to like
Whcncvcr to Whenever
etc.
IntelliTrain r eme mbers the tra in in g data it col lec ts, and yo u can sav e this
to a training file for future use with similar documents. If you want to be
prompted to save your unsaved training data whe n you close the
document, select that option in the Proofing panel of the Options dialog
box. Unsaved training data is stored in an OmniPage Document.
Saving trai ning to file, l oading, edit ing and unl oading trai ning files a re all
done in the Training Files dialog box. Open this from the Proofing panel
of the Options dialog box or the Tools menu.
66PROOFINGAN DEDITING
Select this, click
Save and type
in a name to
save a new
training file.
Click this to edit
the selected
training file
(see below).
Select this to
unload a
training file.
Use this also to save
new training into a
loaded training file. It
is listed as:
File name [modified]
Unsaved training can be edited in the Edit Training dialog box, an
asterisk is displayed in the title bar in place of a training file name. It
remains unsaved when you close the dialog box.
A training file can be also edited; it s name a ppears in th e title bar. If it has
unsaved training added to it, an asterisk appears after its name. Both the
unsaved and the modified training are saved when you close the dialog
box.
The dialog box displ ays frames contai ning a char acter shape and an OCR
solution assigned to that shape. Click a frame to select it. Then you can
delete it with the Delete key, or change the assignation. Use arrow keys to
move to the next or previous frame.
You are editing
your unsaved
training.
This frame is grayed.
It has been deleted.
To undelete it, select
it again and press the
Delete key.
Characters marked
as deleted are really
deleted when you
close the dialog box.
This frame is selected. The top part shows the
shape from the image. The bottom part shows
the assigned OCR solution.
Double-click a frame
or press Enter to
change its OCR
solution. Enter the
new solution in the
text box that appears
and press Enter.
Changed assignations
appear in red.
INTELLITRAIN67
THEEDITORDISPLAYANDVIEWS
The editor displays recognized texts and can mark words that were
suspected during recognition. Marking is done with a wavy underline;
red underlines for words not found in a dictionary (this applies only to
languages with dictionary support) and blue underlines for words
containing suspect or reject characters. These markers can be shown or
hidden as selected in the Text Editor panel of the Options dialog box.
You can also show or hide non-printing characters and h ead er/footer
indicators. The Text Editor panel also lets you define a unit of
measurement for the program and a word wrap setting for use in all Text
Editor views except No Formatting view.
Here are the main differences between the views:
No Formatting view
This displays plain decolumnized left-aligned text in a single font and font
size, with the same line breaks as in the original document. Most
formatting buttons and dialog boxes are disabled. Rulers ar e not displayed.
You may find this view convenient for verifying and editing the text.
Retain Fonts and Paragraphs view
This displays decolumnized text with font and paragraph styling. The
horizontal ruler is displayed. You may find this view convenient for
verifying, editing a nd modifying the text together with its styling.
True Page view
This view tries to conserve as much of the formatting of the original
document as possible. Character and paragraph styling is retained. All
page elements, including columns, are placed in frames. It may be more
difficult to verif y and ed it text in this view; y ou may need to scr oll wi thin
a frame to see all the frame contents. A row of arrows denote contents
extending beyond frame borders.
Retain Flowing Columns view
This view is identical to True Page view, except that the reading order of
zones is shown by arrows. This view differs from True Page during
export, see the section Preparing recognition results for export in chapter 5.
Select a view with the four buttons at the bottom left of the Text Editor
or from the View menu. Graphics and tables can appear in all four views.
68PROOFINGAN DEDITING
TEXTANDIMAGEEDITING
This is a WYSIWYG Text Editor, providing many editing facilities.
These work very similarly to those in leading word processors.
Editing character attributes
In all views except No Formatting view, you can change th e font t y pe,
size and attributes (bold, italic, underlined) for selected text. Use the
Formatting toolbar or the Font dialog box from the Format menu. The
latter also offers subscripts, superscripts and colored text or backgrounds.
In No F ormatting view you can use the Formatting toolbar to specify one
font type and size to be applied to the whole document. This is not
transferred to other views; their previous settings are restored.
Open the F o nt Ma tchin g dia log bo x f ro m the OCR pa nel o f th e Opt ion s
dialog box to specify which fonts to use for text s enteri ng the Text Editor.
Editing paragraph attributes
In all views except No F ormatting view, you can change the alignment of
selected paragraphs and apply bulleting to paragraphs. Use the
Formatting toolbar or the Paragraph dialog box from the Format menu.
The latter allows you to modify indents, line spacing and spacing
between paragraphs. The Text Editor’s horizontal ruler lets you define
indent and tab positions easily. Advanced tab settings are done in the
Tabs dialog box from the Format menu.
Paragraph styles
Paragraph styles are auto-detected during recognition. A list of styles is
built up and presented in a selection box on the left of the Formatting
toolbar. Use this to assign a style to selected paragraphs. Use the Style
dialog box from the Format menu to rename or modify a style and to
define a new style. When you save a document to file, you can choose
whether to export the paragraph styles wi th t he document or not. This is
valid only if the target application supports paragraph styles.
TEXTANDIMAGEEDITING69
Graphics
You can edit the contents of a selected graphic zone if you have an image
editor in your computer. Click Edit Picture in the Tools menu. This will
activate the image editor associated with BMP files in your Windows
system, and load the graphic. Edit the graphic, then close the editor to
have it reembedded in OmniPage SE’s Text Editor. Do not change the
graphic’s size, resolution or type, because this will prevent the
reembedding.
Tables
Tables are displayed in the Text Editor in grids. Move the cursor into a
table area. I t changes appearanc e, allowing yo u to move gri dlines. You can
also use the Text Editor’s rulers to modify a table. Modify the placement
of text in t able cells with t he alig nment butt ons in the Formattin g toolba r
and the tab controls in the ruler. When saving the document to file, you
can choose whether to have the tables exported in grids or as tab
separated columns.
READINGTEXTALOUD
The Text-to-Speech facility is enabled or disabled with the Tools menu
item Speech Mode or with the F5 key. A second m enu item Speech
Settings... allows you to select a voice (for example, male or female for a
given language), a reading speed and the volume.
This speech facility is designed for the visually impaired, but it can also
be useful to anyone during text checking and verification. The speaking is
controlled by movements of the insertion point in the Text Editor which
can be mouse or keyboard driven.
The Text-to-Speech facility is not included in OmniPage SE. It is
available in OmniPage Pro 11.
70PROOFINGAN DEDITING
To hear text:Use these keys:
One character at a time, forward or back
Current wordCtrl + Numpad 1
One word to the rightCtrl + right arrow *
One word to the leftCtrl + left arrow *
A single linePlace the insertion point in the line
Next lineDown arrow
Previous lineUp arrow
Current sentenceCtrl + Numpad 2
From insertion point to end of sentenceCtrl + Numpad 6
From start of sentence to insertion pointCtrl + Numpad 4
Current pageCtrl + Numpad 3
From top of current page to insertion pointCtrl + Home
From insertion point to end of current pageCtrl + End
Previous, next or any pageCtrl + PgUp, PgDown or navigation buttons
Typed characters
Right or left arrow. Letter, number or punctuation names are spoken.
Each typed character is pronounced, one by
one, including punctuation.
* If the cursor is in the middle of a word, you will first hear a word fragment, but from the
second keystroke you will hear whole words.
The three basic speech keys are grouped together on the numeric keypad.
It is planned to provide speech pr ograms for the following la nguages:
English, F renc h, German, I talian, P o rtuguese and Span ish. Please consul t
the Readme file for the latest information. Only one speech system will
be installed with OmniPage Pro, depending on your language choice at
the start of installation. If you specify a language with no speech system
available, English is installed.
If you have other SAPI-compliant speech systems on your computer, they
will be detected and availab le. Their voices will be a vai lable in the S peech
Settings dialog box. Once you have associated a voice with a language,
OmniPage Pro will remember this, and switch voices according to the
recognition language of your document.
PAGEOUTLINE
The Page outline window lets you change the order of areas on a page or
of paragraphs inside areas. It also lets you define how text should flow if
you export with Retain Flowing Columns view. Open the page outline
window from the View menu. The areas correspond to the zones used
during recognition and also to frames used in the Text Editor. Click and
drag an item to the desired location. Reordered paragraphs display
immediately in the Text Editor and are exported. Reordered areas display
and are exported in No Formatting View and Retain Fonts and
Paragraphs view. In True Page view they have no practical effect. In
Retain Flowing Columns view, arrows show the order of text flow. Move
areas to change this order. The positions of the areas do no t change, but
the arrows show the changed text flow.
72PROOFINGAN DEDITING
5 Saving and exporting
Once you have acquired at least one image for a document, you can
export the image(s) to file. Once you have recognized at least one page,
you can export recognition results to a target application by:
1. Savi ng to file
2. Copying a docume nt to the Clipb oard
3. Sending a document as a mail attachment
The document remai n s in Omn iPage SE after export. This allows you to
save, copy or send it repeatedly, for example with different formatting
levels, using different file types, names or locations. You can also add or
rerecognize pages or modify the recognized text.
With automatic p rocessing and using the OCR Wizard, you specify the
first saving destination before processing starts. When the last available
page is recognized (or proofread, if that was requested), the exporting
occurs.
You can specify export any time the program is not busy. If you ask to
export a document with unrecognized pages, you will be asked whether
they should be re cognized first. If you answer No, only results from
recognized pages will be exported. If zones have been modified on
recognized pages, you will be invited to rerecognize those pages before
exporting.
OMNIPAGE SE USER’S GUI DE73
PREPARINGRECOGNITIONRESULTSFOREXPORT
Text is exported to file, Clipboard or mail with the formatting level
defined by the view set in the Text Editor at expor t time, if that is
possible. However, some export file types and target applications cannot
support all f ormatting elements. You may be warned if there is a
mismatch and of fered the highest permissible view. You can accept that,
or cancel export, set a different view and restart the export.
The table in the section File types for saving recognition results in chapter 6
tells you which file typ es support which formatting levels.
Here is how you can use the views for export:
No Formatting view
This view is needed when exporting to ASCII, Unicode or other formats
with extension .T XT. These fi le types cannot acce pt graphics or tables.
Of course, you can export plain text to any file type and target
application.
Retain Fonts and Paragraphs view
This is suitable for all formats except those with the TXT or PDF
extensions. These formats can all handle graphics and tables.
True Page view
This is suitable on ly for file types and target applications capable of
handling frames or text boxes. When you export to PDF, True Page is
used as source, regardless of your editor view (Not applicable to
OmniPage SE). The reading order of zones, or areas reordered in the Page
outline window have no influence when True Page is used for export.
Retain Flowing Columns view
Set this at export time to keep the original layout of the pages, including
columns. This is done wherever possible with column settings, not with
frames. Text will then flow from one column to the other, which does not
happen when frames are used. Arrows show the text flow order. You can
change this order with the Page outline window, as described in Page outline in chapter 4.
74SAVINGANDEXP ORTING
SAVINGTOFILE
You can save recognized pages and original images to disk in a wide
variety of file type s. See chapter 6 for a complete list of supported file
types: File types for opening and saving images and File types for saving recognition results.
Saving origina l im ages
1. Choose Save Image... in the File menu. In the dialog box that
appears, select a folder location and a file type for your images. Type
in a file name.
2. Select to save the current image only or all images in the document.
In the second case you can have all images in a single multi-page
image file, providing you set TIFF or DCX as file type. Otherwise
each image is placed in a separate file. OmniPage SE adds numerical
suffixes to the file name you provide, to generate unique file names.
3. Click OK to save the image(s) as specified. Zones and recognize d text
are not saved with the file. If possible, the file is saved as displayed:
that is black-and-white, grayscale or color. Black-and-white images
are saved at their original resolutions. Grayscale and color images are
reduced to approximately 150 dpi.
Tip To see the image size and original resolution of an image, hover the
cursor over it in the Original Image area or over its thumbnail in the
Document Manager.
Note In OmniP age P ro you can save your document to four variants of
PDF, including ‘image only’. This is savi ng the recogn ition resul t s as
image, not the original images. PDF saving is n ot available in OmniPage
SE.
SAVINGTOFILE75
Select this to
automatically open
the saved file in its
target application.
Select this to have
the paragraph
styles from the Text
Editor export ed
with the recognized
text.
Saving recog nition resul t s
1. Choose Save As... in the File menu, or click the Export Results
button in the Manual OCR toolbar with Save as File selected in the
drop-down list.
2. The Save As dialog box appears, as shown in its expanded form.
Click Advanced
to open the lower
panel and Basic
to close it.
Choose from:
Create one file for all pages
Create one file per page
Create a new file at each blank page.
3. Select a folder location and a file type for your document. The special
4. Type in a file name. Click the Advanced button to see all the saving
5. Click OK. The document is saved to disk as specified. If ‘Save and
76SAVINGANDEXP ORTING
OPD file type is the last in the file type list.
options. Select these as desired .
Launch’ is selected, the exported file will appear in its target
application; that is the one associated with the selected file type in
your Windows system.
Note Graphics and formatting are saved in the document o n ly if the
selected file type supports them. The formatting level for export is the
Editor view set at saving time. You will be warned if the formatting level
is not supported by the export file type.
Note If more than one export file is created, OmniPage SE will append
a numerical suffix to your file name to create unique file names. If you
select ‘Create a new file at each blank page’ with input from image files,
see how to place blank images in the section Input from image files in
chapter 3.
SAVINGADOCUMENTASYOUWORK
Click the Save button in the Standard toolbar or choose Save in the File
menu to save changes to the current document as you work. If you do
this with an untitle d d ocument, the Save As dialog b ox appears.
With a named document, the Save command saves it to the name and
format of its last save, as displayed in the title bar. If the document was
last saved as an OmniPage Document, the save command updates this
document: new or changed images, changed zoning, recognition results
and training are all saved. If the document was last saved to a text-based
file type, only changes to the recognition results are saved.
If you want to work with your document again in OmniPage SE in a later
session, save it as an OmniPage Document. This is a special output file
type. It saves the original images together with the recognition results,
settings and training. See the section OmniPage Documents in chapter 2.
The Save As dialog box lists available file types in its Save as Type dropdown list. The OmniPage Document is the last format in the list.
Your OmniPage Documents can be passed between OmniPage SE and
OmniPage Pro 11. In OmniPage SE any training data in the OPD is
ignored and training cannot be done.
SAVINGADOCUMENTASYO UWORK77
If you first save the document as an OmniPage Document (for instance as
memo.opd), then modify it and later save it to a text file (for instance as
memo.txt), then modify it again and click Save, the recent changes are
saved to the
memo.txt file, not to the OPD. When you close the document
or exit the program, you will be prompted to save the document if it has
not been saved as an OmniPage Document, or there are changes si nce the
last OPD save.
COPYINGADOCUMENTTOTHE CLIPBOARD
You can copy the recognition results from every recognized page of a
document to the Clipboard. The copying is reported by a progress
monitor. You can then paste the Clipboard contents into another
application.
Text formatting, such as bold and italics, is retained when you paste into
an application that supports RTF information. Otherwise, only plain text
will be pasted. Graphics are retained if the application supports insertion
of images.
t To copy a document to the Clipboard
•With automatic processing, select Copy to Clipboard as the
•With manual processing, select the Copy to Clipboard command in
78SAVINGANDEXP ORTING
command in the Export Results drop-down list on the AutoOCR
toolbar or in the OCR Wizard. The text is sent to Clipboard as soon
as the last available page is recognized or proofed.
the Export Results drop-down list and then click its button on the
Manual OC R toolbar. Copying starts immediately.
SENDINGADOCUMENTASAMAILATTACHMENT
You can send recognition results as one or more files attached to a mail
message if you have install ed a MAPI- compliant mai l applicat ion, such as
Microsoft Outlook.
t To send a document by e-mail
•With automatic processing, select Send as Mail as the command in
the Export Results drop-down list on the AutoOC R toolbar. The
Send Mail dialo g bo x ap pea rs as soon as th e last available page in the
document is recognized or proofed.
•With manual processing, select Send as Mail as the command in the
Export Results drop-down list and then click its button on the
Manual OCR toolbar. The Send Mail dialog box appears
immediately.
At any time the program is not busy, choose Send as Mail in the File
menu to call up the Send Mail dialog box.
1. The Send Mail dialog box lets you specify a file type and attachment
2. Log into your mail application if you are prompted to do so.
options: one attachment for all pages, one attachment per page, new
attachment at each bla nk page. Set all options an d click OK.
SENDINGADOCUMENTASAMAILATTACHMENT79
3. Your mail application appears with the attachment(s) in a new empty
message. Attachments take the name used for the last save of the
document in OmniPage SE, or ‘Untitled from OmniPage’. The
suitable file exten sion is added, and nu merical suffixes for multiple
attachments.
4. Address your mail message, add message text as desired and click the
Send button.
80SAVINGANDEXP ORTING
6 Technical information
This chapter provides troublesh ooting and other technical information
about using OmniPage SE.
Please also read the online Readme file and other help topics, or visit the
ScanSoft web pages. The Scanner Information web page contains
detailed and regularly updated information about scanner setup and
support. The Readme file contains last-minute information relating to
OmniPage SE. Access to the Readme file and to ScanSoft’s web pages is
provided in the Help menu.
This chapter contains the following information:
u Troubleshooting
• Solutions to try first
• Testing OmniPage SE
• Low memory problems
• Low disk space problems
u Supported file types
• File types for opening and saving images
• File types for saving recognition results
• Saving to PD F
u OCR problems
• Text does not get recognized properly
• Problems with fax re cognition
• System or performance problems during OCR
u Unin stalling the software
OMNIPAGE SE USER’S GUI DE81
TROUBLESHOOTING
Although OmniPage SE is designed to be easy to use, problems
sometimes occur. Many of the error messages contain self-explanatory
descriptions of what to do – check connections, close other applications
to free up memory, and so on. Sometimes that is all the troubleshooting
help you need.
Please see your Windows documentation for information on optimizing
your system and application performance.
Solutions to try first
Try the s e solutions if you ex perience problems starting or using
OmniPage SE:
u Make sure that your system meets all requirements listed under
System requirements in chapter 1.
u Make sure that your scanner is plugged in and that all cable
connections are secure.
u Visit the support section of ScanSoft’s web site at
www.scansoft.com. It contains Tech Notes on commonly
reported issues using OmniPage SE. Our web pages may also
offer assistance on the installation process and troubleshooting.
82TECHNIC ALINFORMATION
u Turn off your computer and your scanner, turn your scanner
back on, and then restart your computer. Make sure other
applications are functioning properly.
u Use the software that came with your scanner to verify that the
scanner works p roperly before usi ng it with OmniPage SE.
u Make sure you have the correct drivers for your scanner, printer,
and video card. Visit the ScanSoft’s Scanner Information web
page through the Help menu for more information.
u Run ScanDisk for Windows 95, 98 or Me, or Check Disk for
Windows NT and Windows 2000 to check your hard disk for
errors. Se e Windows online Help for more information.
u Defragment your hard disk. See Windows online Help for more
information.
u Uninstall an d reinstall OmniP age SE, as described in Uninstalling
the software at the end of th is chapter.
Testing OmniPage SE
Restarting Windows 95, 98, 2000 or Me in safe mode or Windows NT
in VGA mode allows you to test OmniPage SE on a simplified system.
This is recommended when you cannot resolve crashing problems or if
OmniPage SE has stopped running altogether. See Windows online Help
for more information.
Note Your scann er will not run with OmniPage SE in safe mode or
VGA mode, so do not test scanner problems in this configuration.
t To test OmniPage SE in safe mode (Windows 95, 98, 2000 or Me):
1. Restart your computer in safe mode by pressing F8 immediately a fter
you see the ‘Starting Windows’ message.
2. Launch OmniPage SE and try performing OCR on an image. Use a
known image file, for insta nce one of t he suppl ied sampl e imag e fi les.
• If OmniPage SE does not launch or run properly in safe
mode, then there may be a problem wi th the installation.
Uninstall and reinstall OmniPage SE (see Uninstalling the software), and then run it in Windows safe mode.
• If OmniPage SE runs in safe mode, then a device driver on
your system may be interfering with OmniPage SE
operation. Troubleshoot the problem b y restarting Windows
in Step-by-Step Confirmation mode. See Windows online
Help for more information.
t To test OmniPage SE in VGA mode (Windows NT):
1. Restart your computer.
2. Select Windows NT Workstation Version 4.00 [VGA mode] and
press Enter.
3. Press Ctrl+Alt+Del and select Task Manager.
4. In the Task Manager dialog box, select all background applications
and click End Process. See Wind ows online Help for more
information.
TROUBLESHOOTING83
5. Launch OmniPage SE and try performing OCR on an image. Use a
known image file such as one of the supplied sample files.
Note You can also run OmniPage SE from a command line in its own
safe mode. Choose Start
add the command line option
É Run, browse for the file OmniPage.exeand
/safe. This starts the program, but ignores
previously stored settings and does not try to recover a document from an
abnormal termination.
Low memory problems
OmniPage SE may run poorly under low-memory conditions. This may
be indicated by various error messages or if OmniPage SE works slowly
and accesses the hard drive often. Try these solutions for low memory
conditions:
u Restart your computer.
u Close other open applications to release memory.
u Close unnecessary OmniPage SE applications.
u Defragment your hard disk to free up contiguous blocks of disk
space. See Windows online Help for instructions.
u Increase the amount of free hard disk space.
u Increase your computer’s physical memory (RAM).
Low disk space pr o bl ems
Problems may occur if yo ur syst em runs low on free disk space. Try these
solutions for low disk space problems:
84TECHNIC ALINFORMATION
u More memory optimizes OCR performance. See System
requirements in chap ter 1 for more information.
u Empty the Windows Recycle Bin.
u Close all open applicatio ns and delet e th e *.tmp files in the Temp
folder. This folder is usually located in your Windows folder.
u Run ScanDisk for Windows 95, 98 or Me, or Check Disk for
Windows NT or Windows 2000 to check for errors that may be
using disk space. See Windows online Help for instructions.
u Back up unneeded files onto floppy disks or other media and
delete them from your hard disk.
u Remove Windows applications that you do not use.
u Defragment your hard disk. See Windows online Help for
instructions.
u Clear the cache for your web browser and limit its size.
SUPPORTEDFILETYPES
The program supports a wide range of file types. Several important types
have been added in OmniPage SE.
File types for opening and saving images
File typeExtension
BMP, Bitmap*.bmpNoOpen and SaveAll
DCX*.dcxYesOpen and SaveAll
GIF *.gifN/AN/AN/A
JPEG*.jpgNoOpen and SaveGrayscale, color
PCX*.pcxNoOpen and SaveAll
PDF*.pdfN/AN/A (see note)N/A
PNG*.pngNoOpen and SaveAll
TIFF Compressed G3*.tifYesOpenB/W
TIFF Compressed G4*.tifYesOpen and SaveB/W
TIFF Compressed LZW*.tifN/AN/AN/A
TIFF FX*.xifN/AN/AN/A
TIFF PackBits*.tifYesOpen and SaveAll
TIFF Uncompressed*.tifYesOpen and SaveAll
Multipage
Open / Save
B/W, Grayscale,
Color
Input image files can have resolutions up to 600 dpi, but 300 dpi (both
horizontally and vertically) is r ecommended for optimum OCR accuracy.
The program stores black-and-white images at their original resolution,
but grayscale and color images are not usually saved above 150 dpi.
Hover the cursor ov er an image for a popup window sho wing the size a nd
resolution of the original image.
Note If you try to save a black-and-white image to JPEG format, the
program will offer conversion to grayscale. With TIFF G3 and G4 it will
offer conversion to black-and-whi t e.
SUPPORTEDFILETYPES85
Note Saving to PDF for m at is supported in OmniPage Pro 11, with
four options. One of these is to export image only. But this exports the
recognition results as images, not the original images, through the Save
As dialog box. This is not available in OmniPage SE. Also, OmniPage SE
cannot handle GIF, LZW TIFF and TIFF FX files.
File types for saving recognition results
File type
ASCII text
Adobe PDF, normal*.pdfN/AN/A
Adobe PDF with image substitutes*.pdfN/AN/A
Adobe PDF with image on text*.pdfN/AN/A
Adobe PDF, image only*.pdfN/AN/A
Excel (3.0 to 7.0, 97, 2000)*.xlsNFV, RFP (Spreadsheet)Yes
FrameMaker (5.5.3)*.mifAllYes
Freelance Graphics*.txtNo Formatting view (NFV)No
Harvard Graphics*.txtNo Formatting view (NFV)No
HTML (3.2 or 4.0)
PowerPoint 97*.rtfAllYes
Microsoft Publisher 98*.rtfAllYes
Word for Windows (6.0, 97, 2000)*.docAllYes
PageMaker 6.5.2*.docAllYes
Quattro Pro for Windows 4.0, 8 *.xlsNFV, RFP (Spreadsheet)No
1ASCII and Unicode text can be with flowing text, with line breaks or comma
86TECHNIC ALINFORMATION
separated. The latt er have the ext ension .cs v and are used for pla in text input
of tables into spreadsheet programs.
2When saving to HTML, all graphics are saved as separate image files using
JPEG format. HTML 4.0 is supported only in OmniPage Pro 11, OmniPage
SE support is limited to HTML 3.2.
3Recognition results are sent to Clipboard in this format and will be pasted in
RTF if possible, and as Unicode or ASCII text if not.
4Unicode text can handle the widest range of accented characters.
5True Page or Retain Flowing Columns (RFC) views will not be refused, but
will appear as Retain Fonts and Paragraphs (RFP) view, that is, without
columns.
6O mniPage Do cument s created by OmniPage SE or OmniPag e Pro 11 can be
reopened by OmniP age SE. It ca n also ope n OPD file s created b y OmniPage
Pro 10 and the similar M ET fil es from Om ni Page Pro 9. Th es e fil es rem ain in
their old format and a copy is converted to OmniPage SE.
Saving to PDF
This section does not apply to OmniPage SE.
In OmniPage Pro 11, you have four choices when saving recognition
results t o Portable Docume nt Format (PDF) files.
Normal:
Pages are exported as they appeared in the Text Editor in True P age view.
The PDF file can be viewed and searched in a PDF viewer and ed ited in a
PDF editor.
With image substitutes:
As above, but reject and suspect characters have image overlays, so these
uncertain characters display as they were in the original document. The
PDF file can be viewed, searched and edited.
Image only:
The PDF file is viewable only and cannot be modified in a PDF editor
and text cannot be sear ch ed.
Image on text:
The PDF file is viewable only and cannot be modified in a PDF editor.
But there is a linked text file behind each image, so the text can be
searched. A found word is highlighted in the image.
SUPPORTEDFILETYPES87
OCR PROBLEMS
This section contains information and solutions for possible OCR
problems. First we provide suggestions for improving recognition
accuracy, second on getting good results from fax input and finally on
system or performance problems arising during OCR.
Text does not get recognized properly
Try these solutions if any p art of the origin al document is not co nverted
to text properl y during OCR:
u Look at the original page image and ensure that all text areas are
enclosed by text zones. If an area is not enclosed by a zone, it is
generally ignored during OCR. See Manual zoning in chapter 3.
u Make sure text zones are identified correctly. Reidentify zone
types and contents, if necessary, and perform OCR on the
document again. See Zone properties in chapter 3.
u Be sure you do not have an unsuitable template loaded by
mistake. If zone borders cut through text, recognition is
impaired.
88TECHNIC ALINFORMATION
u Adjust the brightness and contrast sliders in the Scanner panel of
the Options dialog box. You may need to experi me nt with
different settings combinati ons to get the desired results.
u Check the resolution of the original image. Ho ver the cursor over
the Original Image area for a popup display. If the resolution is
significantly above or below 300 dpi, recognition is likely to
suffer.
u Make sure the correct document languages are selected in the
OCR panel of the Options dialog box. Only languages included
in the document should be selected.
u Turn IntelliTrain on and make some pr oofing corr ections. This is
most likely to help with stylized fonts or uniformly degraded
documents. If IntelliTrain was running, try turning it off – on
some types of degraded documents it may not be able to help.
This does not apply to OmniPage SE.
u If you use True Page as the Text Editor view or for export,
recognized text is put into frames (fo rmatting boxes). Some text
may be hidden if a frame is too small. To view the text, place the
cursor in the te xt frame and us e the arr o w keys on y our keybo ar d
to scroll to the top, bottom, left, or right of the frame.
u Check the glass, mirrors, and lenses on your scanner for dust,
smudges, or scratches. Clean if necessary.
Note OmniPage SE only recognizes machine printed-text characters
such as typewritten or laser -printed text. It can handle dot-matrix
characters, though accura cy may be lower on draft-qualit y texts. It cannot
read handprint or handwriting. H owev er, it can retain signatures or other
handwritten text as a graphic.
Problems with fax recognition
Try these solutions to improve OCR accuracy on fax images:
u Ask senders to use clean, original documents if possible.
u Ask senders to select Fine or Best mode when they send you a
fax. This produces a resolution of 200 x 200 dpi.
u Ask senders to transmit files directly to your computer via fax
modem if you both have one. You can save fax images as image
files and then load them into OmniPage SE. See the section
Input from image files in chapter 3.
System or performance problems during OCR
Try these solutions if a crash occurs during OCR or if processing takes a
very long time:
u Resolve low memory problems. See Testing OmniPage SE.
u Resolve low disk space problems. See Testing OmniPage SE.
u Minimize all applicat ions or click Alt +Tab to check for Window s
error messages.
u Check the quality of the image you are recognizing.
u Consult your scanner documentation on ways to improve the
quality of scanned images.
OCR PROBLEMS89
u Break complex p age images (lo ts of te xt and gra phics or el aborate
formatting) into smaller jobs. Draw zones manually or modify
automatically cre ated zones and per form OCR on one pa ge area
at a time. See Wo rking with zones in chapter 3 on creating and
modifying zones.
u Restart Windows 95, 98 and Me and 2000 in safe mode, or
Windows NT and in VGA mode and test OmniPage SE by
performing OCR on the included sample image files
. See the
section Testing OmniPage SE.
If you are performing multiple tasks at once, such as recognizing and
printing, OCR may take longer.
UNINSTALLINGTHESOFTWARE
Sometimes uninstalling and then reinstalling OmniPage SE will solve a
problem. You should uninstall OmniPag e SE before instal ling OmniPage
Pro 11 or any OmniPage evaluation software. OmniPage SE’s Uninstall
program will not remove any of the following user-created files:
Zone templates (
Training files (
User dictionaries (
OmniPage Documents (
*.zon)
*.otd) (Not applicable to OmniPage SE)
*.ud)
*.opd)
To uninstall from Windows NT or Windows 2000, you must be logged
into your computer with administrator privileges.
t To uninstall or reinstall OmniPage SE:
u Close OmniPage SE.
u Click Start in the Window s taskbar and choose Settings É
Contro l Panel É Add/Remove Programs.
u Select OmniPage SE and click Change.
u Click Next in the dialog box that appears.
u Select Remove or Repair, t hen Next.
u Follow instructions until the process is finished.
zone types, 57
Character attribu tes
Checking OCR results
Clipboard
Closing a document
Color
images, 75
markers, 63
scanning, 51
Columns
changing text flow, 72
in tables, 58
,
Command buttons for automatic
processing
Comparing recognized words
with originals
Contents of OmniPage
Documents
Context-Sensitive Help
33
Contrast
Control over processing
Conversion of images
Copying
and pasting text, 25
document to Clipboard, 40,
Creating training data
, 52, 88
72
, 78
, 52, 88
78
, 43
, 77
, 63
, 69
, 63
, 31
, 85
, 67
, ix, 25,
, 44
Custom Layout
Customizing columns in Deta il
view
, 30
Cutting and pasting text
, 34, 54
, 25
D
Deferred processing
Deleting
a zone template, 59
pages, 28, 30
Describing document layout
53
Desktop
Detail view
Direct OCR
Disk space
Dividers, placing in tables
Document
Document Manager
Dot-matrix texts
, 24
customizing colum ns in, 30
description of, 29
in Document Manager, 24
, 33, 47
, 12, 84
closing, 31
copying to Clipboard, 40, 78
double-sided, 53
export, 23
finishing, 43
in OmniPageSE, 23
layout description, 53
OmniPage Document, 28
overview, 28
saving, 73
saving as you work, 32, 77
unfinished, 31
with varied layout, 53
, 31
, 42,
, 26
, 24, 28
, 89
OMNIPAGE SE USER’S GUI DE91
Double-sided documents
Drawing zones
Drivers for scanners
Dropping graphics from export
76
Duplex scanners
, 48
, 53
, 53
, 14
E
Earlier OmniPage versions
Editing
a training file, 67
a user dictionary, 64
character attributes, 69
graphics, 70
paragraph attributes, 69
PDF output, 87
recognized text, 26, 69
table dividers, 26, 58
table grids, 58
tables, 70
Effect of settings
Export Results button
Exporting
file types for, 86
graphics, 76
preparing for, 74
repeated, 73, 77
to a target application, 23,
44, 73
to Clipboard, 78
to file, 76
, 34
, 13
, 42, 45
F
Fax recognition
Features new to version 11 of
OmniPage Pro
Features of OmniPage SE
File
as export target, 75
as image source, 50
retained on uninstalling, 90
separation options, 76, 79
types, 76
, 89
, 18
, 19
types for export, 74, 86
types, supported, 85, 86
Finding
,
non-dictionary words, 62
suspect words, 62
Finishing a doc ument
Formatting levels
86
Formatting levels for export
Formatting toolbar
Frames
in export document, 74
recognized text in, 68, 89
G
Generating table dividers
Get Page butt o n
Getting online Help
Graphic zone
Graphics
saving to, 32
Opening image files
Opimizing brightness
Optical character recognition
Optimizing image quality
Options dialog box
Original Image
area, 24
saving, 75, 85
Overview
of document, 28
of processing steps, 23
, 24, 27, 42
, 50, 85
, 52
, 52
, 33
P
Page
acquired, 28
adding to a document, 43
deleting, 28, 30
Get Page button, 42, 44
moving between pages, 28
multi-page image files, 50,
75, 85
multiple column, 54
navigation, 24
new file on blank page, 50
outline, 72
proofed, 28
recogniz ed, 28
reordering, 28
rerecognizing all, 43
selectin g multiple , 28
single column, 54, 56
single co lumn page s with
tables, 54
spreadsheet pages, 54
, 22
OMNIPAGE SE USER’S GUI DE93
status, 28
zoned, 28
PaperPort
Paragraph
alignment, 26
changing order, 72
editing attributes, 69
reordering, 72
retaining paragraph styles, 76
styles, 26, 69, 76
PDF
editing PDF output, 87
image substitutes in, 87
PDF file input, 50, 85
PDF output, 87
saving to, 87
searching PDF output, 87
viewing PDF output, 87
Perform OCR button
Performance problems during
OCR
Performing
OCR, 23
recognition, 45
Placing dividers in tables
Preparing recognition results for
export
Printing
a document, 30
images, 25
recognition results, 25
Problems with fax recognition
acquire Text, 47
effect of settings, 34
for Direct OCR, 47
in OCR Wizard, 41
in Options dialog box, 33
zone types, 58
pages, 54, 56
pages with tables, 54
zone, 56
82
, 79
, 14
, 47
, 57
, 89
, x
Speed maximised
Splitting zones
Spreadsheet pages
Standard toolbar
Starting a user dictionary
Starting the program
Step-by-step processing
Stopping automatic processing
43
Subtracting from zones
Suggestion from dictionaries for
proofing
Supplementing template zones
59
Supported file types
Suspect words in proofing
Switching between manual an d
automatic processing
Switching between Text Editor
views
, 68
System or performance problems
during OCR
System requirements
, 33
, 56
, 54
, 24, 25
, 62
, 89
T
Tables
columns in, 58
editing, 70
editing dividers, 26, 58
editing grid s, 58
generating dividers, 59
in single column pages, 54
inserting dividers, 26, 58
moving dividers, 58
removing dividers, 58
rows in, 58
table handling in Text
Editor, 70
zones, 26, 57, 58
Task Manager
,
Technical information
Templates, zone
Testing Omn iPage SE
, 83
, 54, 59, 88
, 64
, 14
, 44
, 56
, 85, 86
, 27, 46
, 12
, 81
, 83
, 62
Text
Acquire Text Settings, 47
ASCII output, 86
attributes text, 26
Text Editor
Text saving
Text-to-Speech facility
,
Thumbnail view
TIFF images file s
Toolbars
image, 26
standard, 25
,
Training
creating training data, 67
editing a training file, 67
loading a training file, 67
saving a training file, 67
traning files, 65, 67
unloading a training file, 67
unsaved training data, 31
Troubleshooting
True Page view
TWAIN
, 24, 34, 61, 68
, 76
, 14
U
Underlined text
Unfinished documents
Unicode text output
Uninstalling OmniPage SE
Unit of measurement
Unloading a training file
Unloading a user dict iona ry
Unloading a zone templa te
Unsaved training data
Upgrading to Omni Page Pro
User dictionaries