OmniPage SE User Manual

Welcome to OmniPage SETM, and thank you for using our software!
This User's Guide
This Guide introduces you to using OmniPage SE. It includes
The Guide is presented in PDF format, allowing you to use hyperlink jumps on cross-references and other navigation tools in your PDF viewer.
Online Help
OmniPage SE's online Help contains information on features, settings, and procedures. The online Help is provided as HTML help, and has been designed for quick and easy information retrieval. Comprehensive context-sensitive help aims to provide just enough assistance to let you keep working without delay.
In addition to using this Guide, you can use OmniPage SEs online Help to learn about features, settings, and procedures. Online Help is available after you install OmniPage SE.
Online HTML Help
Open OmniPage SE's online Help at its top level by choosing OmniPage SE Help Topics at the top of the Help menu. This allows you to see topics arranged in a Table of Contents, search an alphabetical list of keywords or make full-text searches through the topics. Other items in the Help menu provide access to useful topics or web pages.
Press F1 as you are working with the program to see an online help topic relating to the current screen area, dialog box or warning message.
Context-Sensitive Help
You can get concise on-the-spot information in a popup window about a particular OmniPage SE menu item, toolbar button, screen area or dialog box, in the following ways:
Click the Help button in the Standard toolbar to get the help icon. Click this on any item on the desktop outside a dialog box or warning message.
Press Shift + F1 to get the same help icon. Click the question mark button in the upper right corner of a dialog
box and then click an item in the dialog box to see the popup window. Some dialog boxes or warning messages have their own Help button,
or a help text. Click the button or the text to get information on the dialog or message box.
Click anywhere to remove a context-sensitive popup Help window.
The product you have is a Special Edition of the world-renown OmniPage ProTM software. This edition has been developed for distribution by selected scanner manufacturers and contains a subset of the features of the OmniPage Pro 11 product. This Guide and the online Help describe the features of the full product, using an SE icon to document the differences between the two products.
If you find the additional features of the professional product would be of benefit to you, you can use online facilities to upgrade your Special Edition to OmniPage Pro 11.
Installation and setup
This chapter provides information on installing and starting OmniPage SE. It presents the following topics:
u System requirements u Installing OmniPage SE u Setting up your scanner with OmniPage SE u How to start the program u Registering your software u New features in OmniPage Pro 11 u OmniPage SE and OmniPage Pro 11
You need the following minimum system requirements t o install and run OmniPage SE:
u A computer with a Pentium or higher processor u Microsoft Windows 95, Windows 98, Windows ME, Windows
2000, or Windows NT 4.0
u 32MB of memory (RAM), 64MB recommended u 75MB of free hard disk spa ce for t he appli cation fi les plus 10MB
working space duri ng install ation
u 9MB for Microsoft Installer (MSI) if not present and 44MB for
Internet Explorer if not present. (These are present as part of the operating system in Windows 98, Windows ME and Windows
u SVGA monitor with 256 colors and 800 x 600 pixel resolution u Windows-compatible pointing device u CD-ROM drive for installation u A compatible scanner if you plan to scan documents. Please see
the Scanner Guide at ScanSoft ’s web site ( for a list of supported scanners.
Note Performance and speed will be enhanced if your computer’s
processor, memory, and available disk space exceed minimum requirements.
OmniPage SEs installation program takes you through installation with instructions on every screen.
Before installing OmniPage SE:
u Make sure your scanner is connected, turned on, and compatible
with your system.
u Close all other applications, especially anti-virus programs. u Log into your computer with administrator privileges if you are
installing on Windows 2000 or Windows NT.
u If you have previous OmniPage software on your system, the
installer will ask for your consent to uninstall that software first.
t To install OmniPage SE:
1. Insert OmniPage SE’s CD-ROM in the CD-ROM drive. The installation program should start auto matically. If it does not start, locate your CD-ROM drive in Windows Explorer and double-click
Autorun.exe program at the top-level of the CD-ROM.
2. Choose a language to use during installation. This language will be used for the Text-to-Speech system and as the program’s interface language. The program interface language is used for displ ays such as menu items, dialog boxes, warning messages and so on. You can change the interface language later from within OmniPage SE, but your choice at installation time determines which Text-to-Speech system will be installe d with the program. References to the Text-to­Speech faciliy do not apply to OmniPage SE.
3. Follow the instructions on each screen to install the software. All files needed for scanning are copied automatically during installation.
Note Sometimes uninstalling and then reinstalling OmniPage SE will
solve a problem. See Uninstalling the software at the end of chapter 6.
Note In OmniPage Pro 11, Text-to-Speech is available for English
(British and US), French, German, Italian, Portuguese or Spanish. This is not available in OmniPage SE. See also the section Reading text aloud in chapter 4.
All files needed for scanner setup and support are copied automatically during the program’s installation. Before using OmniPage SE for scanning, your scanner should be correctly installed and tested for correct functionality.
Scanner installa tion an d setup ar e done thr ough the Sc anner W izar d. You can start this yourself, as described below. Otherwise, the Scanner Wizard appears when you first attempt to perform scanning from OmniPage SE.
Please follow these steps to use the Scanner Wizar d to setup your scanner with OmniPage SE:
u Choose StartÉProgramsÉScanSoft OmniPage SEÉ Scanner
Wizard or click the Setup button in the Scanner panel of the Options dialog box. or choose a scan command in the Get Page drop-down list in the OmniPage Toolbox.
u Choose Select scanning source, then click Next. u Click once on your scanners TWAIN driver to select it, then
click Next.
u Choose Yes to test your scanner configuration, then click Next. u The wizard will now test the connection from the computer to
your scanner. Click on Next.
u Insert a test page into your scanner. u The wizard is now prepared to do a basic scan using your scanner
manufacturer’s software. Click on Next.
u Your scanner’s native user-interface will appear. Click on Scan to
begin the sample scan.
u If necessary, click on Inverse Image or Missing Image and
make the appropriate selections.
u Once the image appears correctly in the window, click on Next. u Select the item that most appropriately describes your scanner,
then click on Next.
u Click on Next to proceed to page size. u The page sizes that the Scanner Wizard believes that your
scanner supports are listed in the window. To make any changes to the page sizes, click on Advanced, make the changes and then click on Next.
u Insert a page with text but no pictures into your scanner. Click
on Next to begin a scan in black and white mode.
u If necessary, click on Inverse Image or Missing Image and
make the appropriate selections.
u Once the image appears correctly in the window, click on Next. u If you have a color scanner, insert a color photograph or a page
with a color picture into your scanner. Click on Next to begin a scan in color mode. If necessary, click on Inverse Image or Missing Image and make the appropria te se lection s. O nce th e image appears correctly in the window, click on Next. If your scanner cannot scan in color, skip this step.
u Insert a photograph or a page containing a picture into your
scanner. Click on Next to begin a scan in grayscale mode. If necessary, click on Inverse Image or Missing Image and make the appropriate selections. Once the image appe ars correctly in the window, click on Next.
u You have successfully configured your scanner to work with
OmniPage SE! Click on Finish.
To change the scanner settings at a late r time, or to set up a different scanner, or to test and repair an installed scanner, please follow one of these two methods to reopen the Scanner Wizard:
u StartÉProgramsÉScanSoft OmniPage SEÉScanner Wizard or u StartÉProgramsÉScanSoft OmniPage SEÉOmniPage
ÉTools menuÉOptionsÉScannerÉSetup button.
Note To test and repair an improperly functioning scanner, follow the
procedure above, selecting Test and configure current scanning source at the start of the process.
To start OmniPage SE do one of the following:
u Click Start in the Windows taskbar and choose
ÉScanSoft OmniPage SEÉOmniPage SE.
u Double-click the OmniPage SE icon in the programs instal lation
folder or on the Windows desktop if you placed it there.
u Double-click an OmniPage Document (OPD) icon or file name;
the clicked document is loaded into the program. See OmniPage Documents in chapter 2.
On opening, OmniPage SE’s title screen is displayed and then its d esktop. See chapter 2 for an introduction to OmniPage SE’s desktop.
There are several ways of running the program with a limited interface:
u Use the Schedule OCR program. Click Start in the Windows
taskbar and choose ProgramsÉScanSoft OmniPage SEÉ Schedule OCR. See Processing documents with Schedule OCR in chapter 3.
u Click Acquire Text from the File menu of an application
registered with the Direct OCR facility. See How to set up Direct OCR in chapter 3.
u Right-click an image file icon or file name for a shortcut menu.
Select a sub-menu item from Convert To... to define a target.
u Use OmniPage SE with ScanSoft’s PaperPort
or Pagis® document management products, to add OCR services. See How to use OmniPage SE with your PaperPort software in chapter 3.
ScanSoft’s registration Wizard runs at the end of installation. We provide an easy electronic form that can be completed in less than five minutes.
When the form is filled, and you click Send the program will search an Internet connection to immediately perform the registration online.
If you did not register the software during installation, you will be periodically in vited to register later. You can go to to register online. Click on Support and from the main suppor t screen choose Register on the left-hand column.
For a statement on the use of your registration data, please see ScanSoft’s Privacy Policy.
The OmniPage® product family is augmented by OmniPage Pro 11 and OmniPage SE. This section lists enhancements introduced in the professional product OmniPage Pro 11. Some of these are incorporated in OmniPage SE, as detailed in the next section.
New features in OmniPage Pro 11 compared to OmniPage Pro 10 are:
u Greater accuracy - redeveloped recognition engines ma ke
OmniPage Pro 11 the most accurate OmniPage ever.
u Improved page layout - OmniPage Pro 11 will allow you to retain
formatting that is true to the original, even on pages with non­gridded tabl es, headers and footers and droppe d capitals.
u More intelligent proofreading - new IntelliTrain feature
automatically uses previous corrections to generate be tter OCR results.
u PDF capability - now you can import PDF files (even read-only
files) and convert them to your favorite program files (Word, Excel, etc.). You can also create PDF files from any paper document or image files.
u Better HTML - new WYSIWYG (What You See Is What You
Get) HTML output will handle graphics, text, and backgrounds to keep your web output looking like the or iginal document.
u Language support - OmniPage Pro 11 now supports over 100
languages and extends to the Greek and Cyrillic alphabets.
u Detail view - this pro vi des more c ustomizabl e informa tion about
each page, making it easier to handle pages in a document.
u Text Editor - a new fully-featured WYSIWYG editor for
recognition results, with a wide range of editing tools, color support, and a choice of four formatting levels for display and export.
u Better results on degraded text - a new despeckle module
significantly reduces errors on spotty, shaded and color backgrounds.
This list documents features which are not incorporated in OmniPage SE, but which can become available by upgrading to OmniPage Pro 11:
u Significant improvement in recognition accuracy. u Access to the IntelliTrain character training facility. u Abitity to open and read the contents of PDF files. u Ability to save recognized documents to PDF format. u Ability to open TI FF FX image files. u Handling LZW TIFF and GIF image files for input and output. u Support for WYSIWYG HTML 4.0 output. u Language support rises from about 50 to over a hundred. u Access to text-to-speech software, allowing recognized texts to be
read aloud.
For more information or to upgrade, please visit, make a selection from the country/continent list if you prefer a different language, then click on the OmniPage icon.
You probably use your computer for business correspondence, preparing reports, handling data and an ever-increasing number of other uses. The challenge is that, in spite of the digital revolution, certain sources of information still circulate in printed, paper form and cannot be used immediately in a computer.
For example, if you want to incorporate information from a magazine article in a report you are preparing, you somehow have to get the text from the article into your computer. Painstakingly retyping the article is not an appealing solution.
This chapter introduces you to the solution: optical character recognition (OCR). It describes how OmniPage SE uses OCR technology to transform text from scanned pages or image files into editable text fo r use in your favorite computer applications.
The chapter includes the following sections:
u What is optical cha racter recognition
u Documents in OmniPage SE u Basic processing steps
u The OmniPage SE desktop u Managing documents u OmniPage Documents u Settings
Optical character recognition is the process of extracting text from an image. This image can result from scanning a paper document or opening an electronic image file. characters; they have many tiny dots (pixels) that together form character shapes. These present a picture of the text on a page.
During OCR, OmniPage SE 11 analyzes the character shapes in an image and defines solutions to produce editable text. After OCR, you can save the resulting text to a variety of word-processing, deskto p publishing or spreadsheet applications.
OmniPage SE’s OCR capabilities
In addition to text recognition, OmniPage SE can retain the following elements of a document through the OCR process.
Photos, lo gos, and drawings are ex amples of graphics.
Images do not have editable text
Text formatting
Font types, sizes and styles (such as bold, italic and underlines examples of character format ting. I ndents, ta bs, margin s and line spac ing are examples of paragraph formatting.
Page formatting
Column structure, table formats, and placement o f graphics an d headings are examples of page forma tting.
The graphics, text an d page formatting elements that Omni Page SE retains are dete rmined by the settings you select. Refe r to the Settings Guidelines in the online Help for more information about selecting settings.
) are
Note OmniPage SE only recognizes machine-generated characters such
as offset or laser-printed or typewritten text. However, it can retain handwritten text, such as a signature, as a graphic.
Documents in OmniPage SE
OmniPage SE handles documents one at a time. When you acquire your first image (from scanner or from file) a new document is started. Further acquired images are added to the same document, until you save and close it.
A document in OmniPage SE consists of one image for each document page. After you perform OCR, the document will also contain recognized text, displayed in the Text Editor , possibly along with graphics and tables. For more information on screen areas, see the section The OmniPage SE desktop.
Basic processing steps
There are two main ways of handling documents: with automatic processing or manual processing. See chapter 3, Processing documents automatically and Processing documents manually. The basic steps for both processing methods are broadly the same:
1. Bring a set of images into OmniPage SE.
You can scan a paper document with or without an Automatic Document Feeder (ADF) or load one or more image files. The resulting images appear in miniature in the Document Manager’s Thumbnail view and the pages are summarized in its Detail view. The image of the current page is displayed in the Original Image area.
2. Perform OCR to generate editable text.
3. Export the document to the desired location.
During OCR, OmniPage SE creates zones around elements on the page that will be processed, and then interprets text characters or graphics in each zone. Manual and template zoning are also possible. After OCR, you can check and correct errors in the document using the OCR Proofreader and edit the document in the Text Editor.
You can save your document to a specified file name and type, place it on the Clipboar d, or se nd it as a mail att achment. You can save it as an OmniPage Document (OPD) as described later. You can save the same document repeatedly to different destinations, different file types, with different settings and levels of formatting. See chapter 5.
Standard toolbar OmniPage
The current page has a pale border.
This page has been recognized.
OmniPage SEs desktop has a title bar and a menu bar along the top and a status bar along the bottom. It has three main working areas, separated by splitters: the Document Manager, the Original Image area and the Text Editor. The Document Manager has two tabbed panels : Thumbnail view and Detail view. The Original Image area has an Image toolbar and the Text Editor has a Formatting toolbar.
Formatting toolbar
Thumbnail view shows a picture of each page in the document.
Page navigation buttons
Buttons to show, hide or rearrange the working areas.
Image toolbar
Original Image area:
This displays the image of the current page, together with any zones automatically or manually placed on the image.
Drag this splitter to left or right to resize the working areas.
The Text Editor view buttons offer four formatting levels.
Text Editor: This is displaying the recognition results from the current page in True Page™ view.
Note To control which of the three views (Document Manager,
Original Image, and Text Editor) are displayed, check or uncheck each view from the View menu or with the status bar buttons.
The OmniPage Toolbox lets you control processing. It can have three states, depending which of the three tab buttons on the left is clicked. In the picture, we display its appearance for Manual OCR. We show the program wi th a thr ee-page do cument. P a ge one is the curr ent page, which has been recognized and proofed. Page two has been recognized but not proofed yet. Page three has been acquired and manually zoned, but not recognized yet. The icons at the bo ttom right of the thumbnail ima ge s show page status.
Status bar buttons let you show, hide or rearrange the main screen areas and move to other pages in the document. A right mouse click in any screen area brings up a shortcut menu with the most useful commands for that are a.
The Standard toolbar
The Standard toolbar contains buttons and a drop-down list for performing stand ard tasks. It can be floated and docked to any edge of the OmniPage SE desktop. All these functions can also be accessed from menus.
New start a new document.
Open an OmniPage Document
Save the current document und er the name and type of its last save.
Print images or recognition results from all or selected pages.
The Menu bar
For concise information on any menu item, click the context-sensitive help button and then click a menu item. A popup text explains the purpose of the menu item. Click anywhere to close the popup.
the recognized text.
Cut the current selection in the Text Editor.
Copy the current Text Editor selection.
Paste selection into the Text Editor.
Undo the last editing action.
Open the Options dialog box.
Zoom the active area: Original Image or Text Editor.
Context­sensitive Help
The Image toolbar
The Image toolbar contains buttons that allow you to zoom in or out on the current image or to rotate it. They also allow you work with zones and table dividers on the page. See chapter 3, Manual zoning and Table grids in the image. Here we summarize the purpose of the buttons. The Image toolbar can be floated (that i s, undocked and moved anyw here on the desktop). It can be docked to any edge of the Original Image area.
Draw rectangular zones.
Draw irregular zones.
Add to a zone or combine zones.
Subtract from zone or separate zones.
Tip You can also resize or rotate the original image with a shortcut
menu. Right click in the Original Image area outside a zone and select a zoom or rotation value.
The Formatting toolbar
The Formatting toolbar contains buttons that allow you to edit recognized text in the Text Editor. See Text and image editing in chapter
4. Here we summarize the purpose of the buttons. The Formatting toolbar always remains along the top of the Text Editor.
Reorder zones.
Zone properties
row or column dividers in a table.
Insert column dividers in a table.
Insert row dividers in a table.
Remove row or column dividers one by one.
Remove/ replace all
row and column dividers.
Rotate images.
Zoom in on page image.
Zoom out from page image.
Paragraph styles
Font name F o nt size
Bold Underline
Show/hide non-
printing characters.
The OmniPage T oolbox
This Toolbox lets you drive the processing. By default it is located along the top of the OmniPage SE desktop, just above the working areas. It can be floated and also be docked along the bottom of the desktop.
It has three tabs on the left: AutoOCR, Manual OCR and OCR Wizard. Click one to see its controls in the Toolbox. The picture at the beginning of th is sectio n showe d the OmniPage desktop with the Manua l OCR toolbar. The AutoOCR toolbar looks like this.
Automatic processing is started, and can be stopped and re-started with the buttons on the right of the toolbar. The use of these buttons is explained in Processing documents automatically in chapter 3. The effects of other settings are also described in chapter 3, Tutorial: Processing
You can switch between automatic and manual processing any time the program is not busy with processing. That means you can switch between them while you are wor k ing within a doc ument. You can automatically process some pages, then add more pages with manual processing. After processing a stack of pages automatically, you can inspect the results and then go back to reprocess certain pages manually. This procedure is described in chapter 3 in the section Processing a document automatically and finishing it manually.
OmniPage SE must be empty when you start the OCR Wizard. See the section Processing documents using the OCR Wizard in chapter 3. When you have used the OCR Wizard to process and save a document, it remains in the program and can be further processed (adding more pages, rerecognizing pages etc.) with either manual or automatic processing.
The Document Manager is situated on the left of the OmniPage SE desktop. It has two tabbed panels: Thumbnail view and Detail view. Click a tab to see its view. Both views summarize the pages in the document and are synchronized: the current and selected pages remain the same when you switch views. Our pictures show the two views with the same four-page document. Pages 1 and 2 are selected and page 4 is the current pa ge, that is, the one shown i n the Original Image area. The Document Manager shows page status with the following icons:
Page Status
1 Acquired
2 Zoned
3 Recognized
4 Proofed
Thumb­nail icon
Detail icon
Thumbnail vi ew
This presents a vertical set of number ed thumbnail images, one for each page in the document. Scroll to see pages as necessary . The curre nt page has a paler background and its page number text appears bold. You can select multiple pages in the document; these have a ‘pushed-in’ appearance. A status icon appears at the bottom right of each page as described above.
Jump to a page: Click the icon of the desired page. Reorder a pa ge: Cl ick the thumbnail of the page you want to move and
drag it above the desired page number. Pages are renumbered automatically.
Page image has been... Acquired with no manual or template zones and has
not yet been recognized. Acquired and manual or template zones have been
placed; not yet recognized. Recognized, but not proofread, or proofing was
interrupted on the page. Recognized, and proofing has reached the end of
the page.
Delete a page: Select the thumbnail of the page you want to delete and press the Delete key.
Select multiple pages: Hold down the Shift key and click two thumbnails to select all pages between and including them. Hold down the Ctrl key as you click thumbnails to add pages to a selection one by one. Then you can move or delete the selected pages as a group, or send them to (re)recognition.
Detail view
This facility is new to OmniPage SE. It provides an overview of your document with a table. Each row represents one page. Columns present statistical or status information for each page, and (where appropriate) document totals. The pictur e below sh ows the d efault columns on the left and four columns which a user has specified.
Move the cursor
onto the page’s status icon to see a thumbnail of the page.
This shows the number of zones of each type on the page.
The current page is shown with a highlight. You can use Detail view for page operations, as follows:
Jump to a page: Click the row of the desired page. Reorder a page: Click the row of th e pag e y ou wan t to move and drag it
to the desired location. An arrow indicator on the left shows where the page will be inserted. Pages are renumbered automatically.
Delete a page: Select the row of the page you want to delete and press the Delete key.
Select multiple pages: Hold down the Shift key and click two page rows to select all pages between and includi n g them. Hold down the Ctrl key as you click rows to add pages to a selection one by one. Then you can move or delete the selected pages as a group, or send them to (re)recognition.
When multiple pages are being selected, the page set as current does not change. All selected pages are highlighted.
Tip Get image size information by hovering the cursor over a thumbnail
or outside a zone on an original image. A popup text displays the image size in pixels and the program’s unit of measurement. Image resolution is also show n.
+ 67 hidden pages