Nuance SCANSOFT OMNIPAGE PRO 12, OMNIPAGE PRO 12, Pro 12 ScanSoft, OmniPage Pro - 12.0 User Manual

1.52 Mb
Loading...

L E G A L N O T I C E S

Copyright © 2002 ScanSoft, Inc. All rights reserved. No part of this publication may be transmitted, transcribed, reproduced, stored in any retrieval system or translated into any language or computer language in any form or by any means, mechanical, electronic, magnetic, optical, chemical, manual, or otherwise, without prior written consent from ScanSoft, Inc., 9 Centennial Drive, Peabody, Massachusetts 01960. Printed in the United States of America and in the Netherlands.

The software described in this book is furnished under license and may be used or copied only in accordance with the terms of such license.

IMPORTANT NOTICE

ScanSoft, Inc. provides this publication "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability or fitness for a particular purpose. Some states or jurisdictions do not allow disclaimer of express or implied warranties in certain transactions; therefore, this statement may not apply to you. ScanSoft reserves the right to revise this publication and to make changes from time to time in the content hereof without obligation of ScanSoft to notify any person of such revision or changes.

TRADEMARKS AND CREDITS

ScanSoft, OmniPage, OmniPage Pro, PaperPort, Pagis, True Page and Direct OCR are registered trademarks or trademarks of ScanSoft, Inc., in the United States and/or other countries.

All other company names or product names referenced herein may be the trademarks of their respective holders.

ScanSoft, Inc.

9 Centennial Drive

Peabody, MA 01960

U.S.A.

ScanSoft Belgium BVBA

Guldensporenpark 32

BE-9820 Merelbeke

Belgium

Part Number 50-281201-00A

C O N T E N T S

 

W E L C O M E

7

 

Using this Guide

8

 

Getting online Help

9

 

Online HTML Help

9

 

Context-Sensitive Help

9

 

Tech Notes

10

 

Glossary

10

1

I N S T A L L A T I O N A N D S E T U P

11

 

System requirements

12

 

Installing OmniPage Pro

13

 

Setting up your scanner with OmniPage Pro

14

 

How to start the program

16

 

Registering your software

17

 

New features in OmniPage Pro 12

17

2

I N T R O D U C T I O N

19

 

What is optical character recognition

20

 

OmniPage Pro’s OCR capabilities

20

 

Documents in OmniPage Pro

21

 

Basic processing steps

21

 

The OmniPage Desktop

22

 

The Menu bar

23

 

The Toolbars

23

 

The Image Panel

24

OmniPage Pro User’s Guide

iii

 

The Text Editor

24

 

The OmniPage Toolbox

25

 

Managing documents

26

 

Thumbnails

26

 

Document Manager

27

 

Customizing Document Manager columns

28

 

Deleting pages from a document

28

 

Printing a document

29

 

Closing a document

29

 

OmniPage Documents

29

 

Why save to OPD

30

 

How to save to OPD

30

 

Settings

31

3

P R O C E S S I N G D O C U M E N T S

33

 

Quick Start Guide

34

 

Loading and recognizing sample image files

34

 

Scanning and recognizing a single page

34

 

Processing overview

36

 

Automatic processing

38

 

Stopping and restarting automatic processing

39

 

Manual processing

40

 

Combined processing

41

 

Processing with the OCR Wizard

43

 

Processing from other applications

45

 

How to set up Direct OCR

45

 

How to use Direct OCR

45

 

How to use OmniPage Pro with PaperPort

46

 

Processing with Schedule OCR

47

 

Defining the source of page images

48

 

Input from image files

48

iv Contents

 

Input from scanner

49

 

Scanning with an ADF

50

 

Scanning without an ADF

51

 

Describing the layout of the document

51

 

Zones and backgrounds

53

 

Automatic zoning

53

 

Manual zoning

54

 

Zone types and properties

55

 

Working with zones

57

 

Table grids in the image

59

 

Using zone templates

61

4

P R O O F I N G A N D E D I T I N G

63

 

The editor display and views

64

 

Proofreading OCR results

65

 

Verifying text

66

 

User dictionaries

68

 

Training

69

 

Manual training

69

 

IntelliTrain

70

 

Training files

71

 

Text and image editing

72

 

On-the-fly editing

74

 

Reading text aloud

75

OmniPage Pro User’s Guide

v

5

S A V I N G A N D E X P O R T I N G

77

 

Saving original images

78

 

Saving recognition results

79

 

Saving a document as you work

80

 

Selecting a formatting level

81

 

Selecting advanced saving options

82

 

Saving to PDF

84

 

Copying pages to Clipboard

84

 

Sending pages by mail

85

6

T E C H N I C A L I N F O R M A T I O N

87

 

Troubleshooting

88

 

Solutions to try first

88

 

Testing OmniPage Pro

89

 

Increasing memory resources

90

 

Increasing disk space

90

 

Text does not get recognized properly

91

 

Problems with fax recognition

92

 

System or performance problems during OCR

92

 

ODMA support

93

 

Advanced features in Schedule OCR

93

 

Supported file types

94

 

File types for opening and saving images

94

 

File types for saving recognition results

95

 

Uninstalling the software

96

 

I N D E X

97

vi Contents

Welcome

Welcome to OmniPage Pro®, and thank you for using our software! The following documentation has been provided to help you get started and give you an overview of the program.

This User’s Guide

This guide introduces you to using OmniPage Pro 12. It includes installation and setup instructions, a description of the program’s commands and working areas, task-oriented instructions, ways to customize and control processing, and technical information. The guide is presented in PDF format, allowing you to use hyperlink jumps on cross-references and other navigation tools in your PDF viewer.

Online Help

OmniPage Pro’s online Help contains information on features, settings, and procedures. The online Help is provided as HTML help, and has been designed for quick and easy information retrieval. Comprehensive context-sensitive help aims to provide just enough assistance to let you keep working without delay. See “Getting online Help” on page 9.

Readme File

The Readme file contains last-minute information about the software. Please read it before using OmniPage Pro. To open this HTML file, choose Readme in the OmniPage Pro Installer or afterwards in the Help menu.

Scanning and other information

ScanSoft’s web site at www.scansoft.com provides timely information on the program. The Scanner Guide contains up-dated information about supported scanners and related issues; ScanSoft tests the 25 most widely used scanner models. Access ScanSoft’s web site from the OmniPage Pro Installer or afterwards from the Help menu.

OmniPage Pro User’s Guide

7

Using this Guide

This guide is written with the assumption that you know how to work in the Microsoft Windows environment. Please refer to your Windows documentation if you have questions about how to use dialog boxes, menu commands, scroll bars, drag and drop functionality, shortcut menus, and so on.

We also assume you are familiar with your scanner and its supporting software, and that the scanner is installed and working correctly before it is setup with OmniPage Pro 12. Please refer to the scanner’s own documentation as necessary.

The following conventions are used in this guide:

Bold

Introduces new terms and presents sub-headings.

 

 

Italic

Names topics in the online Help system.

 

Presents longer option texts in dialog boxes.

 

 

Non-serif

Presents file names: sample.tif

 

 

 

A note presents an item of additional information.

 

 

 

A tip presents ideas for using program features to

 

accomplish specific tasks.

 

 

8 Welcome

Getting online Help

In addition to using this guide, you can use OmniPage Pro’s online Help to learn about features, settings, and procedures. Online Help is available after you install OmniPage Pro.

Online HTML Help

Open OmniPage Pro’s online Help at its top level by choosing Help Topics at the top of the Help menu. This allows you to see topics arranged in a Table of Contents, search an alphabetical list of keywords or make full-text searches through the topics. Other items in the Help menu provide access to useful topics or web pages.

Press F1 as you are working with the program to see an online help topic relating to the current screen area, dialog box or warning message.

Context-Sensitive Help

You can get concise on-the-spot information in a popup window about a particular OmniPage Pro menu item, toolbar button, screen area or dialog box, in the following ways:

Click the Help tool in the Standard toolbar to get the help icon. Click this on any item on the desktop outside a dialog box or warning message.

Press Shift + F1 to get the same help icon. Use Shift + F1 to get contextsensitive help for shortcut menu items.

Click the question mark button in the upper right corner of a dialog box and then click an item in the dialog box to see the popup window.

Some dialog boxes or warning messages have their own Help button, or a help text. Click the button or the text to get information on the dialog or message box.

Click anywhere to remove a context-sensitive popup Help window.

OmniPage Pro User’s Guide

9

Tech Notes

ScanSoft’s web site at www.scansoft.com contains Tech Notes on commonly reported issues using OmniPage Pro 12. Web pages may also offer assistance on the installation process and troubleshooting.

Glossary

This guide does not include a glossary. The online Help has a comprehensive glossary, with its own alphabetical index and a table of contents. Please consult it if you want to find the meaning of a term used in this guide or in the program.

10 Welcome

Chapter 1

Installation and setup

This chapter provides information on installing and starting OmniPage

Pro 12. It presents the following topics:

XSystem requirements

XInstalling OmniPage Pro

XSetting up your scanner with OmniPage Pro

XHow to start the program

XRegistering your software

XNew features in OmniPage Pro 12

OmniPage Pro User’s Guide

11

System requirements

You need the following minimum system requirements to install and run OmniPage Pro 12:

XA computer with a Pentium or higher processor

XMicrosoft Windows 98 (from second edition), Windows Me, Windows NT 4.0 (with at least Service Pack 6), Windows 2000 or Windows XP

X64MB of memory (RAM), 128MB recommended

X90MB of free hard disk space for the application files plus 5MB working space during installation

X5MB for Microsoft Installer (MSI) if not present (This is present as part of the operating system in Windows Me, Windows 2000 and Windows XP)

XSVGA monitor with 256 colors, but preferably 16-bit color (called High Color in Windows 2000 and Medium Color in XP) and 800 x 600 pixel resolution

XWindows-compatible pointing device

XCD-ROM drive for installation

XA compatible scanner with its own scanner driver software, if you plan to scan documents. Please see the Scanner Guide at ScanSoft’s web site (www.scansoft.com) for a list of supported scanners.

Performance and speed will be enhanced if your computer’s processor, memory, and available disk space exceed minimum requirements.

12 Installation and setup

Chapter 1

Installing OmniPage Pro

OmniPage Pro 12’s installation program takes you through installation with instructions on every screen.

Before installing OmniPage Pro:

XClose all other applications, especially anti-virus programs.

XLog into your computer with administrator privileges if you are installing on Windows NT, 2000 or XP.

XIf you own a previous version of OmniPage Pro, or if you are upgrading from demonstration software or an OmniPage Special Edition, the installer asks your consent to uninstall that product.

WTo install OmniPage Pro:

1.Insert OmniPage Pro’s CD-ROM in the CD-ROM drive. The installation program should start automatically. If it does not start, locate your CD-ROM drive in Windows Explorer and double-click the Autorun.exe program at the top-level of the CD-ROM.

2.Choose a language to use during installation. This language will be used for the Text-to-Speech system and as the program’s interface language. The program interface language is used for displays such as menu items, dialog boxes, warning messages and so on. You can change the interface language later from within OmniPage Pro 12, but your choice at installation time determines which Text-to-Speech system will be installed with the program. See the second note below.

3.Follow the instructions on each screen to install the software. All files needed for scanning are copied automatically during installation.

Sometimes uninstalling and then reinstalling OmniPage Pro will solve a problem. See “Uninstalling the software” on page 96.

It is planned to provide Text-to-Speech for English, French, German, Italian, Portuguese and Spanish. This may vary depending on region or version. The Readme file provides latest information. A speech system for only one language can be installed with OmniPage Pro. See “Reading text aloud” on page 75.

Installing OmniPage Pro

13

Setting up your scanner with OmniPage Pro

All files needed for scanner setup and support are copied automatically during the program’s installation. Before using OmniPage Pro 12 for scanning, your scanner should be installed with its own scanner driver software and tested for correct functionality. Scanner driver software is not included with OmniPage Pro.

Scanner installation and setup are done through the Scanner Wizard. You can start this yourself, as described below. Otherwise, the Scanner Wizard appears when you first attempt to perform scanning.

Please follow these steps to use the Scanner Wizard to setup your scanner with OmniPage Pro 12:

XChoose Start Programs ScanSoft OmniPage Pro 12.0 Scanner Wizard

or click the Setup button in the Scanner panel of the Options dialog box.

or choose a scan setting in the Get Page drop-down list in the OmniPage Toolbox and click the Get Page button.

The Scanner Setup Wizard starts. The first panel appears only on first setup when called from inside OmniPage Pro.

XChoose ‘Select scanner or digital camera’, then click Next. You see a list of all detected TWAIN scanner drivers, with the system default scanner selected.

XClick once to select the driver of the scanner you want to use. Click ‘Other drivers...’ if you need to browse for a driver. Select ‘Configure Advanced Settings’ for an extra panel if you want your scanner’s own interface to be hidden during scanning or to modify the image transfer method. Click on Next.

XChoose Yes to test your scanner configuration, then click Next. The wizard will now test the connection from the computer to your scanner. When completed, click on Next.

XInsert a test page into your scanner. The wizard is now prepared to do a basic scan using your scanner manufacturer’s software. Click on Next. Your scanner’s native user-interface will appear.

14 Installation and setup

Chapter 1

XClick on Scan to begin the sample scan.

XIf necessary, click on Inverse Image… or Missing Image… and make the appropriate selections.

XOnce the image appears correctly in the window, click on Next.

XSelect the item that most appropriately describes your scanner, then click on Next.

XClick on Next to proceed to page size.

XThe page sizes that the Scanner Wizard believes your scanner to support are listed in the window. To make any changes to the page sizes, click on Advanced, make the changes and then click on Next.

XInsert a page with text but no pictures into your scanner. Click on Next to begin a scan in black-and-white mode.

XIf necessary, click on Inverse Image… or Missing Image… and make the appropriate selections.

XOnce the image appears correctly in the window, click on Next.

XIf you have a color scanner, insert a color photograph or a page with a color picture into your scanner. Click on Next to begin a scan in color mode. If necessary, click on Inverse Image… or Missing Image… and make the appropriate selections. Once the image appears correctly in the window, click on Next. If your scanner cannot scan in color, skip this step.

XInsert a photograph or a page containing a picture into your scanner. Click on Next to begin a scan in grayscale mode. If necessary, click on Inverse Image… or Missing Image… and make the appropriate selections. Once the image appears correctly in the window, click on Next.

XYou have successfully configured your scanner to work with OmniPage Pro 12! Click on Finish.

To change the scanner settings at a later time, or to set up a different scanner, reopen the Scanner Setup Wizard from the Windows Start menu or from the Scanner panel of the Options dialog box. To test and repair an improperly functioning scanner, open the Scanner Setup Wizard from the Windows Start menu and select ‘Test scanner or digital camera’ in the first panel, then work through the procedure described above.

Setting up your scanner with OmniPage Pro

15

How to start the program

To start OmniPage Pro 12 do one of the following:

XClick Start in the Windows taskbar and choose Programs ScanSoft OmniPage Pro 12.0 OmniPage Pro 12.0.

XDouble-click the OmniPage Pro icon in the program’s installation folder or on the Windows desktop if you placed it there.

XDouble-click an OmniPage Document (OPD) icon or file name; the clicked document is loaded into the program. See “OmniPage Documents” on page 29.

On opening, OmniPage Pro’s title screen is displayed and then its desktop. See “The OmniPage Desktop” on page 22. It provides an introduction to the program’s main working areas.

There are several ways of running the program with a limited interface:

XUse the Schedule OCR program. Click Start in the Windows taskbar and choose Programs ScanSoft OmniPage Pro 12.0 Schedule OCR. See “Processing with Schedule OCR” on

page 47.

XClick Acquire Text from the File menu of an application registered with the Direct OCR™ facility. See “How to set up Direct OCR” on page 45.

XRight-click an image file icon or file name for a shortcut menu. Select a sub-menu item from ‘Convert To...’ to define a target.

XUse OmniPage Pro 12 with ScanSoft’s PaperPort® or Pagis® document management products, to add OCR services. See “How to use OmniPage Pro with PaperPort” on page 46.

16 Installation and setup

Chapter 1

Registering your software

ScanSoft’s registration Wizard runs at the end of installation. We provide an easy electronic form that can be completed in less than five minutes. When the form is filled, and you click Send the program will search an Internet connection to immediately perform the registration online.

If you did not register the software during installation, you will be periodically invited to register later. You can go to www.scansoft.com to register online. Click on Support and from the main support screen choose Register on the left-hand column.

For a statement on the use of your registration data, please see ScanSoft’s Privacy Policy.

New features in OmniPage Pro 12

The OmniPage® product family is augmented by OmniPage Pro 12. If you are upgrading, you may not need to consult this guide very much. Here are some main areas of innovation compared to OmniPage Pro 11:

XDramatic increase in accuracy

Improved synergy between recognition engines, support for professional dictionaries and the ability to train characters chosen by the user boost accuracy to new levels.

XStreamlined interface

Automatic and manual processing are now driven directly from the OmniPage Toolbox without separate toolbars. See page 25. Thumbnails now display in the Image Panel; choose to see the current page, thumbnails or both. See page 26. The previous Detail view becomes the Document Manager and includes a Note column for comments and searchable keywords.

XNew zoning concepts

On-the-fly zoning allows zone changes to be processed immediately without having to re-recognize the whole page. See page 74. Page backgrounds are defined as process (auto-zone) or ignore, so all zoning instructions appear on the page and can be

Registering your software

17

saved to zone templates. See page 53. Irregular zones can be drawn and zones split and joined more simply, without the need for separate tools. See page 57.

XBetter proofing and verifying

The Proofing dialog box now shows suspect words in a wider context. A dynamic verifier can stay open as text is being checked, with the image display and window tracking the editing position. See page 65.

XFormatting levels for display and saving

There are three formatting levels for Text Editor display. See page 64. The output formatting level is now chosen at export time; the choices depend on the specified file type. An export choice ‘Flowing Page’ is an improved version of the previous ‘Retain Flowing Columns’ view. It preserves page layout without boxes and frames whenever possible, so text can flow between columns. See page 81.

XSuperior page analysis

The transfer of table formatting has improved, in particular the detection of tables without gridlines in original pages. Web and e-mail addresses can be detected and transferred to the Text Editor; hyperlinks can be inserted. Reading order can now be viewed and changed after recognition in the Text Editor’s True Page® view. See from page 72.

XImproved PDF handling

OmniPage Pro 12 searches background text in PDFs it opens, to deliver higher recognition accuracy. A new file type ‘PDF Edited’ allows good format retention on pages that were modified in the Text Editor after recognition.

XAdvanced saving options

A wider range of saving options is offered for each output file type. User-defined output file types can be created with customized settings. See page 82. If your edition of OmniPage Pro 12 includes the new saving formats XML and eBook, see page 95.

18 Installation and setup

Chapter 2

Introduction

You probably use your computer for business correspondence, preparing reports, handling data and an ever-increasing number of other uses. The challenge is that, in spite of the digital revolution, certain sources of information still circulate in printed, paper form and cannot be used immediately in a computer.

For example, if you want to incorporate information from a magazine article in a report you are preparing, you somehow have to get the text from the article into your computer. Painstakingly retyping the article is not an appealing solution.

This chapter introduces you to the solution: optical character recognition (OCR). It describes how OmniPage Pro 12 uses OCR technology to transform text from scanned pages or image files into editable text for use in your favorite computer applications.

We present the following topics:

XWhat is optical character recognition

Documents in OmniPage Pro

Basic processing steps

XThe OmniPage Desktop

XManaging documents

XOmniPage Documents

XSettings

OmniPage Pro User’s Guide

19

What is optical character recognition

Optical character recognition is the process of extracting text from an image. This image can result from scanning a paper document or opening an electronic image file. Images do not have editable text characters; they have many tiny dots (pixels) that together form character shapes. These present a picture of the text on a page.

During OCR, OmniPage Pro 12 analyzes the character shapes in an image and defines solutions to produce editable text. After OCR, you can save the resulting text to a variety of word-processing, desktop publishing or spreadsheet applications.

OmniPage Pro’s OCR capabilities

In addition to text recognition, OmniPage Pro can retain the following elements of a document through the OCR process.

Graphics

Photos, logos, and drawings are examples of graphics.

Text formatting

Font types, sizes and styles (such as bold, italic and underlines) are examples of character formatting. Indents, tabs, margins and line spacing are examples of paragraph formatting.

Page formatting

Column structure, table formats, and placement of graphics and headings are examples of page formatting.

The graphics, text and page formatting elements that OmniPage Pro retains are determined by the settings you select. Refer to the Settings Guidelines in the online Help for more information about selecting settings.

OmniPage Pro only recognizes machine-generated characters such as offset or laser-printed or typewritten text. However, it can retain handwritten text, such as a signature, as a graphic.

20 Introduction

Chapter 2

Documents in OmniPage Pro

OmniPage Pro 12 handles documents one at a time. When you acquire your first image (from scanner or from file) a new document is started. Further acquired images are added to the same document, until you save and close it.

A document in OmniPage Pro consists of one image for each document page. After you perform OCR, the document will also contain recognized text, displayed in the Text Editor, possibly along with graphics and tables. See “The OmniPage Desktop” on page 22.

Basic processing steps

There are two main ways of handling documents: with automatic processing or manual processing. See “Automatic processing” on page 38 and “Manual processing” on page 40. The basic steps for both processing methods are broadly the same:

1.Bring a set of images into OmniPage Pro.

You can scan a paper document with or without an Automatic Document Feeder (ADF) or load one or more image files. The resulting images can appear as thumbnails in the Image Panel along with the image of the first page entered. The document pages are summarized in the Document Manager. See “Defining the source of page images” on page 48.

2.Perform OCR to generate editable text.

During OCR, OmniPage Pro creates zones around elements on the page that will be processed, and then interprets text characters or graphics in each zone. Manual and template zoning are also possible. After OCR, you can check and correct errors in the document using the OCR Proofreader and edit the document in the Text Editor.

3.Export the document to the desired location.

You can save your document to a specified file name and type, place it on the Clipboard, or send it as a mail attachment. You can save it as an OmniPage Document (OPD) as described later. You can save the same document repeatedly to different destinations, different file types, with different settings and levels of formatting. See “Saving and exporting” on page 77.

What is optical character recognition

21

Standard toolbar

OmniPage Toolbox

Thumbnails show a picture of each page

in the document.

The current page has an “eye” icon.

This page has been recognized.

Image toolbar

Page navigation buttons

Buttons to show or hide the Document Manager, Text Editor and the Image Panel’s thumbnails and current page display. This can also be done from the View menu.

The OmniPage Desktop

The OmniPage Desktop has a title bar and a menu bar along the top and a status bar along the bottom. It has three main working areas, separated by splitters: the Document Manager, the Image Panel and the Text Editor. Each has close, maximize and restore buttons top right. The Image Panel has an Image toolbar and the Text Editor has a Formatting toolbar.

Formatting toolbar

Drag these splitters to resize the working areas.

Image Panel:

This is displaying the image of the current page, together with its zones. The image panel can display the current page, thumbnails, or both.

The Text Editor view buttons offer three formatting levels.

Text Editor:

This is displaying the recognition results from the current page in True Page view.

22 Introduction

Chapter 2

We show the program with a three-page document. Page one is the current page, which has been recognized and proofed. Page two has been recognized but not proofed yet. Page three has been acquired and manually zoned, but not recognized yet. The icons at the bottom of the thumbnail images show page status.

Status bar buttons let you show or hide the main screen areas and move to other pages in the document. A right mouse click in any screen area brings up a shortcut menu with the most useful commands for that area.

The Menu bar

For concise information on any menu item, click the context-sensitive help button and then click a menu item. A popup text explains the purpose of the menu item. Click anywhere to close the popup.

The Toolbars

The program has three main toolbars; all can be floated. Use the View menu to show, hide or customize them. Context-sensitive help explains the purpose of all tools. Two further toolbars govern specific tasks.

Toolbar

Default

Other docking

Purpose

location

locations

 

 

 

 

 

 

Standard

Horizontal under

Any edge of the

Performing basic program functions.

Menu bar

OmniPage Desktop

See page 29 and page 65.

 

 

 

 

 

Image

Vertically to left of

Vertically to right of

Image, zoning and table operations.

current page image

current page image

See page 53 and page 59.

 

 

 

 

 

Formatting

Horizontal at top of

None

Formatting recognized text in the

Text Editor

Text Editor. See page 72.

 

 

 

 

 

 

Verifier

Hover the cursor over the verifier window

Controlling the location and appear-

to see this floating toolbar.

ance of the verifier. See page 66.

 

 

 

 

Reorder

Click the Change reading order tool. This

Modifying the order of elements in

toolbar replaces the Formatting toolbar.

recognized pages. See page 72.

 

 

 

 

 

The OmniPage Desktop

23

The Image Panel

When this displays the current page image, the Image toolbar is available. All page images have a background value: process or ignore. Zones can be manually drawn on page images, or can be placed automatically after recognition. There are five zone types: Process, Ignore, Text, Table, Graphics. Areas inside process zones and on a process background outside other zones have zones automatically drawn and their zone types determined during processing. See “Zones and backgrounds” on page 53.

If the current page image is hidden, the thumbnails appear in rows to make the best use of the available space.

The Text Editor

This displays recognition results in any of three formatting levels:

XNo Formatting view (NF)

XRetain Fonts and Paragraphs view (RFP)

XTrue Page (TP)

True Page retains page layout using text, table and picture boxes, and frames. It can display multicolumn areas, to show text blocks that can be treated as flowing columns at export time.

True Page is also an export formatting level, along with Flowing Page that retains page layout without boxes and frames. See “The editor display and views” on page 64.

24 Introduction

Chapter 2

The OmniPage Toolbox

This Toolbox lets you drive the processing. By default it is located along the top of the OmniPage Desktop, just above the working areas. It can be floated and also be docked along the bottom of the desktop.

Start button

Get Page button

Perform OCR button

Export Results button

Get Pages

 

 

 

 

 

 

Layout Description

Export Results

drop-down list

 

 

drop-down list

drop-down list

 

 

Automatic processing is started, and can be stopped and re-started with the Start (1-2-3) button. See “Automatic processing” on page 38.

Manual processing allows you to process documents page-by-page and step-by-step. Start each step with the three large buttons: the Get Page button (1), the Perform OCR button (2) and the Export Results button

(3). See “Manual processing” on page 40.

You can switch between automatic and manual processing any time the program is not busy with processing. That means you can switch between them while you are working within a document. You can automatically process some pages, then add more pages with manual processing. After processing a stack of pages automatically, you can inspect the results and then go back to reprocess certain pages manually. This procedure is described in chapter 3. See “Combined processing” on page 41.

The OCR Wizard is designed for new users. See “Processing with the OCR Wizard” on page 43. If you have a document open when you start the OCR Wizard, the document will be closed after a prompting to save it. When you have used the OCR Wizard to process and save a document, it remains in the program and can be further processed (adding more pages, re-recognizing pages etc.) with either manual or automatic processing.

The OmniPage Desktop

25

Managing documents

Document management can be done by thumbnails in the Image Panel or by the Document Manager, situated along the bottom of the OmniPage Desktop. Both summarize the pages in the document and are synchronized. Our pictures show these with the same seven-page document. Pages 1 and 2 are selected and page 4 is the current page, that is, the one shown in the Image Panel. Page status is shown as follows:

Page

Status

Icon

Page image has been...

1

Acquired

 

acquired but has not yet been recognized.

 

 

 

 

2

Recognized

 

recognized, but not proofread, or proofing

 

was interrupted on the page.

 

 

 

 

 

 

 

3

Recognized,

 

recognized, and proofing has reached the

Proofed

 

end of the page.

 

 

 

 

 

 

4

Modified

 

recognized with at least one editing or for-

 

matting change made in the Text Editor.

 

 

 

 

 

 

 

5

Modified,

 

recognized, edited in the Text Editor, and

proofed

 

proofing has reached the end of the page.

 

 

 

 

 

 

6

Pending

 

acquired, maybe recognized; some zone

 

changes are stored but not yet processed.

 

 

 

 

 

 

 

7

Saved

 

recognized and saved at least once.

 

 

 

 

Thumbnails

These present a set of numbered thumbnail images, one for each page in the document. Scroll to see pages as necessary. The current page has an ‘eye’ icon. You can select multiple pages in the document; these have a distinctive appearance. Use thumbnails for page operations, as follows:

Jump to a page: Click the thumbnail of the desired page.

Reorder a page: Click the thumbnail of the page you want to move and drag it above the desired page number. Pages are renumbered automatically.

Delete a page: Select the thumbnail of the page you want to delete and press the Delete key.

Select multiple pages: Hold down the Shift key and click two thumbnails to select all pages between and including them. Hold down

26 Introduction

Move the cursor onto the page’s status icon to see a thumbnail of the page.

Chapter 2

the Ctrl key as you click thumbnails to add pages to a selection one by one. Then you can move or delete the selected pages as a group, or send them to (re)recognition. You can also export selected pages.

Get information on an input image by hovering the cursor over its thumbnail (so long as ToolTips are enabled). A popup text displays the image size in pixels and the program’s unit of measurement. Image resolution is also shown.

Document Manager

This provides an overview of your document with a table. Each row represents one page. Columns present statistical or status information for each page, and (where appropriate) document totals. The picture shows columns that a user has specified.

Enter comments or searchable keywords here.

The current page is shown with an ‘eye’ icon. You can use the Document Manager for page operations, as follows:

Jump to a page: Click the leftmost part of the page row or double click anywhere in its row.

Reorder a page: Click the row of the page you want to move and drag it to the desired location. An indicator on the left shows where the page will be inserted. Pages are renumbered automatically.

Delete a page: Select the row of the page you want to delete and press the Delete key.

Select multiple pages: Hold down the Shift key and click two page rows to select all pages between and including them. Hold down the Ctrl key as you click rows to add pages to a selection one by one. Then you can move or delete the selected pages as a group, or send them to (re)recognition. You can also export selected pages.

Managing documents

27

When multiple pages are being selected, the page set as current does not change. All selected pages are highlighted.

Customizing Document Manager columns

You can specify which columns of information you want to see in the Document Manager. Click Customize Columns... in the View menu for the following dialog box:

This item is

 

 

highlighted.

 

Highlight an

 

 

 

 

 

 

 

Click a checkbox

 

item and use

 

these arrows to

to select the item.

 

 

 

 

 

 

 

 

change the

 

 

 

 

 

 

 

Image sizes are

 

 

 

 

 

order of

 

 

 

columns.

 

 

expressed in

 

 

pixels.

 

 

Define a width for the highlighted item.

Define which columns should appear, their widths, and column order. The topic Customizing Document Manager columns in online Help clarifies what is presented in each column. You can change column widths easily in the Document Manager; just drag the column dividers in the title bar.

Deleting pages from a document

Page deletions must be confirmed and can be undone. Delete the current page only with the item Delete Current Page in the Edit menu. Delete all selected pages in the Document Manager or from the thumbnails by pressing the Delete key or using the shortcut menu command Clear.

28 Introduction

Chapter 2

Printing a document

You can print the document with the Print item in the File menu. Choose whether to print images or text (that is, recognition results as they appear in the Text Editor). You can print all pages or a range of pages. The Print tool in the Standard toolbar prints images or text, depending whether the Image Panel or the Text Editor is active.

Closing a document

Choose Close in the File menu to close a document. You are prompted to save your document if you have not saved it or you have modified it since the last save. See the next section on saving the document as an OmniPage Document (*.opd). You will also be prompted to save unsaved training data if you selected ‘Prompt to save training data when closing document’ in the Proofing panel of the Options dialog box.

OmniPage Documents

The OmniPage Document is the program’s proprietary file type; it has the extension .opd. It is one of the file types offered when saving a document to file. You save the document to the OPD file type if you want to work with it again in OmniPage Pro during a future session. You can then process unfinished pages, add more pages and proof or edit recognition results.

An OmniPage Document contains the original page images (deskewed and pre-processed) with any zones placed on them. After recognition, the OPD also contains the recognition results. Recognized characters are stored along with their coordinate and confidence data. This preserves the links between image and text, so that verification and proofing remain available when the OPD is reopened in future sessions.

When you save an OmniPage Document, the current settings (and unsaved training) are also saved. When you open an OmniPage Document, its settings are applied, replacing those existing in the program.

OmniPage Documents

29

Why save to OPD

You do not have to save your documents to the OPD file type. You would typically do this for the following reasons:

xYou cannot finish working with the document in the current session.

xYou want to pass the document to other users who have OmniPage Pro. For example, you can pass an OPD file to a specialist for proofing. In an office network, you may have one scanner generating images for recognition and proofing at several workstations.

xYou want to build up an archive of recognized documents whose original images remain accessible. The recognized texts allow searching by keywords and other document retrieval techniques.

Recognition results should be saved from OPD files before installing any OmniPage Pro upgrade. These files may not be upwards compatible to newer OPD file formats, or possibly only the images will be retained when the files are upgraded. When you open an OPD created by OmniPage Pro 10, only images are loaded. When you open an OPD created by OmniPage Pro 11, images and recognized pages are loaded, but no zones are retained.

How to save to OPD

If you intend to create an OPD, you can save it to this format at an early stage, for protection. Use the Save button to save it periodically as you work. Save it again at the end of your session.

The Save button saves the document to the name and file type of its last save. You can save your document repeatedly to different formats. If your first save was to another format (for instance .doc), use the item Save As...

from the File menu to save it as an OPD. If a document is saved as an OPD, then you later save it to another format, it is not automatically resaved as an OPD. When you close the document or exit the program, you will be prompted to save the document as an OPD.

The title bar shows the file name of the most recent whole-document save.

30 Introduction

+ 70 hidden pages