ABBYY FineReader - 12.0 User Manual

®
ABBYY
FineReader
Version 12
User’s Guide
ABBYY FineReader 12 User’s Guide
Information in this document is subject to change without notice and does not bear any commitment on the part of ABBYY. The software described in this document is supplied under a license agreement. The software may only be used or copied in strict accordance with the terms of the agreement. It is a breach of the "On legal protection of software and databases" law of the Russian Federation and of international law to copy the software onto any medium unless specifically allowed in the license agreement or nondisclosure agreements. No part of this document may be reproduced or transmitted in any from or by any means, electronic or other, for any purpose, without the express written permission of ABBYY.
© 2013 ABBYY Production LLC. All rights reserved. ABBYY, ABBYY FineReader, ADRT are either registered trademarks or trademarks of ABBYY Software Ltd. © 1984-2008 Adobe Systems Incorporated and its licensors. All rights reserved. Protected by U.S. Patents 5,929,866; 5,943,063; 6,289,364; 6,563,502; 6,185,684; 6,205,549; 6,639,593; 7,213,269; 7,246,748; 7,272,628; 7,278,168; 7,343,551; 7,395,503; 7,389,200; 7,406,599; 6,754,382 Patents Pending.
Adobe® PDF Library is licensed from Adobe Systems Incorporated. Adobe, Acrobat®, the Adobe logo, the Acrobat logo, the Adobe PDF logo and Adobe PDF Libraryare either registered
trademarks or trademarks of Adobe Systems Incorporated in the United States and/or other countries.
Portions of this computer program are copyright © 2008 Celartem, Inc. All rights reserved. Portions of this computer program are copyright © 2011 Caminova, Inc. All rights reserved. DjVu is protected by U.S. Patent № 6,058,214. Foreign Patents Pending.
Powered by AT&T Labs Technology.
Portions of this computer program are copyright © 2013 University of New South Wales. All rights reserved. © 2002-2008 Intel Corporation.
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Outlook, Excel, PowerPoint, Windows Vista, Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.
© 1991-2013 Unicode, Inc. All rights reserved. © 2010, Oracle and/or its affiliates. All rights reserved.
OpenOffice.org, OpenOffice.org logo are trademarks or registered trademarks of Oracle and/or its affiliates. JasPer License Version 2.0:
© 2001-2006 Michael David Adams © 1999-2000 Image Power, Inc. © 1999-2000 The University of British Columbia EPUB®, is a registered trademark of the IDPF (International Digital Publishing Forum)
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit. (http://www.openssl.org/). This product includes cryptographic software written by Eric Young (eay@cryptsoft.com).
© 1998-2011 The OpenSSL Project. All rights reserved. ©1995-1998 Eric Young (eay@cryptsoft.com) All rights reserved.
This product includes software written by Tim Hudson (tjh@cryptsoft.com). Portions of this software are copyright © 2009 The FreeType Project (www.freetype.org). All rights reserved. All other trademarks are the sole property of their respective owners.
2
ABBYY FineReader 12 User’s Guide
Contents
Introducing ABBYY FineReader 12
What's New in ABBYY FineReader 12
Quick Start
Microsoft Word Tasks
Microsoft Excel Tasks
Adobe PDF Tasks
Tasks for Other Formats
Adding Images Without Processing
Creating Custom Automated Tasks
Integration with Other Applications
Scanning Paper Documents
Photographing Documents
...........................................................................................................................................................10
............................................................................................................................................. 12
.............................................................................................................................................. 13
..................................................................................................................................................... 13
........................................................................................................................................ 14
.................................................................................................................................. 19
.................................................................................................................................... 21
............................................................................................................. 6
.................................................................................................................... 15
..................................................................................................................... 15
................................................................................................................... 17
........................................................................................................ 8
Opening an Image or PDF Document
Scanning and Opening Options
Image Preprocessing
Recognizing Documents
What Is a FineReader Document?
Document Features to Consider Prior to OCR
OCR Options
Working with Complex–Script Languages
Tips for Improving OCR Quality
If the Complex Structure of a Paper Document Is Not Reproduced
If Areas Are Detected Incorrectly
If You Are Processing a Large Number of Documents with Identical Layouts
.............................................................................................................................................................. 35
.............................................................................................................................................. 26
.................................................................................................................................29
................................................................................................................ 24
........................................................................................................................... 24
...................................................................................................................... 29
................................................................................................. 33
......................................................................................................... 36
..................................................................................................................40
......................................................... 40
....................................................................................................................... 40
...................................... 43
If a Table Is Not Detected
If a Picture Is Not Detected
If a Barcode Is Not Detected
Adjusting Area Properties
.................................................................................................................................... 43
................................................................................................................................. 44
............................................................................................................................... 45
..................................................................................................................................... 46
3
ABBYY FineReader 12 User’s Guide
Incorrect Font Is Used or Some Characters Are Replaced with "?" or "□"
If Your Printed Document Contains Non–Standard Fonts
If Your Text Contains Too Many Specialized or Rare Terms
If the Program Fails to Recognize Some of the Characters
If Vertical or Inverted Text Is Not Recognized
Checking and Editing Texts
Checking Texts in the Text Window
Using Styles
Editing Hyperlinks
Editing Tables
............................................................................................................................................................... 55
................................................................................................................................................... 56
........................................................................................................................................................... 56
Removing Confidential Information
Copying Content from Documents
...........................................................................................................................53
................................................................................................................... 53
.................................................................................................................... 57
.............................................................................................................58
............................................................................................... 52
............................................................................ 47
........................................................................ 49
........................................................................ 50
............................................. 46
Saving OCR Results
Saving an Image of a Page
E–mailing OCR Results
..........................................................................................................................................59
................................................................................................................................... 72
........................................................................................................................................... 73
Group Work in a Local Area Network
Automating and Scheduling OCR
Automated Tasks
ABBYY Hot Folder
..................................................................................................................................................... 77
.................................................................................................................................................... 78
Customizing ABBYY FineReader
Main Window
Toolbars
Customizing the Workspace
Options Dialog Box
............................................................................................................................................................ 82
...................................................................................................................................................................... 84
................................................................................................................................. 85
.................................................................................................................................................. 86
................................................................................................................77
..................................................................................................................82
........................................................................................................76
Changing the User Interface Language
............................................................................................................ 87
Installing, Activating, and Registering ABBYY FineReader
Installing and Starting ABBYY FineReader
Activating ABBYY FineReader
Registering ABBYY FineReader
............................................................................................................................... 89
............................................................................................................................ 90
....................................................................................................... 88
.............................................................88
4
ABBYY FineReader 12 User’s Guide
Privacy Policy
ABBYY Screenshot Reader
Appendix
Glossary
Shortcut Keys
Supported Image Formats
Supported Saving Formats
Required Fonts
Regular Expressions
Technical Support
............................................................................................................................................................ 90
...............................................................................................................................................................95
...................................................................................................................................................................... 95
............................................................................................................................................................ 98
....................................................................................................................................................... 104
.............................................................................................................................................. 106
........................................................................................................................................... 109
.............................................................................................................................92
.................................................................................................................................. 102
.................................................................................................................................. 104
5
ABBYY FineReader 12 User’s Guide
Introducing ABBYY FineReader 12
ABBYY FineReader is an optical character recognition (OCR) system that converts scanned documents, PDF documents, and image files (including digital photos) into editable formats.
ABBYY FineReader 12 advantages
Fast and accurate recognition
The OCR technology used in ABBYY FineReader quickly and accurately recognizes and
retains the original formatting of any document.
Thanks to ABBYY's Adaptive Document Recognition Technology (ADRT®), ABBYY
FineReader can analyze and process a document in its entirety, rather than one page at a time. This approach retains the source document's structure, including formatting, hyperlinks, e–mail addresses, headers and footers, image and table captions, page numbers, and footnotes.
ABBYY FineReader is largely immune to printing defects and can recognize texts printed in
virtually any font.
ABBYY FineReader can recognize text photos obtained with a regular camera or a mobile
phone. Additional image preprocessing can greatly improve the quality of your photos, resulting in more accurate OCR.
For faster processing, ABBYY FineReader makes efficient use of multi–core processors and
offers a special black–and–white processing mode for documents where colors need not be preserved.
Supports most of the world's languages*
ABBYY FineReader can recognize texts written in any of the 190 languages that it supports,
or in a combination of those languages. Among the supported languages are Arabic, Vietnamese, Korean, Chinese, Japanese, Thai, and Hebrew. ABBYY FineReader can automatically detect the language of a document.
Ability to check OCR results
ABBYY FineReader has a built–in text editor which allows you to compare recognized texts
against their original images and make any necessary changes.
If you are not satisfied with the results of automatic processing, you can manually specify
image areas to capture and train the program to recognize less common or unusual fonts.
Intuitive user interface
The program comes with a number of preconfigured automated tasks that cover the most
common OCR scenarios and enable you to convert scans, PDFs, and image files into editable documents with a click of a button. Integration with Microsoft Office and Windows Explorer means that you can recognize documents directly from within Microsoft Outlook, Microsoft Word, Microsoft Excel or simply by right–clicking a file on your computer.
The program supports the usual Windows shortcut keys and touchscreen swipes, e.g. to
scroll or zoom in and out of images.
Quick quoting
You can easily copy and paste recognized fragments into other applications. Page images
will open instantly, and will be available for viewing, selection, and copying before the entire document has been recognized.
6
ABBYY FineReader 12 User’s Guide
Recognition of digital photos
You can take a picture of a document with your digital camera, and ABBYY FineReader 12
will recognize the text just as if it was an ordinary scan.
PDF archiving
ABBYY FineReader can convert your paper documents or scanned PDFs into searchable
PDF and PDF/A documents.
MRC compression can be applied to reduce the size of PDF files without impairing their
visual quality.
Supports multiple saving formats and cloud storage services
ABBYY FineReader 12 can save recognized texts in Microsoft Office formats (Word, Excel,
and PowerPoint), in searchable PDF/A and PDF for long–term storage, and in popular e– book formats.
You can save results either locally or in cloud storage services (Google Drive, Dropbox, and
SkyDrive) and access them from anywhere in the world. ABBYY FineReader 12 can also export documents directly to Microsoft SharePoint Online and Microsoft Office 365 (ABBYY FineReader 12 Corporate only).
Includes two bonus applications ABBYY Business Card Reader and ABBYY Screenshot Reader
ABBYY Business Card Reader (available only with ABBYY FineReader 12 Corporate) is a
handy utility that captures data from business cards and saves them directly to Microsoft® Outlook®, Salesforce, and other contact management software.
ABBYY Screenshot Reader is an easy–touse program that can take screenshots of whole
windows or selected areas and recognize the text inside.
Free technical support for registered users
* The set of supported languages may vary in different editions of the product.
7
ABBYY FineReader 12 User’s Guide
What's New in ABBYY FineReader 12
Below follows a brief overview of the major new features and improvements that have been introduced in ABBYY FineReader 12.
Improved recognition accuracy
The new version of ABBYY FineReader delivers more accurate OCR and better recreates the original formatting of your documents thanks to improvements in ABBYY's proprietary Adaptive Document Recognition Technology (ADRT). The program now better detects document styles, headings, and tables, so that you don't have to fix the formatting of your documents once they are recognized.
Recognition languages
ABBYY FineReader 12 can now recognize Russian texts with stress marks. OCR quality has been improved for Chinese, Japanese, Korean, Arabic, and Hebrew.
Faster and friendlier user interface
Background processing
It may take quite some time to recognize very large documents. In the new version, time– consuming processes run in the background, allowing you to continue working on those parts of the document which have already been recognized. Now you don't have to wait for the OCR process to complete before you can adjust image areas, view non–recognized pages, force–start the OCR of a particular page or image area, add pages from other sources, or change the order of pages in the document.
Faster image loading
Page images will appear in the program as soon as you scan the paper originals, so that you can immediately see the scanning results and select pages and image areas to recognize.
Easier quoting
Any image area containing text, pictures or tables can be easily recognized and copied to the Clipboard with a click of the mouse.
All the basic operations, including scrolling and zooming, are now also supported on
touchscreens.
Image preprocessing and camera OCR
The improved image preprocessing algorithms ensure better recognition of photographed texts and produce text photos that look as good as scans. The new photo correction capabilities include automatic cropping, correction of geometrical distortions, and evening out of brightness and background colors.
ABBYY FineReader 12 allows you to select the preprocessing optio ns you wish to apply to any newly added image, so that you won't need to correct each image separately.
Better visual quality for archived documents
ABBYY FineReader 12 includes new PreciseScan technology, which smoothes characters to improve the visual quality of scanned documents. As a result, characters do not look pixelated even when you zoom in on the page.
New tools for manual editing of recognition output
Verification and correction capabilities have been expanded in the new version. In ABBYY FineReader 12, you can format recognized texts in the verification window, which now also includes a tool for inserting special symbols not available on standard keyboards. You can also use keyboard shortcuts for the most frequent verification and correction com mands.
8
ABBYY FineReader 12 User’s Guide
In ABBYY FineReader 12, you can disable recreation of such structural elements as headers, footers, footnotes, tables of contents, and numbered lists. This may be necessary if you want these elements to appear as normal text for better compatibility with other products, e.g. translation software and e–book authoring software.
New saving options
When saving OCR results to XLSX, you can now save pictures, remove text formatting, and
save each page on a separate Excel worksheet.
ABBYY FineReader 12 can create ePub files compliant with the EPUB 2.0.1 and EPUB 3.0
standards.
Improved integration with third–party services and applications
Now you can export your recognized documents directly to SharePoint Online and Microsoft Office 365 (FineReader 12 Corporate only), and the new opening and saving dialog boxes provide easy access to cloud storage services, such as Google Drive, Dropbox, and SkyDrive.
9
ABBYY FineReader 12 User’s Guide
Quick Start
ABBYY FineReader converts scanned documents, PDF documents, and image files (including digital photos) into editable formats.
To process a document with ABBYY FineReader, you need to complete the following four steps:
Acquire an image of the document Recognize the document Verify the results Save the results in a format of your choice
If you need to repeat the same steps over and over again, you can use an automated task, which will execute the required actions with just one click of a button. To process documents with complex layouts, you can customize and run each step separately.
Built–in automated tasks
When you start ABBYY FineReader, the Task window is displayed, listing the automated tasks for the most common processing scenarios. If you can't see the Task window, click the Task button on the main toolbar.
10
ABBYY FineReader 12 User’s Guide
1. In the Task window, click a tab on the left:
o Quick Start contains the most common ABBYY FineReader tasks o Microsoft Word contains tasks that automate conversion of documents to Microsoft
Word
o Microsoft Excel contains tasks that automate conversion of documents to Microsoft
Excel
o Adobe PDF contains tasks that automate conversion of documents to PDF o Other contains tasks that automate conversion of documents to other formats o My Tasks contains your custom tasks (ABBYY FineReaderВ Corporate only)
2. From the Document language drop–down list, select the languages of your document.
3. From the Color mode drop–down list, select a color mode:
o Full color preserves the colors of the document; o Black and white converts the document to black and white, which reduces its size
and speeds up the processing.
Important! Once the document is converted to black and white, you will not be able to restore the colors. To obtain a color document, either scan a paper document in color or open a file that contains color images.
4. If you are going to run a Microsoft Word, Microsoft Excel or PDF task, specify additional
document options in the right–hand part of the window.
5. Start the task by clicking its button in the Task window.
When you start a task, it will use the options currently selected in the Options dialog box (click Tools > Options… to open the dialog box).
While a task is running, a task progress window is displayed, showing the list of steps and alerts issued by the program.
Once the task is executed, the images will be added to a FineReader document, recognized, and saved in the format of your choice. You can adjust the areas detected by the program, verify the recognized text, and save the results in any other supported format.
Document conversion steps
You can set up and start any of the processing steps from the ABBYY FineReader main window.
11
ABBYY FineReader 12 User’s Guide
1. On the main toolbar, select the document languages from the Document language
drop–down list.
2. Scan pages or open page images.
Note: By default, ABBYY FineReader will automatically analyze and recognize the scanned or opened pages. You can change this default behavior on the Scan/Open tab of the Options dialog box (click Tools > Options… to open the dialog box).
3. In the Image window, review the detected areas and make any necessary adjustments.
4. If you have adjusted any of the areas, click Read on the main toolbar to recognize them
again.
5. In the Text window, review the recognition results and make any necessary corrections.
6. Click the arrow to the right of the Save button on the main toolbar and select a saving
format. Alternatively, click a saving command on the File menu.
Microsoft Word Tasks
Using the tasks on the Quick Start tab of the Task window, you can easily scan paper documents and convert them into editable Microsoft Word files. The currently selected program options will be used. If you want to customize the conversion options, use the tasks on the Microsoft Word tab.
1. From the Document language drop–down list at the top of the window, select the
languages of your document.
12
ABBYY FineReader 12 User’s Guide
2. From the Color mode drop–down list, select either full–color or black–and–white mode.
Important! Once the document is converted to black and white, you will not be able to restore the colors.
3. Select desired document options in the right–hand section of the window:
o Document layout options o Select Keep pictures if you want to preserve the pictures in the output document o Select Keep headers and footers if you want to preserve the headers and footers
in the output document
4. Click the button of the task that you need:
o Scan to Microsoft Word scans a paper document and converts it to Microsoft
Word
o Image or PDF File to Microsoft Word converts PDF documents or image files to
Microsoft Word
o Photo to Microsoft Word converts photos of documents to Microsoft Word
As a result, a new Microsoft Word document will be created containing the text of your original document.
Important! When you st art a built–in task, the currently selected program options are used. If you decide to change any of the options, you will need to restart the task.
Microsoft Excel Tasks
Using the tasks on the Microsoft Excel tab of the Task window, you can easily convert images of tables to Microsoft Excel.
1. From the Document language drop–down list at the top of the window, select the
languages of your document.
2. From the Color mode drop–down list, select either full–color or black–and–white mode.
Important! Once the document is converted to black and white, you will not be able to restore the colors.
3. Select desired document options in the right–hand section of the window:
o Document layout options o Select Keep pictures if you want to preserve the pictures in the output document o Select Create separate worksheet for each page if you want each page of the
original document to be saved as a separate Microsoft Excel worksheet
4. Click the button of the task that you need:
o Scan to Microsoft Excel scans a paper document and converts it to Microsoft
Excel
o Image or PDF File to Microsoft Excel converts PDF documents or image files to
Microsoft Excel
o Photo to Microsoft Excel converts photos of documents to Microsoft Excel
As a result, a new Microsoft Excel document will be created containing the text of your original document.
Important! When you start a built–in task, the currently selected program options are used. If you decide to change any of the options, you will need to restart the task.
Adobe PDF Tasks
Using the tasks on the Adobe PDF tab of the Task window, you can easily convert images (e.g. scanned documents, PDF files, and image files) to PDF.
13
ABBYY FineReader 12 User’s Guide
1. From the Document language drop–down list at the top of the window, select the
languages of your document.
2. From the Color mode drop–down list, select either full–color or black–and–white mode.
Important! Once the document is converted to black and white, you will not be able to restore the colors.
3. Select desired document options in the right–hand section of the window:
o Text and pictures only
This option saves only the recognized text and the pictures. The text will be fully searchable and the size of the PDF file will be small. The appearance of the resulting document may slightly differ from the original.
o Text over the page image
This option saves the background and pictures of the original document and places the recognized text over them. Usually, a PDF file saved using this option requires more disk space than a file that has been saved with the Text and pictures only option enabled. The resulting PDF document is fully searchable. In some cases, the appearance of the resulting document may slightly differ from the original.
o Text under the page image
This option saves the entire page image as a picture and places the recognized text underneath. Use this option to create a fully searchable document that looks virtually the same as the original.
o Page image only
This option saves the exact image of the page. This type of PDF document will be virtually indistinguishable from the original but the file will not be searchable.
4. From the Picture drop–down list, select the desired quality of the pictures.
5. Select either PDF or PDF/A.
6. Click the button of the task that you need:
o Scan to PDF scans a paper document and converts it to PDF o Image File to PDF converts image files to PDF o Photo to PDF converts photos of documents to PDF
As a result, a new PDF document will be created and opened in a PDF viewing applicat ion.
Important! When you start a built–in task, the currently selected program options are used. If you decide to change any of the options, you will need to restart the task.
Tip: When saving recognized text in PDF, you can specify passwords to protect the document from unauthorized opening, printing, and editing. For details, see "PDF Security Settings."
Tasks for Other Formats
Use the Other tab in the Task window to access other built –in automated tasks.
1. From the Document language drop–down list at the top of the window, select the
languages of your document.
2. From the Color mode drop–down list, select either full–color or black–and–white mode.
Important! Once the document is converted to black and white, you will not be able to restore the colors.
3. Click the button of the task that you need
o Scan to HTML scans a paper document and converts it to HTML o Image or PDF File to HTML converts PDF documents or image files to HTML o Scan to EPUB scans a paper document and converts it to EPUB o Image or PDF File to EPUB converts PDF documents or image files to EPUB o Scan to Other Formats scans a paper document and converts it to a format of
your choice
14
ABBYY FineReader 12 User’s Guide
o Image or PDF File to Other Formats converts PDF documents or image files to
a format of your choice
As a result, a new FineReader document will be created containing the text of your original document.
Important! When you start a built–in task, the currently selected program options are used. If you decide to change any of the options, you will need to restart the task.
Adding Images Without Processing
You can use the Quick Scan, Quick Open or Scan and Save as Image automated tasks in the Task window to scan or open images in ABBYY FineReader without preprocessing or OCR. This may be useful if you have a very large document and need only some of its pages recognized.
1. From the Color mode drop–down list, select either full–color or black–and–white mode.
Important! Once the document is converted to black and white, you will not be able to restore the colors.
2. Click the automated task that you need:
o Quick Scan scans a paper document and opens the images in ABBYY FineReader
without image preprocessing or OCR.
o Quick Open opens PDF documents and images files in ABBYY FineReader without
image preprocessing or OCR.
o Scan and Save as Image scans a document and saves the scans. Once the
scanning is complete, an image saving dialog box will open.
As a result, the images will be added to a new FineReader document or saved in a folder of your choice.
Creating Custom Automated Tasks
(ABBYY FineReader Corporate only)
You can create your own automated tasks if you need to include processing steps that are not available in the built–in automated tasks.
1. In the Task window, click the My Tasks tab, and then click the Create New button.
2. In the Task Settings dialog box, enter a name for your task in the Task name box.
3. In the left–hand pane, choose what kind of FineReader document to use for the task:
o Create new document
If you choose this option, a new FineReader document will be created when you start the task. You will also need to specify which set of document options the program needs to use when processing your document: the global options specified in the program or the options which you can specify for this particular task.
o Select existing document
Select this option if you want the task to process images from an existing FineReader document. You will need to either specify a FineReader document or choose to have the program prompt you to select a document every time the task starts.
o Use current document
If you choose this option, the images from the active FineReader document will be processed.
4. Choose how you will acquire images:
o Open image or PDF
Select this option if you want the task to process images or PDF documents from a
15
ABBYY FineReader 12 User’s Guide
folder. You will need to either specify a folder or choose to have the program prompt you to select one every time the task starts.
o Scan
If you choose this option, you will need to scan the pages.
Note:
c. This step is optional if earlier you chose Select existing document or Use current
document.
d. If images are added to a document that already contains images, only the newly
added images will be processed.
e. If a FineReader document to be processed contains some pages that have already
been recognized and some pages that have already been analyzed, the recognized pages will not be processed again and the analyzed pages will be recognized.
Add the Analyze step to detect areas on the images and configure this step:
o Analyze the layout automatically, then adjust areas manually
ABBYY FineReader will analyze the images and identify the areas based on their content.
o Draw areas manually
ABBYY FineReader will ask you to draw the appropriate areas manually.
o Use an area template
Select this option if you want an existing area template to be used when the program analyzes the document. You will need to either specify a template or choose to have the program prompt you to select one every time the task starts. For details, see "If You Are Processing a Large Number of Documents with Identical Layouts."
Add the Read step if you need the images to be recognized. The program will use the
recognition options you specified in step 3. Note: When you add the Read step, the Analyze step is added automatically.
Add a Read step to save the recognized text in a format of your choice, email the text or
images, or create a copy of the FineReader document. A task may include multiple Read steps:
o Save document
Here you can specify the name of the file, its format, file options and the folder where the file should be saved.
Note: To avoid specifying a new folder each time the task is started, select Create a time– stamped subfolder.
o Send document
Here you can select the application in which to open the resulting document.
o E–mail document
Here you can specify the name of the file, its format, file options, and the e–mail address to which the file should be sent.
o Save images
Here you can specify the name of the file, its format, file options, and the folder where the image file should be saved.
Note: To save all images to one file, select Save as one multipage image file (applicable only to images in TIFF, PDF, JB2, JBIG2, and DCX).
o E–mail images
Here you can specify the name of the file, its format, file options, and the e–mail address to which the image file should be sent.
16
ABBYY FineReader 12 User’s Guide
o Save FineReader document
Here you can specify the folder to which the FineReader document should be saved.
Specify what options the program should use to save the results. You can choose between the global options specified in the program at the time of saving or the options which you will specify for this particular task.
Remove any unnecessary steps from the task using the button.
Note: Sometimes, removing one step will also cause another step to be removed. For instance, if you remove the Analyze step, the Read step will also be removed, as recognition cannot be carried out without analyzing an image.
Once you have configured all the required steps, click Finish.
The newly created task will appear on the My Tasks tab of the Task window. You can save your task as a file using the Task Manager (click Tools > Task Manager… to open the Task Manager).
You can also load a previously created task: on the My Tasks tab, click Load from Disk and select the file containing the task that you need.
In ABBYY FineReader you can modify, copy, delete, import, and export custom automated tasks. For details, see "Automated Tasks."
Integration with Other Applications
ABBYY FineReader 12 supports integration with Microsoft Office applications and Windows Explorer. This enables you to recognize documents when using Microsoft Outlook, Microsoft Word, Microsoft Excel and Windows Explorer.
Follow the instructions below to recognize a document when using Microsoft Word or Microsoft Excel.
1. Click the button on the ABBYY FineReader 12 tab.
2. In the dialog box that opens, specify the following:
o The source of the image (a scanner or a file) o Document languages o Saving options
3. Click the Start button.
ABBYY FineReader 12 will open and the recognized text will be sent to the Microsoft Office application.
Follow the instructions below to recognize a document when using Microsoft Outlook:
1. Open Microsoft Outlook.
2. Select a message with one or more documents attached.
Tip: You can select specific documents if you do not want to recognize all of the documents in the e–mail attachment.
3. On the ABBYY FineReader 12 tab, click the Convert Image or PDF Attachment
button.
4. In the dialog box that opens, specify the following:
o The document's languages o Saving options
5. Click the Start button.
17
ABBYY FineReader 12 User’s Guide
Tip: If the recognized document's appearance is significantly different from that of the source document, try using different recognition settings or specifying text areas manually. You can find more information about recognition settings in the "Tips for Improving OCR Quality" section.
To open an image or PDF file from Windows Explorer:
1. Select the file in Windows Explorer.
2. Left–click the file and then click ABBYY FineReader 12 >Open in ABBYY FineReader
12 on the shortcut menu.
Note: If the format of the file you selected is not supported by ABBYY FineReader 12, its shortcut
menu will not contain these items.
ABBYY FineReader 12 will start and the image from the selected file will be added to a new FineReader document. If ABBYY FineReader is already running and a FineReader document is open, the image will be added to the FineReader document.
If the ABBYY FineReader button doesn't appear on the Microsoft Office application toolbar or ribbon...
If the ABBYY FineReader 12 tab doesn't appear on the Microsoft Office application ribbon/toolbar:
Click ABBYY FineReader 12 on the shortcut menu of the Microsoft Office application
toolbar.
If the ribbon or toolbar of the Microsoft Office application does not contain the ABBYY FineReader 12 button, FineReader 12 was not inte grated with this application during installation. Integration with Microsoft office applications can be disabled when FineReader 12 is installed manually.
To enable integration:
1. On the taskbar, click the Start button, and then click Control Panel > Programs and
Features.
Note: In Microsoft Windows XP this item is called Add and remove programs. In Microsoft
Windows 8, click Start > All Apps > Control Panel > Programs and Features.
2. Select ABBYY FineReader 12 from the list of installed programs and click the Change
button.
3. Select the desired components in the Custom Installation dialog box.
4. Follow the instruction in the installation wizard.
The first step of the data capture process in ABBYY FineReader is prov iding images to the program. There are several ways to get document images:
Scan a hardcopy document Take a photo of a document Open an existing image file or PDF document
Recognition quality depends on the quality of the image and on the scanning setting s. This section contains information on scanning and taking pictures of documents and on how to remove common defects from scans and photographs.
18
ABBYY FineReader 12 User’s Guide
Problems with the image
Recommendations
Text like this is ready for recognition and no adjustments need to be made.
Characters are disjointed, too bright and too
Decrease the brightness to make the
image darker
Use the grayscale scanning mode
Scanning Paper Documents
You can scan a paper document and recognize the resulting image in ABBYY FineReader 12. Complete the following steps to scan an image.
1. Make sure that the scanner is properly connected to your computer and turn it on.
When connecting a scanner to your computer, follow the instructions in the scanner's manual or other accompanying documentation, and make sure you install the software that comes with the scanner. Some scanners have to be turned on before the computer they are connected to.
2. Place the page you want to scan in the scanner. You can place multiple pages if your
scanner is equipped with an automatic document feeder. Try to make sure that the pages in the scanner are positioned as straight as possible. The document may be converted incorrectly if the text on the scanned image is skewed too much.
3. Click the Scan button or click Scan Pages… on the File menu.
In the scanning dialog box, specify the scanning settings and scan the document. The resulting images will be displayed in the Pages window.
Note: If a FineReader document is already open, newly scanned pages will be appended to the end of this document. If there is no open FineReader document, a new one will be created from these pages.
Tip: If you need to scan documents that were printed on a regular printer, use the grayscale mode and a resolution of 300 dpi for best results.
Recognition quality depends on the quality of the hardcopy document an on the settings used when the document was scanned. Low image quality may adversely affect recognition, so specifying the correct scanning settings and taking the characteristics of the source document into account is important.
Brightness settings
If the brightness was specified incorrectly in the scanning settings, a message prompting you to change the brightness setting will appear during recognition. Scanning some documents in black – and–white mode may require additional brightness adjustments.
Complete the following steps to change the brightness setting:
1. Click the Scan button.
2. Specify the brightness in the dialog box that opens.
Note: The standard brightness setting (50%) works in most cases.
3. Scan the image.
If the resulting image contains many defects such as letters blending together or becoming disjointed, refer to the table below for recommendations on how to get a better image.
19
ABBYY FineReader 12 User’s Guide
thin.
(brightness is adjusted automatically in this mode)
Characters blend together and become distorted because they are too dark and thick.
Increase the brightness to make the image
lighter
Use the grayscale scanning mode
(brightness is adjusted automatically in this mode)
What to do if you see a message prompting you to change the resolution
Recognition quality depends on the resolution of the document image. Low image resolutions (below 150 dpi) may have a negative impact on recognition quality, while images with excessively high image resolutions (over 600 dpi) do not yield any significant i mprovements in recognition quality and take a long time to process.
The message prompting you to change the image's resolution can appear if:
The resolution of the image is less than 250 dpi or greater than 600 dpi. If the image has a non–standard resolution. For example, some faxes have a resolution of
204 by 96 dpi. For best recognition results, the vertical and horizontal resolutions of the image must be the same.
Complete the following steps to change the resolution of an image:
1. Click the Scan button.
2. Select a different resolution in the scanning dialog box.
Note: We recommend using a resolution of 300dpi for documents that do not contain any text smaller than 10 points. Use a resolution of 400–600 dpi for text that is 9 points or smaller.
3. Scan the image.
Tip: You can also use the Image Editor to change an image's resolution. To open the Image Editor, on the Page menu, click Edit Image…).
Scanning facing pages
When you scan facing pages of a book, both pages will appear on the same image.
To improve OCR quality, images with facing pages need to be split into two separate images. ABBYY FineReader 12 features a special mode that automatically splits such images into separate pages within the FineReader document.
20
ABBYY FineReader 12 User’s Guide
Follow the instructions below to scan facing pages from a book or dual pages.
1. Open the Options dialog box (Tools >Options…) and click the Scan/Open tab.
2. Select the Split facing pages option in the General fixes group.
Note: For best results, make sure that the pages are oriented correctly when you scan them and enable the Detect page orientation option in the Scan/Open tab of the Options dialog box.
3. Scan the facing pages.
You can access automatic processing settings by clicking the Options… button in the Open Image dialog box (File >Open PDF File or Image…) or the scanning dialog box.
You can also split facing pages manually:
1. Open the Image Editor (Pages > Edit Image…).
2. Use the tools in the Split group to split the image.
Photographing Documents
Scanning isn't the only way to acquire images of your documents. You can recognize photos of documents taken with a camera or a mobile phone. Simply take a picture of text, save it to your hard disk, and open it in ABBYY FineReader.
When taking pictures of documents, a number of factors should be kept in mind to make the photo better suited for recognition. These factors are described in detail in the sections that follow:
Camera requirements Lighting Taking photos How to improve an image
Camera requirements
Your camera should meet the following requirements in order to obtain document images that can be reliably recognized.
Recommended camera characteristics
Image sensor: 5 million pixels for A4 pages. Smaller sensors may be sufficient for taking
pictures of smaller documents such as business cards.
Flash disable feature Manual aperture control, i.e. availability of Av or full manual mode Manual focusing An anti–shake system or ability to use a tripod Optical zoom
Minimum requirements
2 million pixels for A4 pages. Variable focal distance.
Note: For detailed information about your camera, please refer to the documentation supplied with your device.
21
ABBYY FineReader 12 User’s Guide
Lighting
Lighting greatly affects the quality of the resulting photo.
Best results can be achieved with bright and evenly distributed light, preferably daylight. On a bright sunny day, you can increase the aperture number to get a sharper picture.
Using a flash and additional lighting sources
When using artificial lighting, use two light sources positioned so as to avoid shadows or
glare.
If there is enough light, turn the flash off to prevent sharp highlights and shadows. When
using the flash in poor lighting conditions, be sure to take photos from a distance of approximately 50 cm.
Important! The flash must not be used to take pictures of d ocuments printed on glossy paper. Compare an image with glare and a good quality image:
If the image is too dark
Set a lower aperture value to open up the aperture. Set a higher ISO value. Use manual focus, as automatic focus may fail in poor lighting conditions.
Compare an image that is too dark with a good quality image:
22
ABBYY FineReader 12 User’s Guide
Taking photos
To obtain good quality photos of documents, be sure to position the camera correctly and follow these simple recommendations.
Use a tripod whenever possible. The lens should be positioned parallel to the page. The distance between the camera and
the document should be selected so that the entire page fits within the frame when you zoom in. In most cases this distance will be between 50 and 60 cm.
Even out the paper document or book pages (especially in the case of thick books). The
text lines should not be skewed by more than 20 degrees, otherwise the text may not be converted properly.
To get sharper images, focus on the center of the image.
Enable the anti–shake system, as longer exposures in poor lighting conditions may cause
blur.
Use the automatic shutter release feature. This will prevent the camera from moving when
you press the shutter release button. The use of automatic shutter release is recommended even if you use a tripod.
23
ABBYY FineReader 12 User’s Guide
How to improve an image if:
the image is too dark or its contrast is too low.
Solution: Try to improve the lighting. If that is not an option, try setting a lower aperture value.
the image is not sharp enough.
Solution: Autofocus may not work properly in poor lighting or when taking pictures from a close distance. Try using brighter lighting. Use a tripod and self–timer to avoid moving the camera when taking the picture. If the image is only slightly blurred, try the Photo Correction tool that is available in the Image Editor. For more information, see "Editing Images Manually."
a part of the image is not sharp enough.
Solution: Try setting a higher aperture value. Take pictures from a greater distance at maximum optical zoom. Focus on a point between the center and the edge of the image.
the flash causes glare.
Solution: Turn off the flash or try using other light sources and increasing the distance between the camera and the document.
Opening an Image or PDF Document
ABBYY FineReader 12 lets you open PDF files and image files of supported formats.
Complete the following steps to open a PDF file or an image file:
1. Click the Open button on the main toolbar or click Open PDF File or Image… on the File
menu.
2. Select one or more files in the dialog box that opens.
3. If you selected a file with multiple pages, you can specify the range of page you want to
open.
4. Enable the Automatically process pages as they are added option if you want to
automatically preprocess images. Tip: The Options dialog lets you choose how images are preprocessed: which defects will be removed, whether the document will be analyzed and so forth. To open the Options dialog box, click the Options… button. For more on preprocessing settings, see " Scanning and Opening Options."
Note: If there is a FineReader document open when you open new page images or documents, the new pages will be added to the end of this FineReader document. If no FineReader document is open, a new one will be created from the newly added pages.
Note: Access to some PDF files is restricted by their authors. Such restrictions include password protection, restrictions on opening the document and restrictions on copying content. When opening such files, ABBYY FineReader may request a password.
Scanning and Opening Options
To customize the process of scanning and opening pages in ABBYY FineReader, you can:
enable/disable automatic analysis and recognition of newly added pages select various image preprocessing options select a scanning interface
24
ABBYY FineReader 12 User’s Guide
You can access these settings from dialog boxes for opening and scanning documents (if you are using the scanning interface of ABBYY FineReader 12) and on the Scan/Open tab of the Options dialog box (Tools > Options…).
Important! Any changes you make in the Options dialog box will only be applied to newly scanned/opened images.
The Scan/Open tab of the Options dialog box contains the following options:
Automatic analysis and recognition settings
By default, FineReader documents are analyzed and recognized automatically, but you can change this behavior. The following modes are available:
Read page images (includes image preprocessing)
Any images added to a FineReader document are preprocessed automatically using settings from the Image Processing options group. Analysis and recognition are also performed automatically.
Analyze page images (includes image preprocessing)
Image preprocessing and document analysis are performed automatically, but recognition has to be started manually.
Preprocess page images
Only preprocessing is carried out automatically. Analysis and recognition have to be started by hand. This mode is commonly used for documents with complex structures.
If you do not want the images you add to a FineReader document to be automatically processed, clear the Automatically process pages as they are added option. This lets you quickly open large documents, recognize only select pages in a document and save documents as images.
Image preprocessing options
ABBYY FineReader 12 lets you automatically remove common scan and digital photo defects.
General fixes
Split facing pages
The program will automatically split images that contain facing pages into two images containing a page each.
Detect page orientation
The orientation of pages that are added to a FineReader document will be automatically detected and corrected if necessary.
Deskew images
Skewed pages will be automatically detected and deskewed if necessary.
Correct trapezoid distortions
The program will automatically detect trapezoidal distortions and uneven text lines on digital photographs and scans of books. These defects will be corrected when appropriate.
Straighten text lines
The program will automatically detect uneven text lines on images and straighten them without correcting trapezoidal distortions.
Invert images
When appropriate, ABBYY FineReader 12 will invert an image's colors so that the image contains dark text on a light background.
Remove color marks
The program will detect and remove any color stamps and marks made in pen to facilitate the recognition of the text obscured by such marks. This tool is designed for scanned documents with dark text on a white background. Do not select this option for digital photos and documents with color backgrounds.
25
ABBYY FineReader 12 User’s Guide
Correct image resolution
ABBYY FineReader 12 will automatically determine the best resolution for images, and will change the resolution of images when necessary.
Photo correction
Detect page edges
Sometimes digital photographs have borders that do not contain any useful data. The program will detect such borders and delete them.
Whiten background
ABBYY FineReader will whiten backgrounds and select the best brightness for images.
Reduce ISO noise
Noise will be automatically removed from photographs.
Remove motion blur
The sharpness of blurry digital photos will be increased.
Note: You can disable all of these options when scanning or opening document pages and still apply any desired preprocessing in the Image Editor. For details, see "Preprocessing Images."
Scanning interfaces
By default, ABBYY FineReader uses its own scanning interface. The scanning dialog box contains the following options:
Resolution, Scanning mode, and Brightness Paper Settings Image Processing
Tip: You can choose which preprocessing features to enable, which defects to remove, and whether the document should be automatically analyzed and recognized. To do so, enable the Automatically process pages as they are added option and click the Options… button.
Multi–page Scanning:
a. Use automatic document feeder (ADF) b. Duplex scanning c. Set the page scanning delay in seconds
If the scanning interface of ABBYY FineReader 12 is incompatible with your scanner, you can use your scanner's native interface. The scanner's documentation should contain descriptions of this dialog box and its elements.
Image Preprocessing
Distorted text lines, skew, noise, and other defects commonly found in scanned images and digital photos can lower recognitio n quality. ABBYY FineReader can remove these defects automatically, and also lets you remove them manually.
Automatic image preprocessing
ABBYY FineReader has several image preprocessing features. If these features are enabled, the program automatically determines how an image can be improved based on its type and applies any necessary enhancements: removes noise, corrects skew, straightens text lines, and corrects trapezoidal distortions.
Note: These operations may take a significant amount of time.
26
ABBYY FineReader 12 User’s Guide
Complete the steps below if you want ABBYY FineReader 12 to automatically preprocess all images that are opened or scanned.
1. Open the Options dialog box (Tools >Options…).
2. Click the Scan/Open tab and make sure that the Automatically process pages as
they are added option in the General group is enabled and the necessary operations are selected in the Image preprocessing group.
Note: Automatic image preprocessing can also be enabled and disabled in the Open Image dialog box (File >Open PDF File or Image… ) and in the scanning dialog box.
Editing images manually
You can disable automatic preprocessing and edit images manually in the Image Editor.
Follow the instructions below to edit an image manually:
1. Open the Image Editor by clicking Edit Image… on the Page menu.
The left–hand part of the IMAGE EDITOR contains the page of the FineReader document that was selected when you opened the Image Editor. The right –hand part contains multiple tabs with tools for editing images.
27
ABBYY FineReader 12 User’s Guide
2. Select a tool and make the desired changes. Most of the tools can be applied to selected
pages or to all pages in the document. You can select pages using the Selection drop– down list or in the Pages window.
3. Click the Exit Image Editor button after you are done editing the image.
The image editor contains the following tools:
Recommended Preprocessing The program automatically determines which
adjustments need to be made to the image. Adjustments that may be applied include noise and blur removal, color inversion to make the background color light, skew correction, straightening of text lines, correction of trapezoidal distortion, and trimming of image borders.
Deskew Corrects image skew. Straighten Text Lines Straightens any curved text lines on the image. Photo Correction Tools in this group let you straighten text lines, remove noise and blur,
and turn the document's background color into white.
Correct Trapezoid Distortion Corrects trapezoidal distortions and removes image edges
that don't contain any useful data. When this tool is selected, a blue grid appears on the image. Drag the grid's corners to the corners of the image. If you do this correctly, the grid's horizontal lines will be parallel to the text lines. Now click the Correct button.
Rotate & Flip Tools in this group let you rotate images and flip them vertically or
horizontally to get the text on the image facing in the right direction.
Split Tools in this group let you split the image into parts. This can be helpful if you are
scanning a book and need to split facing pages.
Crop Removes image edges that don't contain any useful information. Invert Inverts image colors. This can be useful if you're dealing with non–standard text
coloring (light text on a dark background).
Resolution Changes image resolution. Brightness & Contrast Changes the brightness and contrast of the image. Levels This tool lets you adjust the color levels of the images by changing the intensity of
shadows, light, and halftones. To raise the contrast of an image, move the left and right sliders on the Input levels histogram. The left slider sets the color that will be considered to be the blackest part of the image, and the right slider sets the color that will be considered to be the whitest part of the image. Moving the middle slider to the right will darken the image, and moving it to the left will lighten the image. Adjust the output level slider to decrease the contrast of the image.
Eraser Removes a part of the image. Remove Color Marks Removes any color stamps and marks made in pen to facilitate the
recognition of the text obscured by such marks. This tool is designed for scanned documents with dark text on a white background. Do not use this tool for digital photos and documents with color backgrounds.
28
ABBYY FineReader 12 User’s Guide
Recognizing Documents
ABBYY FineReader uses Optical Character Recognition (OCR) technologies to convert document images into editable text. Prior to OCR, the program analyzes the structure of the entire document and detects the areas that contain text, barcodes, images, and tables. OCR quality can be improved by selecting the correct document language, reading mode and print type prior to recognition.
By default, FineReader documen ts are recognized automatically. The current program settings are used for automatic recognition.
Tip: You can disable automatic analysis and OCR for newly added images on the Scan/Open tab of the Options dialog box ( Tools > Options…).
In some cases, the OCR process can be started manually. For example, if you disabled automatic recognition, selected areas on an image manually, or changed the following settings in the Options dialog box (Tools > Options…):
the recognition language on the Document tab the document type on the Document tab the color mode on the Document tab the recognition options on the Read tab the fonts to use on the Read tab
To launch the OCR process manually:
Click the Read button on the main toolbar, or Click Read Document on the Document menu
Tip: To recognize the selected area or page, use the appropriate options on the Page and Area menus, or use the shortcut menu.
What Is a FineReader Document?
While working with the program, you can save your interim results in a Fi neReader document so that you can resume your work where you left off. A FineReader document contains the source images, the text that has been recognized on the images, your program settings, and any user patterns, languages or language groups that you have created in order to recognize the text on the images.
Working with an FineReader document:
Opening a FineReader document Adding images to a FineReader document Removing a page from a document Saving documents Closing a document Splitting FineReader documents Ordering pages in a FineReader document Document properties Patterns and languages
29
ABBYY FineReader 12 User’s Guide
Opening a FineReader document
When you start ABBYY FineReader, a new FineReader document is created. You can use this document or open an existing one.
To open an existing FineReader document:
1. On the File menu, click Open FineReader Document…
2. Select the desired document in the dialog box that opens.
Note: When you open a FineReader document that was created in an earlier version of the program, ABBYY FineReader will try to convert it to the current version of the FineReader document format. This process is irreversible, and you will be prompted to save the converted document under a different name. Recognized text from the old document will not be carried over to the new document.
Tip: If you want the last document you worked on to be opened when you start ABBYY FineReader, select the Open the last used FineReader document when the program starts option on the Advanced tab of the Options dialog box (click Tools > Options… to open the dialog box).
You can also open a FineReader document from Windows Explorer by right –clicking it and then
clicking Open in ABBYY FineReader 12 . FineReader documents have the icon.
Adding images to a FineReader document
1. On the File menu, click Open PDF File or Image…
2. Select one or more image files in the dialog box that opens and click Open. The image will
be added to the end of the open FineReader document, and its copy will be saved in the document's folder.
You can also add images from Windows Explorer to a FineReader document. Right –click an image in Windows Explorer and then click Open in ABBYY FineReader on the shortcut menu. If a FineReader document is open when you do so, the images will be added to the end of this document. If this is not the case, a new FineReader document will be created from the images.
Scans can also be added. For details, see "Scanning Paper Documents."
Removing a page from a document
Select a page in the Pages window and press the Delete key, or On the Page menu, click Delete Page from Document, or Right–click the selected page and click Delete Page from Document.
You can select and delete more than one page in the Pages window.
Saving documents
1. On the File menu, click Save FineReader Document…
2. Specify the path to the folder in which you want to save the document and the document's
name in the dialog box that opens.
Important! When you save a FineReader document, any user patterns and languages that were created when you were working with this document are saved in addition to page images and text.
30
Loading...
+ 79 hidden pages