13. Support and Contact ......................................................................................... 102
www.foxitsoftware.com
info@luratech.com
PDF Compressor Manual: v
PDF Compressor Manual
Chapter 1. Introduction
PDF Compressor is an industrial strength solution for automated document conversion and
compression. Through its scalability it is suited for processing small amounts of data as well
as large quantities of input.
The PDF Compressor compresses and converts scanned documents to PDF or PDF/A. Using
the Born Digital Module further digital input formats such as MS Office™ documents, PDF
and e-mails in MSG or EML format including attachments can be converted to PDF/A.
The outstanding document compression provided by the PDF Compressor greatly reduces
file sizes while ensuring prime document quality in an ISO standard compliant format suitable for long-term archiving.
The fully integrated OCR engine provides for full-text searchable PDF compression results.
Created documents can automatically be handed over to downstream systems for various
further business processes.
PDF Compressor can generate PDF files for different requirements, e.g. without using newer
PDF features in order to be compatible to Adobe™ Reader™ 5.0 (or other older viewers) or
archivable documents that meet the PDF/A standard (PDF/A-1, PDF/A-2, PDF/A-3).
PDF Compressor offers the following features:
• Scalable high quality compression of color document images
• Applying OCR to create full-text searchable PDFs from scanned pages
• Conversion from PDF to PDF/A
• Optimization of images or scanned pages inside existing PDF documents
• Merging, splitting and renaming of documents
• 1D- and 2D-Barcode recognition and splitting, renaming and PDF bookmark creation by
barcode
• Combining digital documents with scanned pages, e.g. appending certificates and credentials to an online application
• Unified conversion of digitally created and scanned documents to PDF/A for long term
archiving
• Automatic input ingestion from watched input folders
• Time-triggered starting and stopping of jobs
• Fully automated processing in Job List Processing mode
www.foxitsoftware.com
info@luratech.com
Introduction: 1
PDF Compressor Manual
Chapter 2. Installation
Run the self-extracting installer and follow the installer instructions. The installer should
be named PDFC_Setup_<xXX>_v<N.N.NN.NNN>.exe, where xXX corresponds to the system
architecture, either "x86" or "x64" and N.N.NN.NNN corresponds to the product’s version
number, e.g. 7.0.0.156).
Note: You need Administrator rights to install PDF Compressor.
System Requirements
• Windows™ 32 Bit or 64 Bit, Vista / Windows 7 / Windows 8 / Server 2008 / Server 2012
• CPU: Intel / AMD or compatible x86- or x64 processors, single core and multi core processors supported
• CPU Speed: Minimum 1 GHz, 2 GHz or more recommended
• RAM: Minimum 1 GB per licensed processor core, 2 GB per core recommended, more
than 2 GB for conversion of very large documents on 64 bit systems
• Disk space required for installation: 1 GB
• Microsoft™ .NET Framework 4.0
Note: These are minimum installation requirements. Required resources for production environments should be clarified with Foxit Europe.
Requirements of the Born Digital Conversion Option
In addition to the general system requirements, the following conditions must be met for
Born Digital document conversion of the PDF Compressor standard version.
To convert PDF to PDF/A no preparations are necessary.
For successful conversion of other born digital file formats the following conditions must
be met:
1. Windows™ 64 Bit, Windows 7 / Windows 8 / Server 2008 R2 / Server 2012
2. The necessary office applications have to be installed.
• For conversion of Microsoft Office™ Documents the appropriate components of Office
2007 or higher must be installed such that all Office document types submitted to
conversion can be opened. Ensure that the Microsoft Office applications can save as
PDF (the 'Microsoft Save as PDF or XPS' Add-In must be installed for Office 2007 prior
SP2).
• For conversion of ODF documents (Open Document Format) the appropriate components of OpenOffice.org 3.2 must be installed.
3. The service "PDF Compressor" must run under a dedicated user account.
• Use the button "Configure system for office conversion" in File→Options→Conversion
Options to set the logon settings and necessary system configurations.
www.foxitsoftware.com
info@luratech.com
Installation: 2
PDF Compressor Manual
• The account can be a local or a domain user account, but it should have administrative
privileges on the computer.
• The account must have sufficient permissions to access the actual files to convert and
to access the office applications via (D)COM interface.
4. Manual preparation before first automatic conversion:
• Log in with the dedicated user account (set up for the service) and start each of the
Office applications used for conversion.
• Verify that all required input file types can be opened properly in Microsoft Office™
and/or OpenOffice.org. This procedure is required to complete the Office installation
and to ensure its completeness.
• For Microsoft Office™ test the "Save as PDF" feature.
• Ensure that all startup dialogs requiring user input are deactivated.
• In the office application options the trust center settings should disable macros and
other active elements (ActiveX) without notification.
• Automatic updates of references, fields and external content during document
opening or printing must be deactivated.
• For some conversions it is necessary to have a default printer installed.
Network Requirements
PDF Compressor can run as a stand-alone application without any network, but typically it
will access net shares by reading and/or writing files. The system administrator must ensure
that the account used to run the PDF Compressor’s service has the corresponding access
rights. See “Managing the PDF Compressor Service” (p. 87) for more details.
Service Requirements for OCR
The OCR function is implemented by using the ABBYY™ FineReader™ Engine SDK. To use the
OCR function it is necessary that the ABBYY SDK 11 Licensing Service is running. The service
is set up during installation and should not be disabled.
Additional OCR Languages
In addition to the languages installed there are more OCR languages available. Please contact Foxit Europe Support (support@luratech.com [mailto:support@luratech.com]) to receive the files neccessary for the installation of additional OCR languages. PDF Compressor
supports OCR for almost any language with latin script as well as some languages with Greek
and Cyrillic characters. With an additional license the recognition of Asian languages (Chinese, Japanese, Korean) and Arabic or Gothic text type is also available.
There is a separate setup package for the additional OCR languages. It provides a number feature add-ons and language bundles to choose from. Please note that the additional
OCR languages must match the version of the OCR engine of the installed PDF Compressor.
Therefore, when updating PDF Compressor to a version with a newer OCR engine, a warning
www.foxitsoftware.com
info@luratech.com
Installation: 3
PDF Compressor Manual
is shown that the installed additional OCR languages must also be updated. Please contact
the support in that case to get the new package.
Custom OCR Dictionary
In addition to the standard OCR languages PDF Compressor supports a user-supplied custom
dirctionary. This dictionary can e.g. contain words not directly pertaining to any language
such as place or brand names. A custom dirctionary has to be provided in the form of an
ABBYY .amd file. The Foxit Europe support can assist you with creating such a file.
If a file "PDFCUserDictionary.amd" is present in the "ocr\Data\ExtendedDictionaries" directory in the PDF Compressor’s installation folder it is ingested as a custom dictionary and will
be applied to all jobs for which OCR is activated. As a consequence special words contained
in this dictionary shold be recogized better by the OCR. Nevertheless, you have to select
one or more of the standard OCR languages for each job with OCR.
In case an additional file "PDFCUserAlphabet.txt" is present in the "ocr\Data\ExtendedDictionaries" folder its contents are taken as additional characters for the custom dictionary. In
this way recognition of special characters - e.g. letters with diacritic marks - that are not part
of the base language, but appear in the custom dictionary, can be enabled. This alphabet
need not contain the standard letters of the base language(s) - only the additional characters. The "PDFCUserAlphabet.txt" file must be UTF-8 encoded.
Location of temporary files
PDF Compressor may store temporary files during processing. These files will be located in
a subfolder "LT_PDF_Compressor" in the temporary folder of the service account. The path
of the temporary folder is specified by the following environment variables: TMP, TEMP or
USERPROFILE. If none of these variables is specified for the user or system the Windows
directory is used instead. Alternatively the environment variable LT_PDFCOMP_TMP may
be used to specify a temporary folder for PDF Compressor.
www.foxitsoftware.com
info@luratech.com
Installation: 4
PDF Compressor Manual
Chapter 3. Concept of PDF Compressor
The PDF Compressor is designed to convert files automatically without any user interaction.
The PDF Compressor, after being configured, will run in the background. You do not need
to be logged in on the system any longer. Even after a shutdown and reboot of your system,
PDF Compressor will continue to process its jobs.
This kind of functionality results from PDF Compressor consisting of two basic components:
• A graphical user interface (GUI) that allows you to configure and monitor the system.
• An underlying Windows™ service, which performs the conversion jobs.
However, the GUI can be used as any ordinary Windows™ based application without even
noticing the underlying Windows™ service.
Graphical User Interface
Job List
The job list is the main element of the GUI. When you first start the PDF Compressor you
will see an empty list. Before you can do anything you need to add an entry to this list. This
can be done by creating a new entry or by simply dragging & dropping files or folders onto
the list window (compare “Adding Entries to the List” (p. 8)).
An entry has many properties. One of them is the input file or folder you want to process,
others include all the parameters that can be configured for the file conversion. Since you
have a list with a virtually unlimited number of entries, you can setup different types of file
conversions, each with its own, independent set of parameters.
You can configure entries to access different net shares within your network, or you might
use them to create different versions of PDF files (high resolution, high quality up to low
resolution, high compression) from the same input files.
www.foxitsoftware.com
info@luratech.com
Concept of PDF Compressor: 5
PDF Compressor Manual
The complete job list can be stored to a file or restored from a file to enable backup functionality or the management of completely different setups.
Job Entry and its Properties
Within each entry you define its properties:
• what kind of input is accessed (a file, all files within a folder, what file types, etc.),
• how it is accessed (once or frequently to make a folder a hot folder),
• where to put the output,
• what to do in case of an error,
• what additional data (metadata) is to be added,
• how to compress the data (quality, resolution, PDF compatibility, etc.).
Each entry can be started and stopped individually allowing you to run individual file conversions to your needs.
Default Properties
Since most of the time you will not want to create job entries with completely different
settings, you can configure Default Properties for new entries. Whenever you create a new
entry the settings of the default entry will be used as a template.
With a properly chosen set of default properties, converting files to PDF is achieved by simply dragging & dropping files onto the list and pressing the start button.
PDF Compressor Service
Normally you will not notice the underlying Windows™ service. It just processes the files
you setup within the GUI. Since the GUI is independent from the service, you can configure
and start one or more job entries, close the GUI and log off if you want. When you open the
GUI again, you will be informed of the progress of your running jobs.
Born Digital Conversion
The standard version of PDF Compressor comprises an optional component for conversion
of Born Digital files, such as Microsoft Office™ and OpenOffice™ documents and e-mail mes-
sages. You can choose whether to install this component, since using this component requires an additional license.
An installed and licensed Born Digital component provides for the following types of conversions:
1. Microsoft Office™, OpenOffice™, RTF, HTML and plain text documents are converted to
PDF or PDF/A. The distinction between these input formats is specified via the options
on the Input tab of the Properties dialog (see “Configuring Input Data” (p. 13)).
The exact list of format types depends on your installation and configuration of the MS
Office and OpenOffice applications. Generally supported applications and formats in-
www.foxitsoftware.com
info@luratech.com
Concept of PDF Compressor: 6
PDF Compressor Manual
clude Microsoft Word, Excel, PowerPoint and Outlook as well as OpenOffice Writer, Calc
and Impress.
2. PDF Compressor also supports the conversion of e-mail messages saved in the EML and
MSG formats. Attachments of the supported input file types are extracted and converted, as well.
3. PDF documents can be converted to PDF or PDF/A. Again, the distinction between these
output formats is specified via the options on the Output tab of the Properties dialog
(see “Configuring Data Output” (p. 25)). Note that the conversion of PDF does not
involve the MS Office or OpenOffice applications.
Certain options are available to improve the stability of the Born Digital conversion, e.g. by
restarting it automatically. For details see “Born Digital Options” (p. 90).
www.foxitsoftware.com
info@luratech.com
Concept of PDF Compressor: 7
PDF Compressor Manual
Chapter 4. Getting Started
This section describes the basic steps to convert scanned document files to PDF files. More
details can be found under Chapter 5, Setting up Job Entries (p. 10).
Starting the GUI
To start the GUI, either double click the PDF Compressor taskbar icon or use the Windows™ start menu entry
Foxit → PDF Compressor → PDF Compressor.
When you start the GUI for the first time you will see an empty list:
Adding Entries to the List
There are several ways to add new entries. The most straightforward one is opening the
Windows™ Explorer and dragging & dropping a file or a folder onto the list window. The
new entry will be set up to convert the given file (or all files within the given folder) to PDF.
This output will be placed next to the corresponding input file.
1
Changing the default properties can change this behavior. The explanations given here are related to the Factory Default set-
tings.
www.foxitsoftware.com
info@luratech.com
1
Getting Started: 8
PDF Compressor Manual
Running Jobs
There are several ways to start jobs represented by the list entries. Here we merely describe
one of them:
Select the entry you want to start and press the start button on the toolbar. The status
of the entry will change from stopped via starting to working, and back to stopped when
the job is finished.
While the job is running, you can abort it by pressing the stop button on the toolbar. It
may take some seconds until the conversion has been aborted. Any incomplete PDF output
file will be removed.
Removing List Entries
You can close the GUI without saving your current job list. The list will re-appear in exactly
the same way when you open the GUI again2. To remove an entry from the list, select it and
press the delete button on the toolbar.
2
The current list is instantly stored within the system to make it available to the underlying PDF Compressor service.
www.foxitsoftware.com
info@luratech.com
Getting Started: 9
PDF Compressor Manual
Chapter 5. Setting up Job Entries
You can edit a job entry’s properties once it has been added to the list (see “Adding Entries to
the List” (p. 8)). The properties dialog window can be opened by any of the following actions:
• Double-click the corresponding entry within the list.
• Use the right-click context menu associated with the list entry.
•
Select the entry and use the toolbar’s properties button .
• Select the entry and use the menu Entry → Properties.
Within the properties dialog you can change the name of the entry to make it easier for you
to identify the corresponding job. The entry name is also used within the log file (see “LogFiles” (p. 80)) to identify information, warnings and errors.
Configuring General Job Properties
The General tab lets you configure general properties of the selected job. You can define
the priority of the job here and you can configure a delayed start and periodical stop and
restart of the job.
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 10
PDF Compressor Manual
Job List Processing Order
The Job priority can be set to values from 1 (highest) to 10 (lowest). Lower numbers correspond to higher priority. This setting is only available when the menu File → Priority Pro-
cessing Order is enabled. The toolbar button can also be used to switch Priority Processing mode on and off.
1
Priority Processing Order
If Priority Processing Order is enabled, jobs with higher priorities are processed before lower
priority jobs. All jobs with the same priority are executed concurrently by processing its
job units in interleaved order: Once a unit is done, the next unit of the next job with same
priority is processed. When more than one CPU core is licensed and enabled, more than one
job unit will be processed at the same time. The list of progress bars located in the lower
part of the main window shows information for each CPU core, including the job units that
are currently processed.
1
All entries of the job list must be stopped to change the Priority Processing mode.
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 11
PDF Compressor Manual
A job unit contains all input files that contribute to one output PDF document2. If inputmerging is enabled (see “Configuring Input Data” (p. 13)), this includes all files that are
merged into a single document, otherwise it is just one single input file.
Hot folder jobs (see “Configuring Input Data” (p. 13)) are considered done when their
input folder is empty. Once new input files are detected, a hot folder job becomes active
again and is processed according to its priority setting.
Priority Processing Order allows setting up complex job dependencies. High priority hot folder jobs can be used to dynamically suspend other lower priority jobs. The priority of a job
can be changed at any time, even when it is running. Thus it allows you to start an interim
job to be processed immediately without stopping a currently running job.
Please find more information on priority processing options under “Priority Process-ing” (p. 82).
List Processing Order
Priority Processing Order is the preferred way to control the order in which jobs are
processed. For backward compatibility List Processing Order is still available. To enable ListProcessing Order, disable Priority Processing Order from the File menu.
In List Processing Order mode, all jobs are processed in the order they appear in the job list.
Priorities are ignored and jobs are executed sequentially. There is no way to dynamically
suspend a running job.
Job List Processing
In Job List Processing mode (see Chapter 7, Automatic Job List Processing (p. 73)) PDF
Compressor dynamically reads job lists from the job list folder. In API Mode, jobs can be
passed via a programmatic interface (see Chapter 8, Submitting Jobs via the PDF CompressorAPI (p. 77)). These modes permit controlling PDF Compressor externally in a workflow
setup.
For Job List Processing, the List Processing Order mode can be used to enforce sequential
processing of the job lists and the jobs contained. In normal Priority Processing Order mode
the jobs are processed in parallel and new job lists are added as soon as at least one processor core becomes idle and there are no more jobs pending.
Time Scheduled Processing and Processing Timeout
Settings for Time Scheduled Processing are enabled by selecting their check boxes. The following parameters can then be set for a time-scheduled job:
1. The desired start time is entered in the Start job at field. You can either type in a date
and time or pick it from the calendar widget, which appears when clicking on the arrow
to the right of the entry. The job is then started at this point in time.
2. The Duration specifies the time span for which the job should run. You can select days,
hours or minutes to specify the duration. Once this time has elapsed the job is stopped.
2
Please note that a PDF input file might be split into multiple output files. For further information, see “Configuring Data Out-
put” (p. 25).
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 12
PDF Compressor Manual
3. The Restart interval denotes a second time span, after which the stopped job is started
again for the given duration. This value can only be given in days. Once the job has been
started, such re-starts occur periodically, unless the configuration is changed.
Note: The Duration and Restart interval settings can be enabled independently. This means
that a job can run just once for a given duration without restart. And on the other hand a
job can be periodically restarted, without a fixed duration, assuming that it will finish once
it has processed all its current input.
With the Processing Timeout option you can set a maximum duration for the processing of a
job unit. (The definition of job unit is given above under “Priority Processing” (p. 82).) If
processing of a job unit takes longer than the specified duration, the processing of this unit
is aborted. Such an abort is considered an error, which is recorded in the log file. Concerning
optionally configured post-processing steps an abort is also treated as an error.
Configuring Advanced Job Options
The Advanced Job Options dialog lets you configure additional error handling and Foxit Europe maintenance settings.
1. If the option Continue job on critical error is turned on, PDF Compressor will not stop a
hot folder when a critical error occurs (such as "input folder does not exist" or "output
folder cannot be created") but to go idle instead and try again later.
2. The Foxit Maintenance entry should not be used unless advised by the Foxit Europe Support Team. The Key must be left blank for normal processing.
Configuring Input Data
The Input tab lets you configure which input documents should be converted to PDF format.
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 13
PDF Compressor Manual
Currently TIFF, JPEG, JP2 (JPEG 2000), BMP, PNM, GIF, PNG, PDF and JPM (JPEG 2000 part 6)
formats are accepted for input files. With the optional Born Digital Module, additional input
formats - such as Microsoft Office and saved e-mail messages - can be converted.
Multipage TIFF and PDF files are recognized and converted to multipage PDF documents.
Moreover you can set up input resampling to adjust the resolution of your data.
1. Choose File if you want to convert a single file, or Directory if you want to convert all files
within the given folder. The browse button lets you select the file or directory.
You cannot use mapped network drives to specify the input file or directory. The underlying PDF Compressor Service has no knowledge on mapped network drives. Instead you
can specify the network share using the syntax \\host\share\dir. See “Choosing theService Account” (p. 88), for setting access rights in this case.
2. When Directory is selected, you have the following options:
a. Enable Include subfolders if the whole directory sub tree should be converted. Other-
wise only files directly within the given folder will be processed.
b. Enable Delete empty folders if you have selected Include subfolders and want to clean
up the input sub tree. This can be useful if you configure a setup that moves or deletes
all processed input files (see also “Configuring Post-Processing” (p. 37)). Once all
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 14
PDF Compressor Manual
files of the given job have been processed, the output folder will be examined and all
empty subfolders will be deleted. If the input folder is a hot folder (see next item), the
cleanup process will be performed whenever the hot folder becomes idle, i.e. when
no more files are found to be processed at the moment.
c. Enable Check every <number> seconds to make the given folder a hot folder, i.e. to
let the job check that folder frequently for newly generated files (e.g. by an external
scanner or capture software). A job that has this options enabled will never stop automatically, since it keeps on monitoring the given folder. You have to stop it manually.
Note: This option only makes sense, if you delete successfully converted input files,
or move them into another folder. See also “Configuring Post-Processing” (p. 37).
d. Enable Merge all pages if all pages from all processed files of a folder should
be converted to a single PDF output document. The output file will be named
<name_of_the_input_folder>.pdf. The input files of a folder will be processed in alphabetical, case-sensitive order. This defines the page order of the output PDF.
This option can be used to merge pages from scanners that put numbered single page
files (such as page000.tif, page001.tif, etc.) into a single folder.
e. Input file formats can be selected from the given list. The format is derived from the
file’s extension.
• TIFF enables the conversion of *.tif and *.tiff files. Currently PDF Compressor supports bilevel, grayscale and RGB images. CMYK and CIELab color as well as palletized
images are not supported yet.
• JPEG enables the conversion of *.jpg files.
• JPEG 2000 enables the conversion of JPEG 2000 files. The extensions *.jp2, *.jpf,
*.jpx. *.j2k, *.jpc and *.j2c are recognized.
• PDF enables the conversion of PDF files. If Always is chosen under Rasterize PDFInput, the input will be rasterized to the resolution specified on the Advanced
tab of the Advanced Input Options dialog (see “Configuring Advanced Input Op-tions” (p. 20)) before applying the MRC compression technique. If Automatic
has been chosen for Rasterize PDF Input, PDF files may be rasterized, e.g. if conversion to PDF/A is not possible otherwise.
• BMP enables the conversion of *.bmp files. Palletized images as well as some exotic
variants are not supported yet.
• PNM enables the conversion of *.ppm and *.pgm files.
• GIF enables the conversion of *.gif files. For animated GIF files only the first frame
• PNG enables the conversion of *.png files.
• The formats below Born Digital are only available if the Born Digital conversion op-
www.foxitsoftware.com
info@luratech.com
is converted and a warning is logged.
tion is installed. Any combination of these input document types can be converted
to PDF or PDF/A. Refer to “Born Digital Conversion” (p. 6) and “Requirements of theBorn Digital Conversion Option” (p. 2) for details on Born Digital Conversion. The
following list shows the file extensions matching the input formats.
Setting up Job Entries: 15
PDF Compressor Manual
• MS Word : *.rtf, *.doc, *.docx, *.docm
• MS Excel : *.xls, *.xlsx, *.xlsm
• MS PowerPoint : *.ppt, *.pps, *.pptx, *.pptm, *.ppsx
• MS Project : *.mpp
• MS Visio : *.vsd, *.vdx, *.vsw
• E-Mail : *.msg, *.eml, *.emlx
• OpenOffice Writer : *.odt, *.sxw
• OpenOffice Calc : *.ods, *.sxc
• OpenOffice Impress : *.odp, *.sxi
• Textdateien : *.txt
• HTML : *.htm, *.html
• Extra Extensions List serves to supply a list of additional file extensions to process.
This entry is only enabled, when Born Digital Conversion is installed. The individual
extensions entered must be separated by blank spaces.
3. The Advanced Input Options button opens a dialog for configuring more input options.
See “Configuring Advanced Input Options” (p. 20) for details.
4. If you configure multiple PDF Compressor installations to work with the same input directory (typically a net share), you need to enable Use lock files for PDF Compressor syn-chronization. This ensures that input files currently processed by one PDF Compressor
are not accessed by another PDF Compressor. This is implemented by the use of lockfiles *.lock and *.dlock that PDF Compressor uses to lock certain input files or all files of
a directory (when Merge all pages is enabled).
Note: The PDF Compressor Service must have write access to the input directories when
using lock files.
Important note: Files ending with *.lock and *.dlock will be overwritten without notice
when using lock files. You should not use these extensions for your own files.
Note: A file PDF_Compressor.ulock can be used as a user defined lock for all files of a
directory. In contrast to the lock files *.lock and *.dlock, this file can be managed by the
user. If Use lock files for PDF Compressor synchronization is enabled, PDF Compressor
will never process any file it finds within a directory that contains a file PDF_Compressor.ulock.
This can be used to configure a hot folder with Merge all pages enabled: The external
process creates a new directory underneath the PDF Compressor hot folder containing a
file PDF_Compressor.ulock and copies all files that need to be merged afterwards. When
the last file copy is done, the external process removes PDF_Compressor.ulock. Next time
PDF Compressor inspects the hot folder, the processing of the given files begins.
5. The Rasterize PDF Input setting offers different modes for handling PDF input documents:
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 16
PDF Compressor Manual
a. Automatic is the default setting. In this mode PDF Compressor will automatically de-
termine in which cases to convert typeset or vector contents of PDF input files into
raster graphics. It will try to avoid such conversions, but will resort to them as a fallback option for documents that cannot be converted otherwise.
b. Always - this mode will convert all pages of all PDF input documents to raster graphics
before compressing or converting them. For PDF input this mode is equivalent to the
former MRC Compression mode.
c. With the Never setting the PDF Compressor will not try to rasterize any non-raster
input. In this way you avoid unintentional conversion of typeset or vector content into
raster graphics. On the other hand, this may result in failure to convert certain kinds
of PDF input files to PDF/A.
Note: With the Automatic or Never setting chosen, raster graphics already present in
the input documents may still be resampled or compressed, if Color image optimiza-tion or Bitonal image optimization are selected on the Compression tab. (Such graph-
ics may e.g. be scanned pages or large images inside a PDF document.)
Input Filter Options
File Extensions
On the File Extensions tab of the Input Filter Options dialog you can specify additional input
file extensions, such that files these non-standard extensions are processed by PDF Compressor, too.
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 17
PDF Compressor Manual
Each line of the list widget on the File Extensions tab shows one of PDF Compressor’s input
file types. You can find the file extensions configured by default for each of these types
under “Configuring Input Data” (p. 13). For the selected job additional file extensions
can be associated with each file type in the list’s right hand column. For the selected job
each additional extension may only be associated with one input file type.
File Names
On the File Names tab of the Input Filter Options dialog you can specify a file name pattern
to include only input files with matching names in the processing by PDF Compressor. Alternatively, you can exclude the matched files from processing.
1. Depending on the selection of either Process matching or Skip matching potential input
files that match the filter criteria are either included in the processing or skipped.
2. In the File Name Filter Pattern (may contain * and ?) entry you specify a filter pattern,
against which the names of all potential input files are matched. This pattern may contain
the special characters '*' and '?'.
These characters have the same significance as with the Windows™ file search. The question mark '?' matches an arbitrary single character and the asterisk '*' matches an arbitrary sequence of characters, which may be empty. The pattern is matched only against
the file’s base name, i.e. without the file extension and all alphabetical characters are
matched in a case-insensitive fashion.
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 18
PDF Compressor Manual
Examples: The filter pattern 'Client-????-??' matches the file names 'Client-0000-12.jpeg',
'Client-9900-00.jpeg' and 'client-ABCD-XY.jpg'. It will not match the names 'Client-0000-1.jpg' or 'Client-ABC-XY.jpeg'.
The filter pattern 'File*' matches the file names 'file.jpg', 'File_0123.jpg', 'FILE-A-B-C-D.jpg'
and 'File-2016-08.jpg'. It will not match the names 'Fille.jp2', 'Fil.jpeg' or '2016_File.jpg'.
The filter pattern '??_scan*' matches the file names '00_SCAN.TIF', 'ab_Scan.jpg',
'09_scan_2014.jpg' and 'XY_Scan_2014.jpeg'. It will not match the names 'scan2014.jpg' or
'00_SCA.jpeg'.
PDF Input Options
On the Input Filter tab of the Input Filter Options dialog you can configure options that exclude certain input files from being processed by PDF Compressor.
1. The Bypass PDF files smaller than … kB option serves to skip the processing of input files
smaller than the given file size. This can e.g. be used to suppress the processing of already
compressed PDF files.
2. The Metadata Filter options allow you to skip processing of PDF input files depending on the PDF metadata. Skipped files will not be processed to output files. Instead
they are passed directly to input file post-processing (see “Configuring Post-Process-ing” (p. 37)).
a. Depending on the selection of either Process matching or Skip matching PDF files that
match the filter criteria are either omitted from or included in the processing.
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 19
PDF Compressor Manual
b. The filter criterion is matched in case the Metadata key to check is found in the PDF
Metadata and its value matches the Search text below.
c. Depending on the option Match whole value the Search text is either compared to the
complete value or searched as part of the metadata value.
d. Skipped input files can be copied to a directory by enabling Copy skipped files to di-
rectory.
Configuring Advanced Input Options
Clicking the Advanced Input Options button brings up a dialog with further settings concern-
ing input options.
Advanced Input Options
1. Enable Read only up to <number> pages from input file to restrict the number of pages
read from each input file to the given limit. When you merge all files from an input directory (see “Configuring Input Data” (p. 13)) to a single PDF output file, setting the
maximum number of input pages to 1 can be used to create documents that only contain
the cover pages of these files.
2. Select Case insensitive sorting of files and folders if you want the input file processing to
be in case insensitive alphanumeric order. Otherwise, the ordering will be case sensitive
(with capital letters lexically smaller than lower case letters and therefore "Ac" coming
before "ab").
3. Enable Read local copy of each input file to solve problems related to the location of input
files on a network share. The files will be temporarily copied to the local temporary folder
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 20
PDF Compressor Manual
where they will be opened and read. This is especially useful when the performance for
transmitting small parts of data in the network is low.
4. Default Resolution: If the input image file does not contain any resolution information,
the value (in dpi) given here will be used. In addition this value is used when an input
PDF file is rasterized (PDF files do not have specific resolution information in general).
a. Enable Optimized dpi for PDF input to automatically choose an optimal dpi value for
the rendering of each PDF input page. The optimal value is calculated from the resolutions of all images embedded in the PDF input page. If there are no images on the
page, the setting of the Default Resolution will be used. Otherwise the highest resolution of the detected images is calculated and clipped to the resolution range given
by the Resampling option of the Input tab. We recommend enabling this option only
when the input PDF documents originate from a scan process.
b. Born digital born PDF documents may contain images of small physical size (e.g. less
than an inch in either dimension) but containing many pixels (e.g. 1000 x 1000 pixels)
and thus yielding a non-realistic resolution (more than 1000 dpi in the example). For
born digital documents you should either use a fixed Default Resolution value without enabling Optimized dpi for PDF input, or specify a narrow resolution range in the
Resampling settings.
5. Resampling lets you change the input’s resolution. There are two reasons, why you might
want to change the input’s resolution:
a. If the input resolution is too low (below 150 dpi), the PDF Compressor output might
have a quality that is too low for your needs (although the file size will be extremely
small). Changing the resolution e.g. to 300 dpi will produce better results.
b. If your original scans are high resolution (e.g. more than 600 dpi), you might want to
produce a lower resolution PDF output.
The Minimal and Maximal dpi settings address these requirements. Whenever the
resolution of the input files is outside the given [min, max] range, the resolution will
be changed by resampling to the minimum value or the maximum value.
Note: Please bear in mind that the Resampling setting only affects grayscale and color
images. The resolution of bilevel or B/W images will not be changed.
In case of JPEG 2000 input, the ability to decode a JPEG 2000 file to a lower resolution
will be used. This enables fast access to huge JPEG 2000 images, when only a lower
resolution is needed.
6. When Directly create PDF pages from JPEG input without recoding is enabled, JPEG input
images are not submitted to MRC or JPEG 2000 compression but are directly embedded
as pages in the PDF output document. In this fashion repeated compression of such input
documents is avoided. On the other hand these pages are not reduced in size.
7. Skip JPEG validation when adding images without recoding omits all integrity checks for
JPEG images directly to embed into PDF pages. This involves the risk of creating corrupt
PDF output by incorporating corrupt JPEG input images.
8. With the Use orientation stored in TIFF input files option the orientation information
optionally contained in TIFF input files determines the orientation of the corresponding
pages in the PDF output document.
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 21
PDF Compressor Manual
PDF Input Options
The PDF Input tab of the Advanced Input Options dialog lets you configure options that are
relevant for opening PDF input files.
1. The Document Open Password is used to open password protected PDF files. The given
password is stored within PDF Compressor in an encrypted way. Note that this encryption
does not meet the highest security demands. The given password is ignored, when the
input PDF file is not password protected.
2. In addition to a password needed to open the PDF document, it can be protected by a
Change Permissions Password. This password is used whenever the PDF access permissions need to be changed.
3. PDF input rasterization options are available unless Rasterize PDF Input has been set to
Never:
a. The Interpolate images switch influences the rasterization of PDF input files. PDF in-
put files may contain embedded images that need to be scaled for rasterization. In-terpolate images should be turned on to obtain best image quality. However if an
input PDF file contains only bitonal image content, and this PDF is rasterized to the
same resolution as the embedded images' resolution, the image interpolation can be
turned off to speed up an OCR process.
b. Default PDF Page Size lets you select a page size that is used only if the input file is a
PDF file that is missing its page size information, but has to be rasterized. Normally all
pages of a PDF file carry page size information. However some applications create illformed PDF files without page size information. This option has been added to deal
with such malformed PDF files.
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 22
PDF Compressor Manual
Select user defined at the very end of the list to enter a user defined page size by
specifying the width and the height of a page.
c. Selecting the Alternative rasterization method can be useful when input PDF files con-
tain very large pages which lead to out-of-memory errors. The alternative rasterization method consumes significantly less memory but is slower.
4. The option Restrict PDF Conversion - Use only PDF Optimization for converting PDF inputfiles is only enabled when the Born Digital Module is installed. If selected it serves to
suppress the attempt to use Born Digital Conversion functionality when converting PDF
input files. Instead these files are only processed using PDF Optimization features.
Note: This option primarily serves to guarantee backward compatibility with jobs set up
with pre-7.0 versions of PDF Compressor, which used the PDF Optimization mode for
converting PDF input. In general it is preferable to use the conversion provided by the
Born Digital Module.
Office Input Options
On the Office Input tab you can configure options for the Born Digital conversion of Microsoft Office input files.
1. If the Use Excel page layout option is selected, the print page layout as specified inside
the Excel input files is used when converting this file to PDF. This option is disabled by
default, such that each non-empty sheet inside an Excel input file is converted to one
PDF output page.
2. When Process MS Office documents with OpenOffice is chosen, all office input files are
converted using OpenOffice. Obviously, an OpenOffice installation must be present in
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 23
PDF Compressor Manual
this case. This option is switched off by default and each input document is processed
using its native application.
3. The option Remove tagging information ensures that structure information generated
by office applications during PDF export is disposed. Structure information is important
for accessible documents and should generally be kept. However, if file size is critical this
option can help to create smaller PDF output.
Blank Page Detection Options
On the Blank Page Detection tab you can configure options for the Blank Page Detection of
input files. Pages that are detected as blank are always exempt from text recognition (OCR).
1. The option Detect blank pages serves to enable or disable blank page detection.
2. In the Excluded margins box, you can choose how much of the margins of all pages should
not be part of the blank page detection. This way, it is possible to exclude invariable page
headers or footers or punched holes, for example. You can also choose between the units
millimeter (mm) and hundredths of an inch (1/100 inch).
3. The blank page detection segments the area of the page to be worked on into tiles of
configurable size. The detection is done tile per tile. If in the end all tiles are classified
as blank the whole page is considered blank. The unit for the Tile size value is 'pt', which
is equal to 1/72 inch.
4. The analysis of a tile’s content is block-based. The size of a block should be set to the size
of the smallest meaningful symbol, e.g. the size of a punctuation dot. Depending on the
page image’s resolution a block spans multiple pixels. The unit for the Block size is 'pt'
which is equal to 1/72 inch.
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 24
PDF Compressor Manual
5. For each block a value is calculated that represents how much the block’s color intensity
differs from other blocks in the tile. Each block that has a deviance that exceeds a certain
value is considered non-blank. This value is influenced by the Sensitivity property. The
sensitivity is a value in the range of 1 through 100. A low sensitivity causes only strong
colored structures to be counted as being non-blank. In contrast, a high sensitivity also
takes pale textures into account.
6. A tile is considered non-blank if the number of non-blank blocks exceeds a certain threshold. The Threshold is given in per mill (one tenth of a percent) defining the maximum
fraction of non-blank blocks in relation to blank blocks for a full size tile to still be classified as blank.
7. The option Exclude blank pages from output defines whether detected blank pages are
included in the output. Checking this option results in the exclusion of detected blank
pages.
Configuring Data Output
The Output tab lets you configure PDF output options including the exact version of PDF to
generate and the location where generated PDF files are placed.
1. Select Place output next to input file, and the generated PDF file will be placed into the
same directory as the input file. Select Place output in directory and specify a folder by
www.foxitsoftware.com
info@luratech.com
Setting up Job Entries: 25
Loading...
+ 78 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.