Business objects DATARIGHT IQ 7.80C User Manual

Download

Page 1

DataRight IQ

Transition Guide

DataRight IQ 7.80c

September 2007

Page 2

Contact information Contact us on the Web at http://www.firstlogic.com/customer

If you find any problem with this documentation, please report it to Business Objects in writing at

Patents Business Objects owns the following U.S. patents, which may cover products that are

documentation@businessobjects.com

offered and sold by Business Objects: 5,555,403, 6,247,008 B1, 6,578,027 B2, 6,490,593 and 6,289,352.

Trademarks Business Objects, the Business Objects logo, Crystal Reports, and Crystal Enterprise

are trademarks or registered trademarks of Business Objects SA or its affiliated companies in the United States and other countries. All other names mentioned herein may be trademarks of their respective owners.

Third-party contributors Business Objects products in this release may contain redistributions of software

licensed from third-party contributors. Some of these individual components may also be available under alternative licenses. A partial listing of third-party contributors that have requested or permitted acknowledgments, as well as required notices, can be found at: http://www.businessobjects.com/thirdparty

DataRight IQ Transition Guide

Page 3

Chapter 1:

Welcome to DataRight IQ ............................................................................ 7

Comparing DataRight and DataRight IQ.........................................................8

DataRight IQ’s implementation methods.......................................................10

Where to look for more information ..............................................................11

Chapter 2:

Installing and running DataRight IQ........................................................ 13

Installing your product ...................................................................................14

Updating your jobs (DataRight IQ Job and Views) .......................................16

Your first job is ready to run ..........................................................................17

Editing files in Job..........................................................................................19

Editing files in Views.....................................................................................21

Chapter 3:

What and how DataRight IQ parses ......................................................... 23

Parsing—DataRight versus DataRight IQ.....................................................24

How DataRight IQ differs..............................................................................25

DataRight IQ uses rule-based parsing ...........................................................26

DataRight IQ’s multiline parsing order..........................................................28

Modify how DataRight IQ parses ..................................................................29

Chapter 4:

How DataRight IQ parses new types of data............................................ 31

Parse e-mail addresses....................................................................................32

Parse Social Security numbers .......................................................................34

Parse dates ......................................................................................................37

Parse phone numbers......................................................................................38

Parse user-defined patterns.............................................................................40

Chapter 5:

DataRight IQ’s additional features ........................................................... 41

Control name order.........................................................................................42

Generate statistics files...................................................................................44

Use confidence scores ....................................................................................46

Chapter 6:

Fields DataRight IQ uses ............................................................................ 49

Comparing fields in DataRight IQ and DataRight.........................................50

More DataRight IQ output fields ...................................................................53

Appendix A:

Job file comparisons.....................................................................................55

Contents

Page 4

DataRight IQ Transition Guide

Page 5

Preface

About DataRight IQ DataRight IQ is advanced data-parsing software that identifies information in

your database so that you can use it more effectively. Here are just a few things that you can do with DataRight IQ:

 parse and standardize data  assign gender and prenames  perform advanced search-and-replace  combine split names or split combined names

About this guide Use this document to learn about the new features in DataRight IQ, how they

compare with DataRight, and how to perform the same tasks you used in DataRight with DataRight IQ.

Conventions This document follows these conventions:

Convention Description

Bold We use bold type for file names, paths, emphasis, and text that you

should type exactly as shown. For example, “Type

cd\dirs

.”

Italics We use italics for emphasis and text for which you should substitute

your own data or values. For example, “Type a name for your file, and the

.txt

Menu commands

extension (

We indicate commands that you choose from menus in the following format: Menu Name > Command Name. For example, “Choose File >

testfile

.txt

).”

New.”

We use this symbol to alert you to important information and potential problems.

We use this symbol to point out special cases that you should know about.

We use this symbol to draw your attention to tips that may be useful to you.

Page 6

Documentation

Related documents DataRight IQ comes with other documentation to help you fully use the

application’s abilities.

Document For those

who use

Release Notes

any Contains any necessary installation information and

Description

explains DataRight IQ’s capabilities in relation to the previous version.

User’s Guide

any Learn more about what DataRight IQ can do, and how

to do it. Includes Job-file and Library information.

Views online help

Modifier’s Guide

Views Contains information, accessible on-line, about run-

ning UMD through its Views implementation.

any Explains how to modify UMD to suit your needs—

includign how to create custom dictionaries using the command-line version of the User-Modifiable Dictionary (UMD) program, how to edit your rule file, and how to set up your user-defined patterns.

Here is a list of other product documentation that you may find useful.

Document Description

Views: A Quick Guide to Get You Started

Gives you the basic information you need to get started with any Views software.

Access the latest documentation

Edjob booklet

Explains how to use the Edjob utility to update your job files. Use this utility to update your job files when you receive a new version of UMD.

System Administrator’s Guide

Describes installation procedures, system requirements, and more.

You can access product documentation in several places:

 On your computer. Release notes, manuals, and other documents for each

product that you have installed are available in the Documentation folder. Choose Start > Programs > Firstlogic Applications > Documentation.

 On the Customer Portal. Go to www.firstlogic.com/customer, and then

click the Documentation link to access all the latest product documentation. You can view the PDFs online or save them to your computer for viewing or printing.

DataRight IQ Transition Guide

Page 7

Chapter 1: Welcome to DataRight IQ

This Transition Guide introduces you to the DataRight IQ product, contrasting its features with those of DataRight. As such, this guide is designed for those familiar with DataRight 2.56 or previous versions.

What’s new in DataRight IQ

Converting from DataRight

DataRight IQ was developed from the convergence and evolution of products such as DataRight and TrueName Library.

If you’re familiar with DataRight, you have a good basic understanding of DataRight IQ’s Views and Job implementations. DataRight IQ shares many things with the DataRight product. However, many features—as well as some fundamental differences in how they parse data—make these two products distinct from each other.

To see the differences, refer to “Comparing DataRight and DataRight IQ” on

page 8.

Because of features such as its increased parsing capabilities, DataRight IQ has numerous parameters in its underlying job file that are not in DataRight's master job file. See “Job file comparisons” on page 55.

Because DataRight and DataRight IQ are two distinctly different products (DataRight IQ is not an update for DataRight, but an upgrade), the process of upgrading from DataRight to DataRight IQ may be a little different from what you may be used to when merely updating to a new version.

For more information on updating DataRight IQ Job, see “Installing and running

DataRight IQ” on page 13.

Chapter 1: Welcome to DataRight IQ

Page 8

Comparing DataRight and DataRight IQ

DataRight IQ builds on DataRight’s features, but it also goes so much further.

On the surface, DataRight and DataRight IQ may appear very similar. They both have a job-file implementation (whose job file blocks contain a lot of the same parameters). They both have a Views implementation too (and their windows look very similar). However, DataRight IQ contains several enhancements over DataRight. Below is a brief list of DataRight IQ features.

Parsing more data DataRight IQ parses several more types of data than DataRight 2.5x parsed,

including e-mail addresses, phone numbers (U.S. and international), user-defined patterns, and U.S. Social Security numbers. One of the additional types of data that DataRight IQ parses is dates. (DataRight 2.56 could format dates but didn’t actually parse the data.)

For more information on parsing dates and other data, see “How DataRight IQ

parses new types of data” on page 31.

Rule-based parsing DataRight IQ brings you the flexibility of rule-based parsing. With parsing based

on user-modifiable rules, you can customize how DataRight IQ parses data by editing the rules.

In addition to rule-based parsing, DataRight IQ also provides presumptive parsing, which is similar to the way DataRight 2.5x parsed. (The presumptive parsing option is applicable only to name and firm data.)

For more information, see “DataRight IQ uses rule-based parsing” on page 26.

Controlling parsing DataRight IQ provides you with control over what you parse on multi-line input.

You can turn on and off each of your parsing capabilities for each specific input line.

For more information, see “Turn off parsing engines” on page 29.

Application fields DataRight IQ brings you many new PW and AP fields. Most of these fields have

been added because of the new parsing capabilities.

For more information, see “Fields DataRight IQ uses” on page 49.

User-defined patterns DataRight IQ lets you define patterns that you can then parse. You define these

patterns using Regular Expressions.

For general information about parsing user-defined patterns, see “Parse user-

defined patterns” on page 40. For detailed information about setting up (defining)

user-defined patterns, see the DataRight IQ Modifier’s Guide.

Name order Name order can be FML (First, Middle, Last) or LFM (Last, First, Middle). You

set this in your DEF file—just like you did with DataRight 2.5x.

But in addition to FML or LFM, DataRight IQ lets you determine how stringently name order is applied to parsed names. If you apply “strict” name order, DataRight IQ parses name data the way you set it in the DEF file. If you don’t

DataRight IQ Transition Guide

Page 9

apply “strict” name order, DataRight IQ uses the way it’s set in your DEF file only when the input is ambiguous.

For more information, see “Control name order” on page 42.

Title associations DataRight IQ lets you associate names and titles more precisely than DataRight

2.5x. DataRight had a parameter called “Associate Name & Title.” However, in DataRight IQ you can also associate names and titles on discrete lines and on multilines.

For more information, see your DataRight IQ User’s Guide.

Search-and-Replace DataRight IQ’s Search-and-Replace capabilities are very similar to DataRight

2.5x’s. However, because of DataRight IQ’s user-defined pattern matching feature, patterns are now an option when you choose your Search-and Replace method.

In addition, DataRight IQ has modified its Search-and-Replace function to handle any casing. You can specify whether you want to ignore casing when performing Search-and-Replace.

For more information, see your DataRight IQ User’s Guide.

Scan-and-Split DataRight IQ’s Scan-and-Split capabilities are very similar to DataRight 2.5x’s.

However, DataRight IQ has modified its Scan-and-Split function to handle any casing. You can specify whether you want to ignore casing when performing Scan-and-Split. (Note: Scan-and-Split does not support DataRight IQ’s new userdefined pattern matching.)

For more information, see your DataRight IQ User’s Guide.

Statistics files In addition to all the reports you could create in DataRight 2.5x, DataRight IQ

lets you generate several statistics files. Statistics files are text files containing all the information from a specific report.

For more information, see “Generate statistics files” on page 44

Unicode support DataRight IQ can read and write data that follows the Unicode standard. Strictly

speaking, this ability isn’t different between DataRight and DataRight IQ. However, because it was added only in DataRight 2.56, many DataRight users may not be familiar with the implementation of Unicode.

For more information, see the document Unicode and Firstlogic: Introduction to Firstlogic’s Unicode-Enabled Technology, which is available on the customer portal.

Chapter 1: Welcome to DataRight IQ

Page 10

DataRight IQ’s implementation methods

You may be familiar with only one “flavor” of DataRight or DataRight IQ. However, keep in mind that DataRight IQ spans a range of deployment options.

Implementation

Job file A batch program for processing database files. It sets processing

modes and options by reading a text file (“job file”) that contains specific information on how DataRight IQ should process data.

Views A graphical interface program for setting up DataRight IQ jobs.

Views is available in two forms:

“Local”

Installed and run entirely on Microsoft Windows.

Vie ws

Remote Vie ws

Has two components—The Views GUI is installed on a Windows PC, while the software for actually processing jobs is installed on a UNIX server.

Library

Toolkits for highly-customized integration of DataRight IQ technology into your existing software applications.

RAPID A simplified approach to integrating DataRight IQ technology into

your applications. (RAPID stands for Rapid Application Integration Deployment.)

RAPID contains some qualities of a library or toolkit (it has an API and requires some programming to implement) while it also offers the amenities of our batch tools (GUI setup and easy generation of reports).

Although they all have common DataRight IQ capabilities, each may have unique installation and operating requirements. This guide deals primarily with the Job file and Views implementations.

DataRight IQ Transition Guide

Page 11

Where to look for more information

DataRight IQ includes a number of user guides, reference guides, and online documentation (see below).

DataRight IQ-specific documentation

The documentation that you receive may depend on how you use DataRight IQ— for example, as a Job File, Views, Library, or RAPID product. (For information on the Rapid implementation of DataRight IQ, see the table “DataRight IQ-

DataRight IQ-related documentation

DataRight IQ Views online help

UMD Views online help

Views Contains the online documentation for your

Views product (access the help either through your application or from the Documentation CD.)

any Contains information about using the Views ver-

sion of the User-Modifiable Dictionary (UMD) program.

In addition to DataRight IQ-specific documentation, there may be other product documentation that you may need to refer to. To find these, look on your Documentation CD.

Document Description

System Administrator’s Guide

Database Prep

Views: Quick Start Guide

Explains how to install your software.

Explains how to prepare input files for processing, including how to create definition and format files (DEF, FMT, DMT). Contains tips for converting from one database type to another. Explains filters and functions and how to use them.

Gives you the basic information you need to get started with any Views software.

Chapter 1: Welcome to DataRight IQ

Page 12

Document Description

RAPID Online Documentation

Quick Reference for Views and Job-File Products

Provides the specific information (such as API calls) needed for DataRight IQ RAPID (applies to those who use the RAPID implementation of any related product).

Contains handy reference information:



descriptions of input and output fields



command-line options



ASCII code values



summary of functions and filter operators

DataRight IQ Transition Guide

Page 13

Chapter 2: Installing and running DataRight IQ

For additional installation information, see Business Objects’ System Administrator’s Guide.

Paths You must set your paths appropriately.

Operating system Comment

Windows The installation program prompts you to select a drive and

specify a directory location. The default directory name is

accept this default. The subdirectories underneath created automatically, as shown below.

. We recommend that you

\pw

are

UNIX The

postware

directory structure is shown below. Note that

must

not

be a root directory.

Directory structure If you accept the default settings during installation, the installer creates the

following directory structure.

adm

Edjob and those utilities used by multiple products

pw or postware

dirs

City and ZCF directories

csgui dir

Remote Views

dtr_iq

DataRight IQ executables and dictionaries

utils

samples

Sample job

quickparse

Chapter 2: Installing and running DataRight IQ

Page 14

Installing your product

On Windows When you insert the application CD, the installation program should start

automatically. If it doesn’t, follow these steps:

1. Access your Windows Start menu and choose Run.

2. In the Run window, type x:\setup (where x is the letter of your CD-ROM drive) and click OK.

The installation program should start. For more information about installing, see Business Objects’ System Administrator’s Guide.

On UNIX To install DataRight IQ on UNIX, use install_console for the Java Runtime

Environment as explained in Business Objects’ System Administrator’s Guide.

When you install your software on UNIX, you must set path and pw_path. See the System Administrator’s Guide for details.

UNIX users: Access the shared libraries

In addition to setting path and pw_path, you must also set the shared library path environment variable for each user in the appropriate login script. This ensures that products use the correct files from the /disk_n/postware/adm directory. (We use disk_n to refer to the file system or disk where you choose to install the software.)

You need to do this only once. If you already set the shared library path environment variable for a different product in this CSR, you don’t need to do it again.

To set the shared library path environment variable, first determine the appropriate

Platform Environment

variable

environment variable from the table at right. Then follow these steps:

1. If there is no shared library path, create one referring to /disk_n/postware/adm.

2. If there is an existing shared library

AIX LIBPATH

HP/UX 11.0 SHLIB_PATH

Solaris LD_LIBRARY_PATH

Linux LD_LIBRARY_PATH

path, append the path to /disk_n/ postware/adm at the end of your existing entries.

The examples on the next page show steps 1 and 2 from above for both the Bourne and C shells on Solaris. You may need to make adjustments based on the type of shell that you use. See the System Administrator’s Guide for details.

Bourne shell example If you use Bourne shell on Solaris, add product entries to your .profile or .login

file as follows:

1. Create a new shared library path:

LD_LIBRARY_PATH=/disk_n/postware/adm export LD_LIBRARY_PATH

2. Append to the existing shared library path:

LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/disk_n/postware/adm export LD_LIBRARY_PATH

DataRight IQ Transition Guide

Page 15

C shell example If you use C shell (Berkeley) on Solaris, add product entries to your .cshrc file as

follows:

1. Create a new shared library path:

setenv LD_LIBRARY_PATH /disk_n/postware/adm

2. Append to the existing shared library path:

setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/disk_n/postware/adm

Quick Parse If you use DataRight IQ on the UNIX platform and you have Trolltech’s Qt

installed (see www.trolltech.com

), you need to set up your path in a specific way. Otherwise, you may have a problem running QuickParse, a utility shipped with DataRight IQ.

The following directions apply only if you have QuickParse installed on your UNIX system.and if you’re not using Qt version 3.1.2.

The QuickParse utility lets you quickly see how data that you input (either manually or from a database) would parse if input through your DataRight IQ application. QuickParse is written using the Qt programming language, and it’s shipped with the Qt library version 3.1.2 (libqt.so.3 on UNIX).

You need to set the LB_LIBRARY_PATH correctly so you don’t have problems running QuickParse. This path is:

/postware/DTR_IQ/utils/quickparse

This path needs to have the directory containing libqt.so.3 as the first item in the path statement. It should read like this:

/postware/DTR_IQ/utils/ quickparse:$LD_LIBRARY_PATH

If you have questions about setting your LD_LIBRARY_PATH, see your system administrator.

Chapter 2: Installing and running DataRight IQ

Page 16

Updating your jobs (DataRight IQ Job and Views)

Before you can use your existing jobs and dictionaries in the latest version of DataRight IQ, you must update them.

Use Edjob to update jobs

Use DCTCONV to update parsing dictionaries

To update your existing jobs to the latest version of DataRight IQ, use the Edjob utility. Don’t try to update your job files by hand. Instead, use the Edjob update utility that’s installed with DataRight IQ.

The general syntax of the Edjob command line is:

edjob [options] [path]scriptfilename.upd job_file.dtr

There are two update script files for you to run on your existing jobs:

 To update jobs from DataRight to DataRight IQ: dtr2dtr_iq.upd

 To update jobs from an older version of DataRight IQ to the latest version of

DataRight IQ: pwdiqjob.upd

Find the DataRight IQ update scripts in the DTR IQ subdirectory.

For complete instructions on running Edjob, see Business Objects’ Edjob User’s Guide.

If you created any custom parsing dictionaries (using DataRight 2.56 or earlier), you need to convert the transaction file to DataRight IQ’s format and then build a new dictionary.

To convert, enter the following at your DOS prompt:

dctconv transaction_file new_transaction_file

Format of your existing transaction file

where “transaction_file” is the name of your existing transaction file and “new_transaction_file” is the name of your DataRight IQ transaction file.

This process converts the transaction file to the format accepted by DataRight IQ. For information about building a dictionary, see the documentation for UMD (User Modifiable Dictionary) in the DataRight IQ Modifier’s Guide.

Your existing transaction file must be formatted with these field lengths:

 action,1,C  primary,100,C  secondary,100,C  usage,3,C  intl,15,C  info,50,C  stdtype,50,C

DataRight IQ Transition Guide

Page 17

Your first job is ready to run

A sample job is included with your DataRight IQ software. The sample job is provided so that you can verify your DataRight IQ installation. The sample job also introduces you to the files used in a DataRight IQ job.

Note: DataRight had a run utility named rundtr, which you could use to run your jobs. For DataRight IQ Job, run your jobs using a command line. Or in Views, run your jobs by clicking the Run icon.

Supporting files To read your input files, DataRight IQ needs to know the input database type,

input field names and, in some cases, the length and type of each field.

To provide this information, you create a definition file for each input file, and perhaps a format file. The definition file tells DataRight IQ the database type and “translates” your input field names to names that DataRight IQ understands. The format file tells DataRight IQ how the input file is formatted.

We’ve created definition and format files for the sample input file. Look in the samples subdirectory for files with the .def and .fmt extensions. Use any text editor to open the files to view their contents, but don’t change the files in any way.

For more information about definition files and format files, see Business Objects’ Database Prep documentation.

If you use DataRight IQ Job

If you use DataRight IQ Views

The job file is a set of instructions that tells DataRight IQ how to process your input file. When you create a DataRight IQ job, copy the master job file (master.diq) that is installed in the dtr_iq directory and insert the instructions that are unique to your job.

A sample job file (called quikwin.diq) is included in the samples subdirectory.

Use any text editor to open the job file. Notice that the job file includes parameters that are displayed in groups called blocks. Entries typed at the parameters give DataRight IQ instructions on how to process the job. Scroll through the entire sample job file to get an idea of how this job is set up, but do not change any of the parameter entries.

Check file paths. You should access the Auxiliary Files block and check the paths for the dictionary and directory files. If you placed any of the files in different locations, change these entries before you run the sample job.

DataRight IQ Views gives you an easy-to-use interface to set up and run your job. Instead of customizing the job file (such as the sample job file mentioned above) through a text editor, you can use the windows, menus, and controls of DataRight IQ Views. For assistance with the Views interface, see the product’s online help, which includes window-level and control-level context-sensitive topics.

Running the sample job

To run the sample in Job, type the commands shown below for the UNIX platform. In Views, follow the steps shown below for the Windows platform.

Chapter 2: Installing and running DataRight IQ

Page 18

If you didn’t install DataRight IQ in the default directory, change the path name accordingly.

Before and after you run the job, look at the contents of the samples directory. Notice which files are input and which files DataRight IQ creates.

Platform Directions

UNIX (Job)

Windows (Views)

Enter the following:

$ cd /usr/postware/dtr_iq/samples $ fldiq quikwin.diq

Click

Start > Programs > Firstlogic Applications.

↵

Then select DataRight IQ. You can either browse for the sample job or type the path and file name for the sample job and click the Run icon.

Verification and processing

When you enter the command line or click the Run icon, DataRight IQ verifies that all the parameter entries in the job file are valid. As DataRight IQ verifies the job, it displays progress messages. If DataRight IQ detects a problem, it issues a verification warning or error.

Once the job file passes verification, DataRight IQ begins processing the input file. As DataRight IQ processes the input file, it displays messages to keep you informed of the job’s progress.

Reports After you run the sample job, you can look at the reports that DataRight IQ

generated. For more information, see the Reports chapter in your DataRight IQ User’s Guide.

DataRight IQ Transition Guide

Page 19

Editing files in Job

DataRight IQ offers two ways in which you can set up and edit your jobs. Use Job to enter your options and set up your job in a text-based file, then run the job using a command line in DOS. Or, use Views to set up your job in a Window’s environment and run your job with a click of a button.

When you use Views to set up your job, there’s not the possibility for making setup mistakes as there is in Job. Therefore, here are a few things to pay attention to when you set up DataRight IQ using Job.

Copy and edit the master job

block

parameters

Keep blocks intact

In the dtr_iq subdirectory, we provide a master job file called master.diq. To create your own job file, copy the master job file and then edit the copy. Do not try to type your own job file from scratch. When you name your job file, use the file extension .diq—for example, filename.diq.

To edit job files, use a text editor or word processor. If you use a word processor, save the file as simple ASCII text.

job settings

* MASTER JOB FILE FOR Firstlogic DataRight IQ

BEGIN General DataRight IQ 7.10c =========================

Job Description (to 80 chars)........ =

Job Owner (to 20 chars).............. =

END

BEGIN Execution ===========================================

Post to Input File(s) (Y/N).......... = Y

Post to Output File(s) (Y/N)......... = Y

Create Reports (Y/N)................. = Y

Keep the basic structure of the job blocks intact:

 Do not delete parameters or rearrange them within a block.

 Do not edit parameter names (anything to the left of the equal sign).

 Do not edit block titles.

 Do not edit the BEGIN or END lines. (Exception: If you want DataRight IQ

to ignore a block, insert an asterisk in front of the word BEGIN.)

You may add comments at the beginning or end of the job file and between blocks. Start all comment lines with an asterisk (*). Do not use the key words BEGIN or END in your comments.

Type entries correctly Parameter names are often followed by guidelines or options shown in

parentheses. You can distinguish a guideline from an option by its case.

Chapter 2: Installing and running DataRight IQ

Page 20

Guidelines are shown in lowercase, options in UPPERCASE (see the graphic, below.) Case does not matter when you type your entry, but be sure to spell options exactly as shown. (There is one exception: At Y/N parameters, you may spell out “Yes” or “No.”)

Guidelines are shown in lowercase.

Job Description (to 80 chars)........ =

Cache Buffer Size (SPEED/SPACE)...... = SPEED

Options are shown in UPPERCASE.

Case does not matter when you type your entry

If you’re entering a long parameter entry, never press the Enter key. Simply let the entry wrap onto the next line.

Include all required information

Blocks Certain blocks are required in every job. The documentation that describes each

block indicates whether that block is required or optional. For more information about each block, see the DataRight IQ User’s Guide.

Parameters Most parameters require an entry. There are very few optional parameters that

may be left blank.

$job, $time, and $date macros

In some parameter entries, you can include the macros $job, $time, and $date. DataRight IQ converts the macro to a specific piece of information. (You can also use these macros in Views.)

$job DataRight IQ automatically converts $job to the base name of the job file

(without path or extension). For example, your entry at the Output File Name parameter might be $job.dat. If your job file were named my_file.diq, DataRight IQ would name the output file my_file.dat. You can include the $job macro in file names, the job description, and report headers.

$time and $date DataRight IQ automatically converts $time and $date to the time and date, which

are taken from your computer’s clock when the job starts running. The time is ten characters long in the format hh:mm:ss with am or pm. The date is eleven characters long in the format dd-mmm-yyyy.

You can include the $time and $date macros in the job description or in report headers.

DataRight IQ Transition Guide

Page 21

Editing files in Views

DataRight IQ Views is provided with DataRight IQ Job. You can choose which software that you want to used based on your experience and comfort level. Some of you are used to the setting up your jobs using the master job file and running it with a DOS command. Some of you may be more familiar with using a GUI (graphical user interface).

Advantages of Views With Views, you eliminate many of the chances for mistakes that you can make

in Job. You can’t delete parameters like you can in Job, and many times Views tells you when you’ve entered an incorrect option. Most options don’t need to be typed, but are chosen from a drop-down list or by selecting or deselecting an option. This eliminates the chance for mistakes or misspellings in your entries.

Views provides help at the click of a button. If you don’t know what an option does, simply click the question mark icon in the Views window and click on the option. An explanation appears telling you what the option does. For more indepth information, access the online help by choosing Help from the menu.

While there is always a chance to make errors in setting up your jobs, Views provides an environment that eliminates many of the chances for error.

For details on using Views, see the Views Quick Start Guide.

Chapter 2: Installing and running DataRight IQ

Page 22

Views provides a menu bar and tool bar that provide you with easy ways to set up and run your jobs.

Views presents you the elements of DataRight IQ Job, but in Views, you open windows to enter parameter information.

Access DataRight Job blocks by expanding groups in the tree view and double-clicking the block that you want to setup.

The block appears in a window as shown.

Notice how the entries in Views are usually options that you select. You can see how the parameters and the options match up in this example.

DataRight IQ Transition Guide

Page 23

Chapter 3: What and how DataRight IQ parses

You can use DataRight IQ to identify and isolate various data and data components from fields of information. We call this parsing.

What DataRight IQ parses—compared to DataRight

With DataRight IQ you can parse more data than you can using DataRight. Comparing what you can parse with DataRight versus DataRight IQ, you can tell that DataRight IQ takes you well beyond just name and address.

With its increased parsing capabilities, DataRight IQ lets you parse the following types of data:

You can parse data such as



names



job titles



firms (company data)



U.S. street addresses



e-mail addresses



Social Security numbers



U.S. phone numbers



int’l phone numbers



date



user-defined patterns

   

see note

 

For more information see...

your

DataRight IQ

Users’ Guide

 



the next chapter

    

DataRight and dates. Previously, with DataRight 2.5x, you could format dates, but not parse them. DataRight IQ recognizes dates in a variety of formats and breaks those dates into components.

For a complete list of the input fields that DataRight IQ accepts, see Business Objects’ Quick Reference documentation. For a more detailed description about how DataRight IQ parses each type of data, see the next chapter.

Chapter 3: What and how DataRight IQ parses

Page 24

Parsing—DataRight versus DataRight IQ

You can use DataRight to identify and isolate a wide variety of data. However, it can’t recognize as many data fields as DataRight IQ.

In the example below (using the jobfile implementation), notice all the data that DataRight doesn’t know what to do with—and so puts into the Extra field. But when you use DataRight IQ to parse the same data, it properly identifies data like phone, e-mail address, and Social Security number.

Data parsed by DataRight Job 2.56

Input data

Mr. Dan R. Smith, Jr., CPA Director of Admissions Jones Inc. PO Box 567 1234 Main St S Biron, WI 54494 421-55-2424 dsmith@rdrindustries.com 507-555-3423 Apr 20, 2003

Prename First Name Middle Name Last Name Maturity Postname Honorary Postname Title Firm Address Address Lastline Extra Extra Extra Extra

Data parsed by DataRight IQ Job 7.10

Prename First Name Middle Name Last Name Maturity Postname Honorary Postname Title Firm Address Address Lastline Social Security E-mail address Phone Date

Mr. Dan R. Smith Jr. CPA Director of Admissions Jones Inc. PO Box 567 1234 Main St S Biron, WI 54494 421-55-2424 dsmith@rdrindustries.com 507-555-3423 April 20, 2003

Mr. Dan R. Smith Jr. CPA Director of Admissions Jones Inc. POB 567 1234 Main St S Biron, WI 54494 421-55-2424 dsmith@rdrindustries.com 507-555-3423 April 20, 2003

More application fields

DataRight IQ User’s Guide

DataRight IQ brings you many new PW and AP fields. Most of these fields have been added because of the new parsing capabilities. For more information, see “Fields DataRight IQ uses” on page 49.

Page 25

How DataRight IQ differs

DataRight IQ and DataRight have some underlying differences in their respective parsing behaviors. DataRight IQ significantly improves parsing and identification of name and firm results through configurable “rules” based parsing.

DataRight IQ is not your old DataRight

What’s the basis for this difference?

DataRight IQ and DataRight parse some data differently. You may find that you receive different results when you parse the same data through DataRight IQ and through DataRight.

DataRight IQ makes fewer assumptions about data than DataRight makes. For example, when DataRight encounters data on a name line, it assumes that data is a name. DataRight IQ, however, exercises more caution in its parsing behavior.

DataRight IQ doesn’t automatically assume data on a name line is name data. It parses the data, and if the data does not parse as name data, it places the data in an Extra field. To make that data parse as a name, you may need to add a rule or edit a dictionary.

Rule-based and presumptive. In addition to DataRight IQ’s rule-based way of parsing, DataRight IQ can also parse presumptively (similar to DataRight) when no parsing rule is hit. For more information, see “DataRight IQ uses

rule-based parsing” on page 26.

DataRight 2.xx uses a methodology for parsing that’s based on identifying and isolating words and then comparing them to an empirical source, also known as a dictionary lookup, to determine their meaning.

DataRight IQ builds on this method and uses configurable rules. This parsing method enhances the DataRight approach by using external rules, in combination with dictionary lookup, to guide the parser’s actions.

What’s the benefit of this difference?

DataRight IQ’s approach to parsing provides the following benefits:

 Context sensitivity—DataRight IQ can perform limited contextually based

parsing along with existing dictionary lookups where the parser can survey its surroundings and sometimes infer a word’s meaning by its relationship to other elements in addition to dictionary lookups.

 Flexibility—DataRight IQ has greater flexibility because rules are no longer

hard coded. Rules can be changed to meet the project needs without changing the program. This allows advanced users to change, add, or delete rules to meet their specific needs.

Chapter 3: What and how DataRight IQ parses

Page 26

DataRight IQ uses rule-based parsing

In DataRight 2.56 and earlier, you could only perform presumptive parsing. You didn’t have the option of rule-based parsing. Now DataRight IQ only uses presumptive parsing on name and firm data when the rule-based parsing does not work.

DataRight IQ follows sets of rules to determine how to parse data. It decides how to parse based only on the data itself and the rules. When the data matches a rule, DataRight outputs the data accordingly.

Presumptive parsing The parsing in DataRight was all hard coded (not like DataRight IQ, where there

is an editable rule file). There was no way to change any parsing results. If you knew that your data was a name or a firm, you’d input it on a nameline or firmline and it would always parse as such regardless of what it was.

Although DataRight IQ introduced rule-based parsing, some users prefer how DataRight worked, when it sent all entries to name or firm. So we incorporated the presumptive parsing option into DataRight IQ.

DataRight IQ tries its rule-based parsing with name (or firm) rules when input on a name or firm line, respectively. If data doesn't match a rule (and you have activated presumptive parsing in your setup), it uses presumptive parsing to make a best guess. With presumptive parsing, a name or firm will always be parsed out of a name or firm line.

The input does go to the rule set first, so in some cases the rules will match only part of the entry and parse that out. The remaining will go to extra.

Some examples In the example below, the nameline data “xb1wc so34bod2jc” is recognized as

junk data when parsing by the rules (the data has numbers and/or no letters in it) and so it’s sent to the Extra output field.

Input (on a name line) Output with rule-based parsing

Field Data

xb1wc so34bod2jc Extra1 xb1wc so34bod2jc

If you turn presumptive parsing on, the same data is parsed as first and last name because it came in on a nameline.

Input (on a name line) Output with presumptive parsing

Field Data

xb1wc so34bod2jc First name xb1wc

Last name so34bod2jc

DataRight IQ User’s Guide

Page 27

If you have a legitimate name in your data (like “john smith” among the same junk data, below), parsing will pull out John as a first name, Smith as a last name, and the rest as Extra—regardless if you use rule-based parsing or presumptive parsing—because the data matched a parsing rule.

Input Output with presumptive parsing on

Field Data

xb1wc john smith so34bod2jc First name John

Last name Smith

Extra1 xb1wc so34bod2jc

Turn presumptive parsing on

In Job, activate presumptive parsing for both name and firm lines in the Parsing Control box:

If you use DataRight IQ Library, you use an API call to accomplish this.

BEGIN Parsing Control =============================================

Parsing Mode (NONE/PARSE)............ = PARSE

Presumptive Parse Name Lines......... = Y

Presumptive Parse Firm Lines......... = Y

END

In Views, select the Parsing Setup group in the main window, then open the Parsing Control window. Select the Use Presumptive Parsing for Name Lines and Use Presumptive Parsing for Firm Lines options:

Chapter 3: What and how DataRight IQ parses

Page 28

DataRight IQ’s multiline parsing order

When input is on a multiline, DataRight IQ parses data in the following order:

Order Parsed item

Street address and lastline

E-mail address

U.S. Social Security number

Date

Phone number (U.S. or Canadian)

Phone number (International)

User-defined pattern

Name and title

Firm

Why order? The order in which DataRight IQ parses your data is important. Why? Because if

DataRight IQ identifies data as one thing before it can evaluate it as another, you may get unexpected results.

For example, if DataRight IQ identifies a nine-digit number as a U.S. Social Security number, then it won’t evaluate that data as a potential international phone number. Likewise, if you set up a custom pattern that looks for 5-digit numbers, anything recognized as a ZIP code is not going to make it through to get evaluated against your pattern.

When parsing, DataRight IQ looks through each record for different types of data. For each type of data, DataRight IQ makes a separate “pass” through the data. If DataRight IQ finds something on one pass, it extracts that data—and on the next pass examines only the data that remains.

When an item is recognized, it doesn’t go to the next step.

DataRight IQ User’s Guide

Page 29

Modify how DataRight IQ parses

DataRight IQ parses better than DataRight, in part because you can decide how it parses. You have more control over how DataRight IQ parses.

You can, of course, set the input fields and retrieve the output fields as usual. And with DataRight IQ you can create custom dictionaries just like you can with DataRight—by using the User-Modifiable Dictionary (UMD) utility.

But with DataRight IQ you can also create or edit parsing rules in the rule file (drlrules.dat). The rule file controls how DataRight IQ parses name and firm data.

Use pre-defined rules DataRight IQ already provides hundreds of rules for many different possible

combinations of data. These rules will likely satisfy the parsing needs of most users. However, you may encounter data that isn’t being parsed the way that you want it to be parsed. Or, maybe you want to tweak a rule so that it returns a different confidence score. In situations like this, it is very handy to be able to edit the rule file.

For more information on editing the rules by which DataRight IQ parses, see the DataRight IQ Modifier’s Guide.

Turn off parsing engines

To help you control how DataRight IQ parses, the program gives you the ability to directly control what DataRight IQ parses. For each input line you can selectively turn off the parsing of addresses, names, firms, Social Security numbers, dates, phone numbers, user-defined patterns, and e-mail addresses.

In your DataRight IQ Job product, see the Multiline Parsing block. In your DataRight IQ Library product, see the drl_disable_iline_parsers() function (if using C) or the DisableILineParsers() method (if using C++).

Chapter 3: What and how DataRight IQ parses

Page 30

DataRight IQ User’s Guide

Page 31

Chapter 4: How DataRight IQ parses new types of data

DataRight IQ can identify and isolate various data and data components from fields of information:

Fields that DataRight IQ parses that DataRight did not

e-mail addresses page 32

U.S. Social Security numbers page 34

dates page 37

phone numbers page 38

user-defined pattterns page 40

For a complete list of the input fields that DataRight IQ accepts, see Business Objects’ Quick Reference for Views and Job-File Products.

For more information, see

Chapter 4: How DataRight IQ parses new types of data

Page 32

Parse e-mail addresses

When DataRight IQ parses input data that it determines is an e-mail address, it places the components of that data into specific fields for output. Below is an example of a simple e-mail address:

sales@firstlogic.com

By identifying the various data components (user name, host, and so on) by their relationships to each other, DataRight IQ then assigns the data to specific fields.

Fields used DataRight IQ outputs the individual

components of a parsed email address— that is, the email user name, complete domain name, top domain, second domain, third domain, fourth domain, fifth domain, and host name.

For inputting and outputting e-mail address information, DataRight IQ uses the fields listed at right:

What DataRight IQ does

With DataRight IQ, you can do the following things with an e-mail address:

 Parse the e-mail address, either in a field by itself or combined in a field with

other data.

 Break the domain name down into sub-elements.

 Verify that an e-mail address is properly formatted.

 Flag the address for special handling (see “Flag addresses” on page 32).

Input fields Output fields

PW.Email1-6 and multiline fields

AP.Email1-6 AP.EmailUser1-6 AP.EmailAllD1-6 AP.EmailTopD1-6 AP.Email2ndD1-6 AP.Email2ndD1-6 AP.Email3rdD1-6 AP.Email4thD1-6 AP.Email5thD1-6 AP.EmailHost1-6 AP.EmailISP1-6

Not verified Several aspects of an e-mail address are not verified by DataRight IQ. DataRight

IQ does not verify:

 whether the domain name (the portion to the right of the @ sign) is

registered.

 whether an e-mail server is active at that address.

 whether the user name (the portion to the left of the @ sign) is registered on

that e-mail server (if any).

 whether the personal name in the record can be reached at this e-mail

address.

Flag addresses You can flag e-mail addresses based on a list of criteria you create or maintain.

For example, if you focus on B2B (business to business), you might want to flag consumer-oriented domain names such as hotmail, yahoo, or aol.com.

You flag addresses by matching them against a list of hosts and domain names in a file named drlemail.dat. You can post a flag (to the field EmailISP1-6) that

DataRight IQ User’s Guide

Page 33

indicates if it looked up or not. This can then be used to separate records either in the current job or in a future process.

E-mail components The AP field where DataRight IQ places the data depends on the position of the

data in the record. DataRight IQ follows the Domain Name System (DNS) in determining the correct output field.

When DataRight IQ parses the following data:

expat@london.home.office.city.co.uk

it would assign these components to the following fields according to DNS.

Sample data Field Field description

expat EmailUser1 The user name, or “addressee.” The person or department, for example,

for whom the e-mail is intended.

london.home.office.city.co.uk EmailAllD1 The “all D” is the entire domain.

uk EmailTopD1 The top level domain. In DNS, the highest level of hierarchy after the root

(the last “dot”). In a domain name, that portion of the domain name that appears farthest to the right, often “com,” “org,” “gov,” and so on.

home.office.city.co EmailTopD2-5 The elements between the host and top level domain..

london Host The element immediately to the right of the “at” symbol (@).

For example, with the input data, expat@london.home.office.city.co.uk, DataRight IQ outputs each element in the following fields:

AP field Output value

AP.Email AP.EmailUser1 AP.EmailAllD1 AP.EmailTopD1 AP.Email2ndD1 AP.Email3rdD1 AP.Email4thD1 AP.Email5thD1 AP.EmailHost1 AP.EmailISP1

= expat@london.home.office.city.co.uk = expat = london.home.office.city.co.uk = uk = co = city = office = home = london = f

Chapter 4: How DataRight IQ parses new types of data

Page 34

Parse Social Security numbers

DataRight IQ parses U.S. Social Security numbers (SSNs) that are either by themselves or on an input line surrounded by other text.

Fields used For inputting and outputting U.S. Social Security number information, DataRight

IQ uses the following fields:

Input fields Output fields

How this differs from DataRight

Setting up SSN parse in DataRight IQ

PW.SSN1-6

AP.SSN1-6

and multiline fields

The six available PW.SSN fields (PW.SSN1-6) store outputted Social Security number data.

Example of data: Typical field length:

123-45-6789 9-11 characters

DataRight (as opposed to DataRight IQ) had one PW.SSN field that could contain a Social Security number. DataRight did not perform any processing on the data in this field.

You could use this field to overcome field-naming differences among input files. For example, if one input file contained a field named SS_Number and another input file contained a field named Soc_Sec_No, you could define both fields as PW.SSN. This would give you a common Social Security Number field (PW.SSN) to use in filters and output posting.

DataRight IQ has two SSN options to set up for parsing:

 Specify the file location that contains SSN information  Select the type of deliminter to use

SSN information file Specify the location of the SSN information file in the Auxiliary Files block in

DataRight IQ.

SSN delimiter In the Standardization/Assignment Control block, you can determine what

delimiter you want to output the SSN with by setting the SSN Delimiter option.

In Job, there is a note at the end of the block specifying the deliminters to choose from In Views, choose from options in a drop-down list.

DataRight IQ User’s Guide

Page 35

How DataRight IQ handles Social Security numbers

DataRight IQ handles Social Security numbers in two steps:

1. Identifies a potential SSN by looking for any of three patterns:

Pattern Digits per grouping Delimited by

nnnnnnnnn 9 consecutive digits n.a.

nnn nn nnnn 3, 2, and 4 (for area, group, and serial) spaces

nnn-nn-nnnn 3, 2, and 4 (for area, group, and serial) all supported

delimiters

2. Performs a validity check on the first five digits only. Two outcomes of this validity check are possible:

Outcome Description

Pass DataRight IQ successfully parses the data—and the Social Secu-

rity number comes out in an AP field.

Fail DataRight IQ does not parse the data because it’s not a valid SSN

as defined by the U.S. government—so the data comes out as Extra, unparsed data.

Check validity When performing a validity check, DataRight IQ doesn’t verify that a particular

9-digit Social Security number has been issued, or that it’s the correct number for any named person. Instead, it validates only the first 5 digits (area and group). DataRight IQ doesn’t validate the last 4 digits (serial)—except to confirm they are digits.

SSAdata DataRight IQ’s validation of the first 5 digits is driven by a table from the Social

Security Administration (http://www.ssa.gov/foia/highgroup.htm

). That table is updated monthly as the SSA opens new groups. The rules and data that guide this check are available at http://www.ssa.gov/history/ssn/geocard.html

Update your SSN file Business Objects provides the Social Security Number (SSN) file (drlssn.dat)f or

DataRight IQ customers interested in parsing recently issued and existing U.S. Social Security numbers. The SSN file is updated monthly with the latest SSN information from the U.S. government. Business Objects will convert the data to a format that DataRight IQ can use and post the data by the 5th of every month.

You can obtain the most current SSN file from the Customer Portal site at http://

download.firstlogic.com. This area provides you with the opportunity to

download the latest drlssn.dat file used to parse U.S. Social Security numbers within DataRight IQ.

Outputs valid SSNs Outputs only Social Security numbers that pass its validation. If an apparent SSN

fails validation, DataRight IQ does not pass on the number as a parsed, but invalid, Social Security number.

Chapter 4: How DataRight IQ parses new types of data

Page 36

Other U.S. ID numbers

Your data may include other numbers used in the United States for governmental identification purposes. DataRight IQ’s capability is aimed at U.S. Social Security numbers, which are, in effect, Tax IDs for individuals. However, other numbers include ITIN and EIN.

Name Description

ITIN Individual Taxpayer Identification Number

This “cousin” to the Social Security number is what the IRS assigns to people who earn money and pay federal income taxes but who are not citizens (they are resident or non-resident aliens). An ITIN looks like an SSN except that it begins with the number 9.

DataRight IQ treats an ITIN as an invalid SSN. It might match the pattern, but not make it through the check against the SSN table, so an ITIN will come out as unparsed Extra.

EIN Employer Identification Number

Synonymous with a corporate Tax Identification Number (TIN) or Tax ID, this number is also 9 digits. However, its pattern is

nn-nnnnnnn

. Because of

that, the EIN is not recognized by by DataRight IQ’s SSN parser.

Use UPDM to parse other patterns. If you need to parse patterns that aren’t covered by one of DataRight IQ’s usual parsing engines, use DataRight IQ’s UDPM (user-defined pattern matching) feature.

DataRight IQ User’s Guide

Page 37

Parse dates

DataRight IQ recognizes dates in a variety of formats and breaks those dates into components.

Fields used For inputting and outputting date

information, DataRight IQ uses the following fields:

Formats and delimiters

DataRight IQ supports the following formats and delimiters. That is, you can select any one of these formats to standardize dates.

Format Example

yyyy*mm*dd 2004 01 27

yy*mm*dd 04 01 27

dd*mm*yyyy 27 01 2004

dd*mm*yy 27 01 04

mm*dd*yyyy 01 27 2004

mm*dd*yy 01 27 04

dd*mmm*yy 27 Jan 04

dd*mmm*yyyy 27 Jan 2004

mmm*dd*yyyy Jan 27 2004

Input fields Output fields

PW.Date1-6

AP.Date1-6

and multiline fields

Delimiter* Description

<none> no space

<space> a space

–

* Delimiters appear between date components only for for-

mats that have delimiters. That is, you also have the option of using no delimiters (<none>).

forward slash

dash

backward slash

period

mmm*dd*yy Jan 27 04

yyyymmdd 20040127

yymmdd 040127

ddmmyyyy 27012004

ddmmyy 270104

mmddyyyy 01272004

mmddyy 012704

DataRight IQ can parse up to six dates from your defined record. That is, DataRight IQ identifies one or more dates (up to six) in the input, breaks found dates into components, and makes dates available as output in either the original format or a user-selected standard format.

Chapter 4: How DataRight IQ parses new types of data

Page 38

Parse phone numbers

DataRight IQ can parse both North American (U.S. and Canada) and international phone numbers. When DataRight IQ parses a phone number, it outputs the individual components of the number into the appropriate AP fields (see the examples below).

Fields used For inputting and outputting phone

number information, DataRight IQ uses the following fields:

U.S. versus international phone numbers

U.S. and Canada What DataRight IQ calls U.S. phone numbers should be more properly called

Phone numbering systems differ around the world. DataRight IQ recognizes phone numbers by their pattern and (for non-U.S. numbers) by their country code, too.

Input fields Output fields

PW.Phone1-6 and multiline fields

AP.USPhone1-6 AP.USAreaCod1-6 AP.USPhonPre1-6 AP.USPhonLin1-6 AP.USPhonExt1-6 AP.USPhonTyp1-6

AP.IntPhone1-6 AP.IntCtryCd1-6 AP.IntCityCd1-6 AP.IntPhNum1-6 AP.IntPhDesc1-6

North American phone numbers. The Canadian phone number standard follows the same pattern as U.S. phone numbers. Because of this, when DataRight IQ parses a phone number that’s either from the U.S. or Canada, it posts the data to AP.USPhoneX.

DataRight IQ searches for U.S. phone numbers by commonly used patterns such as: (234) 567-8901, 234-567-8901, and 2345678901.

DataRight IQ gives you the option for some reformatting on output (such as your choice of delimiters). Below is an example with extension text:

Input data: (901) 234-5678 EXT 1234

Output data: 901-234-5678 Ext. 1234

Europe and Pacific-Rim DataRight IQ searches for European and Pacific-Rim numbers by pattern. The

patterns used are stored in drlphint.dat. They require that the country code appear at the beginning of the number. DataRight IQ doesn’t offer any options for reformatting international phone numbers. Also, DataRight IQ doesn’t crosscompare to the address to see if the country and city codes in the phone match the address.

DataRight IQ User’s Guide

Page 39

Phone number components

Phone numbers consist of different output components depending on whether they’re U.S. or international numbers.

Individual components for:

U.S. phone numbers non-U.S. phone numbers

 area code  prefix  line number  extension  line type

 country code  city code  number  description

Example of U.S. phone number

Example of non-U.S. phone number

Say you have the following U.S. phone data for input:

Work (308)-555-8402 ext 34

DataRight IQ parses the data in the following AP fields:

Some of these fields (namely, area code, extension, and type) are optional. If your input doesn’t have appropriate values for these fields, DataRight IQ leaves them empty.

AP field Output value

AP.USAreaCod1 308

AP.USPhonPre1 555

AP.USPhonLin1 8402

AP.USPhonExt1 34

AP.USPhonTyp1 Work

Say you have the following international (non-U.S.) phone data for input:

61-9-0123-4567

DataRight IQ parses the data in the following AP fields (all data must be present for the phone data to be valid):

AP field Output value Note

AP.IntCtryCd1 61

AP.IntCityCd1 9

AP.IntPhNum1 0123-4567

AP.IntPhDesc1 Australia Populated based on the Country ID.

DataRight IQ accepts international phone numbers only as they would be

dialed from the U.S. For example, the number must start with the appropriate country code.

Also, if presented on a line with other data, the international phone number must start the line.

Chapter 4: How DataRight IQ parses new types of data

Page 40

Parse user-defined patterns

Parse any number or alphanumeric

With DataRight IQ you can parse data that’s outside the range of name, title, address, and so on. With DataRight IQ’s user-defined pattern matching (UDPM) feature, you can parse a wide variety of data such as:

 account numbers  part numbers  purchase orders  invoice numbers  VINs (vehicle identification numbers)  driver license numbers

In other words, DataRight IQ can parse any kind of number or alphanumeric for which you can define a pattern.

Fields used For inputting and outputting user-defined pattern information, DataRight IQ uses

the following fields:

Input fields Output fields Description of output field

PW.Pattern1-4 and multiline fields

AP.Pattern1-4 AP.PatnLabel1-4 AP.Patnsub1-4_1-5

The pattern The label for the pattern The subpattern(s) of the pattern

The pattern label is created in the

drludpm.dat

file when the pattern is defined.

How it’s done DataRight IQ is able to parse patterns through its user-defined pattern matching

(UDPM) feature, which uses regular expressions. That is, you can set up data patterns to suit your data (such as part numbers), and DataRight IQ can parse your data according to those user-defined patterns.

DataRight IQ’s UDPM feature makes possible the parsing and extraction of virtually any kind of data that conforms to a pattern—any type of data pattern that can be expressed using regular expressions.

Define your pattern When you create a user-defined pattern, you must include a carriage return/

linefeed at the end of the line. All characters before the carriage return/linefeed— even blank spaces—are considered part of the pattern.

For more information For more information on UDPM and setting up the patterns that DataRight IQ

will parse, see the DataRight IQ Modifier’s Guide, which accompanies the DataRight IQ product. This ability is for advanced users of DataRight IQ. You should read and follow all warnings before changing how your product works.

DataRight IQ User’s Guide

Page 41

Chapter 5: DataRight IQ’s additional features

In addition to improvements in parsing, DataRight IQ has several other new features to enhance your control over file parsing.

 Control name order. DataRight IQ provides you with more control over

how your name formats are followed.

 Generate statistics files. DataRight IQ generates statistics files for you to

use to create customize reports.

 Use confidence scores. DataRight IQ assigns calculated numeric scores to

parsed names and firms to indicate the accuracy of the parse.

Chapter 5: DataRight IQ’s additional features

Page 42

Control name order

DataRight IQ lets you determine name order—FML (First Middle Last) or LFM (Last First Middle)—just like DataRight did. However, instead of only two name orders (FML and LFM), DataRight IQ provides you with more control over how those orders are followed.

“Strict” or “Suggest” Now you can decide how name order is applied to parsed names. DataRight IQ

determines name order according to how you set the following parameter(s):

Job Library

Strict Name Order (Y/N) DRL_FML_STRICT

DRL_FML_SUGGEST DRL_LFM_STRICT DRL_LFM_SUGGEST DRL_UNKNOWN

You set the name order (FML or LFM) in the definition (DEF) file.

Turn on strict name order

Library To make name order in DataRight IQ work as you may be used to with

DataRight, use one of the two *_STRICT values. When you choose SUGGEST, DataRight IQ uses the suggested order only when the input is ambiguous. With SUGGEST, when the name is ambiguous, DataRight IQ looks to the name order to determine which name is First and which Last.

When you use the *_STRICT values, DataRight IQ parses name data the way it is set. This means that if you use DRL_FLM_STRICT, “Tommy Jones” on input parses as First name Tommy and Last name Jones on output. Likewise, if you use DRL_LFM_STRICT, “Tommy Jones” parses as Last name Tommy and First name Jones.

Know your data. Be aware that when you choose the strict values for name

parsing, the method you choose will be the way each name is parsed. This could result in unexpected results such as the example with “Tommy Jones” above.

Job If you use DataRight IQ Job, you enable strict name order by turning the ability

on in the Input File block. In the Input File block, change the value for the Strict Name Order parameter to Yes.

Strict Name Order (Y/N).......... = N

DataRight IQ Transition Guide

Page 43

The only valid values for this parameter are Y or N (yes or no).

Value Description

Yes DataRight IQ will use the name order set in your DEF file for every record.

No DataRight IQ will use the name order set in your DEF file only when the

name is ambiguous.

Views Setup strict name order in the same way that you set up for Job. Open the Input

File window and select Strict Name Order.

Chapter 5: DataRight IQ’s additional features

Page 44

Generate statistics files

You can have DataRight IQ generate statistics files. Statistics files are text files containing all the information from a specific report. You can then use these files when you create custom reports.

DataRight IQ can create the following statistics files:

Enable statistics file generation

Statistics file Based on this

report

Job

Job Summary This statistics file contains a single record. The

Statistics file

Description of statistics file

statistics in this file represent the significant aspects of your DataRight IQ job.

Exec Statistics file

Output Statistics file

Executive Summary

This statistics file contains everything that you can find in the Executive Summary report.

Output File This statistics file includes one record per list per

output file. Totals for the lists are not provided; to determine totals, add up the appropriate fields, based on the type of output file.

In the same way that DataRight IQ creates reports, DataRight IQ creates each statistics file during the process that the file describes.

For example, to create the output Statistics File, you must have your job set up to create an output file and an Output File report. So, in addition to the Output Statistics File block, your job must include a Report: Output File block, and a Create File for Output block.

Activate statistic file generation in the Execution block by typing Y for the Create Report Statistics Files paramter in Job, or selecting the Create Reports Statistics File(s) option in Views.

Yes and No (or selected/deselected) are the only options for this parameter.

Value Description

Yes (selected) If the parameter is set to Y (selected in Views), then DataRight IQ

verifies the Statistics Files block parameters for valid file type and path. Valid file types are ASCII, delimited, or dBASE3.

No (deselected) If the parameter is set to N (deselected in Views), then DataRight

IQ ignores the Statistics File block and doesn’t create any statistics files during processing.

After you have enabled statistics files, you need to set them up.

DataRight IQ Transition Guide

Page 45

Setting options for statistics files

Job

Views

You can set options and file paths for your statistics files through the Statistics Files block. In this block, you specify the file names, locations, and types. Use the $job macro whereever a file name is needed to save time.

Note: The options in the block come in pairs. At the first parameter, type a full path for the statistics file. On the second line type the file type that you want (ASCII, Delimited, or dBASE3).

BEGIN Statistics File =========================================

Job Stats Name (path & file name).... = $jobj.dsj

File Type (ASCII/DBASE3/DELIMITED)... = ASCII Exec Stats Name (path & file name) .. = $jobe.dse File Type (ASCII/DBASE3/DELIMITED)... = ASCII Output Stats Name (path & file name). = $jobo.dso File Type (ASCII/DBASE3/DELIMITED)... = ASCII END

For more information For complete information about statistics files, see the Reports chapter in your

DataRight IQ User’s Guide.

Chapter 5: DataRight IQ’s additional features

Page 46

Use confidence scores

DataRight IQ assigns a confidence score to each parse.

A confidence score is a number between 0-100 that is used as a way to quantify the confidence that a piece of data was correctly parsed. In DataRight IQ, there is a confidence score for the items parsed by a rule, as well as for the subcomponents that make up the pieces of that rule. You can use this score to analyze your parsing results.

Discrete fields are parsed at 100 percent confidence. For non-discrete fields you can specify the breakpoint for confidence ranges.

For reports only. Confidence scores are used for reporting purposes only. How you set the confidence scores has no effect on the way your data is parsed. If you change the High Confidence setting to a lower percentage, your report will show more data as parsed with high confidence, but that doesn’t change how DataRight IQ parsed that data.

Changing scores using a confidence booster

DataRight IQ chooses the rule with the highest confidence score. You cannot change the initial confidence score determined by DataRight IQ. However, if you want another rule to be used, you can add a “confidence booster” to the rule you want to use so that it returns a higher confidence. Then DataRight IQ chooses the new rule for the parse that is used.

Refer to the Modifier’s Guide for more details about confidence boosters and changing rule files.

Set confidence scores

Job In DataRight IQ’s job-file implementation, the Report Defaults block has the

following parameters controlling confidence scores (non-discrete fields).

High Confidence (2 to 100)........... = 31

Medium Confidence (1 to 99).......... = 21

Name High Confidence (2 to 100)...... = 31

Name Medium Confidence (1 to 99)..... = 21

Firm High Confidence (2 to 100)...... = 66

Firm Medium Confidence (1 to 99)..... = 21

These “whole record” confidence scores correspond with similar parameters in DataRight 2.5x.

These confidence scores for name and firm are new for DataRight IQ.

DataRight IQ Transition Guide

Page 47

Views In Views, you control the confidence scores for non-discrete fields in the Reports

Defaults block.

For more information For more information, see your DataRight IQ User’s Guide. For an overview of

and details about confidence scores, see the chapter “Data-quality scores and codes.”

Chapter 5: DataRight IQ’s additional features

Page 48

DataRight IQ Transition Guide

Page 49

Chapter 6: Fields DataRight IQ uses

DataRight and DataRight IQ don’t use the same fields. Well, not all the same fields. DataRight IQ has a number of fields that you won’t find in DataRight (necessary because of its increased parsing abilities). Conversely, DataRight IQ has no need for, and so it doesn’t use, some fields that DataRight made use of.

Avoid these input fields:

If you’re a jobfile user, you should not input data on certain fields that you may be familiar with from DataRight 2.56.

Avoid: Use instead:

PW.Name_line PW.Name_line1-6

PW.Phone PW.Phone1-6

PW.SSN PW.SSN1-6

If your definition file contains a field with this name, you will receive an invalid field error message.

Chapter 6: Fields DataRight IQ uses

Page 50

Comparing fields in DataRight IQ and DataRight

Input fields You can compare PostWare (PW) fields in DataRight IQ and DataRight.

DataRight IQ 7.10 DataRight 2.56 How DataRight IQ differs

([] = has been removed from)

Pre_Name First_Name Mid_Name Last_Name Post_Name

Phone1-6

SSN1-6

Name_Line1-6 Title1-6

Non_Addr1-6 Name_Firm1-6 Firm1-2 Firmline1-2 Firmloc1-2

Date1-6

Pre_Name First_Name Mid_Name Last_Name Post_Name

Phone

SSN

Name_Line Name_Line1-6 Title1-6

Name_Firm1-6 Firm1-2 Firmline1-2 Firmloc1-3

Firstpart1-2 Lastpart1-2

Birthdate

Phone1-6 [ Phone ]

SSN1-6 [ SSN ]

[ Name_Line ]

Non_Addr1-6

[ Firstpart1-2 ] [ Lastpart1-2 ]

Date1-6 [ Birthdate ] —Instead, use Date1-6.

Email1-6 Email1-6

Pattern1-4 Pattern1-4

Line1-12 Line1-12

Address Address

Unit Unit

City State ZIP ZIP4

[ZIP10]

ZIP10

Urb Urb

Last_Line Last_Line

List_ID List_ID

Country Country

Delete Delete

DataRight IQ Transition Guide

Page 51

Output fields Comparing application (AP) fields in DataRight and DataRight IQ.

DataRight IQ 7.10 DataRight 2.56 How DataRight IQ differs

Name_Cnt Namedesig1-6 Name1-6 Pre_Name1-6 FirstName1-6 Mid_Name1-6 Last_Name1-6 Mat_Post1-6 Oth_Post1-6 Title1-6 Dual_name Name_Line1-6 All_Names

Name_Cnt Namedesig1-2 Name1-6 Pre_Name1-2 FirstName1-2 Mid_Name1-2 Last_Name1-2 Mat_Post1-2 Oth_Post1-2 Title1-6 Dual_name Name_Line1-6

Namedesig3-6

Pre_Name3-6 FirstName3-6 Mid_Name3-6 Last_Name3-6 Mat_Post3-6 Oth_Post3-6

All_Names

Extra1-12 Extra1-4 Extra5-12

Gender1-6 Gender_Rec Salute1-6 Dual_salut Salute_Rec

Firm1-2 Firm_Line1-2 Firm_Loc1-2

Gender1-6 Gender_Rec Salute1-2 Dual_salut Salute_Rec

Firm1-2 Firm_Line1-2 Firm_Loc1-2

Salute3-6

Address1-3 City State ZIP ZIP4 Last_Line

Prim_Range Sec_Range Prim_Addr Sec_Addr

POBox POBox_Line RR RRBox RR_Line

Spec_Name URB

Name_Error Firm_Error Addr_Error LL_Error

Address1-3 City State ZIP ZIP4 Last_Line

Prim_Range Sec_Range Prim_Addr Sec_Addr

POBox POBox_Line RR RRBox RR_Line

Spec_Name URB

Name_Error Firm_Error Addr_Error LL_Error

Chapter 6: Fields DataRight IQ uses

Page 52

DataRight IQ 7.10 DataRight 2.56 How DataRight IQ differs

Name_Chng Firm_Chng Addr_Chng LL_Chng

Name_Qual Firm_Qual Addr_Qual LL_Qual

Name_stat Dname_stat Firm_stat Addr_stat Ll_stat

File_No List_No Record_No Rec_no_out

NewLine list_name

Name_Chng Firm_Chng Addr_Chng LL_Chng

Name_Qual Firm_Qual Addr_Qual LL_Qual

Name_stat Dname_stat Firm_stat Addr_stat Ll_stat

File_No List_No Record_No Rec_no_out

NewLine list_name

Date1-6 Date1-6

Email1-6 EmailUser1-6 EmailAllD1-6 EmailTopD1-6 Email2ndD1-6 Email3rdD1-6 Email4thD1-6 Email5thD1-6 EmailHost1-6 EmailISP1-6

Pattern1-4 PatnLabel1-4 Patnsub1-4_1-5

Intphone1-6 IntCtryCd1-6 IntCityCd1-6 IntPhNum1-6 IntPhDesc1-6

Email1-6 EmailUser1-6 EmailAllD1-6 EmailTopD1-6 Email2ndD1-6 Email3rdD1-6 Email4thD1-6 Email5thD1-6 EmailHost1-6 EmailISP1-6

Pattern1-4 PatnLabel1-4 Patnsub1-4_1-5

Intphone1-6 IntCtryCd1-6 IntCityCd1-6 IntPhNum1-6 IntPhDesc1-6

SSN1-6 SSN1-6

Usphone1-6 USAreaCod1-6 USPhonPre1-6 USPhonLin1-6 USPhonExt1-6 USPhonTyp1-6

DataRight IQ Transition Guide

Page 53

More DataRight IQ output fields

DataRight IQ’s other application fields are very similar to DataRight’s. The following tables contrast DataRight’s IQ’s APU and APC fields with DataRight

2.56’s.

APU fields

DataRight IQ 7.10 DataRight 2.56 How DataRight IQ differs

Namedesig1-6 Name1-6 Pre_Name1-6 FirstName1-6 Mid_Name1-6 Last_Name1-6 Mat_Post1-6 Oth_Post1-6 Title1-6 Dual_name Name_Line1-6 Spec_Name All_Names

Gender1-6 Gender1-6

Firm1-2 Firm_Line1-2 Firm_Loc1-2 Address1-3

City State ZIP ZIP4 Country

Namedesig1-2 Name1-6 Pre_Name1-2 FirstName1-2 Mid_Name1-2 Last_Name1-2 Mat_Post1-2 Oth_Post1-2 Title1-6 Dual_name Name_Line1-6 Spec_Name

Firm1-2 Firm_Line1-2 Firm_Loc1-2 Address1-3

City State ZIP ZIP4

Namedesig3-6

Pre_Name3-6 FirstName3-6 Mid_Name3-6 Last_Name3-6 Mat_Post3-6 Oth_Post3-6

All_Names

Country

Last_Line Last_Line

Prim_Range Sec_Range Prim_Addr Sec_Addr

POBox POBox_Line RR RRBox RR_Line Urb

Prim_Range Sec_Range Prim_Addr Sec_Addr

POBox POBox_Line RR RRBox RR_Line Urb

Chapter 6: Fields DataRight IQ uses

Page 54

APC fields The biggest change in the APC fields is that DataRight IQ has many fields that

have

DataRight IQ 7.10 DataRight 2.56 How DataRight IQ differs

Namedesig1-6 Name1-6 Pre_Name1-6 FirstName1-6 Mid_Name1-6 Last_Name1-6 Mat_Post1-6 Oth_Post1-6 Title1-6 Dual_name Name_Line1-6

Namedesig1-2 Name1-6 Pre_Name1-2 FirstName1-2 Mid_Name1-2 Last_Name1-2 Mat_Post1-2 Oth_Post1-2 Title1-6 Dual_name

Name_Line1-6 Spec_Name All_Names

Firm1-2 Firm_Line1-2 Firm_Loc1-2

Firm1-2

Firm_Line1-2

Firm_Loc1-3 [ Firm_Loc3 ]

Record Record

Namedesig3-6

Pre_Name3-6 FirstName3-6 Mid_Name3-6 Last_Name3-6 Mat_Post3-6 Oth_Post3-6

Spec_Name All_Names

DataRight IQ Transition Guide

Page 55

Appendix A: Job file comparisons

Here you can make a side-by-side comparison of the entire master job files for DataRight 2.56 and DataRight IQ 7.10c revision 2.

On the following pages, the job file for DataRight 2.56 appears in the left column with the corresponding blocks of DataRight IQ’s job file alongside in the right column.

Change bars

By using change bars, the following pages show you what’s different between DataRight and DataRight IQ. Change bars to the right of the DataRight column show which DataRight blocks and parameters don’t exist in DataRight IQ. Similarly, change bars next to the DataRight IQ column show which DataRight IQ blocks don’t exist in DataRight.

Appendix A:

Page 56

DataRight 2.56

DataRight IQ 7.10

* MASTER JOB FILE FOR DataRight

BEGIN General DataRight 2.56c ==========================

Job Description (to 80 chars)........ =

Job Owner (to 20 chars).............. =

END

BEGIN Execution ========================================

Parsing Mode (NONE/PARSE)............ = Parse

Post to Input File(s) (Y/N).......... = Y

Post to Output File(s) (Y/N)......... = Y

Create Reports (Y/N)................. = Y

Warn Before File Overwrite (Y/N)..... = Y

Show Detailed Process messages (Y/N). = Y

Message Update Increment............. = 1000

Work File Directory.................. =

Create Backup File(s) (Y/N).......... = N

Backup Directory (path).............. =

Cache Buffer Size (SPEED/SPACE)...... = Speed

END

BEGIN Auxiliary Files ==================================

Multi-line Rules (path & mlrules.gcf) = mlrules.gcf

Firm Rules (path & fprules.gcf)...... = fprules.gcf

Parsing Dct (path & parsing.dct)..... = parsing.dct

Address Dct (path & addrln.dct)...... = addrln.dct

Lastline Dct (path & lastln.dct)..... = lastln.dct

Firmline Dct (path & firmln.dct)..... = firmln.dct

City Directory (path & city08.dir)... = city08.dir

ZCF Directory (path & zcf08.dir)..... = zcf08.dir

Capitalization Dct 1(path & pwcap.dct)= pwcap.dct Capitalization Dct 2(path & file.dct) = Default ASCII FMT (path & file.FMT).. =

Default DEF (path & file.DEF)........ =

END

BEGIN Input File =======================================

Input File (location & file name).... =

Process Deleted Records (Y/N)........ = n

Starting Record Number............... =

Input Filter (to 916 chars).......... =

Bypass Filter (to 1024 chars)........ = USER

Nth Select Type (USER/AUTO/RANDOM)... = USER

User Nth Select (1.0 - ???).......... =

Maximum Number of Records to Input... =

Character Encoding (see NOTE)........ =

Unicode Conversion Name.............. =

Modify PW Field (source,destination). = END

* Master Job File For Firstlogic DataRight IQ

BEGIN General DataRight IQ 7.10c =====================

Job Description (to 80 chars)........ =

Job Owner (to 20 chars).............. =

END

BEGIN Execution ======================================

Post to Input File(s) (Y/N).......... = N

Post to Output File(s) (Y/N)......... = Y

Create Reports (Y/N)................. = Y

Create Report Statistics Files (Y/N). = Y

Warn Before File Overwrite (Y/N)..... = Y

Show Detailed Process messages (Y/N). = Y

Message Update Increment............. = 1000

Work File Directory (path)........... =

Create Backup File(s) (Y/N).......... = Y

Backup Directory (path).............. =

Cache Buffer Size (SPEED/SPACE)...... = SPEED

END

BEGIN Parsing Control =================================

Parsing Mode (NONE/PARSE)............ = PARSE

Presumptive Parse Name Lines......... = Y

Presumptive Parse Firm Lines......... = Y

END

*BEGIN Multiline Parsing =============================

Multiline Number (SEE NOTE).......... = DEFAULT

Parse Address (Y/N)................. = Y

Parse Name (Y/N)..................... = Y

Parse Firm (Y/N)..................... = Y

Parse SSN (Y/N)...................... = Y

Parse Date (Y/N)..................... = Y

Parse Intl Phone Number (Y/N)........ = Y

Parse US Phone Number (Y/N).......... = Y

Parse UDPM (Y/N)..................... = Y

Parse Email (Y/N).................... = Y

END

* NOTE: * Multiline Number - DEFAULT = all fields

BEGIN Auxiliary Files ===============================

Rule File (path & drlrules.dat)...... = drlrules.dat

Pattern File (path & drludpm.dat).... = drludpm.dat

SSN File (path & drlssn.dat)......... = drlssn.dat

Email File (path & drlemail.dat)..... = drlemail.dat

Int’l Phone (path & drlphint.dat).... = drlphint.dat

Parsing Dct (path & parsing.dct)..... = parsing.dct

Address Dct (path & addrln.dct)...... = addrln.dct

Lastline Dct (path & lastln.dct)..... = lastln.dct

Firmline Dct (path & firmln.dct)..... = firmln.dct

City Directory (path & city08.dir)... = city08.dir

ZCF Directory (path & zcf08.dir)..... = zcf08.dir

Capitalization Dct 1(path & pwcap.dct)= pwcap.dct Capitalization Dct 2(path & file.dct) = Default ASCII FMT (path & file.FMT).. =

Default DEF (path & file.DEF)........ =

END

BEGIN Input File ====================================

Input File (location & file name).... =

Process Deleted Records (Y/N)........ = N

Starting Record Number............... =

Input Filter (to 1024 chars)......... =

Bypass Filter (to 1024 chars)........ = USER

Nth Select Type (USER/AUTO/RANDOM)... = USER

User Nth Select (1.0 - ???).......... =

Input Date Format (Month before Day). = Y

Input Date Format (Year First)....... = Y

Maximum Number of Records to Input... =

Strict Name Order (Y/N).............. = N

Character Encoding (see NOTE)........ =

Unicode Conversion Name.............. =

Modify PW Field (source,destination). = END

* NOTE: Character Encoding * UTF_8 * UTF_16 * ISO_8859_1 * US_ASCII * IBM_1252 * IBM_858 * WINDOWS_1252YY

DataRight IQ Transition Guide

Page 57

DataRight 2.56

DataRight IQ 7.10

BEGIN Unicode Conversion ============================

No difference

BEGIN Input List Description ===========================

No difference

BEGIN Post to Input File ===============================

No difference

BEGIN Undetermined List Action =========================

No difference

BEGIN Standardization/Assignment Control ===============

Standardize Lastline (Y/N)........... = N

Non-Mailing Cities (CONVERT/PRESERVE) = PRESERVE

Case (UPPER/lower/Mixed/SAME)........ = Mixed

Use Generated Prenames (Y/N)......... = N

Female Prenames if Couple (MS/MRS)... = MS

Provide Firm Acronyms (Y/N).......... = N

END

BEGIN Salutation Options ===============================

No difference

BEGIN Unicode Conversion ============================

No difference

BEGIN Input List Description =========================

No difference

*BEGIN Post to Input File ============================

No difference

BEGIN Undetermined List Action =======================

No difference

BEGIN Standardization/Assignment Control =============

Standardize Lastline (Y/N)........... = Y

Non-Mailing Cities (CONVERT/PRESERVE) = PRESERVE

Case (UPPER/lower/Mixed/SAME)........ = Mixed

Use Generated Prenames (Y/N)......... = Y

Female Prename Assignment (MS/MRS)... = MS

Provide Firm Acronyms (Y/N).......... = Y

Compound LastName (COMBINE/PRESERVE). = PRESERVE

Associate Name & Title (Y/N)......... = Y

Associate Name & Title Typed (Y/N)... = Y Associate Name & Title Multi (Y/N)... = Y

Output Date Format (SEE NOTE)........ = 5

Output Date Delimiter (SEE NOTE)..... = 2

Output Date Zero Pad (Y/N)........... = Y

Use 20xx for Years 00 to ?? (00-99).. = 20

Phone Number Format (SEE NOTE)....... = 1

Standard Phone Extension............. = ext

SSN Delimiter........................ = 4

END

* NOTE: * Date Format Options: * Format 1 - YYYY*MM*DD * Format 2 - YY*MM*DD * Format 3 - DD*MM*YYYY * Format 4 - DD*MM*YY * Format 5 - MM*DD*YYYY * Format 6 - MM*DD*YY * Format 7 - DD*MMM*YY * Format 8 - DD*MMM*YYYY * Format 9 - MMM*DD*YYYY * Format 10 - MMM*DD*YY * Format 11 - YYYYMMDD * Format 12 - YYMMDD * Format 13 - DDMMYYYY * Format 14 - DDMMYY * Format 15 - MMDDYYYY * Format 16 - MMDDYY * Date Delimiter Options: * Delimiter 1 - ‘’ (No delimiter) * Delimiter 2 - ‘ ‘ (Space) * Delimiter 3 - ‘/’ * Delimiter 4 - ‘-’ * Delimiter 5 - ‘\’ * Delimiter 6 - ‘:’ * For Phone Format, choose one of the following options: * Phone 1 - (xxx)xxx-xxxx (default) * Phone 2 - xxx-xxx-xxxx * Phone 3 - xxxxxxxxxx * International (non US) phone numbers have no formatting option.

BEGIN Salutation Options ============================

No difference

BEGIN Parse Confidence Criteria ======================== Name/Firm High Confidence (2 to 100). = 87 Name/Firm Good Confidence (1 to 99).. = 75 END

*BEGIN Create Internal Table ============================

No difference

*BEGIN Create Search/Replace Function ===================

*BEGIN Create Internal Table =========================

No difference

*BEGIN Create Search/Replace Function ================

Appendix A: Job file comparisons

Page 58

DataRight 2.56

DataRight IQ 7.10

Function Name (to 10 chars).......... =

Internal Table Name (to 20 chars).... =

External Table (path & file name).... =

Search Priority (INTERNAL/EXTERNAL).. = INTERNAL Search & Repl. Method (FIELD/WORD/STR)= WORD

Default Return Action (ORIG/DFLT).... = ORIG

Default Return Value ................ =

END

*BEGIN Create Scan/Split Table ==========================

No difference

*BEGIN Create Scan/Split Function =======================

Function Name (to 10 chars).......... =

Internal Table Name (to 20 chars).... =

External Table (path & file name).... =

Table Priority (INTERNAL/EXTERNAL)... = INTERNAL

Scan Method (WORD/STR)............... = WORD

Split Method (BEFORE/AFTER/3PART).... = BEFORE

END

BEGIN Date Conversion =================================

Function Name (to 10 chars).......... =

Input Date Format (See NOTE)......... =

Output Date Format (See NOTE)........ =

Output Field Type (DATE/CHARACTER)... = DATE

Use 20xx for Years 00 to ?? (00-99).. = 50

END

* NOTE: * For Date Format, specify any combination of the ... * DD for day * MM or MMM for month (MM is numerals, MMM is alpha) * YY or YYYY for year * Punctuation and/or spaces can be used for delimiters.

BEGIN Create File for Output =========================== Output File (location & file name)... =

File Type (See NOTE)................. = ASCII

Create DEF file (Y/N)................ = N

Rec Format to Clone (path & file name)=

Field (name,length,type[,misc])...... =

END

* NOTE: The following are valid File Types: * DBASE3 * ASCII * DELIMITED * EBCDIC

* RMS (VMS only) * RMS_FIXED (VMS only)

BEGIN Post to Output File =============================

No difference

Function Name (to 10 chars).................... =

Internal Table Name (to 20 chars).............. =

External Table (path & file name).............. =

Search Priority (INTERNAL/EXTERNAL)............ =

Search & Repl. Method (FIELD/WORD/STR/PATTERN). =

Default Return Action (ORIG/DFLT).............. =

Default Return Value .......................... =

Case Insensitive Search/Replace(Y/N)........... =

END

*BEGIN Create Scan/Split Table =======================

No difference

*BEGIN Create Scan/Split Function ====================

Function Name (to 10 chars).......... =

Internal Table Name (to 20 chars).... =

External Table (path & file name).... =

Table Priority (INTERNAL/EXTERNAL)... =

Scan Method (WORD/STR)............... =

Split Method (BEFORE/AFTER/3PART).... =

Case Insensitive Scan/Split (Y/N).... =

END

BEGIN Create File for Output ======================== Output File (location & file name)... =

File Type (See NOTE)................. =

Create DEF file (Y/N)................ =

Rec Format to Clone (path & file name)=

Field (name,length,type[,misc])...... =

END

* NOTE: The following are valid File Types: * DBASE3 * ASCII * DELIMITED * EBCDIC

BEGIN Post to Output File ===========================

No difference

BEGIN Output Control ===================================

No difference

BEGIN Report Defaults ================================== Location and File Name/Printer Device =

Existing File (APPEND/REPLACE)....... =

Number of Copies (1 to 10)........... = 1

Case (UPPER/Mixed)................... = Mixed

Page Header Line 1 (to 80 chars)..... =

Page Header Line 2 (to 80 chars)..... =

Page Header Line 3 (to 80 chars)..... =

Page Header Line 4 (to 80 chars)..... =

Printer Init, For Reports ........... =

Printer Reset, For Reports .......... =

Page Length (in lines)............... =

Page Width (in chars)................ =

Top Margin (in lines)................ =

Bottom Margin (in lines)............. =

Left Margin (in chars)............... =

Right Margin (in chars).............. =

DataRight IQ Transition Guide

BEGIN Output Control ================================

No difference

BEGIN Report Defaults =============================== Location and File Name/Printer Device =

Existing File (APPEND/REPLACE)....... =

Number of Copies (1 to 10)........... = 1

Case (UPPER/Mixed)................... = Mixed

Page Header Line 1 (to 80 chars)..... =

Page Header Line 2 (to 80 chars)..... =

Page Header Line 3 (to 80 chars)..... =

Page Header Line 4 (to 80 chars)..... =

Printer Init, For Reports ........... =

Printer Reset, For Reports .......... =

Page Length (in lines)............... = 80

Page Width (in chars)................ = 120

Top Margin (in lines)................ =

Bottom Margin (in lines)............. =

Left Margin (in chars)............... =

Right Margin (in chars).............. =

Page 59

DataRight 2.56

DataRight IQ 7.10

Print Banner Page (JOB/REPORT/NONE).. = none

Suppress Product Name (Y/N).......... = n

END

BEGIN Report: Executive Summary ========================

No difference

BEGIN Report: Job Summary ==============================

No difference

BEGIN Report: Input File Summary =======================

No difference

BEGIN Report: Input List Summary ===========================

No difference

BEGIN Report: List Quality ============================

No difference

BEGIN Report: Output File =============================

No difference

Print Banner Page (JOB/REPORT/NONE).. = none

Suppress Product Name (Y/N).......... = n

High Confidence (2 to 100)........... = 31

Medium Confidence (1 to 99).......... = 21

Name High Confidence (2 to 100)...... = 31

Name Medium Confidence (1 to 99)..... = 21

Firm High Confidence (2 to 100)...... = 66

Firm Medium Confidence (1 to 99)..... = 21

END

*BEGIN Report: Executive Summary =====================

No difference

*BEGIN Report: Job Summary ===========================

No difference

*BEGIN Report: Input File Summary ====================

No difference

*BEGIN Report: Input List Summary ===================

No difference

*BEGIN Report: List Quality =========================

No difference

*BEGIN Report: Output File ==========================

No difference

BEGIN Report: Parsing Error ===========================

No difference

BEGIN Report: Change ==================================

No difference

BEGIN Report: All Records ==============================

No difference

*BEGIN Report: Parsing Error ========================

No difference

*BEGIN Report: Change ===============================

No difference

*BEGIN Report: All Records ===========================

No difference

BEGIN Statistics File ===============================

Job Stats Name (path & file name).... = $jobj.dsj

Appendix A: Job file comparisons

Page 60

DataRight IQ Transition Guide

Business objects DATARIGHT IQ 7.80C User Manual

Specifications and Main Features

Frequently Asked Questions

User Manual

DataRight IQ

Transition Guide

Contents

Preface

Access the latest documentation

Comparing DataRight and DataRight IQ

Parsing more data DataRight IQ parses several more types of data than DataRight 2.5x parsed,

Rule-based parsing DataRight IQ brings you the flexibility of rule-based parsing. With parsing based

Controlling parsing DataRight IQ provides you with control over what you parse on multi-line input.

User-defined patterns DataRight IQ lets you define patterns that you can then parse. You define these

Name order Name order can be FML (First, Middle, Last) or LFM (Last, First, Middle). You

Title associations DataRight IQ lets you associate names and titles more precisely than DataRight

Search-and-Replace DataRight IQ’s Search-and-Replace capabilities are very similar to DataRight

Scan-and-Split DataRight IQ’s Scan-and-Split capabilities are very similar to DataRight 2.5x’s.

Statistics files In addition to all the reports you could create in DataRight 2.5x, DataRight IQ

Unicode support DataRight IQ can read and write data that follows the Unicode standard. Strictly

DataRight IQ’s implementation methods

Where to look for more information

DataRight IQ-specific documentation

DataRight IQ-related documentation

Paths You must set your paths appropriately.

Directory structure If you accept the default settings during installation, the installer creates the

Installing your product

On Windows When you insert the application CD, the installation program should start

On UNIX To install DataRight IQ on UNIX, use install_console for the Java Runtime

UNIX users: Access the shared libraries

Quick Parse If you use DataRight IQ on the UNIX platform and you have Trolltech’s Qt

Updating your jobs (DataRight IQ Job and Views)

Use Edjob to update jobs

Use DCTCONV to update parsing dictionaries

Your first job is ready to run

Supporting files To read your input files, DataRight IQ needs to know the input database type,

If you use DataRight IQ Job

If you use DataRight IQ Views

Running the sample job

Verification and processing

Reports After you run the sample job, you can look at the reports that DataRight IQ

Editing files in Job

Copy and edit the master job

Keep blocks intact

Type entries correctly Parameter names are often followed by guidelines or options shown in

Include all required information

$job, $time, and $date macros

Editing files in Views

Advantages of Views With Views, you eliminate many of the chances for mistakes that you can make

Parsing—DataRight versus DataRight IQ

How DataRight IQ differs

DataRight IQ uses rule-based parsing

Presumptive parsing The parsing in DataRight was all hard coded (not like DataRight IQ, where there

DataRight IQ’s multiline parsing order

Why order? The order in which DataRight IQ parses your data is important. Why? Because if

Modify how DataRight IQ parses

Use pre-defined rules DataRight IQ already provides hundreds of rules for many different possible

Turn off parsing engines

Parse e-mail addresses

Parse Social Security numbers

How DataRight IQ handles Social Security numbers

Other U.S. ID numbers

Parse dates

Parse phone numbers

Parse user-defined patterns

Control name order

Generate statistics files

Enable statistics file generation

Setting options for statistics files

Use confidence scores

Changing scores using a confidence booster

Comparing fields in DataRight IQ and DataRight

Input fields You can compare PostWare (PW) fields in DataRight IQ and DataRight.

Output fields Comparing application (AP) fields in DataRight and DataRight IQ.

More DataRight IQ output fields

APU fields

APC fields The biggest change in the APC fields is that DataRight IQ has many fields that