Business objects DATA QUALITY MANAGEMENT SDK 4.0 User Manual

Page 1
Developer Guide
SAP BusinessObjects Data Quality Management SDK 4.0 (14.0.0.1)
2010-12-09
Page 2
Copyright
© 2010 SAP AG. All rights reserved.SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP Business ByDesign, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other countries. Business Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal Decisions, Web Intelligence, Xcelsius, and other Business Objects products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of Business Objects S.A. in the United States and in other countries. Business Objects is an SAP company.All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary.These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
2010-12-09
Page 3

Contents

Overview...............................................................................................................................13Chapter 1
1.1
1.1.1
1.1.2
2.1
2.2
2.3
3.1
3.1.1
3.1.2
3.1.3
3.1.4
3.1.5
3.1.6
Data Quality Management SDK overview..............................................................................13
Relationship to Data Services................................................................................................13
EmDQ....................................................................................................................................13
Installing Data Quality Management SDK............................................................................15Chapter 2
Upgrading..............................................................................................................................15
To install the SDK on Windows..............................................................................................15
To install the SDK on Unix.....................................................................................................16
Directory data.......................................................................................................................17Chapter 3
Directory Data.......................................................................................................................17
Directory listing and update schedule.....................................................................................17
U.S. Directory expiration........................................................................................................19
Where to copy directories......................................................................................................21
To install and set up SAP Download Manager........................................................................21
To download directory files.....................................................................................................22
To extract directory files.........................................................................................................22
Cleansing packages..............................................................................................................23Chapter 4
4.1
5.1
5.2
5.3
5.3.1
5.4
5.4.1
To install Data Cleanse cleansing packages...........................................................................23
Samples................................................................................................................................25Chapter 5
Getting started with the samples............................................................................................25
Sample program files.............................................................................................................25
Building the sample................................................................................................................26
To build the samples..............................................................................................................26
Running the samples..............................................................................................................26
To run a sample.....................................................................................................................27
2010-12-093
Page 4
Contents
API Reference for C++..........................................................................................................29Chapter 6
6.1
6.1.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10
6.11
6.12
6.13
6.14
6.15
6.16
6.17
6.18
C++ API reference overview..................................................................................................29
ToLatin1.................................................................................................................................29
CertifiedReportGenerator......................................................................................................30
DataRecordSchema...............................................................................................................30
Date.......................................................................................................................................32
DateTime...............................................................................................................................33
EmdqException......................................................................................................................34
InputDataRecord....................................................................................................................34
MessageHandler....................................................................................................................37
MultiRecordTransform............................................................................................................37
MultiRecordTransformHelper.................................................................................................40
OutputDataRecord.................................................................................................................41
ProgressHandler....................................................................................................................44
RecordTransform...................................................................................................................44
RecordTransformHelper.........................................................................................................46
StatisticsHandler....................................................................................................................46
StatisticsSchema...................................................................................................................47
Time.......................................................................................................................................48
TransformFactory...................................................................................................................50
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
7.10
7.11
7.12
7.13
7.14
7.15
API Reference for Java..........................................................................................................53Chapter 7
Java API reference overview .................................................................................................53
CertifiedReportGenerator......................................................................................................53
DataRecordSchema...............................................................................................................54
EmdqException......................................................................................................................55
InputDataRecord....................................................................................................................56
MessageHandler....................................................................................................................58
MultiRecordTransform............................................................................................................58
MultiRecordTransformHelper.................................................................................................61
OutputDataRecord.................................................................................................................62
ProgressHandler....................................................................................................................65
RecordTransform...................................................................................................................65
RecordTransformHelper.........................................................................................................68
StatisticsHandler....................................................................................................................68
StatisticsSchema...................................................................................................................69
TransformFactory...................................................................................................................70
2010-12-094
Page 5
Contents
API Reference for .Net..........................................................................................................73Chapter 8
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
9.1
9.2
10.1
10.2
10.2.1
10.2.2
10.2.3
10.2.4
10.2.5
10.2.6
10.2.7
10.2.8
10.2.9
10.2.10
10.2.11
10.3
10.4
10.4.1
10.4.2
10.5
10.5.1
10.5.2
10.5.3
10.5.4
10.6
.Net API reference overview..................................................................................................73
EmDQException.....................................................................................................................73
LogHandler............................................................................................................................73
MultiRecordProgressHandler.................................................................................................74
MultiRecordTransform............................................................................................................74
MultiRecordTransformHelper.................................................................................................75
RecordTransform...................................................................................................................76
RecordTransformHelper.........................................................................................................77
TransformFactory...................................................................................................................77
Address cleanse concepts....................................................................................................81Chapter 9
Address cleanse basics..........................................................................................................81
Set up the reference files.......................................................................................................81
USA Regulatory Address Cleanse.........................................................................................83Chapter 10
USA Regulatory Address Cleanse overview...........................................................................83
USPS DPV®...........................................................................................................................83
Benefits of DPV.....................................................................................................................84
DPV security..........................................................................................................................84
DPV monthly directories........................................................................................................85
Required information in the job setup.....................................................................................85
DPV output fields...................................................................................................................85
Non certified mode.................................................................................................................88
DPV performance..................................................................................................................88
DPV locking...........................................................................................................................89
Unlocking DPV.......................................................................................................................93
DPV No Stats indicators........................................................................................................93
DPV Vacant indicators...........................................................................................................95
USPS eLOT® .........................................................................................................................95
Early Warning System (EWS).................................................................................................96
Overview of EWS...................................................................................................................96
EWS directory .......................................................................................................................96
SuiteLink..............................................................................................................................97
Benefits of SuiteLink..............................................................................................................97
How SuiteLink works ............................................................................................................97
SuiteLink directory ................................................................................................................98
Improve processing speed ....................................................................................................99
LACSLink®.............................................................................................................................99
2010-12-095
Page 6
Contents
10.6.1
10.6.2
10.6.3
10.6.4
10.6.5
10.6.6
10.6.7
10.6.8
10.6.9
10.6.10
10.6.11
10.7
10.7.1
10.7.2
10.7.3
10.7.4
10.8
10.8.1
10.8.2
10.8.3
10.9
10.9.1
10.9.2
10.9.3
10.10
10.10.1
10.10.2
10.10.3
10.10.4
10.11
10.11.1
10.11.2
10.11.3
10.11.4
10.11.5
10.11.6
10.11.7
10.11.8
10.11.9
10.11.10
10.11.11
Benefits of LACSLink.............................................................................................................99
How LACSLink works .........................................................................................................100
Conditions for address processing.......................................................................................100
LACSLink directory files ......................................................................................................100
Required information in the job setup ..................................................................................101
Reasons for errors ..............................................................................................................101
LACSLink output fields .......................................................................................................102
Memory usage and caching for LACSLink processing..........................................................104
LACSLink® security..............................................................................................................105
Unlocking LACSLink............................................................................................................107
USPS Form 3553.................................................................................................................109
USPS RDI®..........................................................................................................................109
How RDI works ...................................................................................................................110
RDI directory files................................................................................................................110
RDI output field ...................................................................................................................111
CASS Statement, USPS Form 3553....................................................................................111
Z4Change (USA Regulatory Address Cleanse)....................................................................112
Enable Z4Change for faster processing ...............................................................................112
Z4Change and USPS rules .................................................................................................112
Z4Change directory.............................................................................................................112
Introduction to suggestion lists.............................................................................................113
Breaking ties........................................................................................................................114
More information is needed..................................................................................................116
CASS rule ...........................................................................................................................117
USPS certifications..............................................................................................................117
To complete USPS certifications .........................................................................................117
Static directories..................................................................................................................118
CASS self-certification ........................................................................................................120
NCOALink certification........................................................................................................122
NCOALink (USA Regulatory Address Cleanse)...................................................................125
The importance of move updating .......................................................................................126
Benefits of NCOALink.........................................................................................................126
How NCOALink works.........................................................................................................126
Software performance .........................................................................................................128
Address not known (ANKLink) ............................................................................................128
Getting started with NCOALink............................................................................................130
What to expect from the USPS and SAP BusinessObjects..................................................130
About NCOALink directories................................................................................................132
About the NCOALink daily delete file ..................................................................................134
Output file strategies............................................................................................................135
Improving NCOALink processing performance.....................................................................136
2010-12-096
Page 7
Contents
10.11.12
10.12
10.12.1
10.12.2
11.1
11.2
11.3
11.4
11.5
11.6
11.7
11.8
11.9
11.10
11.10.1
11.10.2
11.11
11.12
11.12.1
11.13
11.13.1
11.13.2
11.13.3
11.13.4
11.13.5
11.13.6
NCOALink log files .............................................................................................................138
Multiple data source statistics reporting...............................................................................140
Data_Source_ID field...........................................................................................................141
USPS Form 3553 and group reporting.................................................................................142
USA Regulatory Address Cleanse Reference.....................................................................145Chapter 11
USA Regulatory Address Cleanse.......................................................................................145
System group......................................................................................................................145
Report and analysis..............................................................................................................146
Transform performance........................................................................................................146
Reference files.....................................................................................................................148
Assignment options.............................................................................................................150
Standardization options........................................................................................................152
Z4 Change options..............................................................................................................159
CASS Report options..........................................................................................................160
Suggestion List options........................................................................................................161
Suggestion List output options.............................................................................................163
Suggestion list components.................................................................................................163
Non Certified options ..........................................................................................................166
USPS license information options .......................................................................................167
Required options for USPS License Information...................................................................169
NCOALink options...............................................................................................................170
Processing options..............................................................................................................170
Report Options....................................................................................................................173
Output options.....................................................................................................................174
Processing Acknowledgment Form (PAF) Details.................................................................174
Service provider options......................................................................................................175
Contact Details....................................................................................................................177
12.1
12.2
12.2.1
12.2.2
12.3
12.3.1
12.3.2
13.1
Global Address Cleanse......................................................................................................179Chapter 12
Supported countries (Global Address Cleanse)....................................................................179
Process Japanese addresses .............................................................................................179
Standard Japanese address format......................................................................................179
Special Japanese address formats.......................................................................................184
Process Chinese addresses.................................................................................................186
Chinese address format.......................................................................................................186
Sample Chinese address.....................................................................................................188
Global Address Cleanse Reference....................................................................................191Chapter 13
Global Address Cleanse.......................................................................................................191
2010-12-097
Page 8
Contents
13.2
13.3
13.4
13.5
13.6
13.7
13.8
13.8.1
13.8.2
13.8.3
13.9
13.10
13.10.1
13.10.2
13.11
13.11.1
13.11.2
14.1
14.2
14.3
14.3.1
14.3.2
14.3.3
14.3.4
14.3.5
14.4
14.5
14.6
14.7
14.8
14.8.1
14.8.2
System group......................................................................................................................191
Report and analysis..............................................................................................................192
Reference files.....................................................................................................................192
Country ID Options (Global Address Cleanse).....................................................................192
Engines................................................................................................................................193
Standardization Options.......................................................................................................194
Canada engine.....................................................................................................................203
Canada engine Options........................................................................................................204
Canada engine Report Options............................................................................................206
Canada engine Suggestion List Options...............................................................................207
Global Address Country Options..........................................................................................209
Global Address Engine Report Options................................................................................210
Report options for Australia.................................................................................................211
Report options for New Zealand..........................................................................................211
USA engine..........................................................................................................................212
USA engine Options............................................................................................................212
USA engine Suggestion Lists Options.................................................................................213
Data Cleanse......................................................................................................................217Chapter 14
About Cleansing Data..........................................................................................................217
Ranking and prioritizing parsing engines...............................................................................217
About parsing data...............................................................................................................217
About parsing phone numbers..............................................................................................218
About parsing dates.............................................................................................................219
About parsing Social Security numbers................................................................................219
About parsing Email addresses............................................................................................220
About parsing street addresses...........................................................................................222
About standardizing data......................................................................................................222
About assigning gender descriptions and prenames.............................................................222
Prepare records for matching...............................................................................................222
Cleansing packages and transforms.....................................................................................223
About Japanese data...........................................................................................................224
Text width in output fields.....................................................................................................224
Process Japanese data .......................................................................................................225
15.1
15.2
15.3
15.4
Data Cleanse Reference.....................................................................................................227Chapter 15
Data Cleanse.......................................................................................................................227
System group......................................................................................................................227
Cleansing Package ..............................................................................................................228
Engines................................................................................................................................228
2010-12-098
Page 9
Contents
15.5
15.5.1
15.6
15.7
15.8
15.9
15.10
16.1
16.1.1
16.1.2
17.1
17.2
17.3
17.4
17.4.1
17.4.2
Person standardization options............................................................................................228
Gender standardization options............................................................................................230
Firm standardization options.................................................................................................231
Other standardization options...............................................................................................232
Input word breaker...............................................................................................................234
Date options........................................................................................................................235
Parser configuration.............................................................................................................238
Geocoder............................................................................................................................239Chapter 16
Geocoding...........................................................................................................................239
POI and address geocoding ................................................................................................239
POI and address reverse geocoding ....................................................................................240
Geocoder Reference...........................................................................................................241Chapter 17
Geocoder.............................................................................................................................241
Directories...........................................................................................................................241
System group......................................................................................................................242
Geocoder options................................................................................................................242
Report and analysis..............................................................................................................242
Reference files.....................................................................................................................243
18.1
18.1.1
18.2
18.3
18.4
18.4.1
18.4.2
18.5
18.5.1
18.6
18.7
18.7.1
18.8
18.9
18.9.1
18.9.2
18.9.3
18.9.4
Match..................................................................................................................................249Chapter 18
Matching strategies..............................................................................................................249
Match samples.....................................................................................................................249
Match components..............................................................................................................250
Physical and logical sources.................................................................................................251
Using sources .....................................................................................................................252
Source types ......................................................................................................................253
Source groups ....................................................................................................................253
Prepare data for matching....................................................................................................254
Fields to include for matching...............................................................................................255
Compare tables....................................................................................................................255
Data Salvage ......................................................................................................................255
Data salvaging and initials ...................................................................................................256
Overview of match criteria....................................................................................................258
Matching methods...............................................................................................................259
Similarity score....................................................................................................................259
Rule-based method..............................................................................................................260
Weighted-scoring method....................................................................................................261
Combination method............................................................................................................262
2010-12-099
Page 10
Contents
18.10
18.10.1
18.10.2
18.10.3
18.10.4
18.10.5
18.10.6
18.11
18.12
19.1
19.2
19.2.1
19.3
19.4
19.4.1
19.4.2
19.5
19.6
19.7
19.8
19.9
19.9.1
19.9.2
19.10
19.11
19.12
19.12.1
19.12.2
19.13
19.14
19.15
19.16
19.17
Matching business rules.......................................................................................................263
Matching on strings, abbreviations, and initials.....................................................................263
Extended abbreviation matching...........................................................................................263
Name matching....................................................................................................................264
Numeric data matching.........................................................................................................265
Blank field matching.............................................................................................................267
Multiple field (cross-field) comparison..................................................................................269
Group statistics....................................................................................................................269
Input source select records .................................................................................................270
Match Reference.................................................................................................................273Chapter 19
Match XML..........................................................................................................................273
System group......................................................................................................................273
MatchSettings.....................................................................................................................274
Report and analysis..............................................................................................................275
Match control.......................................................................................................................275
Match levels group...............................................................................................................276
Input fields group.................................................................................................................276
Match level group................................................................................................................277
Match criteria standard keys................................................................................................277
Match criteria key layout......................................................................................................281
Compare table group...........................................................................................................286
Compare match criteria group..............................................................................................288
Standard key match options.................................................................................................290
Criteria definition group........................................................................................................293
Post match processing group...............................................................................................302
Group statistics group..........................................................................................................303
Input sources.......................................................................................................................306
Source groups.....................................................................................................................309
Input sources / Input fields...................................................................................................310
Field algorithm numeric difference group..............................................................................311
Field algorithm numeric percent difference group ................................................................312
Field algorithm geo proximity group .....................................................................................312
Input source group statistics group......................................................................................313
Input source select record group..........................................................................................314
20.1
20.2
20.3
Data Quality fields..............................................................................................................317Chapter 20
Input fields...........................................................................................................................317
Output fields........................................................................................................................318
Data type support................................................................................................................319
2010-12-0910
Page 11
Contents
20.4
20.4.1
20.4.2
20.5
20.5.1
20.5.2
20.6
20.6.1
20.6.2
20.6.3
20.7
20.7.1
20.7.2
20.8
21.1
21.2
21.3
21.4
21.5
21.6
21.7
21.7.1
21.7.2
21.7.3
21.8
Data Cleanse fields..............................................................................................................321
Input fields...........................................................................................................................321
Output fields........................................................................................................................323
Geocoder fields...................................................................................................................329
Input fields...........................................................................................................................329
Output fields........................................................................................................................331
Global Address Cleanse fields.............................................................................................337
Input fields...........................................................................................................................337
Output Fields.......................................................................................................................341
Global Address Cleanse Suggestion List fields....................................................................353
USA Regulatory Address Cleanse fields..............................................................................359
Input fields...........................................................................................................................359
Output fields........................................................................................................................362
Match output fields..............................................................................................................381
Data Quality Appendix........................................................................................................387Chapter 21
Address Cleanse reference..................................................................................................387
Country ISO codes and assignment engines........................................................................387
Information codes (Global Address Cleanse).......................................................................405
Status Codes (USA Regulatory Address Cleanse)...............................................................408
Quality codes (Global Address Cleanse)..............................................................................412
Status codes (Global Address Cleanse)...............................................................................413
About ShowA and ShowL (USA and Canada)......................................................................418
USA ShowA command line options......................................................................................419
Canada ShowA command line options.................................................................................421
Canada ShowL command line options..................................................................................422
Geocoder reference.............................................................................................................424
Glossary..............................................................................................................................425Chapter 22
Index 437
2010-12-0911
Page 12
Contents
2010-12-0912
Page 13

Overview

Overview
1.1 Data Quality Management SDK overview
The Data Quality Management SDK provides a framework and APIs that allow you to write applications that use SAP BusinessObjects Data Quality technology, such as parsing, standardization, correction, and matching of data. You can use it to create applications that target the specific Data Quality functionality you want to employ with an in-process integration.
1.1.1 Relationship to Data Services
This product provides functionality similar to SAP BusinessObjects Data Services, but deploys that technology as an API.
The Data Quality Management SDK provides a lighter footprint than Data Services. This product requires no server components (either from SAP or a third party) or user interface to access the Data Quality functionality.
Many customers choose to use this product in conjunction with Data Services, however. You can use the same release number version of Data Services to configure transform options in the Data Services Designer and create a configuration XML file for use with this SDK. To create the file, right-click on a transform in the Data Services Designer and select Export for DQM SDK. For more information on using the Data Services Designer, see the Data Services documentation.
When you use Data Services as a configuration tool for the Data Quality Management SDK, Data Services does not support the creation of a change log for changes to the configuration. That is, you can employ the Data Services central repository concept to manage changes to the Data Quality transforms, but no change log is created. Instead, the developer must implement a change log within a custom application created using the SDK.
1.1.2 EmDQ
2010-12-0913
Page 14
Overview
In many aspects of this product, the letters “emdq” (or cased as “EmDQ”) are often used in naming conventions. You can see this convention in namespaces, folder names, and file names. As the Data Quality SDK is an embedded, in-line processing, data quality solution, you might think of the letters emdq meaning Embedded Data Quality.
2010-12-0914
Page 15

Installing Data Quality Management SDK

Installing Data Quality Management SDK
Installing Data Quality Management SDK is a simple as running a self-extracting executable.
2.1 Upgrading
If you are upgrading this product from the previous release, you can install this version while the
existing version still exists on the same machine. You should not overwrite the files from the previous version. The default location for where the installation routine places the files is different in this version.
This product provides a new method to the TransformFactory class, UpgradeTransformSettings(),
that makes the transform settings built from the previous version of this product compatible with this version.
For information about using UpgradeTransformSettings(), see the TransformFactory class
documentation for C++, Java, or .Net.
2.2 To install the SDK on Windows
Before installing this product, you must have downloaded from the SAP Service Marketplace the appropriate package file (named *.exe).
1.
Run the executable file. The "Welcome" screen appears.
Tip:
If the installer does not start by running the executable, you can begin the installation routine by running setup.exe, which is contained in the archive.
2.
Click Next. The "License Agreement" screen appears.
3.
After reading and indicating that you accept the license agreement, click Next. The "Specify the destination folder" screen appears.
4.
After choosing a folder to install the files for this product, click Next.
2010-12-0915
Page 16
Installing Data Quality Management SDK
The "Start Installation" screen appears.
5.
Click Next. The installation routine extracts and places the files for this product in the folder you specified, until
the "Install Complete" screen appears.
6.
Click Finish to dismiss the installer.
The files for the SDK are now installed.
You must also install the Addressing Directories and Cleansing Packages before using the address correction and data cleanse functionality of this product, or the sample applications.
2.3 To install the SDK on Unix
Before installing this product, you must have downloaded from the SAP Service Marketplace the appropriate package file (named *.tgz).
1.
Unpack the *.tgz file. The files required for installation are copied to your system.
2.
Run setup.sh. The "Destination Path" screen appears.
3.
Type a destination path for the installation.
Note:
You must choose a different path than the default (which is the current working directory). The "Welcome" screen appears.
4.
Press Enter to dismiss the "Welcome" screen The "License Agreement" screen appears.
5.
Press Enter to accept the license agreement. The installation routine places the files for this product in the path you specified until completion of
the installation.
The files for the SDK are now installed.
You must also install the Addressing Directories and Cleansing Packages before using the address correction and data cleanse functionality of this product, or the sample applications.
2010-12-0916
Page 17

Directory data

Directory data

3.1 Directory Data
To correct addresses and assign codes with SAP BusinessObjects Data Quality Management SDK, the transforms rely on directories, or databases. When this product uses the directories, it’s similar to the way that you use the telephone directory. A telephone directory is a large table in which you look up something you know—someone’s name—and locate something that you don’t know—their phone number.
Depending on which option you own, some disks or online packages that you receive may contain extra files in addition to your directories. You may not need to use all of these reference files depending on which transforms or options you use. For example, you may see an Extract folder. If you do not need these extra files, do not copy them to your computer. For information about extra folders, see the ReadMe.txt file included with the reference files.
3.1.1 Directory listing and update schedule
2010-12-0917
Page 18
Directory data
Updated Monthly (M)
Bimonthly (B)
Auxiliary Directories
Canada engine - Address Data
cityxx.dir
zcfxx.dir
revzip4.dir
zip4us.rev
zip4us.shs
canada.dir
cancity.dir
canfsa.dir
canpci.dir
Approximate SizeDirectory filenameDirectory type
2 MB
2 MB
1 MB
97 MB
4 MB
Quarterly (Q)
MB699 MBzip4us.dirZIP4 and Auxiliary Directories
MB
Weekly1 MBewyymmdd.dirEarly Warning System Directory
M653 MBdpv_pathDPV Data
M486MBelot.dirEnhanced Line of Travel Directory
M42 MB
Australia engine - Address Data
Engine - Data
Note:
You will receive files only for those countries your company has pur­chased.
apc.dir
aucity.dir
aus.dir
all filesGlobal Address engine and EMEA
MQ200 MB
Qup to 12.2 GB (for
all countries)
Q720 MBcgeox.dirCentroid Level Geo Data
Q4.67 GBageox.dirAddress Level Geo Data
2010-12-0918
Page 19
Directory data
Updated Monthly (M)
Bimonthly (B)
Geocoder
Japan engine - Address Data
geo_addr_ca_ven dorx .dir
geo_cent_ca_ven dorx .dir
geo_addr_fr_ven dorx .dir
geo_cent_fr_ven dorx .dir
geo_addr_us_ven dorx<num> .dir
geo_cent_us_ven dorx<num> .dir
gion_jp_paf.dir
Approximate SizeDirectory filenameDirectory type
Canada: 1.6 GB
Canada: 1 MB
France: 1.6 GB
France: 6 MB
USA: < 2 GB
USA: < 2 GB
461 MBall filesLACSLink
248 MBga_re
Quarterly (Q)
Q
M199 MBz4change.dirZ4Change Data
Q
3.1.2 U.S. Directory expiration
We publish and distribute the ZIP4 and supporting directory files under a non-exclusive license from the USPS. The USPS requires that our software disable itself when a user attempts to use expired directories.
ga_loc12_jp_paf.dir
ga_loc34_jp_paf.dir
ga_dp_jp_paf.dir
2010-12-0919
Page 20
Directory data
If you do not install new directories as you receive them, the software issues a warning in the log files when the directories are due to expire within 30 days. To ensure that your projects are based on up-to-date directory data, it's recommended that you heed the warning and install the latest directories.
Note:
Incompatible or out-of-date directories can render the software unusable. The directories are lookup files used by SAP BusinessObjects solution portfolio software. The system administrator must install monthly or bimonthly directory updates to ensure that they are compatible with the current software.
Expiration schedule
You can choose to receive updated U.S. national directories on a monthly or bimonthly basis. Bimonthly updates are distributed during the even months. Directory expiration guidelines are:
ZIP4 and Auxiliary Directories expire on 1st day of the fourth month after directory creation. When
LACSLink directories expire 105 days after directory creation.
running in Non-Certified mode, Zip4 and Auxiliary directories expire on the first day of the fourteenth month after directory creation.
3.1.2.1 U.S. National and Auxiliary files
The U.S. National and Auxiliary file self-extracting files are named as follows.
2010-12-0920
Page 21
Directory data
Zip file nameDirectory name
us_dirs_2004.exe2004-2008 U.S. National directory
U.S. Address-level GeoCensus
U.S. Centroid-level GeoCensus
3.1.3 Where to copy directories
We recommend that you install the directory files in the reference_data folder for each transform created during the Data Quality Management SDK installation. By default, the software looks for directories in <LINK_DIR>\DataQuality\reference_data (Windows) <LINK_DIR>/DataQuality/reference_data (Unix). If you place your directories in a different location, you must change the individual reference file option values in the XML files.
us_ageo1_2.exe
us_ageo3_4.exe
us_ageo5_6.exe
us_ageo7_8.exe
us_ageo9_10.exe
us_cgeo.exe
us_cgeo1.exe
us_cgeo2.exe
3.1.4 To install and set up SAP Download Manager
Before you can download directory files, you need to install and set up SAP Download Manager.
To install and set up SAP Download Manager:
1.
Access the SAP Service Marketplace (SMP): http://service.sap.com/bosap-support
2.
Select Downloads.
3.
Select Download Basket.
4.
Click the Get Download Manager button.
5.
Follow the steps to install and set up the Download Manager.
2010-12-0921
Page 22
Directory data
3.1.5 To download directory files
The directories are available for download from the SAP Service Marketplace (SMP). To download directories:
1.
Access the SAP Service Marketplace (SMP) site: http://service.sap.com/bosap-support
2.
Select Software Downloads.
3.
From the left pane, select Downloads > SAP Software Distribution Center > Installations and Upgrades > My Company's Application Components.
A list of your company's applications and any license-free products or components appear.
4.
Select the files you want to download and add them to the Download Basket. The files you select are placed in the Download Basket.
5.
To access the Download Basket, click Download Basket.
6.
To access the Download Manager documentation, click Get Download Manager.
7.
Follow the steps included in the Download Manager documentation to download the directory files.
3.1.6 To extract directory files
The steps listed here describe how to install the zipped directories using Info-Zip. If you use a different unzip tool, see the unzip procedure included with that tool.
1.
Copy the self-extracting directory files manually from the download package to the \temporary\ folder.
2.
Locate and double-click the file. The files are extracted and placed in the \temporary\ folder.
3.
Copy the directory files from the \temporary\ folder to the location where you keep your directories.
4.
Copy the zipped directory files manually from location of the extracted files to the location where you keep your directories.
5.
Type unzip filename.zip -d outputfolder. For ZIP4US, type unzip us_dirs_2004.zip -d /SAP BusinessObjects/SAP BusinessObjects Data
Quality Management SDK/linux_x86_32/DataQuality/gac).
6.
Repeat these steps for each required file.
2010-12-0922
Page 23

Cleansing packages

Cleansing packages
4.1 To install Data Cleanse cleansing packages
Before you can install cleansing packages, you must have successfully installed this product and downloaded SBOP DQM Cleansing Packages for the appropriate platform from the SAP Service Marketplace.
Installing a cleansing package prepares your system to use Data Cleanse to control parsing of person and firm data for the specific cleansing package.
You must install the cleansing packages to the same relative path as the directory holding the Data Cleanse reference files.
1.
Go to the directory where you downloaded cleansing packages from the SAP Service Marketplace, and run the setup file (setup.exe or setup.sh) to start the installer.
The DQM Cleansing Package installer starts.
2.
Read the explanatory text on screen.
3.
Review and accept the license agreement.
4.
Choose DQM SDK.
5.
Specify the destination folder.
Note:
For Windows, the destination folder is determined by the location of the installation of the SDK itself and cannot be altered. For Unix, you must assure that the destination directory is in the same relative path as the installation of the SDK.
6.
Choose the operating system for the machine on which you are installing the cleansing packages.
7.
Choose the cleansing packages that you want to install.
8.
Click the Disk Cost button (or, in Unix, select Disk Cost) to learn how much disk space is needed for this installation and how much disk space is available.
9.
Proceed until installation is complete.
The cleansing packages that you selected are now installed and available for you to use.
2010-12-0923
Page 24
Cleansing packages
2010-12-0924
Page 25

Samples

Samples
5.1 Getting started with the samples
The best way to get started with this SDK is to examine, build, and run the provided sample programs.
The installation routine places folders containing the sample program files in each supported operating system andlanguage (for example, <install_path>\windows_32\Java\samples). The \samples folder contains files needed for integrating the public API and running Data Quality transforms built using this SDK.
5.2 Sample program files
The following is the folder structure for the source code, build, and run script the samples use to demonstrate a given Data Quality transform.
cpp\ - contains files needed to integrate this SDK into a C++ environment.
inc\ - contains the SDK public headers you will include in your code.
lib\ - contains the SDK public libraries you need to link against for C++ code. You need not
link the certifiedreportgenerator.lib library unless you intend to use the certified report generator.
samples\ - contains C++ sample drivers for several Data Quality transforms.
dotNet\ - contains files needed to integrate the SDK into a .Net environment.
bin\ - contains the SDK public library you need to link against for .Net code.
samples\ - contains .Net sample drivers for several Data Quality transforms.
Java\ - contains files needed to integrate the SDK into a Java environment.
bin\ - contains the SDK public library you need to link against for Java code.
samples\ - contains Java sample drivers for several Data Quality transforms.
bin\ - contains many of the shared libraries and binaries needed to run the Data Quality transforms
included in this package. This directory must be in your PATH and shared library load environment variable (such as LD_LIBRARY_PATH on Linux) for the shared objects and other required files to be loaded properly. The run scripts for the included samples set these variables for you.
DataQuality\ - contains many files required for the Data Quality transforms to run.
redist\ - contains the MSVC VS 2005 redistributable package that you must have installed to run
the windows executables.
2010-12-0925
Page 26
Samples
xsd\ - contains all Data Quality transform configuration file XSD files. Note that the
xsi:schemaLocation element in your configuration xml files must be able to locate these XSD files.
5.3 Building the sample
All the build scripts included in a samples folder assume you are running from a command prompt with your compiler paths set up correctly. On Windows, for C++ and .Net builds, the devenv and dumpbin executable from the VisualStudio 2005 SP1 or greater should be available in your PATH environment variable. Likewise on Unix platforms, the appropriate compiler for that platform should available.
For Java projects, JAVA_HOME must be set to a compatible JDK location so that the javac executable can be found.
For .Net projects, ensure the <install_path>\<platform>\bin folder is added to your PATH environment variable prior to launching Visual Studio. The samples require these libraries to be found to create an instance of an SDK transform object.
5.3.1 To build the samples
To build all samples for a particular programming language use the build.bat (Windows) or build.sh (Unix).
1.
Navigate to your desired language folder.
2.
Navigate to the samples folder.
3.
Run the build.bat (Windows) or build.sh (Unix) script to build all samples.
The script builds the samples.
Example:
To build all C++ samples on Windows 32 bit, navigate to <install_path>\windows_32\cpp\sam ples, and type build.bat.
To build a specific sample navigate one level further to the transform's sub directory and run the build.bat (Windows) or build.sh (Unix) script within that subdirectory.
5.4 Running the samples
All of the run scripts included in a samples folder assume you are running from a command prompt.
2010-12-0926
Page 27
Samples
For Java projects, JAVA_HOME must be set to a compatible JDK or JRE location so that the Java executable can be found.
5.4.1 To run a sample
Before running the samples, you must have installed the address directory reference files and cleansing packages, and built the sample.
Each run script takes at least the configuration XML file as the first command line argument. Multi-record transforms such as Match also require you to list the input .txt file as the second argument to the run script.
1.
Navigate to the sample transform directory.
2.
Run the run.bat (Windows) or run.sh (Unix) run scripts with the necessary command line arguments, to setup your environment and run the sample.
Example:
To run the Global Address Cleanse C++ sample on Windows 32 bit, navigate to the folder <in stall_location>/windows_32/cpp/samples/gac and type run.bat
EmDQ_GlobalAddressCleanse.xml.
To run the Match C++ sample on Windows 32 bit, navigate to the folder <install_location>/win dows_32/cpp/cpp/samples/match and type run.bat EmDQ_NameAddressMatch.xml MatchNameAddrUSSingleSource.txt.
2010-12-0927
Page 28
Samples
2010-12-0928
Page 29

API Reference for C++

API Reference for C++
6.1 C++ API reference overview
This section details the API for the C++ implementation.
The define for the C++ namespace is EmDQ.
6.1.1 ToLatin1
The ToLatin1 method converts Unicode data to standard Latin1 data. It is defined in the file utility.h.
char* ToLatin1(const uint16_t* str, char* dstBuf, int32_t dstBufLength, char invalidCharReplacement = '?');
invalidCharReplace­ment [IN]
This method is used convert the UCS2 characters of the str object to Latin1 characters and copy the Latin1 characters into the passed-in buffer. A NULL terminator is always added, so there must be room in the buffer for the entire string plus a NULL terminator.
A UCS2 value greater than 255 is considered to be an invalid Latin1 character. An invalid Latin1 character is replaced with the invalidCharReplacement value. If this value is set to 0, then the invalid Latin1 character is deleted.
There are no conversions made for Locale.
DescriptionParameter
The string to convertstr [IN]
The buffer to receive Latin1 chars and a NULL terminatordstBuf [OUT]
The size of dstBuf in bytes.dstBufLength [IN]
The character to be used to replace invalid Latin1 characters (a 0 means to drop invalid characters)
Returns the dstBuf parameter.
2010-12-0929
Page 30
API Reference for C++
6.2 CertifiedReportGenerator
Class CertifiedReportGenerator is an implementation of the public StatisticsEvenHandler interface that can generate the certified mailing Statement of Address Accuracy (SERP), Address Matching processing Summary (AMAS), and Coding Accuracy Support System (CASS) 3553 reports.
This method Implements the StatisticsHandler interface.
void SetReportFile(const char* fileName, REPORT_TYPE report);
DescriptionParameter
A valid filename and path where the report is to be generatedfileName [IN]
Which report to write to fileNamereport [IN]
This method tells CertifiedReportGenerator a path and file name to create the report. Valid report types are REPORT_3553, REPORT_AMAS, and REPORT_SERP. You must call this method for each report you want generated.
If the specified file exists, if previous version of the file is overwritten. If the path to the file specified does not exist, the file is not created and an error occurs.
This method must be called for each report you wish to have generated prior to using any transform.
Related Topics
StatisticsHandler
6.3 DataRecordSchema
Class DataRecordSchema defines the layout of a Data Record.
int GetFieldCount();
This method returns the number of fields defined in the data record.
int GetFieldIndex(const uint16_t* fieldName);
DescriptionParameter
The name of the field to getfieldName [IN]
This method returns the field index of the fieldName field. Field names are treated case-insensitive (that is, NAME is equivalent to name). Returns the field index (0-based) that can be used in the other
2010-12-0930
Page 31
API Reference for C++
methods that have a field index as a parameter; otherwise, if fieldName is invalid, a value of -1 is returned.
int GetFieldIndex(const char* fieldName);
This method returns the field index of the fieldName field. Field names are treated case-insensitive (that is, NAME is equivalent to name). Returns the field index (0-based) that can be used in the other methods that have a field index as a parameter; otherwise, if fieldName is invalid, a value of -1 is returned.
int GetFieldLength(int fieldIndex);
This method returns the length of the fieldIndex field; otherwise, on a non-fatal error, returns 0.
DescriptionParameter
The name of the field to getfieldName [IN]
DescriptionParameter
The field for which to get information (0-based)fieldIndex [IN]
const uint16_t* GetFieldName(int fieldIndex);
DescriptionParameter
The field for which to get information (0-based)fieldIndex [IN]
This method returns the name of the fieldIndex field; otherwise, on a non-fatal error, returns 0.
bool GetFieldName(int fieldIndex, char* buffer, int bufferSize, char unicodeReplacement = '?');
DescriptionParameter
The field for which to get information (0-based)fieldIndex [IN]
The buffer where the field name is to be placedbuffer [OUT]
The size of the bufferbufferSize [IN]
unicodeReplace­ment [IN]
The character that is substituted when a character is encountered that cannot be represented as Latin1; set to to remove the character instead
This method gets the name of the fieldIndex field. Returns TRUE if successful; otherwise, on a non-fatal error, returns FALSE.
DATATYPE GetDataType(int fieldIndex, bool& status);
2010-12-0931
Page 32
API Reference for C++
This method gets the datatype of the field.
6.4 Date
Class Date represents a datatype. It is used to hold a date value for a record field.
DescriptionParameter
The field for which to get information (0-based)fieldIndex [IN]
TRUE upon success; FALSE if a non-fatal error has occuredstatus [OUT]
Date();Default Constructor
Date(const Date& rhs);Copy Constructor
bool SetDate(const char* dateStr);
This method sets the date of this object. The string must be in the format YYYYMMDD, where YYYY is the year, MM is the month and DD is the day. The year can range from 0 to 9999. The month can range from 1 to 12. The day can range from 1 to 31. If the string is not formatted correctly, the date value is not changed. Returns TRUE if the date is valid; otherwise, it returns FALSE.
bool SetDate(int year, int month, int day);
This method sets the date of this object. Returns TRUE if the date is valid; otherwise, it returns FALSE.
bool SetDay(int day);
DescriptionParameter
The date in the format YYYYMMDDdateStr [IN]
DescriptionParameter
The year value from 0 to 9999year [IN]
The month value from 1 to 12month [IN]
The day value from 1 to 31day [IN]
DescriptionParameter
This method sets the day of this object. The day can range from 1 to 28-31, depending on the month. Returns TRUE if the day is valid for the month and year; otherwise, it returns FALSE .
bool SetMonth(int month);
The day value from 1 to 31day [IN]
2010-12-0932
Page 33
API Reference for C++
This method sets the month of this object. The month can range from 1 to 12. Returns TRUE if the month is valid for the day and year; otherwise, it returns FALSE .
bool SetYear(int year);
This method sets the year of this object. The year can range from 0 to 9999. Returns TRUE if the year is valid for the day and month; otherwise, it returns FALSE .
void GetDate(char* dateStr, int bufferSize) const;
DescriptionParameter
The size of the dateStr destination bufferbufferSize [IN]
DescriptionParameter
The month value from 1 to 12month [IN]
DescriptionParameter
The year value from 0 to 9999year [IN]
This method gets the date value of this object. The date is returned as a string with the format YYYYMMDD, where YYYY is the year, MM is the month and DD is the day.
int GetDay() const;
This method gets the day value of this object. It returns the day value from 1 to 31.
int GetMonth() const;
This method gets the month value of this object. It returns the month value from 1 to 12.
int GetYear() const;
This method gets the year value of this object. It returns the year value from 0 to 9999.
6.5 DateTime
Class DateTime represents a datatype. It is used to hold a date and time value for a record field. This class inherits from both the Date and the Time class, so the methods of those classes are available also.
The date value in YYYYMMDD formatdateStr [OUT]
bool SetDateTime(const char* dateTimeStr);
DateTime();Default Constructor
DateTime(const DateTime& rhs);Copy Constructor
2010-12-0933
Page 34
API Reference for C++
DescriptionParameter
The date time in the format YYYYMMDDHHMMSSFdateTimeStr [IN]
This method sets the date time of this object. The string must be in the format YYYYMMDDHHMMSSF, where YYYY is the year, MM is the month, DD is the day, HH is the hours, MM is the minutes, SS is the seconds, and F is an optional one digit of the fraction, which can be repeated. The year can range from 0 to 9999. The month can range from 1 to 12. The day can range from 1 to 31. The hours can range from 0 to 23. The minutes can range from 0 to 59. The seconds can range from 0 to 59. Each optional fraction digit can range from 0 to 9. If the string is not formatted correctly, the date time value is not changed.
Returns TRUE if the date is valid; otherwise, it returns FALSE.
void GetDateTime(char* dateTimeStr, int bufferSize) const;
DescriptionParameter
The date time in the format YYYYMMDDHHMMSSFdateTimeStr [OUT]
The number of characters of the dateTimeStr bufferbufferSize [IN]
This method gets the date time value of this object. The date time will be returned as a string with the following format YYYYMMDDHHMMSSF, where YYYY is the year, MM is the month, DD is the day, HH is the hours, MM is the minutes, SS is the seconds, and F is an optional one digit of the fraction, which can be repeated.
6.6 EmdqException
Class EmdqException is the the exception class thrown by all public interfaces of this product. This class is required for processing. It is defined in the file exception.h.
virtual const uint16_t* GetMessage() const = 0;
This method returns this object's message in UCS2 characters.
virtual const char* GetMessageId() const = 0;
This method returns the message ID of this exception object. The message ID is in the format CCCNNN, where C is an alpha character and N is a numeric character. The CCC represents the source of the error. The NNN is the message number. For example, REC001 is the first message for the DataRecord class.
6.7 InputDataRecord
2010-12-0934
Page 35
API Reference for C++
Class InputDataRecord is the the main interface to the Input Data Record functionality. It inherits from the superclass DataRecord. This class is required for processing.
void Clear();
This method clears all of the fields of the data record. Each character field has a data length of 0.
void SetFieldData(int fieldIndex, const uint16_t* fieldData, int fieldDataLength = -1);
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
The number of UCS2 characters in fieldDatafieldDataLength [IN]
This method sets the data of the fieldIndex field of the data record. fieldDataLength UCS2 characters are copied from the fieldData buffer to the specified data record field.
If the field is set as null, the null will be cleared.
If fieldDataLength is -1, which is the default, then fieldData is assumed to be NULL terminated and its length will be calculated.
void SetFieldData(int fieldIndex, const char* fieldData, int fieldDataLength = -1);
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
The number of Latin1 characters in fieldDatafieldDataLength [IN]
This method sets the data of the fieldIndex field of the data record. fieldDataLength Latin1 characters are copied from the fieldData buffer to the specified data record field.
If the data is longer that the specified field's length, the data is truncated. If the data is shorter than the specified field's length, the field is copied left-justified into the field.
If the field is set as null, the null will be cleared.
If fieldDataLength is -1, which is the default, then strlen() will be used to determine the length fieldData.
void SetFieldData(int fieldIndex, const Date& fieldData);
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
This method sets the data of the fieldIndex field of the data record.
If the field is set as null, the null will be cleared.
2010-12-0935
Page 36
API Reference for C++
If the field datatype is not compatible with Date, an exception is thrown.
void SetFieldData(int fieldIndex, const DateTime& fieldData);
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
This method sets the data of the fieldIndex field of the data record.
If the field is set as null, the null will be cleared.
If the field datatype is not compatible with DateTime, an exception is thrown.
vvoid SetFieldData(int fieldIndex, const Time& fieldData);
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
This method sets the data of the fieldIndex field of the data record.
If the field is set as null, the null will be cleared.
If the field datatype is not compatible with Time, an exception is thrown.
void SetFieldData(int fieldIndex, double fieldData);
This method sets the data of the fieldIndex field of the data record.
If the field is set as null, the null will be cleared.
If the field datatype is not compatible with double, an exception is thrown.
void SetFieldData(int fieldIndex, int fieldData);
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
This method sets the data of the fieldIndex field of the data record.
If the field is set as null, the null will be cleared.
2010-12-0936
Page 37
API Reference for C++
If the field datatype is not compatible with int, an exception is thrown.
void SetFieldNull(int fieldIndex);
DescriptionParameter
The field to set to NULL (0-based)fieldIndex [IN]
This method sets the field to NULL.
6.8 MessageHandler
Class MessageHandler is a callback class to handle messages from a transform. This class is required for processing and its interface must be implemented by the integrating application.
virtual bool HandleMessage(MESSAGETYPE messageType, const char* messageId, const uint16_t* message) = 0;
Implement to handle a message. This method is passed a message for the implementor to handle. Returns true upon success; returns false upon error and stops processing.
6.9 MultiRecordTransform
Class MultiRecordTransform is the record processing Transform class for processing multiple records. This class is required for processing. Some of the methods listed here are inherted from the Transform class.
Instances of MultiRecordTransform cannot be instantiated. You must use the CreateMultiRecordTransform methodsin TransformFactory methodsto create valid MultiRecordTransform instances.
MultiRecordTransform objects are used to represent Match.
DescriptionParameter
The type of message to handlemessageType [IN]
The ID of message to handlemessageId [IN]
The message to handlemessage [IN]
MultiRecordTransformHelper* CreateHelper();
This method is used to create a helper for this transform. A helper is used to process records, just like the transform itself does. Typically a helper is run in a different thread than the transform. The helper will have all the same settings as this transform. The helper will share some of the transform's resources.
2010-12-0937
Page 38
API Reference for C++
It will also share the transform's handlers (log, statistics, etc). So the advantage of using one or more helper objects instead of creating multiple, identical transforms is the savings on resources and the production of only one set of statistics. Returns a pointer to a newly created helper object.
This method is not thread safe.
void DestroyHelper(MultiRecordTransformHelper* helper);
This method is used to destroy a helper that is no longer needed. All helpers of a transform must be destroyed before that transform is destroyed.
void LoadInputDataRecord(InputDataRecord* record);
DescriptionParameter
The helper to be destroyedhelper [IN]
DescriptionParameter
The input data record to loadrecord [IN]
This method loads an input data record into this transform. The input data record is copied, so the passed-in input data record is available for use upon return from this method. Do not attempt to use an input data record that belongs to a different transform.
void Process();
This method processes the input data records that were loaded into this transform. When this method returns, the output data records with the posted results are ready to be unloaded.
void SetProgressHandler(ProgressHandler* handler);
This method sets the progress event handler. A progress event handler is called by the transform as it processes the loaded input records. A progress event handler is optional. This method saves a shallow copy of the passed-in progress event handler. It is the application's responsibility to not delete the event handler until this transform has been destroyed.
const OutputDataRecord* UnloadOutputDataRecord();
This method unloads the next available output data record from this transform. There should be one output data record for each input data record that was loaded. An output data record becomes available after the transform has finished processing the input data record and posting results to the output data record.
Normally output records will be available for unloading after Process() has been called. But it is possible for a transform to make an output record available for unloading immediately after the call to LoadInputDataRecord(). The application is free to call this method at any time to see if there are any output records available for unloading.
DescriptionParameter
The progress event handlerhandler [IN]
2010-12-0938
Page 39
API Reference for C++
Once Process() is called, all current output records must be unloaded before any new input records are loaded for processing.
Returns the next available output record; returns 0 if no records are available.
void ClearRecords();
This method clears all input and output records, and readies the transform to process again. Call this method only after you have loaded your input records, processed, and extracted your output.
ProgressHandler* GetProgressHandler();
This method returns a pointer to the current progress event hander. Returns a pointer to the progress event handler.
void SetStatisticsHandler(StatisticsHandler* handler);
This method sets the statistics event handler. A statistics event handler is called whenever a transform wishes to output statistics. Normally, statistics are output when the transform is terminating (see the method DestroyTransform()). A statistics event handler is optional. If omitted, the statistics are not output.
DescriptionParameter
The statisticis event handlerhandler [IN]
This method saves a shallow copy of the passed-in statistics event handler to be used for handling statistics events. It is the application's responsibility not to delete the event handler until this transform has been destroyed.
InputDataRecord* GetInputDataRecord();
This method returns the input data record that holds this transform's input fields. An application can use this record to pass data to this transform. The application only has to get the input record once. The same input record may be used repeatedly.
StatisticsHandler* GetStatisticsHandler();
This method returns a pointer to the current statistics event handler.
const DataRecordSchema* GetInputSchema() const;
This method returns the schema of the input data record.
const DataRecordSchema* GetOutputSchema() const;
This method returns the schema of the output data record.
const StatisticsSchema* GetStatisticsSchema(int schemaIndex) const;
DescriptionParameter
The handler of the statistics schema for which to get informationschemaIndex [IN]
This method returns schemaIndex statistics schema.
int GetStatisticsSchemaCount() const;
2010-12-0939
Page 40
API Reference for C++
This method returns the number of statistics schemas defined for this transform. If statistics are not enabled, or if the transform does not provide statistics, the count returned will be 0.
6.10 MultiRecordTransformHelper
Class MultiRecordTransformHelper is the helper class for the multi-record processing Transform class.
void LoadInputDataRecord(InputDataRecord* record);
This method loads an input data record into this transform. The input data record is copied, so the passed-in input data record is available for use upon return from this method.
DescriptionParameter
The input data record to loadrecord [IN]
This method cannot process an input data record owned by a different transform.
void Process();
This method processes the input data records that were loaded into this transform. When this method returns, the output data records with the posted results are ready to be unloaded.
const OutputDataRecord* UnloadOutputDataRecord();
This method unloads the next available output data record from this transform. There should be one output data record for each input data record that was loaded. An output data record becomes available after the transform has finished processing the input data record and posting results to the output data record.
Normally output records are available for unloading after Process() has been called. However, a transform can make an output record available for unloading immediately after the call to LoadInputDataRecord(). The application is free to call this method at any time to check if there are any output records available for unloading.
Once Process() is called, all current output records must be unloaded before any new input records are loaded for processing.
Returns the next available output record, if available; otherwise, returns 0 if no records are available.
void ClearRecords();
Clears all input and output records and makes the transform ready to process again.
Call this method only after you have loaded your input records, processed, and extracted your output.
InputDataRecord* GetInputDataRecord();
This method returns the input data record that holds this transform's input fields. An application can use this record to pass data to this transform. The application must get the input record only once. The same input record may be used repeatedly.
2010-12-0940
Page 41
API Reference for C++
6.11 OutputDataRecord
Class OutputDataRecord is the the main interface to the Output Data Record functionality. It inherits its methods from the superclass DataRecord. This class is required for processing.
void GetFieldData(int fieldIndex, uint16_t* buffer, int bufferSize;
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
Holds UCS2 data and a NULL terminatorbuffer [OUT]
The size of bufferbufferSize[IN]
This method gets the data of the fieldIndex field of the data record. The data is copied to the buffer, including a NULL terminator. bufferSize indicates the number of UCS2 characters that will fit into the buffer.
If the data and NULL terminator is longer that the specified buffer length, the data is truncated. The NULL terminator is always copied.
If the field fieldIndex is NULL, no processing happens.
void GetFieldData(int fieldIndex, char* buffer, int bufferSize, char unicodeReplacement = '?');
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
Holds Latin1 data and a NULL terminatorbuffer [OUT]
The size of bufferbufferSize [IN]
unicodeReplacement [IN]
The character that is substitued if a character is encountered that cannot be recognized as a Latin1 character;set this parameter to 0 if you want characters that cannot be recognized as Latin1 to be removed
This method gets the data of the fieldIndex field of the data record. The data is copied to the buffer, including a NULL terminator. bufferSize indicates the number of Latin1 characters that will fit into the buffer.
All UCS2 values above 255 are converted or dropped. All UCS2 values <= 255 are saved as is. Locality and/or codepage are not considered.
If the data and NULL terminator is longer that the specified buffer length, the data is truncated. The NULL terminator is always copied.
If the field fieldIndex is NULL, no processing happens.
void GetFieldData(int fieldIndex, uint16_t* buffer, int bufferSize, int& numCopied);
2010-12-0941
Page 42
API Reference for C++
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
Holds Latin1 data and a NULL terminatorbuffer [OUT]
The size of bufferbufferSize [IN]
The number of UCS2 characters copied to buffernumCopied [OUT]
This method gets the data of the fieldIndex field of the data record. The data is copied to the buffer. bufferSize indicates the number of UCS2 characters that will fit into buffer. numCopied is set to the number of UCS2 characters copied into buffer . Thebuffer is not NULL-terminated.
If the data is longer that the specified buffer length, the data is truncated.
If the field fieldIndex is NULL, no processing happens.
void GetFieldData(int fieldIndex, Date& output);
DescriptionParameter
This method gets the data of the fieldIndex field of the data record. The data is copied to output.
If the field datatype is not compatible with Date, an exception is thrown.
If the field fieldIndex is NULL, no processing happens.
void GetFieldData(int fieldIndex, Time& output);
This method gets the data of the fieldIndex field of the data record. The data is copied to output.
If the field datatype is not compatible with Time, an exception is thrown.
If the field fieldIndex is NULL, no processing happens.
void GetFieldData(int fieldIndex, DateTime& output);
The field to get (0-based)fieldIndex [IN]
The resulting Dateoutput [OUT]
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
The resulting Timeoutput [OUT]
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
The resulting DateTimeoutput [OUT]
This method gets the data of the fieldIndex field of the data record. The data is copied to output.
2010-12-0942
Page 43
API Reference for C++
If the field datatype is not compatible with DateTime, an exception is thrown.
If the field fieldIndex is NULL, no processing happens.
void GetFieldData(int fieldIndex, double& output);
This method gets the data of the fieldIndex field of the data record. The data is copied to output.
If the field datatype is not compatible with double, an exception is thrown.
If the field fieldIndex is NULL, no processing happens.
void GetFieldData(int fieldIndex, int& output);
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
The resulting doubleoutput [OUT]
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
This method gets the data of the fieldIndex field of the data record. The data is copied to output.
If the field datatype is not compatible with int, an exception is thrown.
If the field fieldIndex is NULL, no processing happens.
bool IsFieldNull(int fieldIndex);
This method determines if the field fieldIndex is NULL. Returns TRUE if the field is NULL; otherwise, it returns FALSE
int GetFieldDataLength(int fieldIndex);
This method returns the number of characters in the field fieldIndex of the data record, or it returns 0 if the field is NULL.
The resulting intoutput [OUT]
DescriptionParameter
The field to check (0-based)fieldIndex [IN]
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
2010-12-0943
Page 44
API Reference for C++
6.12 ProgressHandler
Class ProgressHandler is a callback class to show the progress of a MultiRecordTransform and allow the handler to end processing.
virtual bool HandleProgress(double percentDone) = 0;
This method shows the percentage of completion for the current set of records being processed.
Returns TRUE on success; otherwise, returns FALSE and stops processing.
bool SetProgressInterval(int interval);
DescriptionParameter
The percent done (0.0 - 100.0)percentDone [IN]
This method specifies the interval the transform should wait between calls to the Progress() method. The interval is in seconds and should be greater than 0. Interval values less than or equal to 0 are invalid and will be ignored.
Returns TRUE on success; otherwise, returns FASLE to indicate an invalid interval.
int GetProgressInterval();
This method returns the current progress interval in seconds.
6.13 RecordTransform
Class RecordTranform is the record processing Transform class for processing single records. This class is required for processing. Some of the methods listed here are inherted from the Transform class.
Instances of RecordTransform cannot be instantiated. You must use the CreateRecordTransform methods in TransformFactory methods to create valid RecordTransform instances.
RecordTransform objects are used to represent Data Cleanse, USA Regulatory Address Cleanse, Global Address Cleanse and Geocoder.
DescriptionParameter
The number of seconds to wait between calls to Progress()interval [IN]
RecordTransformHelper* CreateHelper();
This method is used to create a helper for this transform. A helper is used to process records. Typically a helper is run in a different thread than the transform. The helper has all the same settings as this
2010-12-0944
Page 45
API Reference for C++
transform. The helper shares some of the transform's resources. It also shares the transform's handlers (for example, for logs and statistics). So the advantage of using one or more helper objects instead of creating multiple, identical transforms is the savings on resources and the production of only one set of statistics.
void DestroyHelper(RecordTransformHelper* helper);
This method is used to destroy a helper that is no longer needed. All helpers of a transform must be destroyed before that transform is destroyed.
const OutputDataRecord* Process(InputDataRecord* record);
This method processes the input data record owned by this transform. The input data record can be obtained by calling GetInputDataRecord(). The input data record should be loaded with data before being passed to this method. This method will read the fields of the input data record and post results to an output data record. The output data record is returned. The results can be queried from the output data record. Do not attempt to process an input data record owned by a different transform. Returns output data record on success; returns 0 on a nonfatal error.
DescriptionParameter
The helper to be destroyedhelper [IN]
DescriptionParameter
The input data record to processrecord [IN]
void SetStatisticsHandler(StatisticsHandler* handler);
This method sets the statistics event handler. A statistics event handler is called whenever a transform wishes to output statistics. Normally, statistics are output when the transform is terminating (see the method DestroyTransform()). A statistics event handler is optional. If omitted, the statistics are not output.
This method saves a shallow copy of the passed-in statistics event handler to be used for handling statistics events. It is the application's responsibility not to delete the event handler until this transform has been destroyed.
InputDataRecord* GetInputDataRecord();
This method returns the input data record that holds this transform's input fields. An application can use this record to pass data to this transform. The application only has to get the input record once. The same input record may be used repeatedly.
StatisticsHandler* GetStatisticsHandler();
This method returns a pointer to the current statistics event handler.
const DataRecordSchema* GetInputSchema() const;
DescriptionParameter
The statisticis event handlerhandler [IN]
2010-12-0945
Page 46
API Reference for C++
This method returns the schema of the input data record.
const DataRecordSchema* GetOutputSchema() const;
This method returns the schema of the output data record.
const StatisticsSchema* GetStatisticsSchema(int schemaIndex) const;
DescriptionParameter
The handler of the statistics schema for which to get informationschemaIndex [IN]
This method returns schemaIndex statistics schema.
int GetStatisticsSchemaCount() const;
This method returns the number of statistics schemas defined for this transform. If statistics are not enabled, or if the transform does not provide statistics, the count returned will be 0.
6.14 RecordTransformHelper
Class RecordTransformHelper is the helper class for the record processing Transform class.
const OutputDataRecord* Process(InputDataRecord* record);
This method processes the input data record owned by this transform. The input data record can be obtained by calling GetInputDataRecord(). The input data record should be loaded with data before being passed to this method. This method will read the fields of the input data record and post results to an output data record.
Returns the output data record. The results can be queried from the output data record.
This method cannot process an input data record owned by a different transform.
InputDataRecord* GetInputDataRecord();
This method returns the input data record that holds this transform's input fields. An application can use this record to pass data to this transform. The application must get the input record only once. The same input record may be used repeatedly, but make sure you call Clear() before re-loading the record.
DescriptionParameter
The input data record to processrecord [IN]
6.15 StatisticsHandler
2010-12-0946
Page 47
API Reference for C++
Class StatisticsHandler is a callback class to handle statistics records. This interface must be implemented by the integrating application.
virtual bool HandleStatistics(const OutputDataRecord* record) = 0;
DescriptionParameter
The output statistics record to outputrecord [IN]
This method is passed an output record that holds statistics information. The application may query the record to determine which statistics table the record belongs.
The record pointer passed to this method should not be saved. The pointer becomes invalid after this method returns. The application must query and save any field data from the record that it intends to keep.
Returns TRUE upon success; otherwise, returns FALSE and produces an error and stops.
int GetRecordsRemainingCount() const;
This method gets the number of output records that remain to be passed to the Output() method. If the transform has a block of records to send, the transform calls SetRecordsRemainingCount() before each call to Output() to indicate how many records are left to send. The application may use this information to buffer the records instead of processing them individually.
Returns the number of additional records ready to be output.
const StatisticsSchema* GetStatisticsSchema() const;
This method returns the statistics schema.
6.16 StatisticsSchema
Class StatisticsSchema defines the layout of a statistics table.
DATATYPE GetFieldDataType(int fieldIndex, bool& status) const = 0
DescriptionParameter
The field on which to get information (0-based)fieldIndex [IN]
TRUE on success; FALSE on a nonfatal errorstatus [OUT]
This method gets the data type of the fieldIndex field.
const uint16_t* GetTableName();
This method returns the name of the table that this schema describes.
bool GetTableName(char* buffer, int bufferSize, char unicodeReplacement = '?');
2010-12-0947
Page 48
API Reference for C++
DescriptionParameter
The buffer to hold the table namebuffer [OUT]
The size of the bufferbufferSize [IN]
unicodeReplace­ment [IN]
The character substituted if a character is encountered that cannot be represented as Latin1; set to 0 if you want characters that cannot be converted to be removed.
This method returns the name of the table that this schema describes.
bool AllowNull(int fieldIndex;
DescriptionParameter
The field on which to get information (0-based)fieldIndex [IN]
This method indicates whether the fieldIndex field allows a NULL value. Returns TRUE if fieldIndex allows a NULL value or if fieldIndex is invalid; otherwise, returns FALSE.
bool IsPrimaryKey(int fieldIndex);
DescriptionParameter
The field on which to get information (0-based)fieldIndex [IN]
This method indicates whether the fieldIndex field is a primary key. Returns TRUE if fieldIndex is a primary key; otherwise, returns FALSE.
DataRecordSchema::DATATYPE GetDataType(int fieldIndex, bool& status);
DescriptionParameter
This method gets the datatype for the field.
6.17 Time
Class Time represents a datatype. It is used to hold a time value for a record field.
bool SetTime(const char* timeStr);
The field on which to get information (0-based)fieldIndex [IN]
FALSE if a non-fatal error has occurredstatus [OUT]
2010-12-0948
Page 49
API Reference for C++
This method sets the time of this object. The string must be in the format HHMMSSF, where HH is the hour, MM is the minutes, SS is the seconds, and F is an optional digit of the fraction, which can be repeated. The hours can range from 0 to 23. The minutes can range from 0 to 59. The seconds can range from 0 to 59. Each optional fraction digit can range from 0 to 9. If the string is not formatted correctly, the time value is not changed. Returns TRUE if the time is valid; otherwise, it returns FALSE.
bool SetHours(int hours);
This method sets the hours of this object. The hours can range from 0 to 23. Returns TRUE if the hours is valid; otherwise, it returns FALSE.
bool SetMinutes(int minutes);
DescriptionParameter
The time in the format HHMMSSFtimeStr [IN]
DescriptionParameter
The hours value from 0 to 23hours [IN]
This method sets the minutes of this object. The minutes can range from 0 to 59. Returns TRUE if the minutes is valid; otherwise, it returns FALSE.
bool SetSeconds(int seconds);
This method sets the seconds of this object. The seconds can range from 0 to 59. Returns TRUE if the seconds is valid; otherwise, it returns FALSE.
bool SetFractionOfSeconds(double fraction);
This method sets the fractional seconds of this object. The range must be 0.0 <= fraction < 1.0. Returns TRUE if the seconds is valid; otherwise, it returns FALSE.
void GetTime(char* timeStr, int bufferLength) const;
DescriptionParameter
The minutes value from 0 to 59minutes [IN]
DescriptionParameter
The seconds value from 0 to 59seconds [IN]
DescriptionParameter
The fractional seconds value.fraction [IN]
2010-12-0949
Page 50
API Reference for C++
This method gets the time value of this object. The time is returned as a string with the format HHMMSSF, where HH is the hours, MM is the minutes, SS is the seconds, and F is one digit of the fraction, which can be repeated.
int GetHours() const;
This method gets the hours value of this object. It returns the hours value from 0 to 23.
int GetMinutes() const;
This method gets the minutes value of this object. It returns the minutes value from 0 to 59.
int GetSeconds() const;
This method gets the seconds value of this object. It returns the seconds value from 0 to 59.
double GetFractionOfSeconds() const;
This method gets the fractional seconds of this object. The fractional seconds can range from 0 to < 1.
DescriptionParameter
The size of the timeStr bufferbufferLength [IN]
The time value in HHMMSSF formattimeStr [OUT]
6.18 TransformFactory
Class TransformFactory is used to create a Transform. This class is required for processing.
MultiRecordTransform* CreateMultiRecordTransform(const char* transformSettings, int transformSettingsBufferSize);
DescriptionParameter
The transform settings buffertransformSettingsBuffer [IN]
The number of bytes in transformSettingsBuffertransformSettingsBufferSize
This method creates a multi-record transform, using the XML found in transformSettingsBuffer.
transformSettingsBuffer is a buffer of bytes. The encoding is determined automatically or by the encoding XML attribute.
If you are passing UCS2 data (2 byte characters) to this method, then the encoding attribute in the XML must either not exist, or be UCS2/UTF16. If you are passing Latin1 (1 byte characters) to this method, the encoding attribute in the XML must either not exist, or be UTF-8/Latin1.
Returns a pointer to the created multi-record transform.
RecordTransform* CreateRecordTransform(const char* transformSettings, int transformSettingsBufferSize);
2010-12-0950
Page 51
API Reference for C++
This method creates a record transform, using the XML found in transformSettingsBuffer.
transformSettingsBuffer is a buffer of bytes. The encoding is determined automatically or by the encoding XML attribute.
If you are passing UCS2 data (2 byte characters )to this method, then the encoding attribute in the XML must either not exist, or be UCS2/UTF16. If you are passing Latin1 (1 byte characters) to this method, the encoding attribute in the XML must either not exist, or be UTF-8/Latin1.
Returns a pointer to the created record transform.
void DestroyTransform(Transform* transform);
DescriptionParameter
The transform settings buffertransformSettingsBuffer [IN]
The number of bytes in transformSettingsBuffertransformSettingsBufferSize
DescriptionParameter
The transform to destroytransform [IN]
This method destroys a record transform or a multi-record transform. Destroying a transform may cause final statistics to be passed to the statistics event handler.
const char* UpgradeTransformSettings(const char* transformSettings, int transformSettingsLength, int& upgradedSet tingsLength);
DescriptionParameter
The transform settings buffertransformSettings [IN]
The number of bytes in transformSettingstransformSettingsLength [IN]
The actual length in bytes of the upgraded XMLupgradedSettingsLength [OUT]
This method upgrades the transform settings. It upgrades a transform’s XML settings found in transformSettings. The transformSettings parameter is a buffer of bytes. The encoding is determined automatically or by the encoding XML attribute. If you are passing UCS2 data to this method (2 byte characters) then the encoding attribute in the XML must either not exist, or must be set to UCS2 or UTF16. If you are passing Latin (1 byte characters) to this method, then the encoding attribute in the XML must either not exist, or must be set to UTF-8 or Latin1.
Once the XML is successfully parsed, the version is checked. If the XML is current, then the pointer to the passed in buffer is returned. If the XML is not current, then the XML is upgraded with the latest version string and other transform-specific changes. The upgraded XML is then stored as a string into an internal buffer and that internal buffer is returned. The number of bytes in the returned buffer is stored in upgradedSettingsLength.
Returns the pointer to the passed in buffer if the XML is current; otherwise, returns the internal buffer that holds the updated XML. If the internal buffer is returned, the data in the buffer should be copied out of the buffer before any other calls to this object.
bool ValidateTransformSettings(const char* transformSettings, int transformSettingsBufferSize);
2010-12-0951
Page 52
API Reference for C++
This method validates the XML for the transform found in transformSettingsBuffer.
transformSettingsBuffer is a buffer of bytes. The encoding is determined automatically or by the encoding XML attribute.
If you are passing UCS2 data (2 byte characters) to this method, then the encoding attribute in the XML must either not exist, or be UCS2/UTF16. If you are passing Latin1 (1 byte characters) to this method, the encoding attribute in the XML must either not exist, or be UTF-8/Latin1.
void SetMessageHandler(MessageHandler* handler);
DescriptionParameter
The transform settings buffertransformSettingsBuffer [IN]
The number of bytes in transformSettingsBuffertransformSettingsBufferSize
DescriptionParameter
The log event handlerhandler [IN]
This method sets the log event handler. A log event handler is called whenever a transform needs to output log information. A log event handler is required.
This method saves a shallow copy of the passed-in log event handler to be used for logging messages. The object will be used by each subsequently created Transform. If the application needs that each created transform has its own statistics event handler, the application must call this method with a new log event handler before each new transform is created. It is the application's responsibility to not delete the event handler until all transforms that are using the event handler have been destroyed.
void SetLocale(const char* locale);
This method sets the locale to use for messages produced by Transforms. If the locale is not supported, a warning will be logged to the MessageHandler set using SetMessageHandler and all messages will default to en_US.
MessageHandler* GetMessageHandler();
This method returns a pointer to the current log event handler.
const char* GetLocale() const;
This method gets the locale that is currently being used. If the locale set by a call to SetLocale is supported, this method will return that value. If the locale set using SetLocale was not supported, the default locale of en_US will be returned.
DescriptionParameter
The locale to uselocale [IN]
static const char* GetVersion();
This method gets the version of the Data Quality Management SDK being used.
2010-12-0952
Page 53

API Reference for Java

API Reference for Java
7.1 Java API reference overview
This section details the API for the Java implementation.
The package for the Java API is com.sap.emdq.
7.2 CertifiedReportGenerator
Class CertifiedReportGenerator is an implementation of the public StatisticsEvenHandler interface that can generate the certified mailing Statement of Address Accuracy (SERP), Address Matching processing Summary (AMAS), and U.S. Coding Accuracy Support System (CASS) 3553 reports.
CertifiedReportGenerator()
This method is the constructor and must be run before use of the Certified Report Generator.
void Destroy()
This method is required to create the reports. Only call this method after all processing is done. The object will no longer be valid after this call.
boolean handleStatistics(OutputDataRecord record)
DescriptionParameter
The statistics record to outputrecord
This method Implements the StatisticsHandler interface.
void setReportFile(ReportType reportType, String reportFile)
2010-12-0953
Page 54
API Reference for Java
This method tells CertifiedReportGenerator a path and file name to create the report. Valid reportType options are Cass3553Report, AmasReport, and SerpReport. You must call this method for report you want generated.
If the specified file exists, if previous version of the file is overwritten. If the path to the file specified does not exist, the file is not created and an error occurs.
This method must be called for each report you wish to have generated prior to using any transform.
Throws EmdqException.
Related Topics
StatisticsHandler
DescriptionParameter
Which report to write to reportFilereportType
A valid filename and path where the report is to be generatedreportFile
7.3 DataRecordSchema
Class DataRecordSchema defines the layout of a Data Record.
int getFieldCount()
This method gets the field count.
Returns the number of fields defined in the data record.
Throws EmdqException.
int getFieldIndex(String fieldName)
This method returns the field index of the fieldName field. Field names are treated case-insensitive (that is, NAME is equivalent to name).
Returns the field index (0-based) that can be used in the other methods that have a field index as a parameter; otherwise, if fieldName is invalid, a value of -1 is returned.
Throws EmdqException.
int getFieldLength(int fieldIndex)
DescriptionParameter
The name of the field to getfieldName [IN]
2010-12-0954
Page 55
API Reference for Java
This method gets the length of the field fieldIndex.
Returns the length of the fieldIndex field; otherwise, on a non-fatal error, returns 0.
Throws EmdqException.
String getFieldName(int fieldIndex)
This method gets the name of the field fieldIndex.
Returns the name of the fieldIndex field; otherwise, on a non-fatal error, returns 0.
Throws EmdqException.
DescriptionParameter
The field for which to get information (0-based)fieldIndex [IN]
DescriptionParameter
The field for which to get information (0-based)fieldIndex [IN]
DataType getDataType(int fieldIndex)
This method gets the datatype of the field fieldIndex.
Returns the datatype of the fieldIndex field; otherwise, on a non-fatal error, returns 0.
Throws EmdqException.
7.4 EmdqException
Class EmdqException is the the exception class thrown by all public interfaces of this product. This class is required for processing.
public String getMessageId()
This method returns the message ID of this exception object. The message ID is in the format CCCNNN, where C is an alpha character and N is a numeric character. The CCC represents the source of the error. The NNN is the message number. For example, REC001 is the first message for the DataRecord class.
DescriptionParameter
The field for which to get information (0-based)fieldIndex [IN]
2010-12-0955
Page 56
API Reference for Java
7.5 InputDataRecord
Class InputDataRecord is the main interface to the Input Data Record functionality. It inherits from the superclass DataRecord. This class is required for processing.
void clear()
This method clears all of the fields of the data record. Each character field has a data length of 0.
Throws EmdqException.
void setStringData(int fieldIndex, String fieldData)
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
This method sets the data of the fieldIndex field of the data record. The data is copied from the fieldData buffer.
Throws EmdqException.
void setDateData(int fieldIndex, Calendar fieldData)
This method sets the data of the fieldIndex field of the data record. If the field is set as null, the null will be cleared.
If the field is set as null, the null will be cleared.
Throws EmdqException if the field datatype is not compatible with Date.
void setDateTimeData(int fieldIndex, Calendar fieldData)
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
This method sets the data of the fieldIndex field of the data record. If the field is set as null, the null will be cleared.
If the field is set as null, the null will be cleared.
2010-12-0956
Page 57
API Reference for Java
Throws EmdqException if the field datatype is not compatible with DateTime.
void setTimeData(int fieldIndex, Calendar fieldData)
This method sets the data of the fieldIndex field of the data record. If the field is set as null, the null will be cleared.
If the field is set as null, the null will be cleared.
Throws EmdqException if the field datatype is not compatible with Time.
void setDoubleData(int fieldIndex, double fieldData)
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
This method sets the data of the fieldIndex field of the data record. If the field is set as null, the null will be cleared.
If the field is set as null, the null will be cleared.
Throws EmdqException if the field datatype is not compatible with double.
void setIntegerData(int fieldIndex, int fieldData)
This method sets the data of the fieldIndex field of the data record. If the field is set as null, the null will be cleared.
If the field is set as null, the null will be cleared.
Throws EmdqException if the field datatype is not compatible with int.
void setFieldNull(int fieldIndex)
The field data to setfieldData [IN]
DescriptionParameter
The field to set (0-based)fieldIndex [IN]
The field data to setfieldData [IN]
DescriptionParameter
This method sets the field to NULL.
Throws EmdqException.
The field to set to NULL (0-based)fieldIndex [IN]
2010-12-0957
Page 58
API Reference for Java
7.6 MessageHandler
Class MessageHandler is a callback class to handle messages from a transform. This class is required for processing and its interface must be implemented by the integrating application.
MessageHandler()
This method is the protected scope constructor. It must be called before the extended class is used.
abstract boolean handleMessage(MessageType type, String messageId, String message)
DescriptionParameter
The type of message to handletype
The ID of message to handlemessageId
The message to handlemessage
This method handles log messages produced by this product. Set delegate object on TransformFactory using the LogHandler property.
Returns true upon success; returns false upon error and stops processing.
7.7 MultiRecordTransform
Class MultiRecordTransform is the record processing Transform class for processing multiple records. This class is required for processing. Some of the methods listed here are inherted from the Transform class.
Instances of MultiRecordTransform cannot be instantiated. You must use the CreateMultiRecordTransform methodsin TransformFactory methodsto create valid MultiRecordTransform instances.
MultiRecordTransform objects are used to represent Match.
MultiRecordTransformHelper createHelper()
This method is used to create a helper for this transform. A helper is used to process records, just like the transform itself does. Typically a helper is run in a different thread than the transform. The helper will have all the same settings as this transform. The helper will share some of the transform's resources. It will also share the transform's handlers (log, statistics, etc). So the advantage of using one or more helper objects instead of creating multiple, identical transforms is the savings on resources and the production of only one set of statistics.
Returns a newly created helper object.
2010-12-0958
Page 59
API Reference for Java
This method is not thread safe.
Throws EmdqException.
void destroyHelper(MultiRecordTransformHelper helper)
This method is used to destroy a helper that is no longer needed. All helpers of a transform must be destroyed before that transform is destroyed.
Throws EmdqException .
void loadInputDataRecord(InputDataRecord record)
DescriptionParameter
The helper to be destroyedhelper [IN]
DescriptionParameter
The input data record to loadrecord [IN]
This method loads an input data record into this transform. The input data record is copied, so the passed-in input data record is available for use upon return from this method. Do not attempt to use an input data record that belongs to a different transform.
Throws EmdqException.
void process()
This method processes the input data records that were loaded into this transform. When this method returns, the output data records with the posted results are ready to be unloaded.
Throws EmdqException.
void setProgressHandler(ProgressHandler handler)
This method sets the progress event handler. A progress event handler is called by the transform as it processes the loaded input records. A progress event handler is optional. This method saves a shallow copy of the passed-in progress event handler. It is the application's responsibility to not delete the event handler until this transform has been destroyed.
Throws EmdqException.
OutputDataRecord unloadOutputDataRecord()
This method unloads the next available output data record from this transform. There should be one output data record for each input data record that was loaded. An output data record becomes available after the transform has finished processing the input data record and posting results to the output data record.
DescriptionParameter
The progress event handlerhandler [IN]
2010-12-0959
Page 60
API Reference for Java
Normally output records will be available for unloading after Process() has been called, but it is possible for a transform to make an output record available for unloading immediately after the call to LoadInputDataRecord(). The application is free to call this method at any time to see if there are any output records available for unloading.
Once Process() is called, all current output records must be unloaded before any new input records are loaded for processing.
Returns the next available output record; returns null if no records are available.
Throws EmdqException.
void clearRecords()
This method clears all input and output records, and readies the transform to process again. Call this method only after you have loaded your input records, processed, and extracted your output.
Throws EmdqException.
ProgressHandler getProgressHandler()
This method returns a pointer to the current progress event handler. Returns a pointer to the progress event handler.
Throws EmdqException.
void setStatisticsHandler(StatisticsHandler handler)
This method sets the statistics event handler. A statistics event handler is called whenever a transform wishes to output statistics. Normally, statistics are output when the transform is terminating (see the method DestroyTransform()). A statistics event handler is optional. If omitted, the statistics are not output.
This method saves a shallow copy of the passed-in statistics event handler to be used for handling statistics events. It is the application's responsibility not to delete the event handler until this transform has been destroyed.
Throws EmdqException.
InputDataRecord getInputDataRecord()
This method returns the input data record that holds this transform's input fields. An application can use this record to pass data to this transform. The application has to get the input record only once. The same input record may be used repeatedly.
Throws EmdqException.
StatisticsHandler getStatisticsHandler()
This method returns the current statistics event handler.
DescriptionParameter
The statisticis event handlerhandler [IN]
Throws EmdqException.
DataRecordSchema getInputSchema()
2010-12-0960
Page 61
API Reference for Java
This method returns the schema of the input data record.
Throws EmdqException.
DataRecordSchema getOutputSchema()
This method returns the schema of the output data record.
Throws EmdqException.
StatisticsSchema getStatisticsSchema(int schemaIndex)
DescriptionParameter
The statistics schema for which to get information (0-based)schemaIndex [IN]
This method returns schemaIndex statistics schema.
Throws EmdqException.
int getStatisticsSchemaCount()
This method returns the number of statistics schemas defined for this transform. If statistics are not enabled, or if the transform does not provide statistics, the count returned will be 0.
Throws EmdqException.
7.8 MultiRecordTransformHelper
Class MultiRecordTransformHelper is the helper class for the multi-record processing Transform class.
void loadInputDataRecord(InputDataRecord record)
This method loads an input data record into this transform. The input data record is copied, so the passed-in input data record is available for use upon return from this method.
This method cannot process an input data record owned by a different transform.
Throws EmdqException.
MultiRecordTransformHelper();Default Constructor
virtual ~MultiRecordTransformHelper();Default Destructor
DescriptionParameter
The input data record to loadrecord [IN]
void process()
This method processes the input data records that were loaded into this transform. When this method returns, the output data records with the posted results are ready to be unloaded.
2010-12-0961
Page 62
API Reference for Java
Throws EmdqException.
OutputDataRecord unloadOutputDataRecord()
This method unloads the next available output data record from this transform. There should be one output data record for each input data record that was loaded. An output data record becomes available after the transform has finished processing the input data record and posting results to the output data record.
Normally output records are available for unloading after Process() has been called. However, a transform can make an output record available for unloading immediately after the call to LoadInputDataRecord(). The application is free to call this method at any time to check if there are any output records available for unloading.
Once Process() is called, all current output records must be unloaded before any new input records are loaded for processing.
Returns the next available output record if available; otherwise, returns 0 if no records are available.
Throws EmdqException.
void clearRecords()
Clears all input records set and makes the transform ready to process again.
Call this method only after you have loaded your input records, processed, and extracted your output.
Throws EmdqException.
InputDataRecord getInputDataRecord()
This method returns the input data record that holds this transform's input fields. An application can use this record to pass data to this transform. The application must get the input record only once. The same input record may be used repeatedly.
7.9 OutputDataRecord
Class OutputDataRecord is the main interface to the Output Data Record functionality. It inherits its methods from the superclass DataRecord. This class is required for processing.
OutputDataRecord(long internalPtr)
internalPtr
This method is the package scope constructor of this class.
DescriptionParameter
The long value representing C++ pointer to OutputDataRecord object in native code
String getStringData(int fieldIndex)
2010-12-0962
Page 63
API Reference for Java
This method gets the data of the fieldIndex field of the data record. In the process a new string is created and returned containing the data of the field.
Returns the field data on success; otherwise, returns null if there is no data.
Throws EmdqException.
Calendar getDateData(int fieldIndex)
This method gets the Date data from a field and returns it as a copy in a Calendar object. If the field datatype is not compatible with Date, an exception is thrown.
Any time information stored within the Calendar object will be invalid.
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
Returns the field data. If fieldIndex is null, no processing occurs.
Throws EmdqException.
Calendar getTimeData(int fieldIndex)
This method gets the Time data from a field and returns it as a copy in a Calendar object. If the field datatype is not compatible with Time, an exception is thrown.
Any time information stored within the Calendar object will be invalid.
Returns the field data. If fieldIndex is null, no processing occurs.
Throws EmdqException.
Calendar getDateTimeData(int fieldIndex)
This method gets the DateTime data from a field and returns it as a copy in a Calendar object. If the field datatype is not compatible with DateTime, an exception is thrown.
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
Returns the field data. If fieldIndex is null, no processing occurs.
Throws EmdqException.
double getDoubleData(int fieldIndex)
2010-12-0963
Page 64
API Reference for Java
This method gets the double data from a field and returns it as a copy in a Calendar object. If the field datatype is not compatible with double, an exception is thrown.
Returns the field data. If fieldIndex is null, no processing occurs.
Throws EmdqException.
int getIntData(int fieldIndex)
This method gets the int data from a field and returns it as a copy in a Calendar object. If the field datatype is not compatible with int, an exception is thrown.
Returns the field data. If fieldIndex is null, no processing occurs.
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
Throws EmdqException.
boolean isFieldNull(int fieldIndex)
This method determines if the field fieldIndex is null.
Returns TRUE if the field is null; otherwise, it returns FALSE.
Throws EmdqException.
int getFieldDataLength(int fieldIndex)
This method gets the field data length.
Returns the number of characters in the field fieldIndex of the data record; returns -1 if fieldIndex is invalid.
Throws EmdqException.
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
DescriptionParameter
The field to get (0-based)fieldIndex [IN]
The resulting intoutput [OUT]
2010-12-0964
Page 65
API Reference for Java
7.10 ProgressHandler
Class ProgressHandler is a callback class to show the progress of a MultiRecordTransform and allow the handler to end processing.
protected abstract boolean handleProgress(double percentDone)
DescriptionParameter
The percent done (0.0 - 100.0)percentDone [IN]
This method shows the percentage of completion for the current set of records being processed. Returns TRUE to continue processing; otherwise, returns FALSE to stop processing.
boolean setProgressInterval(int interval)
This method specifies the interval the transform should wait between calls to Progress(). The interval is in seconds and must be greater than 0.
Throws EmdqException.
int getProgressInterval()
This method returns the current progress interval in seconds.
Throws EmdqException.
7.11 RecordTransform
Class RecordTranform is the record processing Transform class for processing single records. This class is required for processing. Some of the methods listed here are inherted from the Transform class.
Instances of RecordTransform cannot be instantiated. You must use the CreateRecordTransform methods in TransformFactory methods to create valid RecordTransform instances.
DescriptionParameter
The number of seconds to wait between calls to Progress()interval [IN]
RecordTransform objects are used to represent Data Cleanse, USA Regulatory Address Cleanse, Global Address Cleanse, and Geocoder.
RecordTransform(long internalPtr)
2010-12-0965
Page 66
API Reference for Java
DescriptionParameter
internalPt
The pointer representing the C++ pointer to the C++ RecordTransform object in native code
This method is the package scoped constructor for RecordTransform. It should be called only from TransformFactory.
RecordTransformHelper createHelper()
This method is used to create a helper for this transform. A helper is used to process records. Typically a helper is run in a different thread than the transform. The helper has all the same settings as this transform. The helper shares some of the transform's resources. It also shares the transform's handlers (for example, for logs and statistics). So the advantage of using one or more helper objects instead of creating multiple, identical transforms is the savings on resources and the production of only one set of statistics.
Returns a newly created helper object.
Throws EmdqException.
void destroyHelper(RecordTransformHelper helper)
This method is used to destroy a helper that is no longer needed. All helpers of a transform must be destroyed before that transform is destroyed.
DescriptionParameter
The helper to be destroyedhelper [IN]
Throws EmdqException.
OutputDataRecord process(InputDataRecord record)
This method processes the input data record owned by this transform. The input data record can be obtained by calling GetInputDataRecord(). The input data record should be loaded with data before being passed to this method. This method will read the fields of the input data record and post results to an output data record. The output data record is returned. The results can be queried from the output data record. Do not attempt to process an input data record owned by a different transform.
Values found within the OuputDataRecord return will only be valid until a Process is called again.
Returns output data record on success; returns null on a nonfatal error.
Throws EmdqException.
void setStatisticsHandler(StatisticsHandler handler)
DescriptionParameter
The input data record to processrecord [IN]
2010-12-0966
Page 67
API Reference for Java
This method sets the statistics event handler. A statistics event handler is called whenever a transform wishes to output statistics. Normally, statistics are output when the transform is terminating (see the method DestroyTransform()). A statistics event handler is optional. If omitted, the statistics are not output.
This method saves a shallow copy of the passed-in statistics event handler to be used for handling statistics events. It is the application's responsibility not to delete the event handler until this transform has been destroyed.
Throws EmdqException.
InputDataRecord getInputDataRecord()
This method returns the input data record that holds this transform's input fields. An application can use this record to pass data to this transform. The application has to get the input record only once. The same input record may be used repeatedly.
DescriptionParameter
The statisticis event handlerhandler [IN]
Throws EmdqException.
StatisticsHandler getStatisticsHandler()
This method returns the current statistics event handler.
Throws EmdqException.
DataRecordSchema getInputSchema()
This method returns the schema of the input data record.
Throws EmdqException.
DataRecordSchema getOutputSchema()
This method returns the schema of the output data record.
Throws EmdqException.
StatisticsSchema getStatisticsSchema(int schemaIndex)
DescriptionParameter
The statistics schema for which to get information (0-based)schemaIndex [IN]
This method returns schemaIndex statistics schema.
Throws EmdqException.
int getStatisticsSchemaCount()
This method returns the number of statistics schemas defined for this transform. If statistics are not enabled, or if the transform does not provide statistics, the count returned will be 0.
Throws EmdqException.
2010-12-0967
Page 68
API Reference for Java
7.12 RecordTransformHelper
Class RecordTransformHelper is the helper class for the record processing Transform class.
OutputDataRecord process(InputDataRecord record)
This method processes the input data record owned by this transform. The input data record can be obtained by calling GetInputDataRecord(). The input data record should be loaded with data before being passed to this method. This method will read the fields of the input data record and post results to an output data record.
Returns the output data record. The results can be queried from the output data record.
DescriptionParameter
The input data record to processrecord [IN]
This method cannot process an input data record owned by a different transform.
Throws EmdqException.
InputDataRecord getInputDataRecord()
This method returns the input data record that holds this transform's input fields. An application can use this record to pass data to this transform. The application must get the input record only once. The same input record may be used repeatedly.
Throws EmdqException.
7.13 StatisticsHandler
Class StatisticsHandler is a callback class to handle statistics records. This interface must be implemented by the integrating application.
abstract boolean handleStatistics(OutputDataRecord record);
DescriptionParameter
The output statistics record to outputrecord [IN]
This method is passed an output record that holds statistics information. The application may query the record to determine which statistics table the record belongs.
2010-12-0968
Page 69
API Reference for Java
The record pointer passed to this method should not be saved. The pointer becomes invalid after this method returns. The application must query and save any field data from the record that it intends to keep.
Returns TRUE upon success; otherwise, returns FALSE and produces an error and stops.
int getRecordsRemainingCount()
This method gets the number of output records that remain to be passed to the Output() method. If the transform has a block of records to send, the transform calls SetRecordsRemainingCount() before each call to Output() to indicate how many records are left to send. The application may use this information to buffer the records instead of processing them individually.
Returns the number of additional records ready to be output.
StatisticsSchema getStatisticsSchema()
This method returns the statistics schema.
7.14 StatisticsSchema
Class StatisticsSchema defines the layout of a statistics table.
DataType getFieldDataType(int fieldIndex)
This method gets the data type of the fieldIndex field.
Throws EmdqException.
String getTableName()
This method returns the name of the table that this schema describes.
Throws EmdqException.
boolean allowNull(int fieldIndex)
DescriptionParameter
The field on which to get information (0-based)fieldIndex [IN]
DescriptionParameter
The field on which to get information (0-based)fieldIndex [IN]
This method indicates whether the fieldIndex field allows a NULL value. Returns TRUE if fieldIndex allows a NULL value or if fieldIndex is invalid; otherwise, returns FALSE.
Throws EmdqException.
boolean isPrimaryKey(int fieldIndex)
2010-12-0969
Page 70
API Reference for Java
DescriptionParameter
The field on which to get information (0-based)fieldIndex [IN]
This method indicates whether the fieldIndex field is a primary key. Returns TRUE if fieldIndex is a primary key; otherwise, returns FALSE.
Throws EmdqException.
7.15 TransformFactory
Class TransformFactory is used to create a Transform. This class is required for processing.
synchronized MultiRecordTransform createMultiRecordTransform (String transformSettings)
DescriptionParameter
The transform settings buffertransformSettings [IN]
This method creates a multi-record transform, using the XML found in transformSettings.
Returns a handle to the created multi-record transform.
Throws EmdqException.
synchronized RecordTransform createRecordTransform(String transformSettings)
DescriptionParameter
The transform settings buffertransformSettings [IN]
This method creates a record transform, using the XML found in transformSettings.
Returns a handle to the created record transform.
Throws EmdqException.
void destroyTransform(Transform transform)
DescriptionParameter
The transform to destroytransform [IN]
This method destroys a record transform or a multi-record transform. Destroying a transform may cause final statistics to be passed to the statistics event handler. You must call this method when you are finished using a transform instance.
Throws EmdqException.
synchronized String upgradeTransformSettings(String transformSettings)
2010-12-0970
Page 71
API Reference for Java
DescriptionParameter
The transform settings buffertransformSettings [IN]
This method upgrades the transform settings. It upgrades a transform’s XML settings found in transformSettings and returns it as a String.
Throws EmdqException.
synchronized boolean validateTransformSettings(String transformSettings)
DescriptionParameter
The transform settings buffertransformSettings [IN]
This method validates the XML for the transform found in transformSettings. Returns TRUE if the transform settings had no error; otherwise, returns FALSE.
Throws EmdqException.
void setMessageHandler(MessageHandler logEventHandler)
This method sets the log event handler. A log event handler is called whenever a transform wishes to output log information. A log event handler is required.
public void setLocale(String locale)
This method sets the locale to use for messages produced by Transforms. If the locale is not supported, a warning will be logged to the MessageHandler set using SetMessageHandler and all messages will default to en_US.
public MessageHandler getMessageHandler()
This method returns the current log event handler.
String getLocale()
This method gets the locale that is currently being used. If the locale set by a call to SetLocale is supported, this method will return that value. If the locale set using SetLocale was not supported, the default locale of en_US will be returned.
static const char* GetVersion();
This method gets the version of the Data Quality Management SDK being used.
DescriptionParameter
The log event handlerlogEventHandler [IN]
DescriptionParameter
The locale to uselocale [IN]
2010-12-0971
Page 72
API Reference for Java
2010-12-0972
Page 73

API Reference for .Net

API Reference for .Net
8.1 .Net API reference overview
This section details the API for the .Net implementation.
The namespace for the .Net API is Sap.Emdq.
Ensure the <install_location>\<platform>\bin folder is added to your PATH environment variable prior to launching Visual Studio.
In Visual Studio, set the application type within the application integrating this product to x86 for 32 bit applications and x64 for 64 bit applications. The default value of “Either” is not sufficient.
8.2 EmDQException
Class EmDQException is the the exception class thrown by all public interfaces of this product. This class is derived from the Exception class and the text of the message can be found in the Message member. This class is required for processing.
property System::String^ MessageId
This method returns the message ID of this exception object. The message ID is in the format CCCNNN, where C is an alpha character and N is a numeric character. The CCC represents the source of the error. The NNN is the message number. For example, REC001 is the first message for the DataRecord class.
property System::String^ Message
This method is a member of the base class, Exception. It contains the content of the message.
8.3 LogHandler
Class Loghandler is a delegate object used to pass messages from the SDK to the integrating application.
delegate bool LogHandler(LogMessageType type, string messageId, string message)
2010-12-0973
Page 74
API Reference for .Net
This method passes messages from the SDK to the integrating application.
8.4 MultiRecordProgressHandler
Class MultiRecordProgresshandler is a delegate object used to pass progress information from the SDK to the integrating application.
delegate bool MultiRecordProgressHandler(double percentDone)
DescriptionParameter
The type of message to handletype
The ID of the message to handlemessageId
The message to handlemessage
This method indicates indicates progress information.
8.5 MultiRecordTransform
Class MultiRecordTransform is the record processing Transform class for processing multiple records. This class is required for processing. Some of the methods listed here are inherted from the Transform class.
Instances of MultiRecordTransform cannot be instantiated. You must use the CreateMultiRecordTransform methodsin TransformFactory methodsto create valid MultiRecordTransform instances.
MultiRecordTransform objects are used to represent Match.
System::Data::DataTable^ Process(System::Data::DataTable^ input);
DescriptionParameter
DescriptionParameter
The percent of progresspercentDone
The collection of input data records to processinput
This method processes the input data records that were loaded into this transform. When this method returns, the output data records with the posted results are ready to be unloaded.
2010-12-0974
Page 75
API Reference for .Net
To monitor progress of processing, register a delegate with the DataTable input, RowChanged event.
MultiRecordTransformHelper^ CreateHelper();
This method is used to create a helper for this transform. A helper is used to process records, just like the transform itself does. Typically a helper is run in a different thread than the transform. The helper will have all the same settings as this transform. The helper will share some of the transform's resources. It will also share the transform's handlers (log, statistics, etc). So the advantage of using one or more helper objects instead of creating multiple, identical transforms is the savings on resources and the production of only one set of statistics.
Returns a newly created helper object.
This method is not thread safe.
void DestroyHelper(MultiRecordTransformHelper^ helper);
This method is used to destroy a helper that is no longer needed. All helpers of a transform must be destroyed before that transform is destroyed.
DescriptionParameter
The helper to be destroyedhelper [IN]
property System::Data::DataTable^ InputSchema
This method returns the schema of the input data record.
property System::Data::DataTable^ OutputSchema
This method returns the schema of the output data record.
property System::Data::DataSet^ StatisticsSchemas
This method gets the set of statistics tables that will be populated.
Statistics are received from the SDK by adding a DataRowChangeEventHandler delegate to each of the statistics tables contained in the StatisticsSchemas data set. The method associated with the delegate is called each time statistics are generated by a transform. This is generally done when the transform terminates.
property MultiRecordProgressHandler^ ProgressHandler
This property is a pointer to the progress handler delegate that will receive progress status from the SDK.
property int ProgressInterval
This property contains the interval, in seconds, that progress is reported from the SDK.
8.6 MultiRecordTransformHelper
2010-12-0975
Page 76
API Reference for .Net
Class MultiRecordTransformHelper is a shared resource based processing object that is a clone of a MultiRecordTransform.
System::Data::DataTable^ Process(System::Data::DataTable^ input);
DescriptionParameter
The collection of input data records to processinput
This method processes the input data records that were loaded into this transform. When this method returns, the output data records with the posted results are ready to be unloaded.
8.7 RecordTransform
Class RecordTranform is the record processing Transform class for processing single records. This class is required for processing. Some of the methods listed here are inherted from the Transform class.
Instances of RecordTransform cannot be instantiated. You must use the CreateRecordTransform methods in TransformFactory methods to create valid RecordTransform instances.
RecordTransform objects are used to represent Data Cleanse, USA Regulatory Address Cleanse, Global Address Cleanse, and Geocoder.
void Process(System::Data::DataRow^ input, System::Data::DataRow^ output);
DescriptionParameter
The input data record to processrecord [IN]
This method processes the input data record owned by this transform. The input data record can be obtained by calling GetInputDataRecord(). The input data record should be loaded with data before being passed to this method. This method will read the fields of the input data record and post results to an output data record. The output data record is returned. The results can be queried from the output data record. Do not attempt to process an input data record owned by a different transform. Returns output data record on success; returns nullptr on a nonfatal error.
RecordTransformHelper^ CreateHelper();
This method is used to create a helper for this transform. A helper is used to process records, just like the transform itself does. Typically a helper is run in a different thread than the transform. The helper has all the same settings as this transform. The helper shares some of the transform's resources. It also shares the transform's handlers (for example, for logs and statistics). So the advantage of using one or more helper objects instead of creating multiple, identical transforms is the savings on resources and the production of only one set of statistics.
void DestroyHelper(RecordTransformHelper^ helper);
2010-12-0976
Page 77
API Reference for .Net
This method is used to destroy a helper that is no longer needed. All helpers of a transform must be destroyed before that transform is destroyed.
property System::Data::DataTable^ InputSchema;
This method returns the schema of the input data record.
property System::Data::DataTable^ OutputSchema;
This method returns the schema of the output data record.
property System::Data::DataSet^ StatisticsSchemas;
This method gets the set of statistics tables that will be populated.
DescriptionParameter
The helper to be destroyedhelper [IN]
8.8 RecordTransformHelper
Class RecordTransformHelper is the helper class for the record processing Transform class.
void Process(System::Data::DataRow^ input, System::Data::DataRow^ output);
This method processes the input data record owned by this transform. The input data record can be obtained by calling GetInputDataRecord(). The input data record should be loaded with data before being passed to this method. This method will read the fields of the input data record and post results to an output data record.
Returns the output data record or nullptr on a nonfatal error. The results can be queried from the output data record.
This method cannot process an input data record owned by a different transform.
8.9 TransformFactory
Class TransformFactory is used to create a Transform. This class is required for processing.
In the .Net implementation, this class also contains the methods used by the Certified Report Generator.
property LogHandler^ LoggerHandler;
2010-12-0977
Page 78
API Reference for .Net
DescriptionParameter
The log event handlerhandler [IN]
Set the log event handler. A log event handler is called whenever a transform wishes to output log information. A log event handler is required.
property System::String^ Locale;
DescriptionParameter
The locale to uselocale [IN]
Sets the locale to use for messages produced by Transforms. If the locale is not supported, a warning is logged to the LogHandler set and all messages will default to en_US.
MultiRecordTransform^ CreateMultiRecordTransform(System::String^ transformSettings);
DescriptionParameter
The transform settings buffertransformSettings [IN]
This method creates a multi-record transform, using the XML found in transformSettings.
Returns a handle to the created multi-record transform.
RecordTransform^ CreateRecordTransform(System::String^ transformSettings);
DescriptionParameter
The transform settings buffertransformSettings [IN]
This method creates a record transform, using the XML found in transformSettings.
Returns a handle to the created record transform.
bool ValidateTransformSettings(System::String^ transformSettings);
DescriptionParameter
The transform settings buffertransformSettings [IN]
This method validates the XML transform settings found in transformSettings.
Returns TRUE if the transform settings had no errors; otherwise, returns FALSE.
System::String^ UpgradeTransformSettings(System::String^ transformSettings);
2010-12-0978
Page 79
API Reference for .Net
This method upgrades the transform settings. It upgrades a transform’s XML settings found in transformSettings and returns it as a String.
void DestroyTransform(Transform^ transform);
This method disposes of a transform instance. This method is needed to ensure operations that are performed when the user is finished with the transform, such as producing statistics that can only be done when the transform is finished processing.
You must call this method when you are finished using a transform instance.
void SetSerpReport(System::String^ fileName);
DescriptionParameter
The transform settingstransformSettings [IN]
DescriptionParameter
The transform to destroytransform [IN]
DescriptionParameter
The filename and path to where the report is to be generatedfileName [IN]
This method generates certified mailing reports. You must call this method prior to creating a transform if you want the SERP report generated.
void SetAmasReport(System::String^ fileName);
DescriptionParameter
The filename and path to where the report is to be generatedfileName [IN]
This method generates certified mailing reports. You must call this method prior to creating a transform if you want the AMAS report generated.
void SetCass3553Report(System::String^ fileName);
DescriptionParameter
The filename and path to where the report is to be generatedfileName [IN]
This method generates certified mailing reports. You must call this method prior to creating a transform if you want the CASS 3553 report generated.
2010-12-0979
Page 80
API Reference for .Net
2010-12-0980
Page 81

Address cleanse concepts

Address cleanse concepts
This product allows you to create applications that use many address cleanse features, from basic parsing and standardizing to more advanced concepts unique to only some transforms.
9.1 Address cleanse basics
Address cleanse provides a corrected, complete, and standardized form of your original address data. With the USA Regulatory Address Cleanse transform and for some countries with the Global Address Cleanse transform, address cleanse can also correct or add postal codes.
What happens during address cleanse?
The USA Regulatory Address Cleanse transform and the Global Address Cleanse transform cleanse your data in the following ways:
Verify that the locality, region, and postal codes agree with one another. If your data has just a
locality and region, the transform usually can add the postal code and vice versa (depending on the country).
Standardize the way the address line looks. For example, they can add or remove punctuation and
abbreviate or spell-out the primary type (depending on what you want).
Identify undeliverable addresses, such as vacant lots and condemned buildings (USA records only).
Assign diagnostic codes to indicate why addresses were not assigned or how they were corrected.
.
Reports
The USA Regulatory Address Cleanse transform provides data for the creation of the USPS Form 3553 (required for CASS) and the NCOALink Summary Report. The Global Address Cleanse transform provides data for the creation of the Canadian SERP—Statement of Address Accuracy Report, the Australia Post’s AMAS report, and the New Zealand SOA Report.
9.2 Set up the reference files
The USA Regulatory Address Cleanse transform and the Global Address Cleanse transform and engines rely on directories (reference files) in order to cleanse your data.
2010-12-0981
Page 82
Address cleanse concepts
Directories
To correct addresses and assign codes, the address cleanse transforms rely on databases called postal directories.
Besides the basic address directories, there are many specialized directories that the USA Regulatory Address Cleanse transform uses:
DPV®
Early Warning System (EWS)
eLOT®
GeoCensus
LACSLink®
NCOALink®
RDI™
SuiteLink™
Z4Change
These features help extend address cleansing beyond the basic parsing and standardizing.
Define directory file locations
In the transform, you must tell the transform or engine where your directory (reference) files are located.
Caution:
Incompatible or out-of-date directories can render the software unusable. The system administrator must install weekly, monthly or bimonthly directory updates for the USA Regulatory Address Cleanse Transform; monthly directory updates for the Australia and Canada engines; and quarterly directory updates for the Global Address engine to ensure that they are compatible with the current software.
Related Topics
Directory Data
2010-12-0982
Page 83

USA Regulatory Address Cleanse

USA Regulatory Address Cleanse
10.1 USA Regulatory Address Cleanse overview
The USA Regulatory Address Cleanse transform identifies, parses, validates, and corrects USA address data according to the U.S. Coding Accuracy Support System (CASS). This transform supports the generation of data that can be used to generate the USPS Form 3553 and can output many useful codes to your records. You can also run in a non-certification mode as well as produce suggestion lists.
Note:
If an input record has characters not included in the Latin1 code page, the USA Regulatory Address Cleanse transform will not process that data. Instead, the software sends the mapped input record to the corresponding standardized output field (if applicable). No other output fields will be populated for that record. If your Unicode database has valid U.S. addresses from the Latin1 character set, the transform processes as normal.
If you perform both data cleansing and matching, the USA Regulatory Address Cleanse transform typically should process the data before the Data Cleanse transform, as well as any of the Match transforms.
The following sections describe the configurations for the USA Regulatory Address Cleanse XML. You can find examples of the XML configurations with the samples installed with the product.
10.2 USPS DPV®
DPV is a USPS product developed to assist users in validating the accuracy of their address information. DPV compares Postcode2 information against the DPV directories to identify known addresses and potential problems that may cause an address to become undeliverable.
DPV is available for U.S. data in the USA Regulatory Address Cleanse transform only.
You can enable DPV in the Assignment options section of the USA Regulatory Address Cleanse configuration file.
Note:
DPV processing is required for CASS certification. If you are not processing for CASS certification, you can choose to run your jobs in non-certified mode and still enable DPV.
2010-12-0983
Page 84
USA Regulatory Address Cleanse
Caution:
If you choose to disable DPV processing, the software will not generate the CASS-required documentation and your mailing will not be eligible for postal discounts.
Related Topics
Assignment options
10.2.1 Benefits of DPV
DPV can be beneficial in the following areas:
Mailing: DPV helps to screen out undeliverable-as-addressed (UAA) mail and helps to reduce mailing
costs.
Information quality: DPV increases the level of data accuracy by verifying an address down to the
individual house, suite, or apartment instead of only the block face.
Increased assignment rate: DPV may increase assignment rate through the use of DPV tiebreaking
to resolve a tie when other tie-breaking methods are not conclusive.
Preventing mail-order-fraud: DPV can eliminate shipping of merchandise to individuals who place
fraudulent orders by verifying valid delivery addresses and Commercial Mail Receiving Agencies (CMRA).
10.2.2 DPV security
The USPS has instituted processes that monitor the use of DPV. Each company that purchases the DPV functionality is required to sign a legal agreement stating that it will not attempt to misuse the DPV product. If a user abuses the DPV product, the USPS has the right to prohibit the user from using DPV in the future.
10.2.2.1 False positive addresses
The USPS has added security to prevent DPV abuse by including false positive addresses within the DPV directories. If the software finds a false positive address in the data, DPV can lock processing based on your provider level.
Related Topics
DPV locking
2010-12-0984
Page 85
USA Regulatory Address Cleanse
10.2.3 DPV monthly directories
DPV directories are shipped monthly with the USPS directories in accordance with USPS guidelines.
The directories expire in 105 days. The date on the DPV directories must be the same date as the Address directory.
Do not rename any of the files. DPV will not run if the file names are changed. The following is a list of the DPV directories:
dpva.dir
dpvb.dir
dpvc.dir
dpvd.dir
dpv_vacant.dir
dpv_no_stats.dir
10.2.4 Required information in the job setup
When you set up for DPV processing, the following options in the USPS License Information group are required:
Customer Company Name
Customer Company Address
Customer Company Locality
Customer Company Region
Customer Company Postcode1
Customer Company Postcode2
10.2.5 DPV output fields
Several output fields are available for reporting DPV processing results:
2010-12-0985
Page 86
USA Regulatory Address Cleanse
DPV_CMRA
DescriptionField
The DPV Commercial Mail Receiving Agency (CMRA) component that is generated for this record.
L = The address triggered DPV locking.
N = The address is not a CMRA
Y = The address is a valid CMRA
blank = A blank output value indicates that Enable DPV is set to No, DPV processing is currently locked, or the transform cannot assign the input address.
DPV footnotes are required for CASS. The footnotes contain the following information:
AA = Input address matches to the postcode2 file.
A1 = Input address does not match to the postcode2 file.
DPV_Footnote
BB = All input address field values match to DPV.
CC = Input address primary number matches to DPV, but the
secondary number does not match (the secondary is present but invalid).
F1 = Input address matches a military address.
G1 = Input address matches a general delivery address.
M1 = Input address primary number is missing.
M3 = Input address primary number is invalid.
N1 = Input address primary number matches to DPV but the ad-
dress is missing the secondary number.
P1 = Input address is missing the RR or HC Box number.
P3 = Input address has an invalid PO, RR, or HC number.
RR = Input address matches to CMRA.
R1 = Input address matches to CMRA, but the secondary number
is not present.
U1 = Input address matches a unique address.
2010-12-0986
Page 87
USA Regulatory Address Cleanse
DescriptionField
No Stats indicator. No Stats means that the address is a vacant property, it receives mail as a part of a drop, or it does not have an established delivery yet.
Y = Address is flagged as No Stats in DPV data.
DPV_NoStats
DPV_Status
N = Address is not flagged as No Stats.
blank = Address was not looked up.
Note:
The US Addressing report contains DPV No Stats counts in the DPV Summary section.
The DPV status component that is generated for this record. D = The primary range is a confirmed delivery point, but the sec-
ondary range was not available on input.
L = The address triggered DPV locking.
N = The address is not a valid delivery point.
S = The primary range is a valid delivery point, but the parsed
secondary range is not valid in the DPV directory.
Y = The address is a confirmed delivery point. The primary range and secondary range (if present) are valid.
blank = A blank output value indicates that Enable DPV is set to No, DPV processing is currently locked, or the transform cannot assign the input address.
DPV_Vacant
Vacant address indicator.
Y = Address is vacant.
N = Address is not vacant.
blank = Address was not looked up.
Note:
The US Addressing report contains DPV Vacant counts in the DPV Summary section.
2010-12-0987
Page 88
USA Regulatory Address Cleanse
10.2.6 Non certified mode
End users can set up jobs with DPV disabled if the end user is not a CASS customer but still wants a Postcode2 added to addresses. The non-CASS option, Assign Postcode2 to Non DPV, enables the software to assign a Postcode2 when an address does not DPV-confirm.
Caution:
When DPV processing is disabled, the software does not generate the CASS-required documentation and the mailing is not eligible for postal discounts.
Related Topics
Non Certified options
10.2.7 DPV performance
The additional time required to perform DPV processing may affect processing time. Processing time may vary with the DPV feature based on operating system, system configuration, and other variables that may be unique to your operating environment.
You can decrease the time required for DPV processing by loading DPV directories into system memory before processing.
10.2.7.1 Memory usage
You may need to install additional memory on your operating system for DPV processing. We recommend a minimum of 768 MB to process with DPV enabled.
To determine the amount of memory required to run with DPV enabled, check the size of the DPV directories (recently about 600 MB1) and add that to the amount of memory required to run the software.
The size of the DPV directories will vary depending on the amount of new data in each directory release.
Make sure that your computer has enough memory available before performing DPV processing.
To find the amount of disk space required to cache the directories, see the
Supported Platforms
in the SAP BusinessObjects Support portal.
1
The directory size is subject to change each time new DPV directories are installed.
document
2010-12-0988
Page 89
USA Regulatory Address Cleanse
10.2.7.2 Cache DPV directories
To better manage memory usage when you have enabled DPV processing, choose to cache the DPV directories.
Related Topics
Transform performance
10.2.7.3 Running multiple jobs with DPV
When running multiple DPV jobs and loading directories into memory, you should add a 10-second pause between jobs to allow time for the memory to be released. For more information about setting this properly, see your operating system manual.
If you don't add a 10-second pause between jobs, there may not be enough time for your system to release the memory used for caching the directories from the first job. The next job waiting to process may produce an error or access the directories from disk if there is not enough memory to cache directories, resulting in performance degradation.
10.2.8 DPV locking
False-positive addresses are included in the DPV directories as a security precaution. If the software detects a false-positive address during processing, it marks the record as a false-positive and discontinues DPV processing.
Before releasing the mailing list that contains the false positive address, the mailer is required to send the DPV log files containing the false positive addresses to the USPS.
Related Topics
False positive addresses
2010-12-0989
Page 90
USA Regulatory Address Cleanse
10.2.8.1 Stop processing alternative
The NCOALink end user, DPV user, or LACSLink user may use the stop processing alternative in the case of the software locking files.
Stop processing alternative allows you to bypass any future DPV or LACSLink directory locks. The stop processing alternative is not an option in the software. It is a key code that you obtain from SAP BusinessObjects Business User Support.
First you must obtain the proper permissions from the USPS, and then provide proof of permission to SAP BusinessObjects Business User Support. Business User Support will provide a key code that disables the DPV or LACSLink directory locking.
For NCOALink end users with the stop processing alternative keycode entered into the SAP License Manager, the software takes the following actions when it detects a false positive address during DPV processing:
Marks the record as a false positive.
Generates a DPV log file containing the false positive address.
Notes the path to the DPV log files in the error log.
Generates a US Regulatory Locking Report containing the path to the DPV log files.
Continues DPV processing without interruption (however, you are required to notify the USPS that
a false positive address was detected.)
10.2.8.2 Reasons for errors
If a job setup is missing information in the USPS License Information group, and you have DPV and/or LACSLink enabled in your job, you will get error messages based on these specific situations:
missing required parameters
unwritable log file directory
Missing required parameters
When your job setup does not include the required options in the USPS License Information group, and you have DPV and/or LACSLink enabled, the software issues an error.
Unwritable log file directory
If you haven't specified a log file path, or if the path that you specified is not writable, the software issues an error.
2010-12-0990
Page 91
USA Regulatory Address Cleanse
10.2.8.3 DPV false positive logs
The software generates a false-positive log file any time it encounters a false positive record, regardless of how the job is set up. The software creates a separate log file for each mailing list that contains a false positive. If multiple false positives exist within one mailing list, the software writes them all to the same log file.
Note:
When the software locks because false-positive log files were created, end users must contact SAP BusinessObjects Business User Support to unlock the file.
Caution:
NCOALink limited and full service providers must not process additional lists for a customer that has given them a list that contains a false-positive record. The mailing list cannot be released until the USPS approves it.
Related Topics
To notify the USPS of DPV locking addresses
To retrieve the DPV unlock code
10.2.8.4 DPV log file location
The software stores DPV log files in the directory specified for the USPS Log Path in the Reference Files group.
Log file naming convention
The software automatically names DPV false positive logs like this: dpvl####.log, where #### is a number between 0001 and 9999. For example, the first log file generated is dpvl0001.log, the next one is dpvl0002.log, and so on.
Note:
When you have set the degree of parallelism to greater than 1, the software generates one log per thread. During a job run, if the software encounters only one false positive record, one log will be generated. However, if it encounters more than one false positive record and the records are processed on different threads, then the software will generate one log for each thread that processes a false positive record.
2010-12-0991
Page 92
USA Regulatory Address Cleanse
10.2.8.5 Submit to USPS
All NCOALink service providers must submit the false-positive log to the USPS NCSC (National Customer Service Center) via email (dsf2stop@email.usps.gov), with the mailer's name, the total number of addresses processed, and the number of addresses matched. Also include in the subject line “DPV False Positive”.
The NCSC uses this information to determine whether the list can be returned to the mailer.
Tip:
When the USPS releases the list that contained the locked record, you should delete the corresponding log file.
10.2.8.6 End users
End users of application made with this product must unlock DPV processing if a lock occurs by contacting SAP Business User Support to unlock the file.
10.2.8.7 To notify the USPS of DPV locking addresses
Follow these steps only if you have received an alert that DPV false positive addresses are present in your address list and you are either an NCOALink service provider or an NCOALink end user with stop processing alternative enabled.
1.
Send an email to the USPS at dsf2stop@usps.gov. Include the following:
write “DPV False Positive” as the subject line
attach the dpvl####.log file or files from the job, where #### is a number between 0001 and
9999.
2.
After the USPS has released the list that contained the locked or false positive record, the corresponding log files should be deleted.
3.
Remove the record that caused the lock from the database.
Related Topics
DPV security
2010-12-0992
Page 93
USA Regulatory Address Cleanse
10.2.9 Unlocking DPV
If DPV locking occurs, NCOALink full and limited-service providers must email the DPV False Positive log file to the USPS NCSC (National Customer Support Center) to obtain approval and the necessary information to unlock the list.
The software also locks DPV processing. You must contact SAP BusinessObjects Business User Support to obtain an unlock code.
10.2.9.1 To retrieve the DPV unlock code
1.
Go to the SAP Service Market Place (SMP) at http://service.sap.com/message and log a message using the component “BOJ-EIM-DS”.
2.
Attach the dpvx.txt file to your message and the log file named dpvl####.log, where ### is a number between 001 and 999.
The dpvx.txt file is located in the DPV directory referenced in the job. The log file is located in the directory specified for the USPS Log Path option in the USA Regulatory Address Cleanse transform.
Note:
If your files cannot be attached to the original message, include the unlock information in the message instead.
3.
SAP Business User Support sends you an unlock file named dpvw.txt. Replace the existing dpvw.txt file with the new file.
4.
Open your database and remove the record causing the lock.
Note:
Keep in mind that you can only use the unlock code one time. If the software detects another false-positive, you will need to retrieve a new DPV unlock code. Be sure to remove the record that is causing the lock from the database.
10.2.10 DPV No Stats indicators
The USPS uses No Stats indicators to mark addresses that fall under the No Stats category. The software uses the No Stats table when you have DPV enabled in a job. The USPS puts No Stats addresses in three categories:
2010-12-0993
Page 94
USA Regulatory Address Cleanse
Addresses that do not have delivery established yet.
Addresses that receive mail as part of a drop.
Addresses that have been vacant for a certain period of time.
10.2.10.1 No Stats table
You must install the No Stats table (dpv_no_stats.dir) before the software performs DPV processing. The No Stats table is supplied by SAP BusinessObjects with the DPV directory install.
The software automatically checks for the No Stats table in the directory folder that you indicate in your job setup. The software performs DPV processing based on the install status of the directory.
Installed
DPV
ResultsType of processingdpv_no_stats.dir
The software automatically outputs No Stats indicators when you include the DPV_NoStats output field in your job.
The software automatically skips the No Stats process-
Not installed
DPV
ing and does not issue an error message. The software will perform DPV processing but won't populate the DPV_NoStat output field.
10.2.10.2 No Stats output field
Use the DPV_NoStats output field to post No Stat indicator information to an output file.
No Stat means that the address is a vacant property, it receives mail as a part of a drop, or it does not have an established delivery yet.
Related Topics
DPV output fields
2010-12-0994
Page 95
USA Regulatory Address Cleanse
10.2.11 DPV Vacant indicators
The software provides vacant information in output fields and reports using DPV vacant counts. The USPS DPV vacant lookup table is supplied by SAP BusinessObjects with the DPV directory install.
The USPS uses DPV vacant indicators to mark addresses that fall under the vacant category. The software uses DPV vacant indicators when you have DPV enabled in your job.
Tip:
The USPS defines vacant as any delivery point that was active in the past, but is currently not occupied (usually over 90 days) and is not currently receiving mail delivery. The address could receive delivery again in the future. Vacant does not apply to seasonal addresses.
10.2.11.1 DPV address-attribute output fields
Vacant indicators for the assigned address are available in the DPV_Vacant output field.
Note:
The US Addressing report contains DPV Vacant counts in the DPV Summary section.
Related Topics
DPV output fields
10.3 USPS eLOT®
eLOT is available for U.S. records in the USA Regulatory Address Cleanse transform only.
eLOT takes line of travel one step further. The original LOT narrowed the mail carrier's delivery route to the block face level (Postcode2 level) by discerning whether an address resided on the odd or even side of a street or thoroughfare.
eLOT narrows the mail carrier's delivery route walk sequence to the house (delivery point) level. This allows you to sort your mailings to a more precise level.
You can enable eLOT in the Assignment options section of the USA Regulatory Address Cleanse configuration file.
2010-12-0995
Page 96
USA Regulatory Address Cleanse
Related Topics
Assignment options
Set up the reference files
10.4 Early Warning System (EWS)
EWS helps reduce the amount of misdirected mail caused when valid delivery points are created between national directory updates. EWS is available for U.S. records in the USA Regulatory Address Cleanse transform only.
You can enable EWS in the Assignment options section of the USA Regulatory Address Cleanse configuration file.
Related Topics
Assignment options
10.4.1 Overview of EWS
The EWS feature is the solution to the problem of misdirected mail caused by valid delivery points that appear between national directory updates. For example, suppose that 300 Main Street is a valid address and that 300 Main Avenue does not exist. A mail piece addressed to 300 Main Avenue is assigned to 300 Main Street on the assumption that the sender is mistaken about the correct suffix.
Now consider that construction is completed on a house at 300 Main Avenue. The new owner signs up for utilities and mail, but it may take a couple of months before the delivery point is listed in the national directory. All the mail intended for the new house at 300 Main Avenue will be mis-directed to 300 Main Street until the delivery point is added to the national directory.
The EWS feature solves this problem by using an additional directory which informs CASS users of the existence of 300 Main Avenue long before it appears in the national directory. When using EWS processing, the previously mis-directed address now defaults to a 5-digit assignment.
10.4.2 EWS directory
The EWS directory contains four months of rolling data. Each week, the USPS adds new data and drops a week's worth of old data. The USPS then publishes the latest EWS data. Each Friday, SAP
2010-12-0996
Page 97
USA Regulatory Address Cleanse
BusinessObjects converts the data to our format (EWyymmdd.zip) and posts it on the SAP Business User Support site at https://service.sap.com/bosap-downloads-usps.
10.5 SuiteLink
SuiteLink is an extra option in the USA Regulatory Address Cleanse transform. SuiteLink uses a USPS directory that contains multiple files of specially indexed address information like secondary numbers and unit designators for locations identified as high-rise business default buildings.
With SuiteLink you can build accurate and complete addresses by adding suite numbers to high-rise business addresses. With the secondary address information added to your addresses, more of your pieces are sorted by delivery sequence and delivered with accuracy and speed.
SuiteLink is not required when you process with CASS enabled. However, the USPS requires that NCOALink full service providers offer SuiteLink processing as an option to their customers.
You can enable SuiteLink in the Assignment options section of the USA Regulatory Address Cleanse configuration file.
Related Topics
Assignment options
10.5.1 Benefits of SuiteLink
Businesses who depend on Web-site, mail, or in-store orders from customers will find that SuiteLink is a powerful money-saving tool. Also businesses who have customers that reside in buildings that house several businesses will appreciate getting their marketing materials, bank statements, and orders delivered right to their door.
The addition of secondary number information to your addresses allows for the most efficient and cost-effective delivery sequencing and postage discounts.
10.5.2 How SuiteLink works
The software uses the data in the SuiteLink directories to add suite numbers to an address. The software matches a company name, a known high-rise address, and the CASS-certified postcode2 in your database to data in SuiteLink. When there is a match, the software creates a complete business address that includes the suite number.
2010-12-0997
Page 98
USA Regulatory Address Cleanse
Example: SuiteLink
This example shows a record that is processed through SuiteLink, and the output record with the assigned suite number.
The input record contains:
Firm name (in FIRM input field)
Known high-rise address
CASS-certified postcode2
The SuiteLink directory contains:
secondary numbers
unit designators
The output record contains:
the correct suite number
Telera
910 E Hamilton Ave Fl2
Campbell CA 95008 0610
10.5.3 SuiteLink directory
The SuiteLink directory is distributed monthly. You must use the SuiteLink directory with a ZIP+4 directory labeled for the same month. For example, the December 2010 SuiteLink directory can be used with only the December 2010 ZIP+4 directory.
Caution:
SuiteLink will be disabled if you are running your job in non-certified mode (Non Certified Options > Disable Certification).
You cannot use a SuiteLink directory that is older than 60 days based on its release date. The software warns you 15 days before the directory expires. As with all directories, the software won't process your records with an expired SuiteLink directory.
Output recordInput record
TELERA
910 E HAMILTON AVE STE 200
CAMPBELL CA 95008 0625
2010-12-0998
Page 99
USA Regulatory Address Cleanse
10.5.4 Improve processing speed
You may increase SuiteLink processing speed if you load the SuiteLink directories into memory. To activate this option, go to the Transform Performance group and set the Cache SuiteLink Directories to Yes.
10.6 LACSLink®
LACSLink is a USPS product that is available for U.S. records with the USA Regulatory Address Cleanse transform only. LACSLink processing is required for CASS certification.
LACSLink updates addresses when the physical address does not move but the address has changed. For example, when the municipality changes rural route addresses to street-name addresses. Rural route conversions make it easier for police, fire, ambulance, and postal personnel to locate a rural address. LACSLink also converts addresses when streets are renamed or post office boxes renumbered.
LACSLink technology ensures that the data remains private and secure, and at the same time gives you easy access to the data. LACSLink is an integrated part of address processing; it is not an extra step. To obtain the new addresses, you must already have the old address data.
You can enable LACSLink in the Assignment options section of the USA Regulatory Address Cleanse configuration file.
Related Topics
Assignment options
Memory usage and caching for LACSLink processing
LACSLink® security
10.6.1 Benefits of LACSLink
LACSLink processing is required for all CASS customers (beginning with CASS Cycle L).
If you process your data without LACSLink enabled, you won't get the CASS-required reports or postal discounts.
2010-12-0999
Page 100
USA Regulatory Address Cleanse
10.6.2 How LACSLink works
LACSLink provides a new address when one is available. LACSLink follows these steps when processing an address:
1.
The USA Regulatory Address Cleanse transform standardizes the input address.
2.
The transform looks for a matching address in the LACSLink data.
3.
If a match is found, the transform outputs the LACSLink-converted address and other LACSLink information.
10.6.3 Conditions for address processing
The transform does not process all of your addresses with LACSLink when it is enabled. Here are the conditions under which your data is passed into LACSLink processing:
The address is found in the address directory, and it is flagged as a LACS-convertible record within
the address directory.
The address is found in the address directory, and, even though a rural route or highway contract
default assignment was made, the record wasn't flagged as LACS convertible.
The address is not found in the address directory, but the record contains enough information to be
sent into LACSLink.
For example, the following table shows an address that was found in the address directory as a LACS-convertible address.
After LACSLink conversionOriginal address
RR2 BOX 204
DU BOIS PA 15801
463 SHOWERS RD
DU BOIS PA 15801-66675
10.6.4 LACSLink directory files
2010-12-09100
Loading...