Business objects DATA INTEGRATOR 11.7.2 User Manual

Page 1
Data Integrator Release Summary
Data Integrator Release Summary Release Summary
Data Integrator 11.7.2
for Windows and UNIX
Page 2
Patents
Business Objects owns the following U.S. patents, which may cover products that are offered and sold by Business Objects: 5,555,403, 6,247,008 B1, 6,578,027 B2, 6,490,593 and 6,289,352.
Trademarks
Copyright
Third-party contributors
Date
Business Objects owns the following U.S. patents, which may cover products that are offered and licensed by Business Objects: 5,555,403; 6,247.008 B1; 6,578,027 B2; 6,490,593; and 6,289,352. Business Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal Xcelsius, Crystal Decisions, Intelligent Question, Desktop Intelligence, Crystal Enterprise, Crystal Analysis, WebIntelligence, RapidMarts, and BusinessQuery are trademarks or registered trademarks of Business Objects in the United States and/or other countries. All other names mentioned herein may be trademarks of their respective owners.
© 2007 Business Objects. All rights reserved.
Business Objects products in this release may contain redistributions of software licensed from third-party contributors. Some of these individual components may also be available under alternative licenses. A partial listing of third-party contributors that have requested or permitted acknowledgments, as well as required notices, can be found at:
http://www.businessobjects.com/thirdparty
April 26, 2007
Page 3
Data Integrator Release Summary
Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Data Integrator information resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
New features in version
Extreme scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
64-bit UNIX support enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Data Profiler enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Distributed data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Load-balancing enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Pageable cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Parallel join enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Performance Optimization Guide enhancements . . . . . . . . . . . . . . . . 10
Persistent cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Push-down enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Trusted information—Data Quality XI integration . . . . . . . . . . . . . . . . . . . . 11
Maximum productivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Adapter interface installation enhancements . . . . . . . . . . . . . . . . . . . . 11
Command-line options to export to XML . . . . . . . . . . . . . . . . . . . . . . . 11
Command line to log in to the Designer . . . . . . . . . . . . . . . . . . . . . . . . 12
Datastore support enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Data type enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Function enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
History preserving transform enhancements . . . . . . . . . . . . . . . . . . . . 13
JMS and SalesForce Interface Integration . . . . . . . . . . . . . . . . . . . . . . 13
Job scheduling improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
s 11.7.2 and 11.7.0. . . . . . . . . . . . . . . . . . . . . . . . . 7
Data Integrator Release Summary 3
Page 4
Contents
Management Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
Administrator redesign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
New Administrator user role: Operator . . . . . . . . . . . . . . . . . . . . . . 14
Enhanced Operational Dashboards . . . . . . . . . . . . . . . . . . . . . . . . 14
Microsoft Excel workbook as a source . . . . . . . . . . . . . . . . . . . . . . . . .14
Multiple source file enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Password management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Performance monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Repository support enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Self-tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
Teradata UPSERT functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Tomcat Web server integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
Variable support extended . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Web services enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
XML schema enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
4 Data Integrator Release Summary
Page 5

Introduction

Welcome to BusinessObjects Data Integrator XI Release 2 Accelerated version 11.7.2. This document highlights the new features available with this release and also lists features that were new in version 11.7.0
For important information about this product release including installation notes, resolved issues, and known issues, see the Data Integrator Release
Notes.
Business Objects offers other products that complement Data Integrator and provide additional Enterprise Information Management solutions These include:
BusinessObjects Composer
BusinessObjects Metadata Manager
BusinessObjects Data Quality
BusinessObjects Data Federator
See the Business Objects Web site or contact a Business Objects sales representative for more information.

Data Integrator information resources

Introduction
.
Consult the Data Integrator Getting Started Guide for:
An overview of Data Integrator products and architecture
Data Integrator installation and configuration information
A list of product documentation and a suggested reading path
After you install Data Integrator (with associated documentation), you can view the technical documentation from several locations. To view documentation in PDF format:
If you accepted the default installation, select Start > Programs >
Business Objects > Data Integrator > Data Integrator Document ation and select:
Data Integrator Release Notes—Opens the Release Notes PDF,
which includes known and fixed bugs, migration considerations, and last-minute documentation corrections
Data Integrator Release Summary—Opens this document, which
describes the latest Data Integrator features
Data Integrator Release Summary 5
Page 6

Overview

Data Integrator Technical Manuals—Opens a “master” PDF
document that has been compiled so you can search across the Data Integrator documentation suite
Data Integrator Core Tutorial—Opens the Data Integrator Tutorial
PDF, which you can use for basic stand-alone training purposes
Select one of the following from the Designer’s Help menu:
Release Notes
Release Summary
Technical Manuals
Tutorial
Other links from the Designer Help menu include:
Data Integrator Community—Opens a browser window to Diamond, the
Business Objects developer community Web site
Knowledge Base—Opens a browser window to the Business Objects
Online Customer Support Web site
You can also open the Data Integrator documentation from links on the Start Page that opens automatically when you open the Designer. To open the Designer, choose Start > Programs > Business Objects > Data Integrator > Data Integrator Designer.
To obtain additional information that might have become available following the release of this document, or to find documentation for previous releases (including Release Summaries and Release Notes), visit the Business Objects documentation Web site at http://support.businessobjects.com/
documentation/.
Overview
New features in Data Integrator XI Release 2 Accelerated support several key areas:
Extreme scalability
Trusted information—Data Quality XI integration
Maximum productivity
Browse each key area to read summary information for that group of new features.
6 Data Integrator Release Summary
Page 7

New features in version 11.7.2

The following features are new in BusinessObjects Data Integrator XI Release 2 Accelerated version 11.7.2. This list itemizes the new features alphabetically.
64-bit UNIX support enhancements
Data Profiler enhancements
Datastore support enhancements
Data type enhancements
Microsoft Excel workbook as a source on UNIX
Repository support enhancement
Web services enhancements

New features in version 11.7.0

New features in Data Integrator XI Release 2 Accelerated version 11.7.0 support several key areas:
Extreme scalability
Trusted information—Data Quality XI integration
Maximum productivity
Browse each key area to read summary information for that group of new features. You can also navigate to specific new features from the following alphabetical list.
Adapter interface installation enhancements
Command-line options to export to XML
Command line to log in to the Designer
Distributed data flows
Function enhancements
History preserving transform enhancements
Job scheduling improvements
Load-balancing enhancements
Management Console
Microsoft Excel workbook as a source on Windows
Multiple source file enhancements
Data Profiler enhancements
New features in version 11.7.2
Data Integrator Release Summary 7
Page 8

Extreme scalability

Pageable cache
Parallel join enhancements
Password management
Performance monitor
Performance Optimization Guide enhancements
Persistent cache
Push-down enhancements
Self-tuning
Teradata UPSERT functionality
Tomcat Web server integration
Trusted information—Data Quality XI integration
Variable support extended
XML schema enhancements
Extreme scalability
The following features will improve the scalability of your Data Integrator projects.

64-bit UNIX support enhancements

For Solaris and AIX, Data Integrator is now a 64-bit executable rather than 32 bit. Although Business Objects 32-bit executables are designed to perform effectively on 64-bit operating systems, with 64-bit executables you will automatically benefit from better processing performance and enhanced memory management, eliminating any impacts due to 32-bit-related memory limitations. Data Integrator is already available in 64 bit on HP Itanium.

Data Profiler enhancements

In release 11.7.2, Data Integrator supports pageable cache for memory­intensive operations in the Data Profiler. The Data Profiler now uses p ageable cache to complete the detailed and relationship profiling tasks.
For more information, see “Configuring Job Server run-time resources” on
page 86 of the Data Integrator Getting Started Guide.
8 Data Integrator Release Summary
Page 9

Distributed data flows

Data Integrator version 11.7.0 provides the ability to distribute the workload across multiple CPUs in a grid. Data Integrator provides capabilities to distribute CPU-intensive and memory-intensive data processing work (such as join, grouping, table comparison and lookups) across multiple CPUs and computers. This work distribution provides the following potential benefits:
Better memory management by taking advantage of more CPU
resources and physical memory
Better job performance and scalability by using concurrent sub data flow
execution to take advantage of grid computing
You can create sub data flows so that Data Integrator does not need to process the entire data flow in memory at one time. You can also distribute the sub data flows to different job servers within a server group to use additional memory and CPU resources.
For more information, see Chapter 7, “Distributing Data Flow Execution,” in
the Data Integrator Performance Optimization Guide.

Load-balancing enhancements

Data Integrator version 11.7.0 provides the ability to distribute the workload across multiple servers in a grid.
You can distribute the execution of a job or a part of a job across multiple Job Servers within a Server Group to better balance resource-intensive operations. Y ou can specify the following distribution levels when you execute a job:
Job level—A job can execute on an available Job Server.
Data flow level—Each data flow within a job can execute on an available
Job Server.
Sub data flow level—A resource-intensive operation (such as a sort, table
comparison, or table lookup) within a data flow can execute on an available Job Server.
For more information, see “Using grid computing to distribute data flows
execution” on page 102 of the Data Integrator Performance Optimization
Guide.
Extreme scalability
Data Integrator Release Summary 9
Page 10
Extreme scalability

Pageable cache

In version 11.7.0, Data Integrator extends the maximum amount of memory you can use by using a packaged memory tool with paging capabilities. This capability means that Data Integrator uses disk space instead of memory to store information, which enables Data Integrator to use unlimited memory (limited only by disk space). This feature supports memory-intensive operations like joining heterogeneous sources, aggregations in memory, large lookups, and also supports some memory-intensive transforms. Now, instead of pushing these operations down to the database, you can perform all of these operations in Data Integrator without running out of memory.
For more information, see Chapter 6, “Using Caches,” in the Data Integrator
Performance Optimization Guide.

Parallel join enhancements

Data Integrator version 11.7.0 provides an additional parallel hash join that you can use to improve joins of large volumes of data. In addition, you can distribute the parallel join execution over multiple sub data flows. For more information, see “Degree of parallelism and joins” on page 77 of the Data
Integrator Performance Optimization Guide.

Performance Optimization Guide enhancements

Version 11.7.0 of Data Integrator provides a reorganized and enhanced Performance Optimization Guide to help you measure and tune performance of your ETL jobs. The new organization provides examples of tools to measure performance and determine performance bottlenecks, and it suggests tuning methods that subsequent chapters describe in detail. The enhancements include scenarios and examples that demonstrate usage of the new Extreme Scalability features in Data Integrator. For more information, see the Data Integrator Performance Optimization Guide.

Persistent cache

The persistent cache feature of Data Integrator version 11.7.0 allows you to store a large amount of table data to which multiple data flows require access. For example, you might have a very large lookup table that you want to cache in memory. You can store the lookup table in a persistent cache table which Data Integrator quickly pages into memory when each data flow executes.
For more information, see Chapter 6, “Using Caches,” in the Data Integrator
Performance Optimization Guide.
10 Data Integrator Release Summary
Page 11
Tr usted information—Data Quality XI integration

Push-down enhancements

Version 11.7.0 of Data Integrator provides an easy, one-step way to push down resource-intensive operations in subsequent transforms in a data flow. You no longer need to manually save the data from the source into temporary storage to use as a source in a different data flow so that it can be pushed down to the database server . Instead , you add the Dat a_transfer transform to your data flow and Data Integrator saves the data so that transforms later in your data flow can be pushed down. For details, see “Data_Transfer
transform for push-down operations” on page 43 of the Data Integrator
Performance Optimization Guide.
Trusted information—Data Quality XI integration
Version 11.7.0 of Data Integrator offers tight integration with Data Quality XI (previously Firstlogic IQ8). Data Quality projects created in the Project Architect can be imported as Data Quality transforms in Data Integrator. You can use these transforms in data flows to cleanse, match, and merge data to improve its quality.

Maximum productivity

The following products and features can enhance your productivity when working with Data Integrator.

Adapter interface installation enhancements

In version 11.7.0 of Data Integrator, the JMS and SalesForce.com adapter interfaces now install automatically. See “JMS and SalesForce Interface
Integration” on page 79 of the Data Integrator Getting Started Guide.

Command-line options to export to XML

Version 11.7.0 of Data Integrator provides options on the al_engine command to export an entire repository or individual objects to an XML file. This feature facilitates migration from one repository to another (for example, DEV, TEST, and PROD) when you include the command in scripts. For more information, see “Command line options to export objects to an XML file” on page 34 of
the Data Integrator Advanced Development and Migration Guide.
Data Integrator Release Summary 11
Page 12
Maximum productivity

Command line to log in to the Designer

Version 11.7.0 of Data Integrator provides the ability to log in to the Designer from the command line. This feature facilitates logging in to multiple repositories. For more information, see “Command line login to the Designer”
on page 25 of the Data Integrator Advanced Development and Migration
Guide.

Datastore support enhancements

In previous versions of Data Integrator, you could use a generic ODBC connection to access Netezza, MySQL, and Data Federator sources. In Data Integrator 11 list of supported database types, making them easier to configure.
With the connection to Data Federator , Business Object s provides users with an easy way to create historical snapshots by moving data from Data Federator’s virtual tables to physical tables stored in any relational database.
For more information on Datastores, see Chapter 5, “Datastores,” in the Data
Integrator Designer Guide.
.7.2, these three sources are available as separate items in the

Data type enhancements

With the release of Data Integrator 11.7.2, you can import Oracle Character Large Object (
long data type in datastores associated with Oracle 9i and later versions. You
can use Data Integrator to read, load (including bulk load), and update both
clob and nclob columns. Data Integrator can process up to 2 GB of clob and nclob data.
For more information, see “long” on page 238 of the Data Integrator
Reference Guide.
clob) and National Character Large Object (nclob) columns as

Function enhancements

In version 11.7.0, Data Integrator expands the list of available built-in functions to make it easier for developers to create more complex calculations using more Data Integrator analytical capabilities. New functions include mathematical functions (sqrt, log, power), aggregation functions (count_distinct), string functions (asc, chr), and many more.
A new category of functions makes it possible to compare values between different rows in a table, making it easy to calculate trends (Previous_Row_Value) and detect changes in values.
12 Data Integrator Release Summary
Page 13
Furthermore, several functions have been improved based on customer feedback; for example, the week_in_year function can now return week numbers, which follows the ISO standard predominantly used in Europe.
For more information, see “Descriptions of built-in functions” on page 396 of
the Data Integrator Reference Guide.

History preserving transform enhancements

As of version 11.7.0, the History Preserving transform now contains an extra option to set the Valid to column of your old record. Now you can set the Valid to result for the same day or the previous day.
For more information, see “History_Preserving” on page 302 of the Data
Integrator Reference Guide.

JMS and SalesForce Interface Integration

In previous versions of Data Integrator, the JMS and SalesForce Interfaces were installed via a separate installation process from your Data Integrator installation. Hence, if you wished to use the JMS and SalesForce Interfaces, these Interfaces were installed after you installed Data Integrator. However, the installation of these interfaces is now an option you may choose during the installation of Data Integrator. These interfaces are now integrated with Data Integrator as optional components you select during installation.
Now, when you upgrade Data Integrator, you can choose to keep your previously installed interfaces, or choose to install the new, integrated Salesforce.com and JMS adapter interfaces.
For details on these interfaces, see “JMS and SalesForce Interface
Integration” on page 79 of the Data Integrator Getting Started Guide.
Maximum productivity

Job scheduling improvements

Version 11.7.0 of Data Integrator includes several enhancements for scheduling jobs including:
The ability to create and manage job schedules in Business Objects
Enterprise
The option to execute a job multiple times in a day over a specified time
range and interval
A Repository Schedules page that displays all schedules for all jobs in a
repository. You can then link directly to a schedule’s configuration page.
Data Integrator Release Summary 13
Page 14
Maximum productivity
For details see “Scheduling jobs in BusinessObjects Enterprise” on page 60
of the Data Integrator Management Console: Administrator Guide.

Management Console

As of version 11.7.0, you can now launch all of the Web-based Data Integrator applications from a single interface called the Management Console. These applications include:
Administrator
Impact and Lineage Analysis
Operational Dashboards
Data Validation dashboards (formerly called Data Quality dashboards)
Auto Documentation

Administrator redesign

The Administrator (formerly called the Web Administrator) has been redesigned and includes an enhanced look and feel and more intuitive navigation but it retains the previous familiar functionality. It has also been combined with the metadata reporting applications and therefore launches from a unified Management Console.

New Administrator user role: Operator

In addition to the two existing Administrator roles (Administrator and Monitor), Data Integrator introduces the new Operator role, which has the same permissions as Monitor plus additional permissions for executing and scheduling jobs.

Enhanced Operational Dashboards

The Operational Dashboards application of metadata reporting provides enhanced graphics using Crystal Xcelsius.

Microsoft Excel workbook as a source

As of version 1 1.7.0, you can import a Microsof t Excel workbook on Windows directly without using ODBC. Y ou can import the schema from a named range defined in the workbook, a custom range in a worksheet (for example A1:C10), or all fields. For details, see “Excel workbook format” on page 106 of
the Data Integrator Reference Guide.
14 Data Integrator Release Summary
Page 15
As of version 11.7.2, you can use a Microsoft Excel workbook as a source on UNIX. For more information, see “Designer Guide” on page 41 of the Data
Integrator Release Notes and“Excel workbook format” on page 106 of the Data Integrator Reference Guide.

Multiple source file enhancements

Version 11.7.0 of Data Integrator provides the option to identify the specific source file for each row when reading multiple source files at once. You can easily read multiple files at one time by using wildcard characters in file names for source file names for COBOL copybooks, Excel workbooks, flat files, IDoc files, and XML files. Your source output can now include a column that contains the source file name for each row.
For details, see Chapter 2, “Data Integrator Objects,” in the Data Integrator
Reference Guide and “About SAP R/3 reference information” on page 180 of the Data Integrator Supplement for SAP.

Password management

Data Integrator uses several types of user accounts and associated passwords. For various reasons, database account parameters such as user names or passwords change. For example, perhaps your company’s compliance and regulations policies require periodically changing account passwords for security. To accommodate these types of changes, version
of Data Integrator provides new mechanisms for:
11.7.0
Updating local repository login parameters—Using an optional, portable
password file, you no longer need to manually regenerate job schedules associated with changed database or repository parameters
Updating datastore connection parameters—You can now update
datastore connection parameters from the Administrator (instead of installing Designer and doing it from the datastore editor)
See the Data Integrator Management Console: Administrator Guide for details.
Maximum productivity

Performance monitor

Version 11.7.0 of Data Integrator provides a new Performance Monitor to help you investigate performance bottlenecks. Y ou can view performance statistics in both graphical and tabular formats. When you execute jobs with the Collect statistics for monitoring option, you can view memory usage statistics.
Data Integrator Release Summary 15
Page 16
Maximum productivity
For more information, see “Reading the Performance Monitor for execution
statistics” on page 27 of the Data Integrator Performance Optimization Guide.

Repository support enhancement

Version 11.7.2 of Data Integrator allows you to host local, central, and profiler repositories on a MySQL database. Additionally, you can administer local, central, and profiler repositories created on a MySQL database and generate reports, documentation, and dashboards from the Management Console.
For more information on creating a MySQL repository, see “Creating a Data
Integrator repository” on page 22 of the Data Integrator Designer Guide . For
more information on administering local, central, and profiler repositories on a MySQL database, see the Data Integrator Management Console:
Administrator Guide.

Self-tuning

As of version 11.7.0, Data Integrator uses cache statistics to automatically determine the optimal cache type for subsequent job executions. For more information, see “Using statistics for cache self-tuning” on page 61 of the Data
Integrator Performance Optimization Guide.

Teradata UPSERT functionality

The purpose of the Teradata UPSERT operation is to update a row, and if no row matches the update, Teradata inserts the row. As of version 11.7.0, the Data Integrator Teradata bulk loader supports this functionality.
For more information, see “Bulk loading in Teradata” on page 123 of the Data
Integrator Performance Optimization Guide.

Tomcat Web server integration

As of version 11.7.0, you can select an existing (previously installed) Tomcat instance for the Administrator rather than installing a new one with Data Integrator. Additionally, most EIM Web applications including the Administrator, Metadata Manager, and Composer can now use the same Tomcat server.
16 Data Integrator Release Summary
Page 17

Variable support extended

With version 11.7.0 of Data Integrator, you can use variables in sources, targets, and transforms. Now, variables are supported in several places including the Date Generation, History Preserving, Key Generation, and Row Generation transforms as well as XML and flat file sources, database targets, COBOL copybooks, and IDOC sources.

Web services enhancements

The 11.7.2 release of Data Integrator provides enhanced Web service security as well as support for importing your Data Integrator-generated WSDL file to .NET.
Now, when you create or update a Web service adapter datastore, Data Integrator provides User Name and Password options to support HTTP basic authentication.
Because the .NET environment requires that you import only WSDL files containing XML schemas with complex types, Data Integrator now generates an enhanced WSDL file that redefines XML schemas with simple types as XML schemas with complex types.
For more information on Web service support, see “Support for Web services”
on page 147 of the Data Integrator Management Console: Administrator
Guide .
Maximum productivity

XML schema enhancements

Version 11.7.0 of Data Integrator allows you to import more XML elements including abstract types, substitution groups, and mixed content. For more information, see “Importing XML Schemas” on page 218 of the Data
Integrator Designer Guide.
Data Integrator Release Summary 17
Page 18
Maximum productivity
18 Data Integrator Release Summary
Loading...