The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2011. SAS®Analytics Accelerator 1.3 for
Teradata: Guide. Cary, NC: SAS Institute Inc.
All rights reserved. Produced in the United States of America.
For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or
by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS
Institute Inc.
For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time
you acquire this publication.
U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the
U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19, Commercial
Computer Software-Restricted Rights (June 1987).
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.
1st electronic book, February 2011
SAS®Publishing provides a complete selection of books and electronic products to help customers use SAS software to its fullest
potential. For more information about our e-books, e-learning products, CDs, and hard-copy books, visit the SAS Publishing Web
site at support.sas.com/publishing or call 1-800-727-3228.
SAS®and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in
the USA and other countries. ® indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective companies.
What’s New in SAS Analytics Accelerator 1.3 for Teradata F 3
What’s New in SAS Analytics Accelerator 1.3 for Teradata
Overview
This release contains the first implementation of in-database functionality for SAS®Enterprise Miner™.
SAS Analytics Accelerator 1.3 for Teradata supports three SAS Enterprise Miner procedures: DMDB,
DMINE, and DMREG.
Overview of In-Database Computing
SAS applications are often built to work with large volumes of data in environments that demand rigorous IT
security and management. When the data are stored in an external database, such as a Teradata database, the
transfer of large data sets to the computers that run the SAS System can cause a performance bottleneck and
possible unwanted security and resource management consequences for local data storage. SAS Analytics
Accelerator 1.3 for Teradata addresses these challenges by moving computational tasks closer to the data
and by improving the integration between the SAS System and the database management system (DBMS).
At present, there are in-database versions of six SAS/STAT procedures, one SAS/ETS procedure, and three
SAS Enterprise Miner procedures. The SAS/STAT in-database procedures include CANCORR, FACTOR,
PRINCOMP, REG, SCORE, and VARCLUS; the SAS/ETS procedure is TIMESERIES; and the SAS Enterprise Miner procedures are DMDB, DMINE, and DMREG.
There are also five Base SAS in-database procedures (CORR, FREQ, MEANS, RANK, and SUMMARY),
but Base SAS in-database procedures, with the exception of the CORR procedure, do not require the installation of SAS Analytics Accelerator 1.3 for Teradata. Table 1.1 lists the software licenses that are required
to take advantage of in-database computing.
4 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
Table 1.1 Software Licensing Requirements for In-Database Computing
ProceduresLicenses Required
SAS/STAT proceduresBase SAS 9.2M3
(CANCORR, FACTOR, PRINCOMP,SAS/ACCESS 9.2 (SAS/ACCESS Interface to Teradata)
REG, VARCLUS, and SCORE)SAS/STAT 9.22
SAS Analytics Accelerator 1.3 for Teradata
SAS/ETS procedureBase SAS 9.2M3
(TIMESERIES)SAS/ACCESS 9.2 (SAS/ACCESS Interface to Teradata)
SAS/ETS 9.22
SAS Analytics Accelerator 1.3 for Teradata
SAS Enterprise Miner procedures
(DMDB, DMINE, and DMREG)Base SAS 9.2M3
SAS/ACCESS 9.2 (SAS/ACCESS Interface to Teradata)
SAS Enterprise Miner 6.2
SAS Analytics Accelerator 1.3 for Teradata
Base SAS statistical proceduresBase SAS 9.2M3
(FREQ, MEANS, RANK, and SUMMARY)SAS/ACCESS 9.2 (SAS/ACCESS Interface to Teradata)
BASE SAS statistical proceduresBase SAS 9.2M3
(CORR)SAS/ACCESS 9.2 (SAS/ACCESS Interface to Teradata)
SAS Analytics Accelerator 1.3 for Teradata
All SAS Analytics Accelerator 1.3 for Teradata in-database functions require Teradata 13.00.00.15 or higher.
Installation instructions for SAS Analytics Accelerator 1.3 for Teradata are published in the section “Configuring SAS Analytics Accelerator for Teradata” in the Configuration Guide for SAS 9.2 Foundation forUNIX Environments and the Configuration Guide for SAS 9.2 Foundation for Microsoft Windows.
In SAS 9.2M3, SAS/STAT 9.22, and SAS/ETS 9.22 in-database computing is available only for the Teradata
DBMS. Future releases might provide in-database computing for other DBMSs.
Teradata and the SAS System
In a conventional environment where in-database computing is not possible, a large amount of data is stored
in the Teradata database. When a SAS procedure executes, it must read the data through the SAS/ACCESS
engine. This movement of data from the database to the SAS workspace server causes a performance
penalty. In an environment where in-database computing is possible, such as that provided by SAS Analytics
Accelerator 1.3 for Teradata, when a comparable Base SAS, SAS/STAT, SAS/ETS, or SAS Enterprise Miner
in-database procedure executes, some of the procedure’s computations are performed within Teradata, and
only the results of those computations, rather than the raw data, are passed to the SAS workspace server.
The benefit is a reduction in data movement between the database and the SAS workspace server. The cost
SQL Generation F 5
of this reduction in data movement is the difference in processing time due to using Teradata versus the SAS
System to process the data. The trade-off in processing time versus data movement time depends on several
dynamic factors, including client and server workloads, network speed and workload, and data size. The net
benefit from using in-database procedures varies based on the particular environment. In general, the net
benefit increases as the number of rows in a database table increases.
SQL Generation
When performing in-database modeling, a SAS/STAT, SAS/ETS, or SAS Enterprise Miner procedure dynamically generates SQL code, which is based on the procedure options and statements. It then submits the
SQL code directly to the database. The code can be standard SQL that can be interpreted by any database,
or it can be tuned specifically for Teradata. The choice for the type of SQL code is determined by the complexity of the required analysis. The code can include SAS formats that are executed in Teradata. The query
returns result sets that are used by the in-database procedure to complete the analysis before supplying the
results to one of the following: the SAS output listing, the ODS listing, ODS Graphics, or output tables.
The end result is usually the same as when you use Base SAS tables rather than Teradata tables. However,
with in-database computing the relational database software is responsible for optimizing and executing the
query. When the table to be analyzed is very large, the in-database computing approach can result in significantly lower total elapsed processing time and in reduced data movement between the SAS System and the
Teradata database.
SAS In-Database Functions
When a SAS/STAT, SAS/ETS, or SAS Enterprise Miner procedure dynamically creates and submits SQL
code, this code includes references to new user-defined functions (UDFs), developed by SAS, which are
installed on the Teradata system. These UDFs are the key to performing advanced statistical computations efficiently in-database. For example, SAS has developed a Teradata UDF called SAS_TACORR for
computing uncorrected sum of squares and crossproducts (SSCP) matrices (also known as X’X matrices)
and related statistics in-database. The SAS/STAT procedures CANCORR, FACTOR, PRICOMP, REG,
and VARCLUS have been modified to take advantage of the SAS_TACORR UDF, which enables them
to perform advanced statistical computations on a DBMS table without downloading the entire table with
SAS/ACCESS software.
To better understand how the SAS_TACORR UDF enhances computing efficiency, consider the following.
The SSCP matrix is a condensed representation of the relationships between variables. The size of the
SSCP matrix is determined by the number of variables that are used by the procedure and does not depend
on the number of rows. For example, a data set that contains 100 numeric columns and 5 million rows of
data produces an SSCP matrix that contains, nominally, 100x100 cells. However, a SAS/STAT in-database
procedure needs only the lower triangle of 100x50 cells. Thus, the data that is transferred from Teradata to
SAS is reduced from 500,000,000 raw data cells to 5,000 SSCP data cells, which is a reduction of 99.999
percent. As the example illustrates, the benefit of computing the SSCP matrix in-database versus passing
the raw data to the SAS System increases greatly as the number of rows increases. Large data applications
benefit the most from in-database processing.
6 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
The SAS/STAT in-database procedure constructs SQL code that uses the SAS in-database function to create the SSCP matrix. This function is not intended for general-purpose SQL queries, but it is useful for
SAS/STAT in-database procedures. The elements of the SSCP are created and transferred to the SAS System. After the crossproducts have been transferred to the SAS System, the database is available to continue
processing queries while the SAS/STAT procedure continues the analysis.
SAS In-Database Procedures
Several SAS/STAT, SAS/ETS, SAS Enterprise Miner, and Base SAS procedures have been modified to
move critical data-intensive operations into the database. Such operations include basic summarization and
exploratory data analysis. These SAS procedures are commonly found in many SAS programs and represent
a large opportunity to improve efficiency when working with relational databases. Current changes are
optimized for the Teradata relational database. The modifications to the SAS/STAT, SAS/ETS, and SAS
Enterprise Miner in-database procedures, as well as the Base SAS CORR procedure, require the installation
of the SAS Analytics Accelerator 1.3 for Teradata and are documented here.
Base SAS In-Database Procedures
Base SAS currently offers in-database implementations of the CORR, FREQ, MEANS, RANK, and SUMMARY procedures. The Base SAS in-database procedures work differently from the SAS/STAT, SAS/ETS,
and SAS Enterprise Miner in-database procedures and do not require the installation of the SAS Analytics
Accelerator for Teradata (with the exception of the CORR procedure). The Base SAS in-database procedures are included in the Base SAS software and are documented in Base SAS Procedures Guide. However,
some notes about the options that affect the Base SAS in-database procedures are included in this document
for convenience.
SAS/STAT In-Database Procedures
In-database versions of the CANCORR, FACTOR, PRINCOMP, REG, SCORE, and VARCLUS procedures
are available in SAS/STAT 9.22. Each procedure generates SQL code which, except for the SCORE procedure, calls the SAS in-database function for SSCP creation. These procedures are commonly used in
exploratory data analysis and regression model building. The procedure syntax and output for in-database
computing is identical to the syntax and output for conventional SAS processing in most cases. In those
cases, existing procedure steps in SAS programs are able to run in-database without modification.
However, some issues with in-database processing can cause the output to differ because of numerical precision issues and BY processing issues. These issues are discussed in the section “Conditions That Prevent
In-Database Processing.” Together with the in-database Base SAS procedures, the SAS/STAT procedures
can be used for basic tasks that are often executed at the beginning of larger analytical tasks. Use of these
techniques can greatly reduce the total data transfer between relational databases and the SAS System.
SAS/STAT In-Database Procedures F 7
The CANCORR Procedure
The CANCORR procedure performs canonical correlation, partial canonical correlation, and canonical redundancy analysis.
Canonical correlation is a generalization of multiple correlation for analyzing the relationship between two
sets of variables. In multiple correlation, you examine the relationship between a linear combination of a
set of explanatory variables, X, and a single response variable, Y. In canonical correlation, you examine
the relationship between linear combinations of the set of X variables and linear combinations of a set of
Y variables. These linear combinations are called canonical variables or canonical variates. Either set of
variables can be considered explanatory or response variables, since the statistical model is symmetric in
the two sets of variables. Simple and multiple correlation are special cases of canonical correlation in which
one or both sets contain a single variable.
The SSCP matrix is computed in-database and is transferred to the SAS System for further processing. The
options in the CANCORR procedure that are affected by in-database processing are detailed in the section
“CANCORR Procedure Options Affected by In-Database Computing.”
The FACTOR Procedure
The FACTOR procedure performs a variety of common factor and component analyses and rotations. The
methods for factor extraction are principal component analysis, principal factor analysis, iterated principal
factor analysis, unweighted least squares factor analysis, maximum likelihood (canonical) factor analysis,
alpha factor analysis, image component analysis, and Harris component analysis. A variety of methods for
prior communality estimation are also available.
The SSCP matrix is computed in-database and is transferred to the SAS System for further processing. The
options in the FACTOR procedure that are affected by in-database processing are detailed in the section
“FACTOR Procedure Options Affected by In-Database Computing.”
The PRINCOMP Procedure
The PRINCOMP procedure computes a principal component transformation of the SSCP matrix into orthogonal components. The first component accounts for the maximum amount of variation; the second
accounts for the next largest amount of variation; and so on. This procedure is typically used in exploratory
data analysis and visualization. Scatter plots that are generated from the principal component dimensions
often reveal interesting relationships among the data points.
Principal component analysis (PCA) reduces the dimensionality of data for predictive modeling by replacing
the variables in the original data with fewer principal component terms in the final model. The SSCP matrix
is computed in-database and is transferred to the SAS System for further processing. The options in the
PRINCOMP procedure that are affected by in-database processing are detailed in the section “PRINCOMP
Procedure Options Affected by In-Database Computing.”
8 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
The REG Procedure
The REG procedure computes a model in which a dependent variable is modeled as a linear equation of
multiple independent variables by a least squares function. PROC REG has numerous model-fitting options,
which include many model selection variations and hypothesis tests. Regression analysis is often used to
identify a subset of independent variables that have unique and significant relationships with the dependent
variable. In this case, PROC REG is used for its exploratory analysis, in addition to its extensive model
fitting and reporting functions. You might want to use the REG procedure to compute candidate models in
the database. A full SSCP matrix is computed in Teradata and is transferred to the SAS System, in which
the model fitting occurs.
PROC REG includes an option that uses a CORR or SSCP matrix as input and creates a CORR or SSCP matrix as output. For both in-database processing and matrix input, any option that requires access to individual
rows of data is disabled. The disabled options include the general class of residual analysis, plot options,
and confidence intervals. Complete details of which options and statements are affected by in-database
processing are provided in the section “REG Procedure Options Affected by In-Database Computing.”
The SCORE Procedure
Many statistical procedures create output data sets (using the OUTEST= or OUTSTAT= options) that contain coefficients. The SCORE procedure can apply the coefficients in those data sets to a raw data set and
compute new variables that are generically called scores. Each new score variable is formed as a linear
combination of the raw data and the scoring coefficients. That is, for each observation in the raw data set,
PROC SCORE multiplies the value of a variable in the raw data set by the matching scoring coefficient
from the data set of scoring coefficients. This multiplication process is repeated for each variable in the
VAR statement. The resulting products are then summed to produce the value of the new score variable. In
other words, PROC SCORE performs a matrix multiplication on the two data sets.
For example, you can use the SCORE procedure to produce output in the form of a table in Teradata that
contains predicted values or residuals from a model that is estimated using the REG procedure in-database.
The SCORE procedure dynamically generates SQL code for the given model. PROC SCORE then submits
the SQL code to the database, which produces a Teradata table without having to extract any rows of data into
the SAS System. Details of which options in the SCORE procedure are affected by in-database processing
are provided in the section “SCORE Procedure Options Affected by In-Database Computing.”
PROC SCORE cannot be used for scoring nonlinear models.
The following SAS procedures produce output data sets that contain scoring coefficients that can be used
by PROC SCORE:
ACECLUS procedure
CALIS procedure
CANCORR procedure
CANDISC procedure
DISCRIM procedure
SAS/ETS In-Database Procedures F 9
FACTOR procedure
PRINCOMP procedure
TCALIS procedure
VARCLUS procedure
ORTHOREG procedure
QUANTREG procedure
REG procedure
ROBUSTREG procedure
You can also use a DATA step or the IML procedure to output any coefficients you want in a data set to use
with PROC SCORE.
The VARCLUS Procedure
The VARCLUS procedure groups variables into clusters that are based on their correlations. The full SSCP
matrix is created, and then clusters are chosen to maximize the variance that is associated with the first
principal component within the cluster. An iterative process assigns variables to clusters to maximize the
sum across clusters of the variance of the original variables, which is explained by the centroid cluster
measure.
Similar to principle components, variable clustering is another technique that is used for exploratory data
analysis and for variable reduction, which reduces the number of terms that are used in successive analyses.
The SSCP matrix is computed in-database and is then transferred to the SAS System for further processing.
Details of which options in the VARCLUS procedure are affected by in-database processing are provided in
the section “VARCLUS Procedure Options Affected by In-Database Computing.”
SAS/ETS In-Database Procedures
The TIMESERIES procedure analyzes time-stamped transactional data with respect to time and accumulates
the data into a time series format. The procedure can perform trend and seasonal analysis on the transactions.
After the transactional data are accumulated, time domain analysis and frequency domain analysis can be
performed on the accumulated time series.
Details of which options in the TIMESERIES procedure are affected by in-database processing are provided
in the section “TIMESERIES Procedure Options Affected by In-Database Computing.”
10 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
SAS Enterprise Miner In-Database Procedures
The DMDB Procedure
SAS Enterprise Miner architecture is based on the creation of a data mining database (DMDB) that is a
snapshot of the original data. PROC DMDB creates this DMDB from the input data source. It also compiles
and computes metadata information about the input data based on variable roles and stores it in a metadata
catalog. The DMDB and the associated metadata catalog facilitate subsequent data mining activities. They
are both designed for processing and storage efficiencies.
The DMINE Procedure
Many data mining databases have hundreds of potential model inputs. The DMINE procedure enables you
to quickly identify the input variables that are useful for predicting the target variables based on a linear
models framework. The procedure facilitates ordinary least squares or logistic regression methods.
The DMREG Procedure
The DMREG procedure enables you to fit both linear and logistic regression models. Linear regression
attempts to predict the value of a continuous target as a linear function of one or more independent inputs.
Logistic regression attempts to predict the probability that a categorical (binary, ordinal, or nominal) target
will acquire the event of interest as a function of one or more independent inputs. The procedure supports
forward, backward, and stepwise selection methods. It also enables you to score data sets or generate SAS
DATA step code to score a data set.
Limitations of In-Database Procedures
Some features of SAS procedures do not execute in-database. The ability to perform in-database processing
depends on the specific type of analysis that is provided by the statements and options of a procedure. These
limitations are considered reasonable for the use case of accessing and exploring data. After a subset of data
has been selected and a model form has been established, a data sample can be defined by using PROC SQL.
This subset of data can then be accessed directly by the SAS procedure to execute an analysis that includes
functions that do not execute in-database. Use of the SAS in-database procedures reduces the amount of
data that is transferred and makes use of the resources of the Teradata system.
Performance and Numerical Accuracy Issues with In-Database Computing F 11
Performance and Numerical Accuracy Issues with In-Database
Computing
Consider the following issues when performing in-database computations:
In-database computation can be either faster or slower than computation that uses SAS/ACCESS
software, depending on client performance, network performance, server performance, and the type
of computations involved.
In-database computation tends to be slow when the number of variables in the analysis is large or
when the number of BY groups is large.
Additional overhead is involved with in-database computation, and therefore more memory is required
on the client for in-database computation compared to out-of-database computation.
The SSCP matrix is computed with less accuracy in-database when the variables have small coeffi-
cients of variation.
Enabling and Controlling In-Database Computing
You control in-database computing with options in the LIBNAME statement and the OPTIONS statement.
SQLGENERATION=NONE | DBMS | ALL
specifies the type of in-database computing to be performed. The SQLGENERATION= option can
be specified as either a LIBNAME statement option or as a system option in an OPTIONS statement.
The value used for the SQLGENERATION= option depends on whether the option is specified in the
LIBNAME statement, an OPTIONS statement, or both as follows:
If the SQLGENERATION= option is specified in the LIBNAME statement for a data set, then
with respect to that data set, any value of the SQLGENERATION= option that is specified in
any OPTIONS statement is ignored.
If the SQLGENERATION= option is not specified in a LIBNAME statement for a data set, then
with respect to that data set, the value of the SQLGENERATION= option that is specified in the
most recent OPTIONS statement is used.
If the SQLGENERATION= option is not specified in either the LIBNAME statement for a data
set or in an OPTIONS statement, then by default with respect to that data set, SQLGENERATION=DBMS.
You can specify different values of the SQLGENERATION= option for the DATA= and OUT= data
sets by using different LIBNAME statements for the two data sets.
12 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
The values of the SQLGENERATION option are interpreted as follows:
NONEspecifies that no in-database computation be performed. In this case, a SAS/STAT
procedure uses SAS/ACCESS to copy the raw data from Teradata to the SAS Work
directory, and the procedure performs all computations within the SAS System.
DBMSspecifies that in-database computation be performed for data sets that are DBMS
tables if no system options, LIBNAME options, data set attributes, or procedure
options are incompatible with in-database computation. DBMS is the default
value, and it automatically enables in-database processing.
When the value is DBMS, the procedure uses in-database processing (when possible) and uses conventional SAS processing when the specific procedure statements
and options do not support in-database processing. For example, a REG procedure
statement that contains a residual option computes the SSCP matrix in-database,
but transfers the detail data to the SAS System to create residual analysis and output.
If the procedure attempts to perform a computation in-database but the SQL command fails, the procedure tries to perform that computation out-of-database.
ALLspecifies that in-database computation be performed whenever possible.
In-database computation is not applicable to data sets that contain summary statistics, such as OUTEST= and OUTSTAT= data sets, or DATA= data sets for which the TYPE= attribute is CORR, EST,
SSCP, FACTOR, and so on. (See the “Special SAS Data Sets” chapter in SAS/STAT User’s Guide for
the complete list of TYPE= attributes.) Because the data in these tables are already summaries, no
additional summarization can be performed in-database. The SQLGENERATION= option is ignored
with respect to those data sets for which in-database computation is not applicable.
The following system options, which are specified in an OPTIONS statement, also control in-database
computing for SAS/STAT, SAS/ETS, and SAS Enterprise Miner procedures:
MSGLEVEL=N | I
specifies the level of messages to be printed to the log. N is the default value.
Nspecifies that the following messages be printed in the log:
Ispecifies that everything printed by MSGLEVEL=N be printed plus the following
a confirmatory note that states that SQL is used for in-database computations,
when that is the case
error messages if anything goes wrong with the SQL commands submitted for
in-database computations
note that states whether SQL is used, when there are SQL error messages
notes:
note that explains why SQL is not used for in-database computations, if that
is the case (however, no note is printed when OPTIONS SQLGENERATION=NONE is specified)
note that explains that the TYPE= attribute is not stored in DBMS tables when
you try to create a special SAS data set (that is, a data set that has a non-blank
value of the TYPE= attribute) as a DBMS table
SQL_IP_TRACE=NOTE | SOURCE | ALL
specifies the level of information to be included in the trace output. You can specify more than one
value by enclosing the values in parentheses.
NOTEspecifies that the trace output include the following information:
all original messages from the DBMS (ordinarily, the SAS System intercepts
certain confusing messages from the DBMS and replaces them with messages
that can be more easily understood by SAS customers)
note that indicates whether SQL is used for in-database computations, even
when SQLGENERATION=NONE is specified. By default, a note is printed
when SQL is used, but no note is printed when SQL is not used. If you specify
SQL_IP_TRACE=NOTE, a note is always printed.
notes that state the name of the data set engine, the name of the DBMS
database, and so on
notes that explain what kinds of SAS passwords apply to the data set, because
the drivers used for in-database computation do not support SAS passwords
note that indicates that no system options or data set attributes prevent using
SQL for in-database computation, when that is the case (this note does not
rule out the possibility that some procedure options might prevent in-database
computation)
Enabling and Controlling In-Database Computing F 13
SOURCEspecifies that trace output include the major SQL commands that are submitted
for in-database computation, but not minor SQL commands such as those used to
obtain metadata.
ALLspecifies that trace output include the output for NOTE and SOURCE, plus all
SQL commands that are submitted (including those used to obtain metadata and to
verify that UDFs are published), plus additional details.
SQLMAPPUTTO=NONE | SAS_PUT
is not applicable to the SAS/STAT, SAS/ETS, or SAS Enterprise Miner procedures but is included
here for completeness.
The Base SAS procedures use implicit pass-through to send SQL commands to Teradata. With implicit pass-through, the SQL commands are written in the SAS dialect of SQL and then the SAS
System translates those commands to the Teradata dialect of SQL. The SAS dialect of SQL uses the
PUT function to format variables. The Teradata dialect of SQL uses the SAS_PUT UDF to format
variables. The SQLMAPPUTTO=SAS_PUT option specifies that the PUT function in SAS SQL be
translated (mapped) to the SAS_PUT function in Teradata SQL.
The SAS/STAT, SAS/ETS, and SAS Enterprise Miner procedures write SQL commands directly in the
Teradata dialect of SQL instead of using implicit pass-through. Therefore, this option has no effect
on the SAS/STAT, SAS/ETS, and SAS Enterprise Miner procedures. See the section “SQLMAPPUTTO= System Option” in SAS/ACCESS for Relational Databases: Reference.
FMTERR | NOFMTERR
specifies whether to refrain from using unpublished formats.
If a procedure needs to format a BY variable but that format has not been published (that is, not
installed on the Teradata server in SYSLIB or the current database), the procedure does not perform
14 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
in-database computation unless you specify NOFMTERR. NOFMTERR tells the procedure to refrain
from using unpublished formats and to use the unformatted values of the BY variables instead. If you
specify NOFMTERR and a Teradata error occurs when the procedure tests a format, the procedure
uses the unformatted values of the BY variables to define BY groups.
For each BY variable that has a format that cannot be used, a message is printed to that effect. That
message is an error if FMTERR is specified or a note if NOFMTERR is specified. In addition, a note
states whether SQL is used.
For information about publishing formats, see the section “Deploying and Using SAS Formats in
Teradata” on page 37 in this document or the section “Deploying and Using SAS Formats in Teradata”
in SAS/ACCESS 9.2 for Relational Databases: Reference.
DBFMTIGNORE | NODBFMTIGNORE
specifies which type of formats be used for DBMS columns.
Ordinarily, SAS formats control how data values are displayed rather than how data values are stored.
However, SAS/ACCESS software works differently from other SAS products with respect to formats.
NODBFMTIGNORE (the default) causes SAS/ACCESS software, when it creates a DBMS table, to
use the SAS formats assigned to SAS variables to decide which DBMS data types to assign to the
corresponding DBMS columns. For example, the following SAS statements create a DBMS table
named pi_table which contains a column named pi with a DBMS data type of decimal (3,2) with a
total of 3 digits with 2 decimal places:
data dbms.pi_table;
pi = 22/7;
format pi 3.2;
run;
The value that SAS/ACCESS software inserts in the column pi is 3.14 rather than 3.142857142 . ..
(that is, the actual numeric value of the column is truncated to the two decimal places that are specified
in the SAS format). Computations performed with the DBMS column pi suffer from loss of precision
and possibly numeric overflow.
If you specify OPTIONS DBFMTIGNORE, SAS/ACCESS software creates numeric DBMS columns
with the data type DOUBLE PRECISION. This data type is the same as the default data type that the
SAS System uses for numeric variables in ordinary SAS data sets.
The DBFMTIGNORE option does not apply to datetime or character formats.
See the section “Using SAS/ACCESS Software with Teradata: Some Cautionary Notes” for more details.
LIBNAME Statement Options for In-Database Computing
To perform in-database computations, the SAS/STAT, SAS/ETS, and SAS Enterprise Miner procedures
connect to the DBMS by using the credentials (username, password, and so on) in the LIBNAME statement for the DATA= data set. This connection is separate from any connection made by SAS/ACCESS
LIBNAME Statement Options for In-Database Computing F 15
software. Therefore, some SAS/ACCESS options such as the CONNECTION= option do not work with the
SAS/STAT, SAS/ETS, and SAS Enterprise Miner in-database procedures. Also, LIBNAME concatenation
cannot be used for in-database computation because the procedure cannot determine which credentials to
use for the connection.
Although the DATABASE= and SCHEMA= options are aliases as data set options, they are not aliases in
a LIBNAME statement. Table 1.2 describes the effects of the DATABASE= and SCHEMA= options in a
LIBNAME statement.
Table 1.2 DATABASE= and SCHEMA= Options for Teradata
Options
specified in
LIBNAME
statement
Neither
DATABASE=
nor SCHEMA=
DATABASE=
but not
SCHEMA=
SCHEMA= but
not
DATABASE=
Both
DATABASE=
and SCHEMA=
Option that
specifies name
of database
where the SAS
System
Option that
specifies name of
database where
OUT= table is
stored**
Option that specifies
name of database where
Teradata looks for
UDFs***
searches for
DATA= table*
USER=USER=USER=
DATABASE=DATABASE=DATABASE=
SCHEMA=SCHEMA=USER=
SCHEMA=SCHEMA=DATABASE=
* The DBMS
user for the
DATA= table
must have the
SELECT
privilege for the
named
database.
** The DBMS user
for the DATA=
table must have the
CREATE TABLE,
SELECT, and
INSERT privileges
for the named
database.
*** If Teradata does not
find the UDF in the
named database, then
Teradata looks in
SYSLIB. The DBMS
user for the DATA= table
must have the EXECUTE
FUNCTION privilege for
the database where the
UDF is found.
Table 1.3 describes whether SAS/ACCESS LIBNAME options for Teradata work correctly with the DATA=
16 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
and OUT= options for the SAS/STAT and SAS/ETS in-database procedures. The cell entries in the DATA=
and OUT= columns have the following values:
Yes—indicates that the option works for in-database computing
No—indicates that the option does not work for in-database computing
N/A (not applicable)—indicates that the option does not apply to in-database computing
Some cell entries in the DATA= and OUT= columns have superscripts that are intended to convey additional
information. The meanings of the superscripts are as follows:
A superscript of 1 (Yes1or No1) indicates an option that is recommended for statistical applications
and correctly affects in-database processing as expected.
A superscript of 2 (Yes2or No2) indicates an option that correctly affects in-database processing as
expected.
A superscript of 3 (Yes3or No3) indicates an option that causes the procedure to use SAS/ACCESS
for out-of-database computing.
A superscript of 4 (Yes4or No4) indicates an option that neither works for in-database computing nor
prevents in-database computing, but is unlikely to cause a serious error. See the Comment column for
more information.
Table 1.3 SAS/ACCESS LIBNAME Options for Teradata
LIBNAME optionDATA=OUT=Comments
ACCESS=READONLYYes
2
Yes
2
For an OUT= data set, you
correctly receive a
SAS/ACCESS error message.
In-database SAS/STAT and
SAS/ETS procedures use
separate connections from
SAS/ACCESS software and
therefore work as if
CONNECTION=UNIQUE is
specified.
LIBNAME Statement Options for In-Database Computing F 17
LIBNAME optionDATA=OUT=Comments
CONNECTION_GROUP=No
3
No
3
In-database SAS/STAT and
SAS/ETS procedures use
separate connections from
SAS/ACCESS software.
DBCOMMIT=N/AN/AFor out-of-database I/O.
DBCONINIT=No
3
No
3
In-database SAS/STAT and
SAS/ETS procedures use
separate connections from
SAS/ACCESS software.
DBCONTERM=No
3
No
3
In-database SAS/STAT and
SAS/ETS procedures use
separate connections from
SAS/ACCESS software.
DBCREATE_TABLE_OPTS=N/AYes
2
DBINDEX=N/AN/AFor out-of-database I/O.
DBLIBINIT=N/AN/AExecuted when LIBNAME
connection is made.
DBLIBTERM=N/AN/AExecuted when LIBNAME is
disconnected.
DBMSTEMP=YESNo
3
No
3
In-database SAS/STAT and
SAS/ETS procedures use
separate connections from
SAS/ACCESS software.
Temporary tables can be
accessed only within the
connection in which they are
created, so DBMSTEMP=YES
causes Teradata errors that say
the table does not exist. The
default value NO works.
DBPROMPT=N/AN/AExecuted when LIBNAME
connection is made.
DBSASLABEL=Yes
1
N/ADBSASLABEL=NONE is
recommended for correct
procedure output.
18 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
LIBNAME optionDATA=OUT=Comments
DBSLICEPARM=N/AN/AFor out-of-database I/O.
DEFER=NONo
4
No
4
In-database SAS/STAT and
SAS/ETS procedures use
separate connections from
SAS/ACCESS software and
therefore work as if
DEFER=YES is specified.
DIRECT_EXE=DELETEN/AN/A
DIRECT_SQL=N/AN/APROC SQL only.
ERRLIMIT=N/AN/AFor out-of-database I/O.
LOGDB=N/AN/AFor out-of-database I/O.
MODE=N/AN/AApplies to explicit pass-through
in PROC SQL.
MULTISTMT=N/AN/AFor out-of-database I/O.
MULTI_DATASRC_OPT=N/AN/AFor out-of-database I/O with
PROC SQL.
PASSWORD=Yes
2
Yes
2
PREFETCH=N/AN/AFor out-of-database I/O.
PRESERVE_COL_NAMES=NONo
PRESERVE_NAMES=NONo
PRESERVE_TAB_NAMES=NONo
READ_ISOLATION_LEVEL=No
READ_LOCK_TYPE=No
READ_MODE_WAIT=No
3
3
3
4
4
4
No
No
No
No
No
No
3
3
3
4
4
4
The default value YES works.
The default value YES works.
The default value YES works.
The option is ignored.
The option is ignored.
The option is ignored.
REREAD_EXPOSURE=N/AN/A
SERVER=Yes
2
Yes
2
Same as TDPID=.
SPOOL=N/AN/AFor out-of-database I/O.
Using SAS/ACCESS Software with Teradata: Some Cautionary Notes F 19
LIBNAME optionDATA=OUT=Comments
SQL_FUNCTIONS=N/AN/AFor PROC SQL.
SQL_FUNCTIONS_COPY=N/AN/AFor PROC SQL.
SQLGENERATION=Yes
2
Yes
2
If SQLGENERATION= is
specified in the LIBNAME
statement, any value specified in
an OPTIONS statement is
ignored.
TDPID=Yes
UPDATE_ISOLATION_LEVEL=No
UPDATE_LOCK_TYPE=No
UPDATE_MODE_WAIT=No
USER=Yes
2
4
4
4
2
Yes
No
No
No
Yes
2
4
4
4
2
Same as SERVER=.
The option is ignored.
The option is ignored.
The option is ignored.
UTILCONN_TRANSIENT=N/AN/AIn-database SAS/STAT and
SAS/ETS procedures use
separate connections from
SAS/ACCESS software.
Using SAS/ACCESS Software with Teradata: Some Cautionary
Notes
See the chapter “SAS/ACCESS Interface to Teradata” in SAS/ACCESS for Relational Databases: Reference
for the complete documentation regarding the use of SAS/ACCESS software with Teradata.
When a SAS/STAT or SAS/ETS procedure performs in-database computation, SAS/ACCESS soft-
ware is not used for downloading the rows of the DATA= data set from the DBMS or for uploading
the rows of the OUT= data set to the DBMS. Rather, SAS/ACCESS software is used as follows:
– When the SAS System parses the statements of a SAS/STAT or SAS/ETS procedure,
SAS/ACCESS software looks up the variable names in the DATA= data set. Or, when you do
not specify an explicit variable list, SAS/ACCESS software obtains the names of the variables
from the DBMS.
– SAS/ACCESS software tells the procedure which SAS data types and SAS formats are assigned
to the variables in both the DATA= and OUT= data sets when those data sets are DBMS tables.
20 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
– When a SAS/STAT or SAS/ETS procedure attempts to create an OUT= data set, SAS/ACCESS
software generates the CREATE TABLE command and sends it to the DBMS.
Because SAS/ACCESS software is used for these purposes, numerous SAS/ACCESS options affect
in-database computation as described in this and subsequent sections. In addition, SAS/ACCESS
options are important for using the DATA step or out-of-database procedures to create DBMS tables,
and these options can have important consequences for subsequent analyses that are performed using
in-database procedures.
In statistical applications, you should exercise extreme caution with the BULKLOAD=YES option
in the LIBNAME statement. The BULKLOAD=YES option omits duplicate rows. It is not unusual
for statistical data sets to have duplicate rows, and omitting them produces incorrect answers. This
behavior affects the DATA step and procedures that do not perform in-database computation. This
behavior does not affect OUT= data sets that are created in-database by the PRINCOMP or SCORE
procedures.
You should usually specify the following SAS/ACCESS options regardless of whether you are per-
forming in-database computations:
– Either specify OPTIONS NOLABEL or specify the DBSASLABEL=NONE option in a LIB-
NAME statement to prevent SAS/ACCESS software from assigning default labels and thus
adding spurious variable labels to procedure listings, ODS tables, and output SAS data sets.
(Teradata tables do not store SAS variable labels. Variable labels must be specified in the PROC
step using a LABEL or ATTRIB statement. However, when SAS/ACCESS software creates a
SAS data set from a DBMS table, it assigns the variable names as labels by default.)
– Specify DBFMTIGNORE in an OPTIONS statement to prevent SAS/ACCESS software from
assigning data types to output variables based on their SAS formats. Without the DBFMTIGNORE option, output data sets in the DBMS (including ODS tables) might suffer from loss of
precision or numeric overflow, causing loss of data. See the subsection “LIBNAME Statement
Data Conversions” in the section “Data Types for Teradata” in SAS/ACCESS 9.2 for RelationalDatabases: Reference.
When SAS/ACCESS software reads a Teradata table or creates an ordinary SAS data set from a
Teradata table, it assigns formats to the SAS variables based on the Teradata data types. These formats
are often inappropriate and can interfere with the proper operation of SAS/STAT, SAS/ETS, and Base
SAS procedures such as PRINT, PRINCOMP, and TRANSREG, especially for character variables.
When using such a data set, you might want to specify a FORMAT statement to remove the formats,
such as the following:
FORMAT _CHARACTER_;
If you use a RENAME= data set option, SAS/ACCESS software assigns the old variable names as
variable labels.
Most SAS data set options are not stored in Teradata tables. In particular, the TYPE= data set option is
not stored in Teradata tables. For example, if you run the CORR procedure to create a TYPE=CORR
data set as a Teradata table, the data set TYPE= is lost and you must specify the TYPE= data set
option every time you use that table.
Using SAS/ACCESS Software with Teradata: Some Cautionary Notes F 21
Teradata has nulls instead of missing values. SAS software has 28 kinds of numeric missing values,
but SAS/ACCESS software converts all of them to null when it creates a Teradata table from SAS
input. SAS character missing values are the same as blank strings, but Teradata nulls are different from
blank strings. When SAS/ACCESS software creates a Teradata table from SAS input, blank strings
are converted to nulls. Conversely, when a Teradata table is converted to a SAS data set, numeric
nulls are converted to ordinary SAS numeric missing values, and character nulls are converted to
blank strings.
Comparisons with Teradata nulls work differently from comparisons with SAS missing values. Tera-
data uses three-valued logic, so a comparison with a null produces an “unknown” logical value. In the
SAS System, numeric missing values are regarded as less than any nonmissing value, and character
missing values are (equal to) blank strings. Comparisons between two SAS missing values that the
SAS System considers equal might yield either a null or a true value on Teradata, depending on how
the SAS System translates the syntax.
When the SAS System copies a column from a DBMS table to a new DBMS table, the new DBMS
data type can differ from that of the original table. This is true for both in-database and out-of-database
processing and is a consequence of the manner in which SAS/ACCESS software handles DBMS data
types and SAS formats. See the subsection “LIBNAME Statement Data Conversions” in the section
“Data Types for Teradata” in SAS/ACCESS 9.2 for Relational Databases: Reference. The change in
data types happens in a DATA step with a SET or MERGE statement and in procedure statements that
copy variables from the DATA= data set to the OUT= data set.
You can exercise limited control over data types in output tables by using FORMAT or ATTRIB
statements to assign formats. For out-of-database processing, you can use the DBTYPE= data set
option to specify data types for output tables. However, the DBTYPE= data set option is not supported
for in-database computation by the SAS/STAT and SAS/ETS procedures.
SAS/ACCESS software does not change the following data types:
– BYTE
– CHAR
– DATE
– FLOAT (also known as REAL or DOUBLE PRECISION)
– TIME
– TIMESTAMP
22 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
Table 1.4 shows how data types are changed when the SAS System copies a Teradata column.
TIME with TIMEZONECHAR(21)
TIME(0) with TIMEZONECHAR(14)
TIME(1) with TIMEZONECHAR(16)
TIME(2) with TIMEZONECHAR(17)
TIME(3) with TIMEZONECHAR(18)
TIME(4) with TIMEZONECHAR(19)
TIME(5) with TIMEZONECHAR(20)
TIME(6) with TIMEZONECHAR(21)
Timestamp Data Types
TIMESTAMP with TIMEZONECHAR(32)
TIMESTAMP(0) with TIMEZONECHAR(25)
TIMESTAMP(1) with TIMEZONECHAR(27)
TIMESTAMP(2) with TIMEZONECHAR(28)
TIMESTAMP(3) with TIMEZONECHAR(29)
TIMESTAMP(4) with TIMEZONECHAR(30)
TIMESTAMP(5) with TIMEZONECHAR(31)
TIMESTAMP(6) with TIMEZONECHAR(32)
Using SAS/ACCESS Software with Teradata: Some Cautionary Notes F 27
Original Data TypeNew Data Type
INTERVAL YEAR to MONTHCHAR(6)
INTERVAL YEAR(1) to MONTHCHAR(5)
INTERVAL YEAR(2) to MONTHCHAR(6)
INTERVAL YEAR(3) to MONTHCHAR(7)
INTERVAL YEAR(4) to MONTHCHAR(8)
INTERVAL MONTHCHAR(3)
INTERVAL MONTH(1)CHAR(2)
INTERVAL MONTH(2)CHAR(3)
INTERVAL MONTH(3)CHAR(4)
INTERVAL MONTH(4)CHAR(5)
INTERVAL DAYCHAR(3)
INTERVAL DAY(1)CHAR(2)
INTERVAL DAY(2)CHAR(3)
INTERVAL DAY(3)CHAR(4)
INTERVAL DAY(4)CHAR(5)
INTERVAL DAY to HOURCHAR(6)
INTERVAL DAY(1) to HOURCHAR(5)
INTERVAL DAY(2) to HOURCHAR(6)
INTERVAL DAY(3) to HOURCHAR(7)
INTERVAL DAY(4) to HOURCHAR(8)
INTERVAL DAY to MINUTECHAR(9)
INTERVAL DAY(1) to MINUTECHAR(8)
INTERVAL DAY(2) to MINUTECHAR(9)
INTERVAL DAY(3) to MINUTECHAR(10)
INTERVAL DAY(4) to MINUTECHAR(11)
INTERVAL DAY to SECONDCHAR(19)
INTERVAL DAY(1) to SECONDCHAR(18)
INTERVAL DAY(2) to SECONDCHAR(19)
INTERVAL DAY(3) to SECONDCHAR(20)
INTERVAL DAY(4) to SECONDCHAR(21)
INTERVAL HOURCHAR(3)
INTERVAL HOUR(1)CHAR(2)
INTERVAL HOUR(2)CHAR(3)
INTERVAL HOUR(3)CHAR(4)
INTERVAL HOUR(4)CHAR(5)
INTERVAL HOUR to MINUTECHAR(6)
INTERVAL HOUR(1) to MINUTECHAR(5)
INTERVAL HOUR(2) to MINUTECHAR(6)
INTERVAL HOUR(3) to MINUTECHAR(7)
INTERVAL HOUR(4) to MINUTECHAR(8)
INTERVAL HOUR to SECONDCHAR(16)
INTERVAL HOUR(1) to SECONDCHAR(15)
INTERVAL HOUR(2) to SECONDCHAR(16)
INTERVAL HOUR(3) to SECONDCHAR(17)
INTERVAL HOUR(4) to SECONDCHAR(18)
INTERVAL MINUTECHAR(3)
28 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
Original Data TypeNew Data Type
INTERVAL MINUTE(1)CHAR(2)
INTERVAL MINUTE(2)CHAR(3)
INTERVAL MINUTE(3)CHAR(4)
INTERVAL MINUTE(4)CHAR(5)
INTERVAL MINUTE to SECONDCHAR(13)
INTERVAL MINUTE(1) to SECONDCHAR(12)
INTERVAL MINUTE(2) to SECONDCHAR(13)
INTERVAL MINUTE(3) to SECONDCHAR(14)
INTERVAL MINUTE(4) to SECONDCHAR(15)
INTERVAL SECONDCHAR(10)
INTERVAL SECOND(1)CHAR(9)
INTERVAL SECOND(2)CHAR(10)
INTERVAL SECOND(3)CHAR(11)
INTERVAL SECOND(4)CHAR(12)
Indeterminate Row Order in a DBMS Table
In an ordinary SAS data set, the observations are ordered in a consistent way. For example, if you run
the PRINT procedure several times, the observations are always printed in the same order. But in a DBMS
table, the order of the rows is not defined. If you run PROC PRINT several times on a DBMS table, the rows
might be printed in a different order every time. If you run a statistical analysis that depends on row order
in a DBMS table (such as the Durbin-Watson statistic in the REG procedure or tables with ORDER=DATA
in the FREQ procedure), the results might differ every time you run the analysis.
To get a consistent row order for a DBMS table, you can use the DBCONDITION= data set option to specify
an ORDER BY clause with one or more variables that defines a unique order for the rows. However, the
DBCONDITION= option works only for out-of-database processing. The DBCONDITION= option does
not work with in-database processing.
Following is an example that uses the DBCONDITION= option for out-of-database processing to control
the row order in PROC PRINT:
NOTE: There were 12 observations read from the data set TERA.test.
title "Order Rows by I and J";
proc print data=tera.test(dbcondition="order by i, j");
run;
30 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
The SAS System generates the following output:
Order Rows by I and J
Obsijx
111-0.04298
212-0.09999
313-0.24349
414-0.22226
5210.07353
6220.49937
723-1.52119
8240.79180
9310.57221
10320.17571
1133-1.44361
12340.44887
NOTE: There were 12 observations read from the data set TERA.test.
title "Order Rows by X";
proc print data=tera.test(dbcondition="order by x");
run;
The SAS System generates the following output:
Order Rows by X
Obsijx
123-1.52119
233-1.44361
313-0.24349
414-0.22226
512-0.09999
611-0.04298
7210.07353
8320.17571
9340.44887
10220.49937
11310.57221
12240.79180
NOTE: There were 12 observations read from the data set TERA.test.
proc delete data=tera.test;
run;
The SAS System generates the following note:
NOTE: Deleting TERA.test (memtype=DATA).
SAS/ACCESS Data Set Options for Teradata F 31
SAS/ACCESS Data Set Options for Teradata
Table 1.5 describes whether SAS/ACCESS data set options for Teradata work correctly with the DATA=
and OUT= SAS/STAT and SAS/ETS procedure options. The cell entries in the DATA= and OUT= columns
have the following values:
Yes—indicates that the option works for in-database computing
No—indicates that the option does not work for in-database computing
N/A (not applicable)—indicates that the option does not apply to in-database computing
Sometimes—see Comment column for details
Some cell entries in the DATA= and OUT= columns have superscripts that are intended to convey additional
information. The meanings of the superscripts are as follows:
A superscript of 1 (Yes1or No1) indicates an option that correctly affects in-database processing as
expected.
A superscript of 2 (Yes2or No2) indicates an option that works differently for in-database computing
compared to out-of-database computing.
A superscript of 3 (Yes3, No3, N/A3, or Sometimes3) indicates an option that causes the procedure to
use SAS/ACCESS for out-of-database computing.
A superscript of 4 (Yes4or No4) indicates an option that neither works for in-database computing nor
prevents in-database computing, but is unlikely to cause a serious error. See the Comment column for
more information.
A superscript of 5 (Yes5or No5) indicates an option that might cause incorrect answers or Teradata
syntax errors.
NOTE : As a data set option, DATABASE= is an alias for SCHEMA=, but as LIBNAME options,
DATABASE= and SCHEMA= are distinct options. The data set options DATABASE= and SCHEMA=
work the same way as the LIBNAME option SCHEMA=.
If you specify the same option as both a data set option and as a LIBNAME option, the value of the data set
option is used for that data set.
Table 1.5 SAS/ACCESS Data Set Options for Teradata
Data Set OptionDATA=OUT=Comments
BL_CONTROL=N/AN/AFor out-of-database I/O.
BL_DATAFILE=N/AN/AFor out-of-database I/O.
32 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
DATABASE= is an alias for
SCHEMA= as a data set option.
DATABASE= as a data set
option does not work like
DATABASE= as a LIBNAME
option.
DBCOMMIT=N/AN/AFor out-of-database I/O.
DBCONDITION=No
3
DBCREATE_TABLE_OPTS=N/AYes
No
3
1
DBFORCE=N/AN/AIn-database computation works
like DBFORCE=YES,
regardless of what value is
specified for DBFORCE=.
SAS/ACCESS software defaults
to DBFORCE=NO. However,
DBFORCE= would have an
effect on in-database
computation only if DBTYPE=
specified a Teradata data type
that resulted in truncation,
whereas specifying DBTYPE=
prevents in-database
computation.
DBINDEX=N/AN/AFor out-of-database I/O.
DBKEY=N/AN/AFor out-of-database I/O.
SAS/ACCESS Data Set Options for Teradata F 33
Data Set OptionDATA=OUT=Comments
DBLABEL=N/AN/ADBLABEL= applies only when
the DATA= data set is a SAS
data set with variable labels and
the OUT= data set is a DBMS
table. In-database computation
cannot be performed when the
DATA= data set is not a DBMS
table.
DBMASTER=N/AN/A
DBNULL=N/A
DBSASLABEL=Yes
3
1
3
No
N/ADBSASLABEL=NONE is
recommended for correct
procedure output, but it is
normally easier to specify this
in the LIBNAME statement.
DBSLICE=N/AN/AFor out-of-database I/O.
DBSLICEPARM=N/AN/AFor out-of-database I/O.
DBTYPE=N/A
3
Sometimes3DBTYPE= prevents in-database
computation because some
Teradata types such as DATE
and TIMESTAMP have not
been implemented.
ERRLIMIT=N/AN/AFor out-of-database I/O.
MBUFSIZE=N/AN/AFor out-of-database I/O.
ML_CHECKPOINT=N/AN/AFor out-of-database I/O.
ML_ERROR1=N/AN/AFor out-of-database I/O.
ML_ERROR2=N/AN/AFor out-of-database I/O.
ML_LOG=N/AN/AFor out-of-database I/O.
ML_RESTART=N/AN/AFor out-of-database I/O.
ML_WORK=N/AN/AFor out-of-database I/O.
MULTILOAD=N/AN/AFor out-of-database I/O.
34 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
Data Set OptionDATA=OUT=Comments
MULTISTMT=N/AN/AFor out-of-database I/O.
NULLCHAR=N/A
NULLCHARVAL=N/A
PRESERVE_COL_NAMES=NONo
READ_ISOLATION_LEVEL=No
READ_LOCK_TYPE=No
READ_MODE_WAIT=No
SCHEMA=Yes
3
3
3
4
4
4
1
No
No
No
No
No
No
Yes
SET=N/AYes
3
3
3
4
4
4
1
2
The default value YES works.
The option is ignored.
The option is ignored.
The option is ignored.
If the DATA= table contains
duplicate rows, the procedure
attempts to insert duplicate rows
into the OUT= table, and the
results for in-database and
out-of-database processing
differ. With in-database
processing, duplicate rows are
discarded and do not cause
errors. With out-of-database
processing, duplicate rows
cause errors and a ROLLBACK
is issued. With
MODE=TERADATA, the
ROLLBACK produces an
empty or incomplete table. With
MODE=ANSI, the
ROLLBACK produces an
empty table.
SLEEP=N/AN/AFor out-of-database I/O.
TENACITY=N/AN/AFor out-of-database I/O.
UPDATE_ISOLATION_LEVEL=No
UPDATE_LOCK_TYPE=No
UPDATE_MODE_WAIT=No
4
4
4
No
No
No
4
4
4
The option is ignored.
The option is ignored.
The option is ignored.
Base SAS Data Set Options F 35
Base SAS Data Set Options
Table 1.6 describes whether the Base SAS data set options work with the DATA= and OUT= SAS/STAT and
SAS/ETS procedure options for in-database computing. The cell entries in the DATA= and OUT= columns
have the following values:
Yes—indicates that the option works for in-database computing
No—indicates that the option does not work for in-database computing
N/A (not applicable)—indicates that the option does not apply to in-database computing
Some cell entries in the DATA= and OUT= columns have superscripts that are intended to convey additional
information. The meanings of the superscripts are as follows:
A superscript of 1 (Yes1or No1) indicates an option that correctly affects in-database processing as
expected.
A superscript of 2 (Yes2, No2, or N/A2) indicates an option that causes SAS/ACCESS syntax errors
or warnings. When SAS/ACCESS software issues a warning message, the option is ignored by the
procedure.
A superscript of 3 (Yes3, No3, or N/A3) indicates an option that causes the procedure to use
SAS/ACCESS for out-of-database computing.
Table 1.6 Base SAS Data Set Options
Data set optionDATA=OUT=Comments
ALTER=No
2
No
2
SAS passwords are not supported on
DBMSs.
BUFNO=N/AN/AFor out-of-database I/O.
BUFSIZE=N/AN/AFor out-of-database I/O.
CNTLLEV=N/AN/AFor SAS data sets only.
COMPRESS=N/AN/AFor SAS data sets only.
DLDMGACTION=N/AN/AFor SAS data sets only.
DROP=Yes
ENCODING=No
1
2
No
No
3
2
36 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
Data set optionDATA=OUT=Comments
ENCRYPT=No
2
No
2
Requires SAS passwords.
FILECLOSE=N/AN/AFor tape data sets only.
FIRSTOBS=No
3
No
3
DBMS tables have no inherent row order,
and FIRSTOBS= is defined in terms of row
order.
GENMAX=No
2
No
2
Neither the Teradata nor TSSQL drivers
support generation options that are explicitly
specified in the procedure statement, and the
procedure does not know whether a
generation number is explicit or implicit.
GENNUM=No
2
No
2
Neither the Teradata nor TSSQL drivers
support generation options that are explicitly
specified in the procedure statement, and the
procedure does not know whether a
generation number is explicit or implicit.
IDXNAME=N/AN/AFor out-of-database I/O.
IDXWHERE=N/AN/AFor out-of-database I/O.
IN=N/AN/AFor DATA step only.
INDEX=N/A
KEEP=Yes
LABEL=No
2
1
2
No
No
No
2
3
2
For SAS data sets only.
SAS data set attributes cannot be stored in
Teradata. However, LABEL= should work
on DATA=, just as TYPE= works on DATA=.
OBS=No
3
No
3
DBMS tables have no inherent row order,
and FIRSTOBS= is defined in terms of row
order.
OBSBUF=N/AN/AFor SAS views only.
OUTREP=N/AN/AFor SAS data sets only.
POINTOBS=N/AN/AFor SAS data sets only.
PW=No
2
No
2
SAS passwords are not supported on
DBMSs.
Deploying and Using SAS Formats in Teradata F 37
Data set optionDATA=OUT=Comments
PWREQ=No
2
No
2
SAS passwords are not supported on
DBMSs.
READ=No
2
No
2
SAS passwords are not supported on
DBMSs.
RENAME=No
3
No
3
SAS supervisor does not tell the procedure
what the old names are.
REPEMPTY=N/AN/AFor SAS data sets only.
REPLACE=N/AN/AFor SAS data sets only.
REUSE=N/AN/AFor SAS data sets only.
SORTEDBY=N/AN/AFor SAS data sets only.
SPILL=N/AN/AFor SAS views only.
TOBSNO=N/AN/AFor out-of-database I/O.
TYPE=No
3
No
3
SAS data set attributes cannot be stored in
Teradata, and most TYPE= values indicate
that the data set contains summary statistics.
WHERE=Yes
1
No
3
WHEREUP=N/AN/AFor updating a table, usually via
out-of-database I/O.
WRITE=No
2
No
2
SAS passwords are not supported on
DBMSs.
Deploying and Using SAS Formats in Teradata
Using SAS Formats
SAS formats are basically mapping functions that change an element of data from one format to another. For
example, some SAS formats change numeric values to various currency formats and date-and-time formats.
38 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
The SAS System supplies many formats. You can also use the SAS FORMAT procedure to define custom
formats that replace raw data values with formatted character values. For example, the following PROC
FORMAT code creates a custom format called $REGION that maps ZIP codes to geographic regions.
SAS programs frequently use both user-defined formats and formats that the SAS System supplies. Although they are referenced in numerous ways, using the PUT function in the SQL procedure is of particular
interest for SAS in-database processing.
The PUT function takes a format reference and a data item as input and returns a formatted value. For
example, this SQL procedure query uses the PUT function to summarize sales by region from a table of all
customers:
select put(zipcode,$region.) as region,
sum(sales) as sum_sales from sales.customers
group by region;
The SAS SQL processor knows how to process the PUT function. Currently, SAS/ACCESS Interface to
Teradata returns all rows of unformatted data in the SALES.CUSTOMERS table in the Teradata database to
the SAS System for processing.
The SAS in-database technology deploys (“publishes”) the PUT function implementation to Teradata as a
new function named SAS_PUT(). Similar to any other programming language function, the SAS_PUT()
function can take one or more input parameters and return an output value.
The SAS_PUT() function supports the use of SAS formats. You can specify the SAS_PUT() function in
SQL queries that the SAS System submits to Teradata in one of two ways:
implicitly by enabling the SAS System to automatically map PUT function calls to SAS_PUT() func-
tion calls
explicitly by using the SAS_PUT() function directly in your SAS program
If you used the SAS_PUT() function in the previous example, Teradata formats the ZIP code values with
the $REGION format and processes the GROUP BY clause using the formatted values.
By publishing the PUT function implementation to Teradata as the SAS_PUT() function to support you can
realize these advantages:
You can process the entire SQL query inside the database, which minimizes data transfer (I/O).
The SAS format processing leverages the scalable architecture of the DBMS.
The results are grouped by the formatted data and are extracted from the Teradata Enterprise Data
Warehouse (EDW).
How It Works F 39
Deploying SAS formats to execute inside a Teradata database can enhance performance and exploit Teradata
parallel processing. This is accomplished when you install the SAS/ACCESS Interface to Teradata software,
which consists of the following three components:
SAS/ACCESS Interface to Teradata engine—controls the exchange of information between the SAS
System and the Teradata EDW
SAS Formats Library for Teradata—contains a library of SAS formats and the SAS_PUT UDF that
maps the SAS PUT() function to the SAS_PUT function in Teradata
SAS Accelerator Publishing Agent—contains macros that enable you to publish the SAS formats and
user-defined formats to Teradata
How It Works
By using a SAS formats publishing macro, you can generate a SAS_PUT() function that enables you to
execute PUT function calls inside the Teradata EDW. You can reference the formats that the SAS System
supplies and most custom formats that you create using the FORMAT procedure.
The SAS formats publishing macro takes a SAS format catalog and publishes it to the Teradata EDW. Inside
the Teradata EDW, a SAS_PUT() function, which emulates the PUT function, is created and registered for
use in SQL queries.
40 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
Figure 1.1 Process Flow Diagram
Here is the basic process flow:
1. Install the SAS 9.2 Formats Library for Teradata in the Teradata EDW. This library contains many of
the formats that are available in Base SAS software.
NOTE : This is a one-time installation process.
For more information, see the section “Formats That SAS Supplies in the Teradata EDW.”
2. If necessary, create your custom formats by using PROC FORMAT and create a permanent catalog
by using the LIBRARY= option.
For more information, see the section “User-Defined Formats in the Teradata EDW” in SAS/ACCESS
9.2 for Relational Databases: Reference, Second Edition and the FORMAT procedure in the Base
SAS Procedures Guide.
3. Start SAS 9.2 and run the %INDTD_PUBLISH_FORMATS macro. This macro creates the files that
are needed to build the SAS_PUT() function and publishes those files to the Teradata EDW.
The %INDTD_PUBLISH_FORMATS macro performs these tasks:
produces the set of .c and .h files that are necessary to build the SAS_PUT() function
Formats That SAS Supplies in the Teradata EDW F 41
produces a script of the Teradata commands that are necessary to register the SAS_PUT() func-
tion on the Teradata EDW
4. After the %INDTD_PUBLISH_FORMATS macro creates the script, SAS/ACCESS Interface to Teradata executes the script and publishes the files to the Teradata EDW.
5. Teradata compiles the .c and .h files and creates the SAS_PUT() function.
The SAS_PUT() function is available to use in any SQL expression and to use typically wherever you
would use Teradata built-in functions. For more information, see “Using the SAS_PUT() Function in
the Teradata EDW” in SAS/ACCESS 9.2 for Relational Databases: Reference, Second Edition.
NOTE : The SAS_PUT() function uses Latin-1 encoding. Any character that cannot be represented in
Latin-1 might cause unexpected or unsuccessful behavior.
Formats That SAS Supplies in the Teradata EDW
The SAS System supplies many formats for use in the SAS_PUT() function.
You must install the SAS 9.2 Formats Library for Teradata on the same machine where you have installed
SAS Foundation. This is a one-time installation process.
After you install the SAS 9.2 Formats Library and run the %INDTD_PUBLISH_FORMATS macro, the
SAS_PUT() function can call these formats.
For information about how to install and configure the SAS 9.2 Formats Library for Teradata, see the chapter
on post-installation configuration for the SAS Accelerator Publishing Agent software in the ConfigurationGuide for SAS 9.2 Foundation for your operating environment.
NOTE : The SAS Scoring Accelerator for Teradata also uses these libraries. For more information about this
product, see the SAS Scoring Accelerator for Teradata: User’s Guide.
User-Defined Formats in the Teradata EDW
You can use PROC FORMAT to create user-defined formats and store them in a format catalog. You can
then use the %INDTD_PUBLISH_FORMATS macro to export the user-defined format definitions to the
Teradata EDW where the SAS_PUT() function can reference them.
If you use the FMTCAT= option to specify a format catalog in the %INDTD_PUBLISH_FORMATS macro,
these restrictions and limitations apply:
Trailing blanks in PROC FORMAT labels are lost when publishing a picture format.
Avoid using PICTURE formats with the MULTILABEL option. You cannot successfully create a
CNTLOUT= data set when PICTURE formats are present. This is a known problem in PROC FORMAT.
42 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
If you are using a character set encoding other than Latin1, picture formats are not supported. The
picture format supports only Latin1 characters.
If you use the MULTILABEL option, only the first label that is found is returned. For more informa-
tion, see the PROC FORMAT MULTILABEL option in the Base SAS Procedures Guide.
The %INDTD_PUBLISH_FORMATS macro rejects a format unless the LANGUAGE= option is set
to English or is not specified.
Although the format catalog can contain informats, the %INDTD_PUBLISH_FORMATS macro ig-
nores the informats.
User-defined formats that include a format that SAS supplies are not supported.
Overview of the Publishing Process
The %INDTD_PUBLISH_FORMATS macro creates the files that are needed to build the SAS_PUT() function and publishes these files to the Teradata EDW.
The %INDTD_PUBLISH_FORMATS macro also publishes the formats that are included in the SAS 9.2
Formats Library for Teradata. This makes many formats that SAS supplies available inside Teradata.
In addition to SAS formats, you can also publish the PROC FORMAT definitions that are contained in a
single SAS format catalog by using the FMTCAT= option. The process of publishing a PROC FORMAT
catalog entry converts the range label pairs into embedded data in Teradata.
NOTE : If you specify more than one format catalog using the FMTCAT= option, the last format that you
specify is published.
The %INDTD_PUBLISH_FORMATS macro performs the following tasks:
creates .h and .c files, which are necessary to build the SAS_PUT() function.
produces a script of Teradata commands that are necessary to register the SAS_PUT() function in the
Teradata EDW.
uses SAS/ACCESS Interface to Teradata to execute the script and publish the files to the Teradata
EDW
For information about publishing formats, see the section “Deploying and Using SAS Formats in Teradata”
in SAS/ACCESS 9.2 for Relational Databases: Reference.
Using SAS Formats
To each variable in an ordinary SAS data set, you can assign a format, field width, and optional number of
decimal places. The SAS System stores the format information for each variable in the SAS data set along
BY Groups and In-Database Computing F 43
with other attributes such as variable labels. Once you have published the SAS formats and any user-defined
formats that you want to use, those formats can be used for in-database processing. However, a Teradata
table cannot explicitly store the name of the SAS format, field width, and number of decimal places for
each column. Instead, when SAS/ACCESS software opens a Teradata table, SAS/ACCESS assigns format
information to each column based on the Teradata data type of the column (see the section “Data Types for
Teradata” in SAS/ACCESS 9.2 for Relational Databases: Reference). If you want to use a format for any
column that differs from the format that SAS/ACCESS software assigns to that column, you must specify
the FORMAT statement in the same PROC step in which you want the format to be used.
BY Groups and In-Database Computing
By default, SAS/STAT and SAS/ETS procedures that perform in-database computation define BY groups
according to the formatted values of the BY variables. However, except for the $w. and $CHARw. formats,
which do not truncate the data values, SAS formats must first be “published” before they can be used for
in-database computing. If your SAS program uses PROC FORMAT to create one or more user-defined
formats, you need to publish those user-defined formats as well. The process of publishing SAS formats
is described in the section “Deploying and Using SAS Formats in Teradata” in this document. Additional
details are also available in the section “Deploying and Using SAS Formats in Teradata” in SAS/ACCESS
9.2 for Relational Databases: Reference.
As indicated previously in the section “Using SAS Formats,” after you have published the SAS formats
and any user-defined formats that you want to use, those formats can be used for in-database computing.
However, a Teradata table cannot explicitly store the name of the SAS format, field width, and number of
decimal places for each column. Instead, when SAS/ACCESS software opens a Teradata table, it assigns
format information to each column based on the Teradata data type of the column (see the section “Data
Types for Teradata” in SAS/ACCESS 9.2 for Relational Databases: Reference). If you want to use a format
for any column that differs from the format that SAS/ACCESS software assigns to that column, you must
specify the FORMAT statement in the same PROC step in which you want the format to be used.
By default, if the SAS formats are not published or if any specific format used for a BY variable is not
published, then the procedure prints error messages about the UDFs or specific formats that have not been
published. If you want the procedure to run anyway without using the SAS_PUT UDF for unpublished
formats, specify the following statement:
OPTIONS NOFMTERR;
This statement causes the SAS/STAT or SAS/ETS procedure to define BY groups according to the unformatted values of the BY variables when the formats of the BY variables have not been published.
You should also be aware of the following conditions regarding BY groups and in-database computing:
The DESCENDING option in the BY statement is supported for in-database computation.
The NOTSORTED option in the BY statement is not supported for in-database computation, because
the results with NOTSORTED depend on the order of the rows in a SAS data set, whereas the rows
in a DBMS table do not have any defined order.
44 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
For BY processing in-database, the SAS session encoding must be Latin-1. SAS_PUT uses the Latin-
1 encoding, and any character that cannot be represented in Latin-1 can cause problems.
SAS_PUT does not support Teradata columns with types of BYTE, VARBYTE, BLOB, GRAPHIC,
or VARGRAPHIC. If you try to use SAS_PUT with any of these types, you are likely to get a Teradata
error message saying, “Function sas_put does not exist”, even if SAS_PUT has been published.
There are also unusual situations in which the BY groups for in-database computation differ from the BY
groups for out-of-database computation. This arises from the fact that BY groups for out-of-database computation depend on the order of the observations, in addition to the formatted values. The in-database
behavior is a feature, not a defect. Here is an example:
title "Try To Use a Discontiguous Format for BY Groups";
options nolabel nodate nostimer;
proc print data=dbms.format_table;
where fmtname='PARITY';
run;
However, you cannot simply use the parity format to define BY groups based on odd and even numbers
when using ordinary SAS data sets, as opposed to DBMS tables. The SAS System requires each BY group
BY Groups and In-Database Computing F 45
to be a contiguous sequence of observations, so you would have to create a new variable that contains the
formatted value and then sort the data set by that new variable.
An out-of-database analysis of a DBMS table works the same way, because SAS/ACCESS software returns
the rows ordered by the unformatted values of the BY variable. On the other hand, suppose you perform
an in-database analysis. The rows in the DBMS table have no inherent order, so the BY groups are defined
solely by the formatted values of the BY variable. To illustrate, suppose you create a data set that contains
a variable named group with integer values from 1 to 7 and a variable named formatted_group with the
formatted values of group.
proc delete data=dbms.par_test;
data dbms.par_test;
retain random group value formatted_group;
do group = 1 to 7;
proc print data=dbms.par_test(dbcondition='order by "group"');
run;
The following analysis fails because the two groups defined by the formatted values are not contiguous
sequences of observations when sorted by the unformatted values. There are many error messages because
each BY group contains only one observation, which the PRINCOMP procedure considers to be an error.
title2 "Analysis 0: DATA out-of-db, OUT out-of-db, by group";
title3 "Should produce many error messages";
options sqlgeneration=none;
proc princomp data=dbms.par_test out=out;
var random value;
by group;
format group parity1.;
run;
To get an out-of-database analysis to work, you have to use the explicitly formatted variable, format-
ted_group as in the following statements. Having both variables in the BY statement is redundant, but
necessary, to get the variable group copied to the OUTSTAT= data set for comparisons.
title2 "Analysis 1: DATA out-of-db, OUT out-of-db, by formatted_group group";
title3 "This analysis is correct";
options sqlgeneration=none;
proc princomp data=dbms.par_test out=out1 outstat=stat1 cov;
by formatted_group group;
format group parity1.;
run;
Next, sort by the variable formatted_group first to get the comparisons to work:
proc sort data=stat1;
by formatted_group group _type_ _name_;
46 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
format _all_;
run;
proc print data=stat1;
run;
proc sort data=out1 out=sort1;
by formatted_group group value;
format _all_;
run;
proc print data=sort1;
run;
The in-database analysis works correctly because the BY groups are defined solely by the formatted values.
Having both variables in the BY statement is redundant, but necessary, to get the variable group copied to
the OUTSTAT= data set for comparisons.
title2 "Analysis 2: DATA in-db, OUT in-db, by group formatted_group";
title3 "This analysis is correct";
options sqlgeneration=dbms;
proc delete data=dbms.out2;
run;
When performing in-database computation with the SAS/STAT or SAS/ETS in-database procedures, all of
the variables in the DATA= data set must be confined to the following Teradata data types, regardless of
whether they are used in the analysis:
BYTEINT
SMALLINT
INTEGER
DECIMAL (ANSI NUMERIC)
FLOAT (ANSI REAL or DOUBLE PRECISION)
DATE
TIME
TIMESTAMP
CHARACTER
VARCHAR
LONG VARCHAR
CLOB
If the DATA= data set contains any variables of the following types, the procedure cannot perform indatabase computation, but performs out-of-database computation instead:
48 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
BYTE
VARBYTE
BLOB
INTERVAL
TIME WITH TIME ZONE
TIMESTAMP WITH TIME ZONE
The following Teradata data types are not supported by SAS/ACCESS software:
BIGINT
GRAPHIC
VARGRAPHIC
LONG VARGRAPHIC
Numeric Computations
SAS/STAT procedures that support in-database computing can perform numeric in-database computations
for the following Teradata data types, all of which are converted to FLOAT before being used in any arithmetic computation:
BYTEINT
SMALLINT
INTEGER
DECIMAL (ANSI NUMERIC)
FLOAT (ANSI REAL or DOUBLE PRECISION)
DATE
TIME
TIMESTAMP
The following Teradata data types cannot be used for numeric in-database computations because they are
character types:
CHARACTER
VARCHAR
LONG VARCHAR
CLOB
BY Processing
Columns with the following Teradata data types can be used as BY variables:
BYTEINT
SMALLINT
INTEGER
DECIMAL (ANSI NUMERIC)
BY Processing F 49
FLOAT (ANSI REAL OR DOUBLE PRECISION)
DATE
TIME
TIMESTAMP
CHARACTER
VARCHAR
LONG VARCHAR
CLOB
Conditions That Prevent In-Database Processing
This section describes conditions that are checked by the SAS/STAT and SAS/ETS procedures that support
in-database computing. Specific procedures might have other conditions that prevent in-database computing. One example is any option that requires all rows of a table to be downloaded without summarization.
All specific conditions are discussed in detail in the following sections that are devoted to the individual
SAS/STAT and SAS/ETS in-database procedures:
CANCORR Procedure Options Affected by In-Database Computing
FACTOR Procedure Options Affected by In-Database Computing
50 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
PRINCOMP Procedure Options Affected by In-Database Computing
REG Procedure Options and Statements Affected by In-Database Computing
SCORE Procedure Options Affected by In-Database Computing
TIMESERIES Procedure Options Affected by In-Database Computing
VARCLUS Procedure Options Affected by In-Database Computing
Options
In-database computation is not supported when any of the options described in Table 1.7 are specified.
Table 1.7 Options That Do Not Support In-Database Computation
OptionOption
Value
SQLGENERATIONNONExx
VALIDVARNAMEV6 or
UPCASE
ENCODINGAnything
other
than
Latin-1
when
there is a
BY
statement
OBSAnyxxDBMS tables have
Specified in
OPTIONS
Statement
xDBMS identifiers
xThe SAS_PUT
Specified in
LIBNAME
Statement
Specified as
Data Set
Option
Reason
are usually
case-sensitive.
UDF supports only
Latin-1.
no inherent row
order.
FIRSTOBS>1xxDBMS tables have
no inherent row
order.
RENAMEAnyx
DROP or KEEPAnyIn OUT=
Table 1.7 continued
Options F 51
OptionOption
Value
Specified in
OPTIONS
Statement
Specified in
LIBNAME
Statement
Specified as
Data Set
Option
Reason
WHEREAnyIn OUT=
READAnyxSAS passwords are
not supported in
SQL, which has a
very different
security system.
WRITEAnyxSAS passwords are
not supported in
SQL, which has a
very different
security system.
ALTERAnyxSAS passwords are
not supported in
SQL, which has a
very different
security system.
GENMAXAnyx
GENNUMAnyx
DBCONDITIONAnyx
DBTYPEAnyx
DBNULLAnyx
DBGEN_NAMESASxxCauses SQL
syntax errors.
PRESERVE_COL_NAMES NoxxCauses SQL
syntax errors.
PRESERVE_TAB_NAMES NoxxCauses SQL
syntax errors.
PRESERVE_NAMESNoxxCauses SQL
syntax errors.
52 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
Table 1.7 continued
OptionOption
Value
CONNECTION_GROUPAnyxxThe in-database
DBMSTEMPYesxxThe in-database
Specified in
OPTIONS
Statement
Specified in
LIBNAME
Statement
Specified as
Data Set
Option
Reason
SAS/STAT and
SAS/ETS
procedures cannot
share connections
with
SAS/ACCESS
software.
SAS/STAT and
SAS/ETS
procedures cannot
share connections
with
SAS/ACCESS
software.
DBCONINITAnyxxThe in-database
SAS/STAT and
SAS/ETS
procedures cannot
share connections
with
SAS/ACCESS
software.
DBCONTERMAnyxxThe in-database
SAS/STAT and
SAS/ETS
procedures cannot
share connections
with
SAS/ACCESS
software.
LIBNAME Properties
In-database computation is not supported when either of the following conditions is true:
Data Set and Variable Properties F 53
The engine is not TERADATA.
The LIBNAME statement specifies a concatenated library.
Data Set and Variable Properties
In-database computation is not supported when any of the following conditions is true:
The data set is not a DBMS table.
The data set TYPE= attribute is not blank.
The data set is a DBMS table and has variables named _TYPE_ and _NAME_, but TYPE=CORR,
COV, or SSCP was not specified as a data set option.
Any variable in the OUT= data set has a HEXw. format.
The DATA= data set contains any variables with the following DBMS data types:
– BYTE
– VARBYTE
– BLOB
– INTERVAL
– TIME WITH TIME ZONE
– TIMESTAMP WITH TIME ZONE
– BIGINT
– GRAPHIC
– VARGRAPHIC
– LONG VARGRAPHIC
BY Processing
In-database computation is not supported when any of the following conditions is true:
The BY statement contains the NOTSORTED option.
Any BY variable has a length or format width that exceeds 256 characters, the maximum supported
by the SAS_PUT UDF.
The total length of the BY variables exceeds 2,048 bytes, the maximum supported by the
SAS_TACORR UDF.
54 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
The total formatted width of the BY variables exceeds 2,048 bytes, the maximum supported by the
SAS_TACORR UDF.
One or more variables cannot be formatted because the SAS_PUT UDF has not been published in
SYSLIB or in the current database and the NOFMTERR option was not specified.
Compatibility of the DATA= and OUT= Data Sets
In-database computation is not supported for the DATA= data set when in-database computation is not used
for the OUT= data set and there is a BY statement. This is because BY groups sometimes are not the same
for both in-database and out-of-database computations.
In-database computation is not supported for the OUT= data set when in-database computation is not used
for the DATA= data set.
In-database computation is not supported when the DATA= and OUT= data sets have different engines.
In-database computation is not supported when any variable has the following properties:
is in both the DATA= and OUT= data sets
does not have a DBMS character data type (CHAR, VARCHAR, CLOB)
has a DBMS data type in the DATA= data set that belongs to a different group than the DBMS type
of the corresponding variable in the OUT= data set, where the groups are as follows:
In most circumstances, if a procedure cannot perform in-database computation, it performs out-of-database
computation instead. However, if the DBMS type of a variable belongs to different groups in the DATA= and
OUT= data sets, the procedure issues an error message and quits. It might be possible to run the procedure
by using SQLGENERATION=NONE.
If OPTIONS SQL_IP_TRACE=NOTE or OPTIONS SQL_IP_TRACE=ALL is specified, the procedure
also issues a message if any variable has different DBMS types in the DATA= and OUT= data sets because
Teradata often has trouble converting data from one type to another.
Other Conditions
In-database computation is not supported when any of the following conditions is true:
A variable is used more than once in a VAR, PARTIAL, or MODEL statement in the PRINCOMP,
REG, or VARCLUS procedures.
Two or more variables in the OUT= data set have the same name in the PRINCOMP or SCORE
procedures.
Any UDF required for in-database computation cannot be found in SYSLIB or the current database.
In-Database Computing for the DATA= Data Set by the
CANCORR, FACTOR, PRINCOMP, REG, and VARCLUS
Procedures
The DATA= data set must be a Teradata table for in-database computing.
When the CANCORR, FACTOR, PRINCOMP, REG, or VARCLUS procedure performs in-database computation for the DATA= data set, the procedure generates an SQL query that computes the SSCP matrix.
The query is passed to the DBMS and executed in-database. The results of the query are then passed back
to the SAS System and stored in an ordinary SAS data set called Work._Mkxpx_. If there is a BY statement,
Work._Mkxpx_ contains results for all of the BY groups. You do not need to understand the contents of
Work._Mkxpx_. The contents are not documented and might change in future releases. Work._Mkxpx_ does
not become the default input data set (_LAST_) even if the procedure creates no other data sets.
The SQL query requires the SAS_TACORR and SAS_TOVB UDFs to be published on the DBMS server. If
these UDFs are absent, or if you do not have the privilege required for executing the UDFs, the SQL query
fails. Usually, the database administrator installs these UDFs and grants the necessary privileges to people
who want to use them.
56 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
ODS Tables for the CANCORR, FACTOR, PRINCOMP, REG, and
VARCLUS Procedures
ODS does not perform in-database processing. Most ODS tables contain statistics or other summaries of the
data, so in-database processing would not be applicable. Although the statistics in an ODS table might have
been computed using in-database processing, after the statistics are computed, there is nothing for ODS to
do in-database.
When ODS creates an output data set, ODS assumes that it is an ordinary SAS data set. You can ask ODS
to create tables on a DBMS by assigning a libref to the table in a LIBNAME statement and then using
that libref to declare a two-level name for the ODS table in an ODS OUTPUT statement. ODS then uses
SAS/ACCESS software to write the tables. Occasionally, ODS might fail to write a DBMS table for the
following reasons:
Some ODS tables are created with SAS formats assigned to some variables. SAS formats cannot be
stored in a DBMS table. However, SAS/ACCESS Interface to Teradata sets the Teradata data type
based on the SAS format. If SAS/ACCESS software assigns a Teradata data type with insufficient
range or precision, the ODS table might fail because of numerical overflow or loss of precision. To
avoid these problems, use OPTIONS DBFMTIGNORE to tell the SAS/ACCESS software to create
all numeric columns with the Teradata data type FLOAT, which is equivalent to the ANSI DOUBLE
PRECISION type.
In some cases, ODS attempts to update an ODS table that has already been created. SAS/ACCESS
software does not allow Teradata tables to be updated or replaced.
In-Database Computing for the OUT= Data Set by the CANCORR,
FACTOR, PRINCOMP, and SCORE Procedures
Both the OUT= and the DATA= data sets must be Teradata tables in order for in-database computation to
be used for the OUT= data set.
When the CANCORR, FACTOR, PRINCOMP, or SCORE procedure computes the OUT= data set indatabase, SAS/ACCESS software submits an SQL command to create an empty DBMS table. Then the
procedure submits an SQL command to insert data into the DBMS table. If there is a BY statement, a
separate SQL command is issued for each BY group.
The SAS/STAT procedures connect to the DBMS using the credentials (user, password, and so on) in the
LIBNAME statement for the DATA= data set. This connection is separate from any connection made by
SAS/ACCESS software. If in-database computations are also performed for the OUT= data set, the DBMS
user specified in the LIBNAME statement for the DATA= data set must have the INSERT privilege for the
database where the OUT= data set resides.
SAS/STAT Procedure Options, Statements, and Features Affected by In-Database Computing F 57
When the OUT= data set is computed in-database, the procedure does not print the usual note about the
number of observations and variables in the data set, because the procedure cannot determine precisely how
many observations are in the OUT= data set. Instead, the procedure prints a note that states how many
rows were inserted as reported by the SQL command. If OPTIONS SQL_IP_TRACE=ALL is specified, a
separate note is generated for each BY group, in addition to a note for the total of all the BY groups. The
numbers reported by SQL seem to be accurate, but it is difficult to tell whether they are always exact. One
possible situation that might cause a discrepancy is other users of the database inserting or deleting rows at
the same time the procedure is running so that the total number of rows that SQL inserts is not necessarily
the actual number of rows in the table.
In-database computation cannot be used for the DATA= data set if both of the following conditions exist:
There is a BY statement.
The OUT= data set is specified but is not computed in-database.
SAS/STAT Procedure Options, Statements, and Features
Affected by In-Database Computing
CANCORR Procedure Options, Statements, and Features Affected by
In-Database Computing
Table 1.8 shows the options for the CANCORR procedure that are affected by in-database computing.
Table 1.8 CANCORR Procedure Options Affected by In-Database Computing
OptionComment
DATA=The SSCP matrix can be computed in-database. The SSCP
matrix is then used to compute the correlation or covariance
matrix on the client.
OUT=Canonical variable scores can be computed in-database provided
that in-database computation is also used for the DATA= data set.
OUTSTAT=In-database computation is not applicable.
The PARTIAL statement is not currently supported for in-database computing.
58 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
FACTOR Procedure Options, Statements, and Features Affected by
In-Database Computing
Table 1.9 shows the options for the FACTOR procedure that are affected by in-database computing.
Table 1.9 FACTOR Procedure Options Affected by In-Database Computing
OptionComment
DATA=The SSCP matrix can be computed in-database. The SSCP
matrix is then used to compute the correlation or covariance
matrix on the client.
OUT=Estimated factor scores can be computed in-database provided
that in-database computation is also used for the DATA= data set.
OUTSTAT=In-database computation is not applicable.
The PARTIAL statement is not currently supported for in-database computing.
PRINCOMP Procedure Options, Statements, and Features Affected by
In-Database Computing
Table 1.10 shows the options for the PRINCOMP procedure that are affected by in-database computing.
Table 1.10 PRINCOMP Procedure Options Affected by In-Database Computing
OptionComment
DATA=The SSCP matrix can be computed in-database. The SSCP
matrix is then used to compute the correlation or covariance
matrix on the client.
OUT=Principal component scores can be computed in-database
provided that in-database computation is also used for the
DATA= data set.
OUTSTAT=In-database computation is not applicable.
REG Procedure Options, Statements, and Features Affected by In-Database Computing F 59
REG Procedure Options, Statements, and Features Affected by In-Database
Computing
Table 1.11 shows the options for the REG procedure that are affected by in-database computing.
Table 1.11 REG Procedure Options and Statements Affected by In-Database Computing
OptionComment
COVOUTIn-database computation is not applicable.
DATA=The SSCP matrix can be computed in-database.
EDFIn-database computation is not applicable.
OUTEST=In-database computation is not applicable.
OUTSEBIn-database computation is not applicable.
OUTSSCP=In-database computation is not applicable.
OUTSTBIn-database computation is not applicable.
OUTVIFIn-database computation is not applicable.
PCOMITIn-database computation is not applicable.
PLOTS=This option requires row-level access and therefore cannot be
used in-database.
PRESSThis option requires row-level access and therefore cannot be
used in-database.
RIDGEIn-database computation is not applicable.
RSQUAREIn-database computation is not applicable.
TABLEOUTIn-database computation is not applicable.
The following statements require row-level access and therefore cannot be used in-database:
OUTPUT
PAINT
PLOT
60 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
REWEIGHT
The following MODEL statement options require row-level access and therefore cannot be used in-database:
ACOV
CLI
CLM
INFLUENCE
LACKFIT
P
PARTIAL
PARTIALDATA
PRESS
R
SPEC
WHITE
The following MODEL statement options are never available with Teradata because they depend upon the
row order of the data, which is not guaranteed to be consistent:
DW
DWPROB
The following PRINT statement options require row-level access and therefore cannot be used in-database:
ACOV
CLI
CLM
DW
INFLUENCE
MODELDATA
P
PARTIAL
R
SPEC
SCORE Procedure Options, Statements, and Features Affected by In-Database Computing F 61
SCORE Procedure Options, Statements, and Features Affected by
In-Database Computing
Table 1.12 shows the options for the SCORE procedure that are affected by in-database computing.
Table 1.12 SCORE Procedure Options Affected by In-Database Computing
OptionComment
DATA=The DATA= data set can be used for computing the OUT= data
set in-database.
OUT=Scores can be computed in-database provided that the DATA=
data set is a table in the same DBMS.
SCORE=In-database computation is not applicable.
When performing in-database computation, the SCORE procedure does not know the number of observations in the DATA= data set. If the data set contains zero observations, PROC SCORE does not print the
usual note that says that there are no observations in the DATA= data set. Instead, SCORE prints a note that
says that SQL inserted zero rows in the OUT= data set.
VARCLUS Procedure Options, Statements, and Features Affected by
In-Database Computing
Table 1.13 shows the options for the VARCLUS procedure that are affected by in-database computing.
Table 1.13 VARCLUS Procedure Options Affected by In-Database Computing
OptionComment
DATA=The SSCP matrix can be computed in-database. The SSCP
matrix is then used to compute the correlation or covariance
matrix on the client.
OUTSTAT=In-database computation is not applicable.
OUTTREE=In-database computation is not applicable.
62 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
SAS/ETS Procedure Options, Statements, and Features Affected
by In-Database Computing
TIMESERIES Procedure Options, Statements, and Features Affected by
In-Database Computing
Table 1.14 shows the options for the TIMESERIES procedure that are affected by in-database computing.
Table 1.14 TIMESERIES Procedure Options Affected by In-Database Computing
OptionComment
ACCUMULATE= The time-stamped rows of the data table are accumulated to
periodic time series in-database. The accumulated time series
are then processed on the client.
Base SAS Procedure Options, Statements, and Features
Affected by In-Database Computing
CORR Procedure Options, Statements, and Features Affected by In-Database
Computing
The ID statement requires row-level access and therefore cannot be used in-database.
Table 1.15 shows the options for the CORR procedure that are affected by in-database computing.
FREQ Procedure Options, Statements, and Features Affected by In-Database Computing F 63
Table 1.15 CORR Procedure Options, Statements, and Features Affected by In-Database Computing
OptionComment
EXCLNPWGTIf in-database computation is used, the EXCLNPWGT option is
activated to exclude observations with nonpositive weights.
HOEFFDINGThis option requires row-level access and therefore cannot be
used in-database.
KENDALLThis option requires row-level access and therefore cannot be
used in-database.
SPEARMANThis option requires row-level access and therefore cannot be
used in-database.
OUTH=This option requires row-level access and therefore cannot be
used in-database.
OUTK=This option requires row-level access and therefore cannot be
used in-database.
OUTS=This option requires row-level access and therefore cannot be
used in-database.
PLOTSThis option requires row-level access and therefore cannot be
used in-database.
NOTSORTEDIf you specify the NOTSORTED option in the BY statement,
PROC CORR in-database computation ignores it and uses the
default ASCENDING order for BY variables.
FREQ Procedure Options, Statements, and Features Affected by In-Database
Computing
Table 1.16 shows the options for the FREQ procedure that are affected by in-database computing.
64 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
Table 1.16 FREQ Procedure Options Affected by In-Database Computing
OptionComment
NOTSORTEDIf you specify the NOTSORTED option in the BY statement,
PROC FREQ in-database computation ignores it and uses the
default ASCENDING order for BY variables.
ORDER=DATAIf you specify the ORDER=DATA option for input data in a
DBMS table, PROC FREQ computation might produce different
results for separate runs of the same analysis. In addition to
determining the order of variable levels in crosstabulation table
displays, the ORDER= option can also affect the values of many
of the test statistics and measures that PROC FREQ computes.
SAS Enterprise Miner Procedure Options, Statements, and
Features Affected by In-Database Computing
DMDB Procedure Options, Statements, and Features Affected by In-Database
Computing
PROC DMDB executes in-database with the following considerations:
In SAS Enterprise Miner 6.2, if the SAS option SQLGENERATION=NONE is specified, then normal
SAS computation is performed; otherwise, in-database computation is performed.
If the number of VAR and CLASS terms is greater than 620, processing executes in SAS and raw data
are transferred from the database to SAS.
If a weight statement is specified, then normal SAS computation is performed.
For class variables, only Latin1 encoding is supported.
For in-database computing, all interval variables are cast as float numbers.
If a table exists in the database, it cannot be overwritten implicitly; it needs to be deleted explicitly.
Table 1.17 shows the options for the DMDB procedure that are affected by in-database computing.
DMINE Procedure Options, Statements, and Features Affected by In-Database Computing F 65
Table 1.17 DMDB Procedure Options, Statements, and Features Affected by In-Database Computing
OptionComment
MAXLEVEL=This option is enforced on a per AMP (Access Module Process)
basis.
OUT=SAS computation is used so that a DMDB data set can be
created in SAS.
DMINE Procedure Options, Statements, and Features Affected by In-Database
Computing
PROC DMINE executes in-database with the following considerations:
R-square computation is supported.
For stepwise forward variable selection, if the inputs are interval variables, the X’X matrix is com-
puted in-database.
Logistic regression for a binary target is not implemented in SAS Enterprise Miner 6.2 for in-database
computing.
If the input data are from DMDB output, in-database computing is not performed.
Missing values are dropped during computation; thus, the final results when in-database computing is
performed can be different from the results when in-database computing is not performed.
The maximum number of variables (including the target variable) allowed for in-database computing
is 2,048.
The status monitor is turned off when in-database computing is performed.
DMREG Procedure Options, Statements, and Features Affected by
In-Database Computing
PROC DMREG executes in-database with the following considerations:
Linear and logistic models are supported. Other link functions in the MODEL statement are not
supported.
The CLASS statement is supported.
The MODEL statement can contain only main effects and VAR interactions. Interactions composed
of CLASS effects disable in-database computing.
66 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
Model selection is supported using training (DATA=) statistics. Use of validation (VALIDATA=) or
test (TESTDATA=) statistics is not supported in-database.
The SCORE statement is not executed in-database. Data are transferred from the database to SAS.
The default solution technique for logistic regression models is quasi-Newton. Methods that require
Hessian or SSCP computations, such as Levenberg-Maquadt, are not computed in-database.
Miscellaneous Details
When SAS procedures perform out-of-database processing, SAS/ACCESS software runs SQL commands
on Teradata in ANSI mode by default. When SAS/STAT, SAS/ETS, and SAS Enterprise Miner procedures
perform in-database processing, the procedures run SQL commands on Teradata in Teradata mode. The
difference in mode causes some differences in behavior as shown in Table 1.18.
Table 1.18 Mode Differences for Data Insertion
ProceduresANSI ModeTeradata Mode
Trying to insert a duplicateTeradata errorDuplicate row is dropped
row when SET=YESand rollbackwithout error
Trying to insert a valueTeradata errorValue is truncated
that is too big for columnand rollbackand inserted without error
Example
A simple case that illustrates the use of SAS in-database procedures is provided in the following statements:
1 Starting with data that is already in Teradata, the first step is to issue a LIBNAME statement that uses
the Teradata SAS/ACCESS engine. If the Teradata LIBNAME engine has been correctly configured and
SAS Analytics Accelerator 1.3 for Teradata has been installed on the Teradata system, then successful
in-database computing is possible.
3 Now that you have the list of variables, you can write the SAS statements for the analysis. The first use
of SAS in-database procedures is in PROC FREQ and PROC MEANS. They are used to report summary
measures that help you determine how to use these variables in the analysis. Both procedures are enabled
for in-database processing. They dynamically generate SQL code that runs on Teradata and report only
the results. As a SAS user, you have not modified any code to enable in-database computing.
title3 'Frequencies of Class Variables';
proc freq data=tera.dmagecr;
table good_bad purpose;
run;
proc means data=tera.dmagecr ;
run;
4 You might continue the data exploration with a principal component analysis, which also runs in the
database. Suppose you want to create plots. You should use the ODS SELECT statement to list only
those plots that are compatible with in-database computing. (You can use the ODS TRACE option to find
the graphics elements that are available for any procedure.) After the unmodified SAS statements run, the
ODS graphics output is displayed in the SAS session.
VAR age amount checking coapp depends duration employed existcr
history housing installp job marital other property foreign
resident savings telephon target;
run;
5 Next, save a copy of the output table in the SAS System because the output table contains several mea-
sures, such as the correlation matrix, which can be used for further analysis.
title3 'Output from PCA';
data corr;
set pcastat(where=(_type_ eq "CORR"));
68 F Chapter 1: SAS Analytics Accelerator 1.3 for Teradata: Guide
run;
proc print data=corr noobs;
run;
6 Now suppose you want to run a regression analysis. By looking at the output of these procedures, you
can see that the dependent variable GOOD_BAD is a character variable. To use PROC REG, you must
transform the GOOD_BAD character variable to a numeric variable. Use PROC SQL to push SAS code
to the database for this operation. Pushing code to the database eliminates data movement. You need to
use the explicit pass-through syntax to force processing in-database. For the column named Foreign, you
must enclose the column name in quotation marks to prevent Foreign from being interpreted as a Teradata
keyword. You must be careful when formatting your SQL code for in-database processing.
Proc SQL noerrorstop;
connect to &dbms. (&connopt.);
execute (
create view tera.tdview_reg as select
age, amount, checking, coapp, depends, duration, employed, existcr,
'foreign', history, housing, installp, job, marital, other,
property, purpose, resident, savings, telephon,
case when good_bad = 'good' then 0 else 1 end as target
from tera.dmagecr) by &dbms;
execute (commit) by &dbms. ;
run;
7 Finally, you build a regression model. Even though the conversion of a binary character dependent vari-
able to a numeric variable for use in a linear regression model is not optimal, it is a common operation
for using the efficient REG procedure for model selection and exploratory modeling steps. SAS output
provides clues about how the factors are used in the model.
title3 'Linear Regression on raw data in teradata';
ods select SelectionSummary;
proc reg data=tera.tdview_reg;
model target = age amount checking coapp depends
duration employed existcr history housing installp
job marital other property foreign resident savings
telephon / selection = stepwise;
run;
quit;
Example F 69
8 Suppose you want to run several model statements, selecting different combinations of variables and
options. You can use the CORR matrix that you saved from the output of the PROC PRINCOMP step.
PROC CORR and PROC REG can also produce this matrix as output. Because you saved the matrix data
locally, you can execute any PROC REG step without accessing the raw data in Teradata.
title3 'Linear Regression on CORR from princomp';
ods trace on;
ods graphics on;
ods select SelectionSummary;
proc reg data= pcastat(type=corr) outset= est;
model target = age amount marital other property
foreign resident savings telephon;
run;
quit;
ods graphics off trace off;
ods html close;
9 Now suppose that you want to apply the model to a data set that contains new independent variable values
to produce a new table that contains predicted values. This process is called scoring. The OUTPUT
statement in PROC REG does not support in-database computing. However, the SCORE procedure does
support in-database computing, and produces scores for a variety of models. You use the OUTEST data
set that you saved locally in the PROC REG step to parameterize the model. PROC SCORE generates
and pushes the appropriate SQL code to the database. Both the input and output tables are in Teradata,
and no data are transferred to SAS.
var age amount marital other property foreign resident savings telephon;
run;
70
Subject Index
B
BULKLOAD=YES option
LIBNAME statement, 20
D
DBCONDITION= option
data set option, 28
DBFMTIGNORE option
OPTIONS statement, 20
DESCENDING option
BY statement, 43
M
missing values, 20
N
NOLABEL option
OPTIONS statement, 20
NOTSORTED option
BY statement, 43
R
RENAME=
data set option, 20
S
software licenses required, 3
T
TYPE=
data set option, 20
U
user-defined function (UDF), 5
72
Syntax Index
D
DATABASE= option
LIBNAME statement, 15
DBFMTIGNORE option
OPTIONS statement, 14
F
FMTERR option
OPTIONS statement, 13
M
MSGLEVEL= option
OPTIONS statement, 12
S
SCHEMA= option
LIBNAME statement, 15
SQL_IP_TRACE= option
OPTIONS statement, 13
SQLGENERATION= option
LIBNAME statement or OPTIONS statement, 11
SQLMAPPUTTO= option
OPTIONS statement, 13
Your Turn
We welcome your feedback.
If you have comments about this book, please send them to
yourturn@sas.com. Include the full title and page numbers (if applicable).
If you have comments about the software, please send them to
suggest@sas.com.
SAS® Publishing Delivers!
Whether you are new to the work force or an experienced professional, you need to distinguish yourself in this rapidly
changing and competitive job market. SAS® Publishing provides you with a wide range of resources to help you set
yourself apart. Visit us online at support.sas.com/bookstore.
SAS® Press
Need to learn the basics? Struggling with a programming problem? You’ll find the expert answers that you
need in example-rich books from SAS Press. Written by experienced SAS professionals from around the
world, SAS Press books deliver real-world insights on a broad range of topics for all skill levels.
support.sas.com/saspress
SAS® Documentation
To successfully implement applications using SAS software, companies in every industry and on every
continent all turn to the one source for accurate, timely, and reliable information: SAS documentation.
We currently produce the following types of reference documentation to improve your work experience:
• ReferencedocumentationdeliveredinHTMLandPDF– free on the Web.
• Hard-copybooks.
support.sas.com/publishing
SAS® Publishing News
Subscribe to SAS Publishing News to receive up-to-date information about all new SAS titles, author
podcasts, and new Web site features via e-mail. Complete instructions on how to subscribe, as well as
access to past issues, are available at our Web site.
support.sas.com/spn
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.