No p art o f this manu al may be re produce d in
any form or by any means (including electronic storage and retrieval or translation
into a foreign language) without prior agreement and written consent from Agilent
Technologies, Inc. as governed by United
States and international copyright laws.
Edition
G4460-90064
Revision A0, January 2021
Printed in USA
Agilent Technologies, Inc.
5301 Stevens Creek Blvd.
Santa Clara, CA 95051
Patents
Portions of this product may be covered
under US patent 6571005 licensed from the
Regents of the University of California.
Technical Support
For US and Canada
Call (800) 227-9770 (option 3,4,2)
Or send an e-mail to:
informatics_support@agilent.com
For all other regions
Agilent’s world-wide Sales and Support
Center contact details for your location can
be obtained at
www.agilent.com/en/contact-us/page.
Warranty
The material contained in this document is provided “as is,” and is subject to being changed, without notice,
in future editions. Further, to the maximum extent permitted by applicable
law, Agilent disclaims all warranties,
either express or implied, with regard
to this manual and any information
contained herein, including but not
limited to the implied warranties of
merchantability and fitness for a particular purpose. Agilent shall not be
liable for errors or for incidental or
consequential damages in connection with the furnishing, use, or performance of this document or of any
information contained herein. Should
Agilent and the user have a separate
written agreement with warranty
terms covering the material in this
document that conflict with these
terms, the warranty terms in the separate agreement shall control.
Technology Licenses
The hardware and/or software described in
this document are furnished under a license
and may be used or copied only in accordance with the terms of such license.
Restricted Rights Legend
U.S. Government Restricted Rights. Software and technical data rights granted to
the federal government include only those
rights customarily provided to end user customers. Agilent provides this customary
commercial license in Software and technical data pursuant to FAR 12.211 (Technical
Data) and 12.212 (Computer Software) and,
for the Department of Defense, DFARS
252.227-7015 (Technical Data - Commercial
Items) and DFARS 227.7202-3 (Rights in
Commercial Computer Software or Computer Software Documentation).
Safety Notices
A CAUTION notice denotes a hazard. It calls attention to an operating procedure, practice, or the like
that, if not correctly performed or
adhered to, could result in damage
to the product or loss of important
data. Do not proceed beyond a
CAUTION notice until the indicated
conditions are fully understood and
met.
A WARNING notice denotes a
hazard. It calls attention to an
operating procedure, practice, or
the like that, if not correctly performed or adhered to, could result
in personal injury or death. Do not
proceed beyond a WARNING
notice until the indicated conditions are fully understood and
met.
2Feature Extraction Reference Guide
In This Guide…
This Reference Guide contains tables that list default
parameter values and results for Feature Extraction
analyses, and explanations of how Feature Extraction uses
its algorithms to calculate results.
1 Protocol Default Settings
This chapter includes tables that list the default parameter
values found in the protocols shipped with the software
(Agilent 2- color gene expression (GE), 1-color GE, CGH,
ChIP, miRNA and non- Agilent protocols).
2QC Report Results
Learn how to read and interpret the QC Reports.
3Text File Parameters and Results
This chapter contains a listing of parameters and results
within the text file produced after Feature Extraction.
4XML (MAGE-ML) Results
Refer to this chapter to find the results contained in the
MAGE- ML files generated after Feature Extraction.
5How Algorithms Calculate Results
Learn how Feature Extraction algorithms calculate the
results that help you interpret your gene expression (2- color
and 1- color), CGH, ChIP and miRNA experiments.
6Command Line Feature Extraction
This chapter contains the commands and arguments to
integrate Feature Extraction into a completely automated
workflow.
Feature Extraction Reference Guide3
Acknowledgments
Apache acknowledgment
Part of this software is based on the Xerces XML parser,
Copyright (c) 1999- 2000 The Apache Software Foundation.
All Rights Reserved (www.apache.org).
JPEG acknowledgment
This software is based in part on the work of the
Independent JPEG Group. Copyright (c) 1991- 1998, Thomas
G. Lane. All Rights Reserved.
Loess/Netlib acknowledgment
Part of this software is based on a Loess/Lowess algorithm
and implementation. The authors of Loess/Lowess are
Cleveland, Grosse and Shyu. Copyright (c) 1989, 1992 by
AT&T. Permission to use, copy, modify and distribute this
software for any purpose without fee is hereby granted,
provided that this entire notice in included in all copies of
any software which is or includes a copy or modification of
this software and in all copies of the supporting
documentation for such software.
THIS SOFTWARE IS BEING PROVIDED “AS IS”, WITHOUT
ANY EXPRESS OR IMPLIED WARRANTY. NEITHER THE
AUTHORS NOR AT&T MAKE ANY REPRESENTATION OR
WARRANTY OF ANY KIND CONCERNING THE
MERCHANTABILITY OF THIS SOFTWARE OR ITS FITNESS
FOR ANY PARTICULAR PURPOSE.
Stanford University School of Medicine acknowledgment
Non- Agilent microarray image courtesy of Dr. Roger Wagner,
Division of Cardiovascular Medicine, Stanford University
School of Medicine
Ultimate Grid acknowledgment
This software contains material that is Copyright (c)
1994- 1999 DUNDAS SOFTWARE LTD., All Rights Reserved.
4Feature Extraction Reference Guide
LibTiff acknowledgement
Part of this software is based upon LibTIFF version 3.8.0.
Copyright (c) 1988- 1997 Sam Leffler
Copyright (c) 1991- 1997 Silicon Graphics, Inc.
Permission to use, copy, modify, distribute, and sell this
software and its documentation for any purpose is hereby
granted without fee, provided that (i) the above copyright
notices and this permission notice appear in all copies of
the software and related documentation, and (ii) the names
of Sam Leffler and Silicon Graphics may not be used in any
advertising or publicity relating to the software without the
specific, prior written permission of Sam Leffler and Silicon
Graphics.
THE SOFTWARE IS PROVIDED “AS- IS” AND WITHOUT
WARRANTY OF ANY KIND, EXPRESS, IMPLIED OR
OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY
WARRANTY OF MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE.
IN NO EVENT SHALL SAM LEFFLER OR SILICON
GRAPHICS BE LIABLE FOR ANY SPECIAL, INCIDENTAL,
INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND,
OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
OF USE, DATA OR PROFITS, WHETHER OR NOT ADVISED
OF THE POSSIBILITY OF DAMAGE, AND ON ANY THEORY
OF LIABILITY, ARISING OUT OF OR IN CONNECTION WITH
THE USE OR PERFORMANCE OF THIS SOFTWARE.
Feature Extraction Reference Guide5
6Feature Extraction Reference Guide
Content
1Default Protocol Settings 13
Default Protocol Settings—an Introduction 14
Differences between CGH and gene expression microarrays 15
Hidden Settings 15
Spot finding of Four Corners 90
Outlier Stats 91
Spatial Distribution of All Outliers 91
Net Signal Statistics 93
Negative Control Stats 94
Plot of Background-Corrected Signals 95
Histogram of Signals Plot (1-color GE or CGH) 96
Local Background Inliers 97
Foreground Surface Fit 97
Multiplicative Surface Fit 99
Spatial Distribution of Significantly Up-Regulated and Down-Regulated
Features (Positive and Negative Log Ratios) 100
Plot of LogRatio vs. Log ProcessedSignal 101
Spatial Distribution of Median Signals for each Row and Column 102
Histogram of LogRatio plot 103
FULL Features Table 179
COMPACT Features Table 190
QC Features Table 195
MINIMAL Features Table 201
Other text result file annotations 205
4MAGE-ML (XML) File Results 207
How Agilent output file formats are used by databases 208
MAGE-ML results 209
Differences between MAGE-ML and text result files 209
Full and Compact Output Packages 209
Tables for Full Output Package 210
Table for Compact Output Package 218
Helpful hints for transferring Agilent output files 222
Feature Extraction Reference Guide9
Contents
XML output 222
TIFF Results 224
5How Algorithms Calculate Results 225
Overview of Feature Extraction algorithms 226
Algorithms and functions they perform 226
Algorithms and results they produce 232
XDR Extraction Process 236
What is XDR scanning? 236
XDR Feature Extraction process 236
How the XDR algorithm works 238
Troubleshooting the XDR extraction 239
How each algorithm calculates a result 240
Place Grid 240
Optimize Grid Fit 243
Find Spots 243
Flag Outliers 250
Compute Bkgd, Bias and Error 256
Correct Dye Biases 276
Compute Ratios 280
Calculate Metrics 282
MicroRNA Analysis 285
Example calculations for feature 12519 of Agilent Human 22K image 292
Data from the FEPARAMS table 293
Data from the STATS Table 293
Data from the FEATURES Table 293
6Command Line Feature Extraction 299
Commands 301
Command line syntax 301
Commands and arguments 302
10Feature Extraction Reference Guide
Return Codes 307
Extraction Input 309
Extraction Results 314
Status information 314
Examples of status information 315
Error codes from XML file 317
Warning codes from XML file 321
Index 327
Contents
Feature Extraction Reference Guide11
Contents
12Feature Extraction Reference Guide
Agilent Feature Extraction 12.2
Reference Guide
1
Default Protocol Settings
Default Protocol Settings—an Introduction 14
Tables of Default Protocol Settings 16
Differences in Protocol Settings Based on Each Step 56
See the Feature Extraction 12.2
User Guide to learn the purpose of
all the parameters and settings and
how to modify them.
Agilent protocols are meant for use
with Agilent microarrays scanned
with an Agilent scanner. They are
intended for use with arrays that
use Agilent default lab procedures
(label, hybridization, wash, and
scanning methods). The
non-Agilent protocol is meant for
use with non-Agilent microarrays
that are scanned with an Agilent
scanner.
When a protocol is assigned to an extraction set, the
software loads a set of protocol parameter values and
settings that affect the process and results for Feature
Extraction.
Parameter values in the protocol depend on the microarray
type and your experiment. The following pages list the
default settings for each of the protocol templates shipped or
downloaded with the software. Each protocol template
represents a different microarray type. You can display these
settings and values when you open the Protocol Editor for
each of the protocol templates.
Agilent Technologies
13
1Default Protocol Settings
Default Protocol Settings—an Introduction
Default Protocol Settings—an Introduction
To learn more about changing the
default values for the protocols,
see the Feature Extraction 12.2
User Guide.
To learn about the naming of the
protocol templates, see the Feature
Extraction 12.2 User Guide.
Agilent provides new and updated
protocols on the eArray website. If
you set up an eArray login in
Feature Extraction, the software
can automatically download and
install protocol updates from
eArray. See the Feature Extraction
12.2 User Guide for more details.
This chapter presents tables for display of the default
settings for each protocol. Parameter values depend on:
• microarray type
• lab protocol
• formats
• scanner used
Listed in the following table are the names of the
nonremovable protocols and where you can find the tables
that list their default values.
Table 1Location of protocol template default settings
Protocol Template nameLocation in chapter
CGH_1201_Sep17
ChIP_1200_Jun14
GE1_1200_Jun14
GE2_1200_Dec17
GE2-NonAT_1100_Jul11
miRNA_1200_Jun14
page 16
page 24
page 31
page 37
page 44
page 49
14Feature Extraction Reference Guide
Default Protocol Settings1
CAUTION
Differences between CGH and gene expression microarrays
Differences between CGH and gene expression microarrays
To see the differences in some
default settings between protocols,
go to “GE2_1200_Dec17” on
page 37.
Hidden Settings
CGH microarrays possess a different negative control
sequence scheme than the gene expression microarrays. The
gene expression microarrays have many replicate negative
control features using only one sequence. The CGH
microarrays have many sequences of negative controls that
span the range of sequence variability seen in the biological
probes used on the microarrays. This difference in the
control grid (especially the multiple sequences used for
negative controls) leads to a difference in protocol settings.
To create a protocol for a specific type of microarray, you
are required to use an Agilent- created protocol or
user- created protocol for the same type of microarray.
Protocol templates provide both visible and hidden settings whose
values are specific to the type or format of microarrays. Although you
can change the visible settings so that any two protocols of different
type appear identical, you cannot change the hidden settings that
distinguish these protocols from one another.
Feature Extraction Reference Guide15
The “Tables of Default Protocol Settings” show only the
default visible parameter values for the steps of the protocol.
You can see the hidden parameters in the FE PARAMS table.
See “Parameters/options (FEPARAMS)” on page 129. Many of
these hidden parameters are image- processing ones that are
chosen using the “Automatically Determine” function.
1Default Protocol Settings
CAUTION
Tables of Default Protocol Settings
Tables of Default Protocol Settings
These protocol settings may not be optimum for non-Agilent
microarrays or Agilent microarrays processed with non-Agilent
procedures. You determine the settings and values that are optimum
for your system.
CGH_1201_Sep17
This protocol is a CGH protocol for use with the
Oligonucleotide Array- Based CGH for Genomic DNA
Analysis (Enzymatic User Manual version 6.1 or higher, ULS
User Manual version 3.1 or higher).
Table 2Default settings for CGH_1201_Sep17 protocol
Place GridArray FormatFor any format automatically
determined or selected by you, the
software uses the default
Placement Method.
Parameters that apply to specific
formats appear only if that format is
selected.
Placement MethodHidden if Array Format is set to
Enable Background Peak ShiftingHidden if Array Format is set to
Automatically Determine
[Recognized formats: Single
Density (11k, 22k), 25k, Double
Density (44k), 95k, 185k, 185k 10
uM, 65-micron feature size (also
with 10-micron scans), 30-micron
feature size single pack and multi
pack, and Third Party]
Automatically Determine.
Allow Some Distortion (All formats)
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
16Feature Extraction Reference Guide
Default Protocol Settings1
CGH_1201_Sep17
Table 2Default settings for CGH_1201_Sep17 protocol (continued)
Use central part of pack for slope
and skew calculation?
Use the correlation method to
obtain origin X of subgrids
Use Enhanced
Gridding
Optimize Grid FitGrid FormatThe parameters and values for
Apply the enhanced gridding
feature released in Feature
Extraction 12.1. The enhancements
include a new iterative method for
determining grid position, rotation,
and skew, and several “fine” grid
tuning methods that improve the
calculation of rotation and skew.
Enhanced gridding also uses both
the foreground and background of
the corner stencil patterns to
improve identification of grid
corners.
optimizing the grid differ depending
on the format.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
True
Note: Results obtained with
protocols that use enhanced
gridding may vary slightly from
results obtained with previous
gridding algorithms (e.g., fewer
gridding errors). Use appropriate
validation processes when
switching from previous CGH
protocols to ones that use
enhanced gridding.
Automatically Determine
[Recognized formats: 65-micron
feature size, 30-micron feature size,
and Third Party]
Iteratively Adjust Corners?Hidden if Array Format is set to
Automatically Determine.
True (All Formats, except Third
Party)
False (Third Party)
Adjustment ThresholdHidden if Array Format is set to
Automatically Determine.
0.300 (All Formats, except Third
Party)
Feature Extraction Reference Guide17
1Default Protocol Settings
CGH_1201_Sep17
Table 2Default settings for CGH_1201_Sep17 protocol (continued)
This enhancement allows for more
accurate placement of the center of
each spot by increasing the area
around the expected spot center in
which the algorithm looks for pixels
in the image that are attributable to
that spot. If the increased search
area captures pixels from
neighboring spots, then the
algorithm does not attribute those
pixels to the spot.
Minimum Population10
IQRatio1.42
Background IQRatio1.42
Use Qtest for Small Populations?True
Report Population Outliers as Failed
in MAGEML file
change depending on the scanner
used for the image. See the
following for differences.
False
Note: Results obtained with
protocols that use enhanced spot
finding may vary slightly from
results obtained without spot
finding (e.g., fewer non-uniform
features). Use appropriate
validation processes when
switching to CGH protocols that
use enhanced spot finding.
False
Automatically Determine
Agilent scanner
Automatically Compute OL Polynomial TermsHidden if Array Format is set to
Automatically Determine.
True
Feature – (%CV)^20.04000
Red Poissonian Noise Term
Multiplier
5
20Feature Extraction Reference Guide
Default Protocol Settings1
CGH_1201_Sep17
Table 2Default settings for CGH_1201_Sep17 protocol (continued)
Place GridArray FormatFor any format automatically
determined or selected by you, the
software uses the default
Placement Method.
Parameters that apply to specific
formats appear only if that format is
selected.
Placement MethodHidden if Array Format is set to
Enable Background Peak ShiftingHidden if Array Format is set to
Use central part of pack for slope
and skew calculation?
Use the correlation method to
obtain origin X of subgrids
Automatically Determine
[Recognized formats: Single
Density (11k, 22k), 25k, Double
Density (44k), 95k, 185k, 185k 10
uM, 65-micron feature size (also
with 10-micron scans), 30-micron
feature size (single pack and multi
pack) and Third Party]
Automatically Determine.
Allow Some Distortion (All formats)
Automatically Determine.
Set to false for all arrays except 30
microns (single pack and multi
pack), for which it is set to true.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
24Feature Extraction Reference Guide
Default Protocol Settings1
ChIP_1200_Jun14
Table 3Default settings for ChIP_1200_Jun14 protocol (continued)
Optimize Grid FitGrid FormatThe parameters and values for
An enhanced automatic gridding
algorithm was released in Feature
Extraction 12.1 for use in CGH
protocols only. Agilent has not
validated the new algorithm in ChIP
protocols.
optimizing the grid differ depending
on the format.
Iteratively Adjust Corners?Hidden if Array Format is set to
Adjustment ThresholdHidden if Array Format is set to
Maximum Number of IterationsHidden if Array Format is set to
False
Automatically Determine
[Recognized formats: 65-micron
feature size, 30-micron feature size,
and Third Party]
Automatically Determine.
True (All Formats, except Third
Party)
False (Third Party)
Automatically Determine.
0.300(All Formats, except Third
Party)
Automatically Determine.
5 (All Formats, except Third Party)
Found Spot ThresholdHidden if Array Format is set to
Automatically Determine.
0.200 (All Formats, except Third
Party)
Number of Corner Feature Side
Dimension?
Find SpotsSpot FormatDepending on the format selected
by the software or by you, the
default settings for this step
change. See the following rows for
the default values for finding spots.
Hidden if Array Format is set to
Automatically Determine.
20 (All Formats, except Third Party)
Automatically Determine
[Recognized formats: same as
those listed above except 244k
10uM replaces 65-micron feature
size 10-micron scans]
Feature Extraction Reference Guide25
1Default Protocol Settings
ChIP_1200_Jun14
Table 3Default settings for ChIP_1200_Jun14 protocol (continued)