No p art o f this manu al may be re produce d in
any form or by any means (including electronic storage and retrieval or translation
into a foreign language) without prior agreement and written consent from Agilent
Technologies, Inc. as governed by United
States and international copyright laws.
Edition
G4460-90064
Revision A0, January 2021
Printed in USA
Agilent Technologies, Inc.
5301 Stevens Creek Blvd.
Santa Clara, CA 95051
Patents
Portions of this product may be covered
under US patent 6571005 licensed from the
Regents of the University of California.
Technical Support
For US and Canada
Call (800) 227-9770 (option 3,4,2)
Or send an e-mail to:
informatics_support@agilent.com
For all other regions
Agilent’s world-wide Sales and Support
Center contact details for your location can
be obtained at
www.agilent.com/en/contact-us/page.
Warranty
The material contained in this document is provided “as is,” and is subject to being changed, without notice,
in future editions. Further, to the maximum extent permitted by applicable
law, Agilent disclaims all warranties,
either express or implied, with regard
to this manual and any information
contained herein, including but not
limited to the implied warranties of
merchantability and fitness for a particular purpose. Agilent shall not be
liable for errors or for incidental or
consequential damages in connection with the furnishing, use, or performance of this document or of any
information contained herein. Should
Agilent and the user have a separate
written agreement with warranty
terms covering the material in this
document that conflict with these
terms, the warranty terms in the separate agreement shall control.
Technology Licenses
The hardware and/or software described in
this document are furnished under a license
and may be used or copied only in accordance with the terms of such license.
Restricted Rights Legend
U.S. Government Restricted Rights. Software and technical data rights granted to
the federal government include only those
rights customarily provided to end user customers. Agilent provides this customary
commercial license in Software and technical data pursuant to FAR 12.211 (Technical
Data) and 12.212 (Computer Software) and,
for the Department of Defense, DFARS
252.227-7015 (Technical Data - Commercial
Items) and DFARS 227.7202-3 (Rights in
Commercial Computer Software or Computer Software Documentation).
Safety Notices
A CAUTION notice denotes a hazard. It calls attention to an operating procedure, practice, or the like
that, if not correctly performed or
adhered to, could result in damage
to the product or loss of important
data. Do not proceed beyond a
CAUTION notice until the indicated
conditions are fully understood and
met.
A WARNING notice denotes a
hazard. It calls attention to an
operating procedure, practice, or
the like that, if not correctly performed or adhered to, could result
in personal injury or death. Do not
proceed beyond a WARNING
notice until the indicated conditions are fully understood and
met.
2Feature Extraction Reference Guide
In This Guide…
This Reference Guide contains tables that list default
parameter values and results for Feature Extraction
analyses, and explanations of how Feature Extraction uses
its algorithms to calculate results.
1 Protocol Default Settings
This chapter includes tables that list the default parameter
values found in the protocols shipped with the software
(Agilent 2- color gene expression (GE), 1-color GE, CGH,
ChIP, miRNA and non- Agilent protocols).
2QC Report Results
Learn how to read and interpret the QC Reports.
3Text File Parameters and Results
This chapter contains a listing of parameters and results
within the text file produced after Feature Extraction.
4XML (MAGE-ML) Results
Refer to this chapter to find the results contained in the
MAGE- ML files generated after Feature Extraction.
5How Algorithms Calculate Results
Learn how Feature Extraction algorithms calculate the
results that help you interpret your gene expression (2- color
and 1- color), CGH, ChIP and miRNA experiments.
6Command Line Feature Extraction
This chapter contains the commands and arguments to
integrate Feature Extraction into a completely automated
workflow.
Feature Extraction Reference Guide3
Acknowledgments
Apache acknowledgment
Part of this software is based on the Xerces XML parser,
Copyright (c) 1999- 2000 The Apache Software Foundation.
All Rights Reserved (www.apache.org).
JPEG acknowledgment
This software is based in part on the work of the
Independent JPEG Group. Copyright (c) 1991- 1998, Thomas
G. Lane. All Rights Reserved.
Loess/Netlib acknowledgment
Part of this software is based on a Loess/Lowess algorithm
and implementation. The authors of Loess/Lowess are
Cleveland, Grosse and Shyu. Copyright (c) 1989, 1992 by
AT&T. Permission to use, copy, modify and distribute this
software for any purpose without fee is hereby granted,
provided that this entire notice in included in all copies of
any software which is or includes a copy or modification of
this software and in all copies of the supporting
documentation for such software.
THIS SOFTWARE IS BEING PROVIDED “AS IS”, WITHOUT
ANY EXPRESS OR IMPLIED WARRANTY. NEITHER THE
AUTHORS NOR AT&T MAKE ANY REPRESENTATION OR
WARRANTY OF ANY KIND CONCERNING THE
MERCHANTABILITY OF THIS SOFTWARE OR ITS FITNESS
FOR ANY PARTICULAR PURPOSE.
Stanford University School of Medicine acknowledgment
Non- Agilent microarray image courtesy of Dr. Roger Wagner,
Division of Cardiovascular Medicine, Stanford University
School of Medicine
Ultimate Grid acknowledgment
This software contains material that is Copyright (c)
1994- 1999 DUNDAS SOFTWARE LTD., All Rights Reserved.
4Feature Extraction Reference Guide
LibTiff acknowledgement
Part of this software is based upon LibTIFF version 3.8.0.
Copyright (c) 1988- 1997 Sam Leffler
Copyright (c) 1991- 1997 Silicon Graphics, Inc.
Permission to use, copy, modify, distribute, and sell this
software and its documentation for any purpose is hereby
granted without fee, provided that (i) the above copyright
notices and this permission notice appear in all copies of
the software and related documentation, and (ii) the names
of Sam Leffler and Silicon Graphics may not be used in any
advertising or publicity relating to the software without the
specific, prior written permission of Sam Leffler and Silicon
Graphics.
THE SOFTWARE IS PROVIDED “AS- IS” AND WITHOUT
WARRANTY OF ANY KIND, EXPRESS, IMPLIED OR
OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY
WARRANTY OF MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE.
IN NO EVENT SHALL SAM LEFFLER OR SILICON
GRAPHICS BE LIABLE FOR ANY SPECIAL, INCIDENTAL,
INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND,
OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
OF USE, DATA OR PROFITS, WHETHER OR NOT ADVISED
OF THE POSSIBILITY OF DAMAGE, AND ON ANY THEORY
OF LIABILITY, ARISING OUT OF OR IN CONNECTION WITH
THE USE OR PERFORMANCE OF THIS SOFTWARE.
Feature Extraction Reference Guide5
6Feature Extraction Reference Guide
Content
1Default Protocol Settings 13
Default Protocol Settings—an Introduction 14
Differences between CGH and gene expression microarrays 15
Hidden Settings 15
Spot finding of Four Corners 90
Outlier Stats 91
Spatial Distribution of All Outliers 91
Net Signal Statistics 93
Negative Control Stats 94
Plot of Background-Corrected Signals 95
Histogram of Signals Plot (1-color GE or CGH) 96
Local Background Inliers 97
Foreground Surface Fit 97
Multiplicative Surface Fit 99
Spatial Distribution of Significantly Up-Regulated and Down-Regulated
Features (Positive and Negative Log Ratios) 100
Plot of LogRatio vs. Log ProcessedSignal 101
Spatial Distribution of Median Signals for each Row and Column 102
Histogram of LogRatio plot 103
FULL Features Table 179
COMPACT Features Table 190
QC Features Table 195
MINIMAL Features Table 201
Other text result file annotations 205
4MAGE-ML (XML) File Results 207
How Agilent output file formats are used by databases 208
MAGE-ML results 209
Differences between MAGE-ML and text result files 209
Full and Compact Output Packages 209
Tables for Full Output Package 210
Table for Compact Output Package 218
Helpful hints for transferring Agilent output files 222
Feature Extraction Reference Guide9
Contents
XML output 222
TIFF Results 224
5How Algorithms Calculate Results 225
Overview of Feature Extraction algorithms 226
Algorithms and functions they perform 226
Algorithms and results they produce 232
XDR Extraction Process 236
What is XDR scanning? 236
XDR Feature Extraction process 236
How the XDR algorithm works 238
Troubleshooting the XDR extraction 239
How each algorithm calculates a result 240
Place Grid 240
Optimize Grid Fit 243
Find Spots 243
Flag Outliers 250
Compute Bkgd, Bias and Error 256
Correct Dye Biases 276
Compute Ratios 280
Calculate Metrics 282
MicroRNA Analysis 285
Example calculations for feature 12519 of Agilent Human 22K image 292
Data from the FEPARAMS table 293
Data from the STATS Table 293
Data from the FEATURES Table 293
6Command Line Feature Extraction 299
Commands 301
Command line syntax 301
Commands and arguments 302
10Feature Extraction Reference Guide
Return Codes 307
Extraction Input 309
Extraction Results 314
Status information 314
Examples of status information 315
Error codes from XML file 317
Warning codes from XML file 321
Index 327
Contents
Feature Extraction Reference Guide11
Contents
12Feature Extraction Reference Guide
Agilent Feature Extraction 12.2
Reference Guide
1
Default Protocol Settings
Default Protocol Settings—an Introduction 14
Tables of Default Protocol Settings 16
Differences in Protocol Settings Based on Each Step 56
See the Feature Extraction 12.2
User Guide to learn the purpose of
all the parameters and settings and
how to modify them.
Agilent protocols are meant for use
with Agilent microarrays scanned
with an Agilent scanner. They are
intended for use with arrays that
use Agilent default lab procedures
(label, hybridization, wash, and
scanning methods). The
non-Agilent protocol is meant for
use with non-Agilent microarrays
that are scanned with an Agilent
scanner.
When a protocol is assigned to an extraction set, the
software loads a set of protocol parameter values and
settings that affect the process and results for Feature
Extraction.
Parameter values in the protocol depend on the microarray
type and your experiment. The following pages list the
default settings for each of the protocol templates shipped or
downloaded with the software. Each protocol template
represents a different microarray type. You can display these
settings and values when you open the Protocol Editor for
each of the protocol templates.
Agilent Technologies
13
1Default Protocol Settings
Default Protocol Settings—an Introduction
Default Protocol Settings—an Introduction
To learn more about changing the
default values for the protocols,
see the Feature Extraction 12.2
User Guide.
To learn about the naming of the
protocol templates, see the Feature
Extraction 12.2 User Guide.
Agilent provides new and updated
protocols on the eArray website. If
you set up an eArray login in
Feature Extraction, the software
can automatically download and
install protocol updates from
eArray. See the Feature Extraction
12.2 User Guide for more details.
This chapter presents tables for display of the default
settings for each protocol. Parameter values depend on:
• microarray type
• lab protocol
• formats
• scanner used
Listed in the following table are the names of the
nonremovable protocols and where you can find the tables
that list their default values.
Table 1Location of protocol template default settings
Protocol Template nameLocation in chapter
CGH_1201_Sep17
ChIP_1200_Jun14
GE1_1200_Jun14
GE2_1200_Dec17
GE2-NonAT_1100_Jul11
miRNA_1200_Jun14
page 16
page 24
page 31
page 37
page 44
page 49
14Feature Extraction Reference Guide
Default Protocol Settings1
CAUTION
Differences between CGH and gene expression microarrays
Differences between CGH and gene expression microarrays
To see the differences in some
default settings between protocols,
go to “GE2_1200_Dec17” on
page 37.
Hidden Settings
CGH microarrays possess a different negative control
sequence scheme than the gene expression microarrays. The
gene expression microarrays have many replicate negative
control features using only one sequence. The CGH
microarrays have many sequences of negative controls that
span the range of sequence variability seen in the biological
probes used on the microarrays. This difference in the
control grid (especially the multiple sequences used for
negative controls) leads to a difference in protocol settings.
To create a protocol for a specific type of microarray, you
are required to use an Agilent- created protocol or
user- created protocol for the same type of microarray.
Protocol templates provide both visible and hidden settings whose
values are specific to the type or format of microarrays. Although you
can change the visible settings so that any two protocols of different
type appear identical, you cannot change the hidden settings that
distinguish these protocols from one another.
Feature Extraction Reference Guide15
The “Tables of Default Protocol Settings” show only the
default visible parameter values for the steps of the protocol.
You can see the hidden parameters in the FE PARAMS table.
See “Parameters/options (FEPARAMS)” on page 129. Many of
these hidden parameters are image- processing ones that are
chosen using the “Automatically Determine” function.
1Default Protocol Settings
CAUTION
Tables of Default Protocol Settings
Tables of Default Protocol Settings
These protocol settings may not be optimum for non-Agilent
microarrays or Agilent microarrays processed with non-Agilent
procedures. You determine the settings and values that are optimum
for your system.
CGH_1201_Sep17
This protocol is a CGH protocol for use with the
Oligonucleotide Array- Based CGH for Genomic DNA
Analysis (Enzymatic User Manual version 6.1 or higher, ULS
User Manual version 3.1 or higher).
Table 2Default settings for CGH_1201_Sep17 protocol
Place GridArray FormatFor any format automatically
determined or selected by you, the
software uses the default
Placement Method.
Parameters that apply to specific
formats appear only if that format is
selected.
Placement MethodHidden if Array Format is set to
Enable Background Peak ShiftingHidden if Array Format is set to
Automatically Determine
[Recognized formats: Single
Density (11k, 22k), 25k, Double
Density (44k), 95k, 185k, 185k 10
uM, 65-micron feature size (also
with 10-micron scans), 30-micron
feature size single pack and multi
pack, and Third Party]
Automatically Determine.
Allow Some Distortion (All formats)
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
16Feature Extraction Reference Guide
Default Protocol Settings1
CGH_1201_Sep17
Table 2Default settings for CGH_1201_Sep17 protocol (continued)
Use central part of pack for slope
and skew calculation?
Use the correlation method to
obtain origin X of subgrids
Use Enhanced
Gridding
Optimize Grid FitGrid FormatThe parameters and values for
Apply the enhanced gridding
feature released in Feature
Extraction 12.1. The enhancements
include a new iterative method for
determining grid position, rotation,
and skew, and several “fine” grid
tuning methods that improve the
calculation of rotation and skew.
Enhanced gridding also uses both
the foreground and background of
the corner stencil patterns to
improve identification of grid
corners.
optimizing the grid differ depending
on the format.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
True
Note: Results obtained with
protocols that use enhanced
gridding may vary slightly from
results obtained with previous
gridding algorithms (e.g., fewer
gridding errors). Use appropriate
validation processes when
switching from previous CGH
protocols to ones that use
enhanced gridding.
Automatically Determine
[Recognized formats: 65-micron
feature size, 30-micron feature size,
and Third Party]
Iteratively Adjust Corners?Hidden if Array Format is set to
Automatically Determine.
True (All Formats, except Third
Party)
False (Third Party)
Adjustment ThresholdHidden if Array Format is set to
Automatically Determine.
0.300 (All Formats, except Third
Party)
Feature Extraction Reference Guide17
1Default Protocol Settings
CGH_1201_Sep17
Table 2Default settings for CGH_1201_Sep17 protocol (continued)
This enhancement allows for more
accurate placement of the center of
each spot by increasing the area
around the expected spot center in
which the algorithm looks for pixels
in the image that are attributable to
that spot. If the increased search
area captures pixels from
neighboring spots, then the
algorithm does not attribute those
pixels to the spot.
Minimum Population10
IQRatio1.42
Background IQRatio1.42
Use Qtest for Small Populations?True
Report Population Outliers as Failed
in MAGEML file
change depending on the scanner
used for the image. See the
following for differences.
False
Note: Results obtained with
protocols that use enhanced spot
finding may vary slightly from
results obtained without spot
finding (e.g., fewer non-uniform
features). Use appropriate
validation processes when
switching to CGH protocols that
use enhanced spot finding.
False
Automatically Determine
Agilent scanner
Automatically Compute OL Polynomial TermsHidden if Array Format is set to
Automatically Determine.
True
Feature – (%CV)^20.04000
Red Poissonian Noise Term
Multiplier
5
20Feature Extraction Reference Guide
Default Protocol Settings1
CGH_1201_Sep17
Table 2Default settings for CGH_1201_Sep17 protocol (continued)
Place GridArray FormatFor any format automatically
determined or selected by you, the
software uses the default
Placement Method.
Parameters that apply to specific
formats appear only if that format is
selected.
Placement MethodHidden if Array Format is set to
Enable Background Peak ShiftingHidden if Array Format is set to
Use central part of pack for slope
and skew calculation?
Use the correlation method to
obtain origin X of subgrids
Automatically Determine
[Recognized formats: Single
Density (11k, 22k), 25k, Double
Density (44k), 95k, 185k, 185k 10
uM, 65-micron feature size (also
with 10-micron scans), 30-micron
feature size (single pack and multi
pack) and Third Party]
Automatically Determine.
Allow Some Distortion (All formats)
Automatically Determine.
Set to false for all arrays except 30
microns (single pack and multi
pack), for which it is set to true.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
24Feature Extraction Reference Guide
Default Protocol Settings1
ChIP_1200_Jun14
Table 3Default settings for ChIP_1200_Jun14 protocol (continued)
Optimize Grid FitGrid FormatThe parameters and values for
An enhanced automatic gridding
algorithm was released in Feature
Extraction 12.1 for use in CGH
protocols only. Agilent has not
validated the new algorithm in ChIP
protocols.
optimizing the grid differ depending
on the format.
Iteratively Adjust Corners?Hidden if Array Format is set to
Adjustment ThresholdHidden if Array Format is set to
Maximum Number of IterationsHidden if Array Format is set to
False
Automatically Determine
[Recognized formats: 65-micron
feature size, 30-micron feature size,
and Third Party]
Automatically Determine.
True (All Formats, except Third
Party)
False (Third Party)
Automatically Determine.
0.300(All Formats, except Third
Party)
Automatically Determine.
5 (All Formats, except Third Party)
Found Spot ThresholdHidden if Array Format is set to
Automatically Determine.
0.200 (All Formats, except Third
Party)
Number of Corner Feature Side
Dimension?
Find SpotsSpot FormatDepending on the format selected
by the software or by you, the
default settings for this step
change. See the following rows for
the default values for finding spots.
Hidden if Array Format is set to
Automatically Determine.
20 (All Formats, except Third Party)
Automatically Determine
[Recognized formats: same as
those listed above except 244k
10uM replaces 65-micron feature
size 10-micron scans]
Feature Extraction Reference Guide25
1Default Protocol Settings
ChIP_1200_Jun14
Table 3Default settings for ChIP_1200_Jun14 protocol (continued)
Recognized formats: 60 micron and
30 micron feature size, third party
PValue for Differential Expression0.010000
Percentile Value75.00
Generate ResultsType of QC ReportCGH_ChIP
Generate Single Text FileTrue
JPEG Down Sample Factor4
30Feature Extraction Reference Guide
Default Protocol Settings1
GE1_1200_Jun14
GE1_1200_Jun14
This protocol is a 1- color gene expression protocol for use
with the One- Color Microarray- Based Gene Expression Analysis (Quick Amp Labeling) (lab protocol v5.7 or higher,
publication number G4140- 90040 or G4140-90041 for Tecan
HS Pro Hybridization).
Table 4Default settings for GE1_1200_Jun14 protocol
Place GridArray FormatFor any format automatically
determined or selected by you, the
software uses the default
Placement Method.
Parameters that apply to specific
formats appear only if that format is
selected.
Placement MethodHidden if Array Format is set to
Enable Background Peak ShiftingHidden if Array Format is set to
Use central part of pack for slope
and skew calculation?
Use the correlation method to
obtain origin X of subgrids
Automatically Determine
[Recognized formats: Single
Density (11k, 22k), 25k, Double
Density (44k), 95k, 185k, 185k 10
uM, 65-micron feature size (also
with 10-micron scans), 30-micron
feature size (single pack and multi
pack) and Third Party]
Automatically Determine.
Allow Some Distortion (All formats)
Automatically Determine.
Set to false for all arrays except 30
microns (single pack and multi
pack), for which it is set to true.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Feature Extraction Reference Guide31
1Default Protocol Settings
GE1_1200_Jun14
Table 4Default settings for GE1_1200_Jun14 protocol (continued)
Optimize Grid FitGrid FormatThe parameters and values for
An enhanced automatic gridding
algorithm was released in Feature
Extraction 12.1 for use in CGH
protocols only. Agilent has not
validated the new algorithm in GE1
protocols.
optimizing the grid differ depending
on the format,
Iteratively Adjust Corners?Hidden if Array Format is set to
Adjustment ThresholdHidden if Array Format is set to
Maximum Number of IterationsHidden if Array Format is set to
False
Automatically Determine
[Recognized formats: 65-micron
feature size, 30-micron feature size,
and Third Party]
Automatically Determine.
True (All Formats, except Third
Party)
False (Third Party)
Automatically Determine.
0.300(All Formats, except Third
Party)
Automatically Determine.
5 (All Formats, except Third Party)
Found Spot ThresholdHidden if Array Format is set to
Automatically Determine.
0.200 (All Formats, except Third
Party)
Number of Corner Feature Side
Dimension?
Find SpotsSpot FormatDepending on the format selected
by the software or by you, the
default settings for this step
change. See the following rows for
the default values for finding spots.
Hidden if Array Format is set to
Automatically Determine.
20 (All Formats, except Third Party)
Automatically Determine
[Recognized formats: same as
those listed above except 244k
10uM replaces 65-micron feature
size 10-micron scans]
32Feature Extraction Reference Guide
Default Protocol Settings1
GE1_1200_Jun14
Table 4Default settings for GE1_1200_Jun14 protocol (continued)
Place GridArray FormatFor any format automatically
determined or selected by you, the
software uses the default
Placement Method.
Parameters that apply to specific
formats appear only if that format is
selected.
Placement MethodHidden if Array Format is set to
Enable Background Peak ShiftingHidden if Array Format is set to
Use central part of pack for slope
and skew calculation?
Use the correlation method to
obtain origin X of subgrids
Automatically Determine
[Recognized formats: Single
Density (11k, 22k), 25k, Double
Density (44k), 95k, 185k, 185k 10
uM, 65-micron feature size (also
with 10-micron scans), 30-micron
feature size (single pack and multi
pack) and Third Party]
Automatically Determine.
Allow Some Distortion (All formats)
Automatically Determine.
Set to false for all arrays except 30
microns (single pack and multi
pack), for which it is set to true.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Feature Extraction Reference Guide37
1Default Protocol Settings
GE2_1200_Dec17
Table 5Default settings for GE2_1200_Dec17 protocol (continued)
Optimize Grid FitGrid FormatThe parameters and values for
An enhanced automatic gridding
algorithm was released in Feature
Extraction 12.1 for use in CGH
protocols only. Agilent has not
validated the new algorithm in GE2
protocols.
optimizing the grid differ depending
on the format.
Iteratively Adjust Corners?Hidden if Array Format is set to
Adjustment ThresholdHidden if Array Format is set to
Maximum Number of IterationsHidden if Array Format is set to
False
Automatically Determine
[Recognized formats: 65-micron
feature size, 30-micron feature size,
and Third Party]
Automatically Determine.
True (All Formats, except Third
Party)
False (Third Party)
Automatically Determine.
0.300 (All Formats, except Third
Party)
Automatically Determine.
5 (All Formats, except Third Party)
Found Spot ThresholdHidden if Array Format is set to
Automatically Determine.
0.200 (All Formats, except Third
Party)
Number of Corner Feature Side
Dimension?
Find SpotsSpot FormatDepending on the format selected
by the software or by you, the
default settings for this step
change. See the following rows for
the default values for finding spots.
Hidden if Array Format is set to
Automatically Determine.
20 (All Formats, except Third Party)
Automatically Determine
[Recognized formats: same as
those listed above except 244k
10uM replaces 65-micron feature
size 10-micron scans]
38Feature Extraction Reference Guide
Default Protocol Settings1
GE2_1200_Dec17
Table 5Default settings for GE2_1200_Dec17 protocol (continued)
Place GridArray FormatFor any format automatically
determined or selected by you, the
software uses the default
Placement Method.
Parameters that apply to specific
formats appear only if that format is
selected.
Placement MethodHidden if Array Format is set to
Enable Background Peak ShiftingHidden if Array Format is set to
Use central part of pack for slope
and skew calculation?
Use the correlation method to
obtain origin X of subgrids
Automatically Determine
[Recognized formats: Single
Density (11k, 22k), 25k, Double
Density (44k), 95k, 185k, 185k 10
uM, 65-micron feature size (also
with 10-micron scans), 30-micron
feature size (single pack and multi
pack) and Third Party]
Automatically Determine.
Allow Some Distortion
Automatically Determine.
Set to false for all arrays except 30
microns (single pack and multi
pack), for which it is set to true.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
44Feature Extraction Reference Guide
Default Protocol Settings1
GE2-NonAT_1100_Jul11
Table 6Default settings for GE2-NonAT_1100_Jul11 protocol (continued)
Optimize Grid FitGrid FormatThe parameters and values for
An enhanced automatic gridding
algorithm was released in Feature
Extraction 12.1 for use in CGH
protocols only. Agilent has not
validated the new algorithm in GE2
protocols.
optimizing the grid differ depending
on the format.
Iteratively Adjust Corners?Hidden if Array Format is set to
Adjustment ThresholdHidden if Array Format is set to
Maximum Number of IterationsHidden if Array Format is set to
False
Automatically Determine
[Recognized formats: 65-micron
feature size, 30-micron feature size,
and Third Party]
Automatically Determine.
True (All Formats, except Third
Party)
False (Third Party)
Automatically Determine.
0.300 (All Formats, except Third
Party)
Automatically Determine.
5 (All Formats, except Third Party)
Found Spot ThresholdHidden if Array Format is set to
Automatically Determine.
0.200 (All Formats, except Third
Party)
Number of Corner Feature Side
Dimension?
Find SpotsSpot FormatThird Party
Use the Nominal Diameter from the
Grid Template
Spot Deviation Limit1.50
Hidden if Array Format is set to
Automatically Determine.
20 (All Formats, except Third Party)
True
Feature Extraction Reference Guide45
1Default Protocol Settings
GE2-NonAT_1100_Jul11
Table 6Default settings for GE2-NonAT_1100_Jul11 protocol (continued)
Signal CharacteristicsOnlyPositiveAndSignificantSignals
Normalization Correction MethodLowess Only
Max Number Ranked Probes8000
Compute RatiosPeg Log Ratio Value4.00
Calculate MetricsSpikein Target UsedFalse
Min Population for Replicate Stats?5
PValue for Differential Expression0.010000
Percentile Value75.00
Generate ResultsGenerate Single Text FileTrue
JPEG Down Sample Factor4
48Feature Extraction Reference Guide
Default Protocol Settings1
miRNA_1200_Jun14
miRNA_1200_Jun14
This protocol is a miRNA protocol for use with miRNA
Microarray System with miRNA Complete Labeling and
Hyb Kit (lab protocol v2.0 or higher, publication number
G4170- 90011).
Table 7Default settings for miRNA_1200_Jun14 protocol
Place GridArray FormatFor any format automatically
determined or selected by you, the
software uses the default
Placement Method.
Parameters that apply only to
specific formats appear only if that
format is selected.
Placement MethodHidden if Array Format is set to
Enable Background Peak ShiftingHidden if Array Format is set to
Use central part of pack for slope
and skew calculation?
Use the correlation method to
obtain origin X of subgrids
Automatically Determine
[Recognized formats: Single
Density (11k, 22k), 25k, Double
Density (44k), 95k, 185k, 185k 10
uM, 65-micron feature size (also
with 10-micron scans), 30-micron
feature size (single pack and multi
pack) and Third Party]
Automatically Determine.
Allow Some Distortion (All formats)
Automatically Determine.
Set to false for all arrays except 30
microns (single pack and multi
pack), for which it is set to true.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Hidden if Array Format is set to
Automatically Determine.
Set to False for all arrays except 30
microns single pack and multi pack,
for which it is set to True.
Feature Extraction Reference Guide49
1Default Protocol Settings
miRNA_1200_Jun14
Table 7Default settings for miRNA_1200_Jun14 protocol (continued)
Optimize Grid FitGrid FormatThe parameters and values for
An enhanced automatic gridding
algorithm was released in Feature
Extraction 12.1 for use in CGH
protocols only. Agilent has not
validated the new algorithm in
miRNA protocols.
optimizing the grid differ depending
on the format.
Iteratively Adjust Corners?Hidden if Array Format is set to
Adjustment ThresholdHidden if Array Format is set to
Maximum Number of IterationsHidden if Array Format is set to
False
Automatically Determine
[Recognized formats: 65-micron
feature size, 30-micron feature size,
and Third Party]
Automatically Determine.
True (All Formats, except Third
Party)
False (Third Party)
Automatically Determine.
0.300 (All Formats, except Third
Party)
Automatically Determine.
5 (All Formats, except Third Party)
Found Spot ThresholdHidden if Array Format is set to
Automatically Determine.
0.200 (All Formats, except Third
Party)
Number of Corner Feature Side
Dimension?
Find SpotsSpot FormatDepending on the format selected
by the software or by you, the
default settings for this step
change. See the following rows for
the default values for finding spots.
Hidden if Array Format is set to
Automatically Determine.
20 (All Formats, except Third Party)
Automatically Determine
[Recognized formats: same as
those listed above except 244k
10uM replaces 65-micron feature
size 10-micron scans]
50Feature Extraction Reference Guide
Default Protocol Settings1
miRNA_1200_Jun14
Table 7Default settings for miRNA_1200_Jun14 protocol (continued)
Min Population for Replicate Statistics5 (3 for CGH and ChIP)
Grid Test FormatAutomatically Determine (Not
applicable for GE2-NonAT)
PValue for Differential Expression0.010000 (All)
Percentile Value75.00 (All)
Generate ResultsType of QC ReportGene Expression for GE1 or GE2,
Streamlined CGH for CGH,
CGH_ChIP for ChIP, miRNA for
miRNA
Generate ResultsGenerate Single Text FileTrue (All)
JPEG Down Sample Factor4 (All)
66Feature Extraction Reference Guide
Agilent Feature Extraction 12.2
Reference Guide
2
QC Report Results
QC Reports 68
QC Report Headers 87
Feature Statistics 90
Histogram of LogRatio plot 103
QC Report Results in the FEPARAMS and Stats Tables 121
QC Metric Set Results 122
QC reports include statistical results to help you evaluate
the reproducibility and reliability of your single microarray
data. This chapter describes each of five types of QC report
– 2- color Gene Expression, 1- color Gene Expression,
Streamlined CGH, CGH_ChIP, and microRNA (miRNA) – and
how each can help you interpret the performance of your
microarray system. Use plots and statistics from the report
to:
• Set up your own run charts of statistical values versus
time or experiment number to track performance of one
microarray compared to other microarrays
• Monitor upstream lab protocols, such as performance of
your hybridization/washing steps
• Monitor the effect of changing Feature Extraction protocol
parameters on the performance of your data analysis
If you incorporate a set of QC metrics in your extraction,
those results appear on the final page of the QC report as
an Evaluation Table.
Agilent Technologies
67
2QC Report Results
NOTE
QC Reports
QC Reports
This section contains example QC Reports, and points out
the different sections that appear on the reports.
The reports in this section are examples. The actual contents of the
reports vary, depending on the protocol settings and QC metric set used.
68Feature Extraction Reference Guide
2-color Gene Expression QC Report
1“QC Report Headers” on
page 87
2 “Spot finding of Four
Corners” on page 90
3 “Outlier Stats” on
page 91
6 “Plot of
Background-Corrected
Signals” on page 95
4 “Spatial Distribution of
All Outliers” on page 91
5 “Net Signal
Statistics” on page 93
1
2
3
4
5
6
This module shows you the organization of the 2-color gene
expression QC report. See the following figure and the
figures on the next pages for links to information on the QC
Report regions.
QC Report Results2
2-color Gene Expression QC Report
Figure 12-color Gene Expression QC Report with Spike-ins (p1)
Feature Extraction Reference Guide69
2QC Report Results
10 “Foreground Surface
Fit” on page 97
12 “Reproducibility Statistics
(%CV Replicated Probes)” on
page 104
13 “Microarray Uniformity
(2-color only)” on page 106
14 “Sensitivity” on page 107
8 “Spatial Distribution of
Significantly Up-Regulated
and Down-Regulated
Features (Positive and
Negative Log Ratios)” on
page 100
11 “Plot of LogRatio vs. Log
ProcessedSignal” on
page 101
7 “Negative Control Stats” on
page 94
15 “Reproducibility plot for
2-color gene expression
(spike-in probes)” on
9 “Local Background
Inliers” on page 97
7
10
9
11
8
13
14
12
15
2-color Gene Expression QC Report
70Feature Extraction Reference Guide
Figure 22-color Gene Expression QC Report with Spike-ins (p2)
17
16
18
16 “2-color gene expression
spike-in signal statistics” on
page 111
17 “Spike-in Linearity Check
for 2-color Gene
Expression” on page 113
18 “QC Metric Set
Results” on page 122
QC Report Results2
2-color Gene Expression QC Report
Figure 32-color Gene Expression QC Report with Spike-ins (p3)
Feature Extraction Reference Guide71
2QC Report Results
1“QC Report Headers” on
page 87
2 “Spot finding of Four
Corners” on page 90
3 “Outlier Stats” on page 91
1
2
4
3
4 “Spatial Distribution of All
Outliers” on page 91
5 “Net Signal Statistics” on
page 93
5
6 “Histogram of Signals Plot
(1-color GE or CGH)” on
page 96
6
1-color Gene Expression QC Report
1-color Gene Expression QC Report
This module shows you the organization of the 1-color gene
expression QC report. See the following figure and the
figures on the next pages for links to information on each of
the QC Report regions.
Figure 41-color Gene Expression QC Report with Spike-ins (p1)
72Feature Extraction Reference Guide
QC Report Results2
8 “Local Background
Inliers” on page 97
11 “Reproducibility Statistics
(%CV Replicated Probes)” on
page 104
10“Multiplicative Surface
Fit” on page 99
12 “1-color gene expression
spike-in signal statistics” on
page 112
9 “Foreground Surface Fit” on
page 97
13 “Spatial Distribution of
Median Signals for each Row
and Column” on page 102
7 “Negative Control Stats” on
page 94
7
8
9
10
11
12
13
1-color Gene Expression QC Report
Figure 51-color Gene Expression QC Report with Spike-ins (p2)
Feature Extraction Reference Guide73
2QC Report Results
14 “Reproducibility plot for
1-color gene expression
(spike-in probes)” on
page 109
15 “Spike-in Linearity Check
for 1-color Gene
Expression” on page 114
15
14
16 “QC Metric Set
Results” on page 122
17 “Table of Values for
Concentration-Response Plot
(1-color only)” on page 115
16
17
1-color Gene Expression QC Report
Figure 61-color Gene Expression QC Report with Spike-ins (p3)
74Feature Extraction Reference Guide
Streamlined CGH QC Report
1 “QC Report Headers” on
page 87
2 “Spot finding of Four
Corners” on page 90
3 “Spatial Distribution of All
Outliers” on page 91
4“QC reports with metric sets
added” on page 83
5 “Histogram of Signals Plot
(1-color GE or CGH)” on
page 96
1
2
3
4
5
6
6 “Outlier Stats” on page 91
The streamlined CGH QC report provides QC metrics that
are relevant to CGH application. All log plots use log base 2
(not 10).
QC Report Results2
Streamlined CGH QC Report
Feature Extraction Reference Guide75
Figure 7Streamlined CGH QC Report (p1)
2QC Report Results
8 “Plot of
Background-Corrected
Signals” on page 95
8
7“Spatial Distribution of
Significantly Up-Regulated
and Down-Regulated
Features (Positive and
Negative Log Ratios)” on
page 100
7
Streamlined CGH QC Report
76Feature Extraction Reference Guide
Figure 8Streamlined CGH QC Report (p2)
CGH_ChIP QC Report
1“QC Report Headers” on
page 87
2 “Spot finding of Four
Corners” on page 90
3 “Outlier Stats” on page 91
4 “Spatial Distribution of All
Outliers” on page 91
7 “Plot of
Background-Corrected
Signals” on page 95
5 “Net Signal Statistics” on
page 93
6 “Negative Control Stats” on
page 94
1
2
3
5
6
47
This report lists all of the same information as the 2- color
Gene Expression report but removes the Array Uniformity
table and spike- ins and has a Histogram of LogRatio plot.
All log plots use log base 2 (not 10).
QC Report Results2
CGH_ChIP QC Report
Feature Extraction Reference Guide77
Figure 9CGH_ChIP QC Report (p1)
2QC Report Results
11 “Spatial Distribution of
Significantly Up-Regulated
and Down-Regulated
Features (Positive and
Negative Log Ratios)” on
page 100
12 “QC reports with metric
sets added” on page 83
9 “Foreground Surface
Fit” on page 97
8 “Local Background
Inliers” on page 97
10 “Reproducibility Statistics
(%CV Replicated Probes)” on
page 104
13 “Plot of LogRatio vs. Log
ProcessedSignal” on
page 101
14 “Histogram of LogR atio
plot” on page 103
9
10
12
13
11
8
14
CGH_ChIP QC Report
78Feature Extraction Reference Guide
Figure 10CGH_ChIP QC Report (p2)
MicroRNA (miRNA) QC Report
1 “QC Report Headers” on
page 87
2 “Spot finding of Four
Corners” on page 90
3 “Outlier Stats” on page 91
4 “Spatial Distribution of All
Outliers” on page 91
5 “Net Signal Statistics” on
page 93
7 “Histogram of Signals Plot
(1-color GE or CGH)” on
page 96
6 “Negative Control Stats” on
page 94
5
2
3
4
7
1
6
QC Report Results2
MicroRNA (miRNA) QC Report
Agilent miRNA microarrays are
currently in development. Check
the Agilent website for the latest
information.
This module shows you the organization of the 1- color
miRNA QC report. See the following figure and the figures
on the next pages for links to information on each of the QC
Report regions.
Feature Extraction Reference Guide79
Figure 11MicroRNA (miRNA) QC Report (p1)
2QC Report Results
9 “Reproducibility Statistics
(%CV Replicated Probes)” on
page 104
8
10
12
8 “Foreground Surface Fit” on
page 97
11 “QC reports with metric
sets added” on page 83
10 “Reproducibility plot for
miRNA (non-control
probes)” on page 110
12 “Spatial Distribution of
Median Signals for each Row
and Column” on page 102
9
11
MicroRNA (miRNA) QC Report
Figure 12MicroRNA (miRNA) QC Report (p2)
80Feature Extraction Reference Guide
1 “QC Report Headers” on
page 87
2 “Spot finding of Four
Corners” on page 90
3 “Outlier Stats” on
page 91
4 “Spatial Distribution of
All Outliers” on page 91
7 “Plot of
Background-Corrected
Signals” on page 95
5 “Net Signal
Statistics” on page 93
6 “Negative Control
Stats” on page 94
2
3
6
4
5
7
1
QC Report Results2
Non-Agilent GE2 QC Report
Non-Agilent GE2 QC Report
This report lists all of the same information as the 2- color
gene expression QC report but with no spike- ins.
Feature Extraction Reference Guide81
Figure 13Non-Agilent GE2 QC Report (p1)
2QC Report Results
12 “Spatial Distribution of
Significantly Up-Regulated
and Down-Regulated
Features (Positive and
Negative Log Ratios)” on
page 100
13 “Plot of LogRatio vs. Log
ProcessedSignal” on
page 101
9 “Foreground Surface
Fit” on page 97
8 “Local Background
Inliers” on page 97
10 “Reproducibility Statistics
(%CV Replicated
Probes)” on page 104
11 “Microarray Uniformity
(2-color only)” on page 106
8
12
13
9
10
11
Non-Agilent GE2 QC Report
82Feature Extraction Reference Guide
Figure 14Non-Agilent GE2 QC Report (p2)
QC reports with metric sets added
When metric sets are associated to the protocols, QC reports
are generated with an additional set of evaluation metrics.
Depending on the microarray types, some QC metric sets
come with thresholds (denoted by QCMT) and some without
thresholds (denoted by QCM).
If thresholds are included in the metric set, the evaluation
tables in the QC report show metrics that are within
threshold ranges or that have exceeded those ranges.
Agilent has determined which of the FE Stats are good
metrics to follow the processing of Agilent arrays. Most of
the metrics chosen are useful to determine if there are
problems in the various laboratory steps (label,
hybridization, wash, scan steps). The new “IsGoodGrid”
metric tracks the automatic grid- finding of Feature
Extraction. By looking at numerous data run on our arrays,
using our wet- lab protocols, Agilent has found thresholds
that indicate if the data is in the expected range (“Good”) or
out of the expected range (“Evaluate”).
QC Report Results2
QC reports with metric sets added
For some applications (CGH, miRNA), an extra threshold
level, “Excellent” is provided. More data has been screened
to allow setting the metric thresholds to tighter limits that
indicate excellent processing. For those applications that do
not have a full set of thresholds (for example, ChIP), or no
“Excellent” thresholds (for example, GE1 and GE2), the user
is assured that the data coming from the “Good” grade is
good to use. Excellent thresholds for those applications may
be provided in the future.
Feature Extraction Reference Guide83
2QC Report Results
QC reports with metric sets added
QC metric set results--default protocol settings
Figure 15 is an example of part of a QC report — the header
and the Evaluation Metrics table — generated from a 2- color
gene expression extraction whose GE2 metric set with
thresholds had been added. In this extraction, the default
protocol settings were used. Note that all values for the
metrics are within the default threshold ranges.
Figure 15Partial QC Report—Header and Evaluation Metrics with GE2
metric set with thresholds added—Default protocol settings
84Feature Extraction Reference Guide
QC Report Results2
QC reports with metric sets added
QC metric set results—Spatial and Multiplicative Detrending Off
Figure 16 is an example of a QC report header and
Evaluation Metrics table generated from a 2- color gene
expression extraction whose GE2 metric set with thresholds
were added. In this extraction spatial and multiplicative
detrending were turned off. Note that not all values of the
metrics are within the default thresholds.
Figure 16QC Report Header and Evaluation Metrics with GE2 metric
set with thresholds added—Detrending turned off
Feature Extraction Reference Guide85
2QC Report Results
QC reports with metric sets added
QC metric set results—miRNA spike-in analysis
Figure 17 is an example of a QC report header and
Evaluation Metrics table generated from a 1- color extraction
whose miRNA metric set with thresholds had been added. In
this extraction, the default protocol settings were used. Note
that not all values of the metrics are within the default
thresholds. For details on how the miRNA spike- in statistics
and metrics are calculated, see “MicroRNA Analysis” on
page 285.
Figure 17QC Report Header and Evaluation Metrics with miRNA metric
set with thresholds added - Default protocol settings
86Feature Extraction Reference Guide
QC Report Headers
2-color Gene Expression QC Report
DateDate and time that the QC Report was generated
ImageName of the TIFF file that was extracted
ProtocolName of the protocol used for the extraction
User NameName of the user who set up the extraction
GridName of the grid template or grid file used
FE VersionVersion of the Feature Extraction software used
Sample (red/green)Names of Cy5- and Cy3-labeled samples
DyeNorm ListName of the dye normalization list
QC Report Results2
QC Report Headers
The following Feature Extraction information is found in the
2- color gene expression QC Report header:
No of Probes in
DyeNorm List
BG MethodType of background subtraction method used
Background
Detrend
Multiplicative
Detrend
Dye NormType of dye normalization method used
Linear DyeNorm FactorGlobal dye normalization factor determined for the linear
Additive ErrorAdditive portion of the error estimated in the Universal or
Feature Extraction Reference Guide87
Number of probes in the designated dye normalization probe
list
If Spatial Detrend was turned on or off during the
extraction
If Multiplicative Detrend was turned on or off during the
extraction
portion of the correction method.
Most Conservative error model (if AutoEstimateAddError
was selected). Or, the values entered into the protocol, (if
AutoestimateAddError was not selected). Note that the
2QC Report Results
1-color Gene Expression QC Report
additive error that appears in the QC report header is the
Additive Error value selected in the protocol multiplied by
the linear dye norm factor.
Saturation
Value
The signal intensity value above which the signal is
considered saturated. This value only appears if it exceeds
about 65,500. If it appears, this means that this QC report is
from an XDR image file.
1-color Gene Expression QC Report
This report lists all of the same header information as the
2- color gene expression report, except for Dye Norm and
Linear DyeNorm Factor which are removed.
Streamlined CGH QC Report
The streamlined CGH QC report contains the same header
information as the 2- color gene expression QC report, except
for Linear DyeNorm Factor and Additive Error which are
removed. Also, the information from the two fields, “BG
Method” and “Background Detrend”, have been collapsed
into the one field, “BG Method”.
CGH_ChIP QC Report
All header information that appears in the 2- color gene
expression QC report are included in the CGH_ChIP report.
This report lists one additional metric, Derivative of Log Ratio Spread in the header information.
Derivative of Log
Ratio Spread
88Feature Extraction Reference Guide
Measures the standard deviation of the probe- to- probe
difference of the log ratios. This metric is used in CGH
experiments where differences in the log ratios are small on
average. A smaller standard deviation here indicates less
noise in the biological signals.
QC Report Results2
MicroRNA (miRNA) QC Report
MicroRNA (miRNA) QC Report
This header lists the same information as the 1- color gene
expression QC Report header. If the XDR function is turned
on, it also lists Saturation Values exceeding 65,500. Because
the dynamic range of the intensity for all miRNA microarray
spots on a microarray may exceed that of a normal scan
range, the miRNA analysis on some microarrays can benefit
with the XDR function turned on.
Non-Agilent 2-color gene expression QC Report
This header lists the same information as the 2- color gene
expression QC report header.
Feature Extraction Reference Guide89
2QC Report Results
Feature Statistics
Feature Statistics
Spot finding of Four Corners
This section provides an explanation for each of the feature
statistics segments of the QC report and how these feature
statistics can help you assess the performance of your
microarray system.
By looking at the features in the four corners of the
microarray, you can decide if the spot centroids have been
located properly. If their locations are off- center in one or
more corners, you may have to run the extraction again with
a new grid.
Figure 18QC Report—Spot Finding for Four Corners
90Feature Extraction Reference Guide
Outlier Stats
QC Report Results2
Outlier Stats
If the QC Report shows a greater than expected number of
nonuniform or population outliers, check your
hybridization/wash step. Also, check the visual results (.shp
file) to see if the spot centroids are off- center. If the grid
was not placed correctly, a new grid is required.
Figure 19QC Report—Outlier Stats
For 1- color reports, the number of outliers is reported for
the green channel only.
Spatial Distribution of All Outliers
The QC report shows two plots of all the outliers, both
population and nonuniformity outliers, whose positions are
distributed across the microarray. One plot is for the green
channel, and the other, for the red channel. SNP probes are
included.
To distinguish the background population and nonuniform
outliers from one another, look at the color coding at the
bottom of the two plots.
For the 1- color report, only the green plot is shown.
Feature Extraction Reference Guide91
2QC Report Results
Spatial Distribution of All Outliers
Figure 20QC Report—Number and Spatial Distribution of Outliers
The number (and percentage) of features that are feature
nonuniformity outliers in either the green or red channel is
shown under the plot. The 1- color report shows only the
percentage of green feature non- uniformity outliers.
Also, the number (and percentage) of genes that are
nonuniformity outliers in either channel is shown under the
plot. If there were replicate features representing one gene
and at least one feature was not an outlier, no gene outliers
would appear.
92Feature Extraction Reference Guide
Net Signal Statistics
QC Report Results2
Net Signal Statistics
Net signal is the mean signal
minus the scanner offset. Net
signal is used so that these
statistics are independent of the
scanner version.
Net signal statistics are an indication of the dynamic range
of the signal on a microarray for both non- control probes
and spike- in probes (not applicable for CGH QC report). The
QC Report uses the range from the first percentile to the
99th percentile as an indicator of dynamic range for that
microarray. NetSignal is also a column in the FeatureData
output.
For example, in Figure 21 for non- control probes, the
dynamic range of the net signal intensity for the red channel
is from 42 to 6803. Half the probes have a net signal
intensity of greater than the median of 97 and half below
the median of 97. The median (or 50th percentile) represents
the middle of the ranked- values of the distribution of
signals.
Another indicator of signal range for the microarray is the
number of features that are saturated in the scanned image
(for example, NumSat).
Figure 21QC Report—Net Signal Statistics
Feature Extraction Reference Guide93
2QC Report Results
Negative Control Stats
Negative Control Stats
The Negative Control Stats table includes the average and
standard deviation of the net signals (mean signal minus
scanner offset) and the background- subtracted signals for
both the red and green channels in the negative controls.
These statistics filter out saturated and feature nonuniform
and population outliers and give a rough estimate of the
background noise on the microarray. SNP probes are not
included in these statistics.
Figure 22QC Report—Negative Control Stats
94Feature Extraction Reference Guide
Plot of Background-Corrected Signals
Plot of Background-Corrected Signals
Figure 23 is a plot of the log of the red
background- corrected signal versus the log of the green
background- corrected signal for non- control inlier features.
The linearity or curvature of this plot can indicate the
appropriateness of background method choices. The plot
should be linear.
The intersection of the red vertical and horizontal lines
shows the location of the median signal. The numbers along
the edge of the lines represent the location of the median
signal on the plot.
The values under the plot indicate the number of
non- control features that have a background-corrected signal
less than zero. SNP probes are not included.
QC Report Results2
Figure 23QC Report—Plot of Background-Corrected Signals
Feature Extraction Reference Guide95
2QC Report Results
Histogram of Signals Plot (1-color GE or CGH)
Histogram of Signals Plot (1-color GE or CGH)
The purpose of this histogram is to show the level of signal
and the shape of the signal distribution. The histogram is a
line plot of the number of points in the intensity bins vs.
the log of the processed signal. SNP probes are not included.
Figure 241-color QC Report—Histogram of Signals Plot
96Feature Extraction Reference Guide
Local Background Inliers
With these numbers, you can see the mean signal
distribution for the local background regions (BGMeanSignal)
after outliers have been removed. This information can help
you detect hybridization/wash artifacts and can be a
component of noise in the low signal range. SNP probes are
included.
Figure 25QC Report—Local Background Inliers
Foreground Surface Fit
QC Report Results2
Local Background Inliers
See “Step 13. Perform background
spatial detrending to fit a
surface” on page 258 of this guide
for more information about these
calculations.
Feature Extraction Reference Guide97
Spatial Detrend attempts to account for low signal
background that is present on the feature “foreground” and
varies across the microarray. SNP probes are not included.
• A high RMS_Fit number can indicate gradients in the low
signal range before detrending.
• RMS_Resid indicates residual noise after detrending.
• AvgFit indicates how much signal is in the “foreground”.
A higher AvgFit number indicates that a larger amount of
signal was detected by the detrend algorithm and
removed.
This value may include the scanner offset, unless a
background method has been used before detrending. The
value may not include higher frequency background
signals. These higher frequency background signals are
best removed by using the Local Background Method
before the detrending algorithm.
2QC Report Results
Foreground Surface Fit
Figure 26QC Report—Foreground Surface Fit
98Feature Extraction Reference Guide
Multiplicative Surface Fit
QC Report Results2
Multiplicative Surface Fit
See “Step 16. Determine the error
in the signal calculation” on
page 268 of this guide for more
information about these
calculations.
This value is the root mean square (RMS) of the surface fit
for the data. The RMS X 100 is roughly the average %
deviation from “flat” on the microarray. A multiplicative
trend means that there are regions of the microarray that
are brighter or dimmer than other regions. This trend is an
effect that multiplies signals; that is, a brighter signal is
more affected in absolute signal counts than a dimmer
signal. SNP probes are not included in calculation of
multiplicative detrending.
This option is turned on in GE1, GE2, and CGH protocols,
turned off in the miRNA protocol and is not available for
non- Agilent protocols.
If the signal is improved through a multiplicative surface fit,
the RMS_Fit value appears as a fraction, as in the figure
shown.
Figure 27QC Report—Multiplicative Surface Fit
What if multiplicative detrending does not work?
If the median %CV for the Processed Signal of the
non- control probes is greater than the BGSub Signal median
%CV after multiplicative detrending, Feature Extraction turns
off multiplicative detrending.
If multiplicative detrending did not result in better data, the
QC report shows an RMS_Fit = 0.0.
If there are no stats for non- control probes, Feature
Extraction looks at the spike- in control probes. If the %CVs
for these become worse, Feature Extraction removes
detrending.
Feature Extraction Reference Guide99
2QC Report Results
Spatial Distribution of Significantly Up-Regulated and Down-Regulated Features (Positive and Negative
Log Ratios)
If the option “Detrend on Replicates only” is chosen and if
there are not enough replicates for non- control or spike-in
control probes, Feature Extraction turns off multiplicative
detrending.
Spatial Distribution of Significantly Up-Regulated and
Down-Regulated Features (Positive and Negative Log
Ratios)
You can display the distribution of the significantly up- and
down- regulated features on this plot (up–red; down–green).
Figure 28QC Report—Spatial Distribution of Up- and Down-Regulated
Features
For the CGH QC Report, this plot is referred to as “Spatial
Distribution of the Positive and Negative Log Ratios”.
If the microarray contains greater than 5000 features, the
software randomly selects 5000 data points. These points
include the number of up-regulated features in the same
proportion to the number of down- regulated features as they
are found on the actual microarray.
The threshold that is used to determine significance is set in
the protocol—QCMetrics_differentialExpressionPValue.
These are the same features shown as up- or
down- regulated in Figure 29.
100Feature Extraction Reference Guide
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.