The software described in this document is furnished under a license agreement. The software may be used
or copied only under the terms of the license agreement. No part of this manual may be photocopied or
reproduced in any form without prior written consent from The MathW orks, Inc.
FEDERAL ACQUISITION: This provision applies to all acquisitions of the Program and Documentation
by, for, or through the federal government of the United States. By accepting delivery of the Program
or Documentation, the government hereby agrees that this software or documentation qualifies as
commercial computer software or commercial computer software documentation as such terms are used
or defined in FAR 12.212, DFARS Part 227.72, and DFARS 252.227-7014. Accordingly, the terms and
conditions of this Agreement and only those rights specified in this Agreement, shall pertain to and govern
theuse,modification,reproduction,release,performance,display,anddisclosureoftheProgramand
Documentation by the federal government (or other entity acquiring for or through the federal government)
and shall supersede any conflicting contractual terms or conditions. If this License fails to meet the
government’s needs or is inconsistent in any respect with federal procurement law, the government agrees
to return the Program and Docu mentation, unused, to The MathWorks, Inc.
Trademarks
MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See
www.mathworks.com/trademarks for a list of additional trademarks. Other product or brand
names may be trademarks or registered trademarks of their respective holders.
Patents
The MathWorks products are protected by one or more U.S. patents. Please see
www.mathworks.com/patents for more information.
Summary by Version ...............................1
Version 3.5 (R2010a) Bioinformatics Toolbox
Software
Version 3.4 (R2009b) Bioinformatics Toolbox
Software
Version 3.3 (R2009a) Bioinformatics Toolbox
Software
Version 3.2 (R2008b) Bioinformatics Toolbox
Software
Version 3.1 (R2008a) Bioinformatics Toolbox
Software
........................................4
........................................8
........................................17
........................................20
........................................27
Contents
Version 3.0 (R2007b) Bioinformatics Toolbox
Software
Version 2.6 (R2007a+) Bioinformatics Toolbox
Software
Version 2.5 (R2007a) Bioinformatics Toolbox
Software
Version 2.4 (R2006b) Bioinformatics Toolbox
Software
Version 2.3 (R2006a+) Bioinformatics Toolbox
Software
Version 2.2.1 (R2006a) Bioinformatics Toolbox
Software
........................................34
........................................39
........................................42
........................................50
........................................54
........................................57
iii
Version 2.2 (R14SP3+) Bioinformatics Toolbox
Software
Version 2.1.1 (R14SP3) Bioinformatics Toolbox
Software
Version 2.1 (R14SP2+) Bioinformatics Toolbox
Software
Version 2.0.1 (R14SP2) Bioinformatics Toolbox
Software
Version 2.0 (R14SP1+) Bioinformatics Toolbox
Software
Version 1.1.1 (R14SP1) Bioinformatics Toolbox
Software
........................................58
........................................60
........................................61
........................................63
........................................64
........................................68
Version 1.1 (R14) Bioinformatics Toolbox Software
Version 1.0 (R13+) Bioinformatics Toolbox Software
Compatibility Summary for Bioinformatics Toolbox
Software
........................................76
...69
..73
ivContents
SummarybyVersion
This table provides quick access to what’s new in each version. For
clarification, see “Using Release Notes” on page 2 .
Bioinformatics Toolbox™ Release Notes
Version
(Release)
Latest Versi
V3.5 (R2010a
V3.4 (R2009b)
V3.3 (R2009a)
V3.2 (R2
V3.1 (R2008a)
V3.0 (R2007b)
V2.6
V2.5 (R2007a)
008b)
(R2007a+)
New Features
and Changes
on
Yes
)
Details
Yes
Details
Yes
Details
Yes
Details
Yes
Details
Yes
Detai
Yes
Details
Yes
Details
Version
Compatibilit
Consideratio
Yes
Summary
Yes
Summary
Yes
Summary
Yes
Summary
Yes
Summary
Yes
ls
Summa
Yes
Summary
Yes
Summary
ry
y
ns
Fixed Bugs
and Known
Problems
Bug Reports
Includes fix
Bug Reports
Includes fixes
Bug Repor
Includes
Bug Reports
Includes fixes
Bug Reports
Includes fixes
Bug Re
Inclu
Bug Reports
Includes fixes
Bug Reports
Includes fixes
es
ts
fixes
ports
des fixes
Related
Documentation
at Web Site
Printable R elease
Notes: PDF
Current product
documentation
No
No
No
No
No
No
No
V2.4 (R2006b)
.3 (R2006a+)
V2
V2.2.1 (R2006a)
Yes
ails
Det
Yes
Details
NoNoBug Reports
NoBug
NoBug Reports
Reports
ludes fixes
Inc
Includes fixes
Includes fixes
No
No
No
1
Bioinformatics Toolbox™ Release Notes
Version
(Release)
V2.2 (R14SP3+)
V2.1.1 (R14SP3)
V2.1 (R14SP2+)
V2.0.1 (R14SP2)
V2.0 (R14SP1+)
V1.1.1 (R14SP1)
V1.1 (R14)
V1.0 (R13+)
New Features
and Changes
Yes
Details
NoNoBug Reports
Version
Compatibility
Considerations
Fixed Bugs
and Known
Problems
NoBug Reports
Includes fixes
Related
Documentation
at Web Site
No
No
Includes fixes
Yes
Details
Yes
Details
Yes
NoBug Reports
Includes fixes
NoBug Reports
Includes fixes
No
No bug fixes
No
No
No
Details
NoNo
Yes
No
No bug fixes
No bug fixes
No
No
Details
Yes
Details
No
No bug fixes
V1.0 product
documentation
Using Release Notes
Use release notes when upgrading to a newer version to learn about:
• New features
• Changes
• Potential impact on your existing files and practices
Review the release notes for other MathWorks™ products required for this
product (for example, MATLAB
®
or Simulink®). Determine if enhancements,
bugs, or compatibility considerations in other products impact you.
If you are upgrading from a software version other than the m ost recent one,
review the current release notes and all interim versions. For example, when
you upg rade from V1.0 to V1.2, review the release notes for V1.1 and V1.2.
2
SummarybyVersion
What Is in the Rel
New Features and
• New functional
• Changes to exi
Version Compa
When a new fea
versions, th
impact.
Compatibil
Reports at
in incompa
compatibi
Fixed Bug
The Math
view Bug
time and
provisi
availa
is not a
ity issues reported after the product release appear under Bug
The MathWorks™ W eb site. Bug fixes can sometimes result
tibilities, so review the fixed bugs in Bug Reports for any
lity impact.
sandKnownProblems
Works offers a user-searchable Bug Reports database so you can
Reports. The development team updates this database at release
as more information becomes available. Bug Reports include
ons for any known workarounds or file replacements. Information is
ble for bugs existing in or fixed in Release 14SP2 or later. Information
vailable for all bugs in earlier releases.
ity
sting functionality
tibility Considerations
ture or change introduces a reported incompatibility between
e Compatibility Considerations subsection explains the
ease Notes
Changes
s Bug Reports using your MathWorks Account.
Acces
3
Bioinformatics Toolbox™ Release Notes
Version 3.5 (R2010a) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.5 (R2010a):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
Yes—Details labeled
as CompatibilityConsiderations,
below. See also
Summary.
New and updated features in this version include:
• “Data Format and Database Functions” on page 4
• “Pairwise Sequence Alignment Functions” on page 5
• “Multiple Sequence Alignment Functions” on page 5
• “Phylogenetic Tree Tools and Methods” on page 6
• “BioIndexedFile Function, Object, Methods, and Properties” on page 6
• “BioRead Function, Object, Methods, and Properties” on page 6
• “BioMap Function, Object, Methods, and Properties” on page 7
• “Function Elements Being Removed” on page 7
Fixed Bugs an d
Known Problems
Bug Reports
Includes fixes
Related
Documentation at
Web Site
Printable Release
Notes: PDF
Current product
documentation
Data Format and Database Functions
The following functions are new:
•
saminfo — Return inform ation about Sequence Alignment/Map (SAM) file.
samread — Read data from Sequence Alignment/Map (SAM) file.
•
The following functions are updated:
•
fastaread — Read data from FASTA file. U pdated to allow trimming of
the headers in the output structure by addition of
4
TrimHeaders property.
Version 3.5 (R2010a) Bioinformatics To olbox™ Software
• fastqread — Read data from FASTQ file . Updated to allow trimming of
the headers in the output structure by addition of
phytreeread — Read phylogenetic tree file. Updated to return a second
•
TrimHeaders property.
output containing bootstrap v alues for tree nodes.
Pairwise Sequence Alignment Functions
The following function is updated:
•
fastaread — Read data from FASTA file. U pdated to allow trimming of
the headers in the output structure by addition of
TrimHeaders property.
Multiple Sequen ce Alignment Functions
The following functions are updated:
•
fastaread — Read data from FASTA file. U pdated to allow trimming of
the headers in the output structure by addition of
multialign — Align multiple sequences using progressive method.
•
Updated to inc lu de a new property,
parfor-loops and compute in parallel mode.
'UseParallel',whichletsyouuse
TrimHeaders property.
seqpdist — Calculate pairwise distance between sequences. Updated to
•
include a new property,
'UseParallel',whichletsyouuseparfor-loops
and compute in parallel mode.
Compatibility Considerations
In Bioinformatics Toolbox™ Version 3.4 and earlier, the multialign and
seqpdist functions included 'JobManager' and 'WaitInQueue' property
name/property value pairs, w h ich let you process in parallel, including
support for the MATLAB scheduler for clusters.
In Bioinformatics Toolbox Version 3.5, the
functions do not include the include the 'JobManager' and 'WaitInQueue'
property name/property value pairs. Instead they include the 'UseParallel'
property name/property value pair, which lets you process in parallel,
including support for:
• Local workers for multicore machines
multialign and seqpdist
5
Bioinformatics Toolbox™ Release Notes
• The MATLAB scheduler for clusters
• Third-party schedulers for clusters
Phylogenetic Tree Tools and Methods
The following functions are updated:
•
phytreeread — Read phylogenetic tree file. Updated to return a second
output containing bootstrap v alues for tree nodes.
•
seqpdist — Calculate pairwise distance between sequences. Updated to
include a new property,
and compute in parallel mode.
BioIndexedFile Function, Object, Methods, and
Properties
Following is a new class for an object that lets you extract information from
large multi-entry text files.
'UseParallel',whichletsyouuseparfor-loops
•
BioIndexedFile — Allow quick and efficient access to large text file with
nonuniform-size entries.
This class has properties and methods that are useful for accessing, reading,
and parsing data from a large source file.
BioRead Function, Object, Methods, and Properties
Following is a new class for an object that contains data from short-read
sequences, including sequence headers, nucleotide sequences, and the quality
scores for the seque n ces.
•
BioRead — Contain sequence and quality data.
This class has properties and methods that you can use to explore, access,
filter, and manipulate all or a subset of the data, before doing subsequent
analyses or sequence alignment and mapping.
6
Version 3.5 (R2010a) Bioinformatics To olbox™ Software
BioMap Function
Following is a ne
sequences, incl
the sequences,
sequence.
•
BioMap —Conta
This class ha
filter, and m
analyses or v
Function El
Function
Element Name
'JobManager'
property
name/property
value pair
as input to
multialign
and seqpdist
functions
'WaitInQueue'
property
name/property
value pair
as input to
multialign
and seqpdist
functions
wclassforanobjectthatcontains data from s hort-read
uding sequence headers, read sequences, quality scores for
and data about alignment and mapping to a single reference
insequence,quality,alignment,andmappingdata.
s properties and methods that you can use to explore, access,
anipulate all or a subset of the data, before doing subsequent
iewing the data.
ements Being Removed
What Happens
When You Use
This Function
Element
Warns
Warns
, Object, Methods, and Properties
Use This
Instead
'UseParallel'
property
name/property
value pair
as input to
multialign
and seqpdist
functions
'UseParallel'
property
name/property
value pair
as input to
multialign
and seqpdist
functions
Compatibility
Considerations
See the
Compatibility
Considerations
subheading
in “Multiple
Sequence
Alignment
Functions” on
page 5
See the
Compatibility
Considerations
subheading
in “Multiple
Sequence
Alignment
Functions” on
page 5.
7
Bioinformatics Toolbox™ Release Notes
Version 3.4 (R2009b) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.4 (R2009b):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
Yes—Details labeled
as CompatibilityConsiderations,
below. See also
Summary.
New and updated features in this version include:
• “Data Format and Database Functions” on page 9
• “Protein Analysis Functions” on page 9
• “Data Visualization Functions” on page 10
• “Sequence Statistics Functions” on page 10
• “Sequence Utility Functions” on page 10
• “Sequence Visualization Functions” on page 11
• “Pairwise Sequence Alignment Functions” on page 11
• “Multiple Sequence Alignment Functions” on page 11
• “Phylogenetic Tree Tools and Methods” on page 12
Fixed Bugs an d
Known Problems
Bug Reports
Includes fixes
Related
Documentation at
Web Site
No
• “Clustergram Window” on page 13
• “Clustergram Methods and Properties” on page 13
• “HeatMap Object, Methods, and Properties” on page 14
• “DataMatrix Methods” on page 15
• “Microarray Functions, Objects, Methods, and Properties” on page 15
• “Mass Spectrometry Functions” on page 16
• “Demos for Sequence Analysis” on page 16
8
Version 3.4 (R2009b) Bioinformatics To olbox™ Software
• “Demos for Microarray Analysis” on page 16
Data Format and Database Functions
Following are new functions:
•
fastainfo — Return information about FASTA file.
fastqinfo — Return information about FASTQ file.
•
fastqread —ReaddatafromFASTQfile.
•
fastqwrite — Write to file using FASTQ format.
•
sffinfo — Return information about SFF file.
•
sffread —ReaddatafromSFFfile.
•
tgspcinfo — Return information about SPC file.
•
tgspcread —ReaddatafromSPCfile.
•
The following functions are updated:
•
affyread — Read microarray data from Affymetrix
®
GeneChip®file.
Updated to read cell layout files (CLF) and background probe (BGP) files.
•
multialignwrite — Write multiple alignment to file. Updated to write a
file in either ClustalW ALN format (default) or MSF format.
Protein Analysis Functions
Following is a new function:
•
isotopicdist — Calculate high-resolution isotope mass distribution and
density function.
The following function is updated:
•
cleave — Cleave amino acid seque n ce with enzyme. Updated to let you
specify an exception to the enzyme’s cleavage rule and to let you specify
a maximum number of missed cleavage sites. Also updated to return the
number o f missed cleavage sites per peptide fragment.
9
Bioinformatics Toolbox™ Release Notes
Data Visualizat
The following fu
•
microplateplo
updated so that
row A. Updated
you reverse th
include a new p
size of text l
•
multialignv
alignment.
Multiple Se
•
showalignm
to control
matches an
abels.
iewer
Updated to accept a list of names to label the sequences in the
quence Alignment Viewer window.
ent
the inclusion or exclusion of terminal gaps from the count of
d si m ilar resi dues when displaying a pairwise alignment.
ion Functions
nctions are updated:
t
— Display visualization of microtiter plate. Display
first row of input matrix appears at the top and is labeled
to return the handle to the axes of the plot, which lets
e order or the rows or columns in the display. Updated to
roperty,
— Display color-coded sequence alignment. Updated
'TextFontSize', which lets you control the font
— Display and interactively adjust mu ltip le sequence
Compatibility Considerations
In Bioinf
by
In Bioin
input ma
ormatics Toolbox Version 3.3, the default layout for the plot returned
microp
lateplot
formatics Toolbox Versi on 3.4, the p lo t displays the first row of the
trix at the top.
displayed the first row of the input matrix at the bottom.
10
Sequen
The fol
•
seqsh
searc
Sequ
The f
•
cle
spe
ama
nu
ce Statistics Functions
lowing function is updated:
owwords
h for multip le words in a sequ en ce.
— Graphically display words in sequence. U pd ated to
ence Utility Functions
ollowing functions are updated:
ave
— Cleave amino acid sequence w ith enzyme. Updated to let you
cify an exception to the enzyme’s cleavage rule and to let you specify
ximum number of missed cleavage sites. Also updated to return the
mber of missed cleavage sites per peptide fragment.
Version 3.4 (R2009b) Bioinformatics To olbox™ Software
• rebasecuts — Find restriction enzymes that cut nucleotide sequence.
Updated to use Version 904 of REBASE
restrict — Split nucleotide sequence at restriction site. Updated to use
•
Version 904 of REBASE, the Restriction E nz yme Database.
®
, the Restriction Enzyme Database.
Sequence Visualization Functions
The following functions are updated:
•
multialignviewer — Display and interactively adjust multiple sequence
alignment. Updated to accept a list of names to label the sequences in the
Multiple Sequence Alignment V i ewer window.
to control the inclusion or exclusion of terminal gaps from the count of
matches and similar residues when displaying a pairwise alignment.
Phylogenetic Tree Tools and Methods
The Phylogenetic Tree Tool includes the following updates:
• Includes two new circular print renderings: equal angle and equal daylight
• Updates to Tools menu, including commands to select specific branch and
leaf nodes based on different criteria, such as distance, common ancestors,
leaves only, and descendants.
Following is a new method:
•
cluster — Validate clusters in phylogenetic tree.
12
The following method is updated:
•
plot — Draw phylogenetic tree. Updated to include two new algorithms
for circular layouts: equal angle and equal daylight. Updated to let you
rotate circular trees from 0 through 360 degrees and to rotate leaf labels
of circular trees so that the text is aligned to the root node. Updated the
'LeafLabels' property so that it defaults to true forcircularlayoutsand
to
false for square and angular layouts.
Compatibility Considerations
In Bioinformatics Toolbox Version 3.3, the 'LeafLabels' property defaulted
to
true when the 'Type' property was 'square' or 'angular',andtofalse
when the 'Type' property was 'radial'.
In Bioinformatics Toolbox Version 3.4, the
to
false when the 'Type' property is 'square' or 'angular',andtotrue
when the 'Type' property is 'radial'.
'LeafLabels' property defaults
Version 3.4 (R2009b) Bioinformatics To olbox™ Software
Clustergram Win
The Clustergram
• Annotate
of the heat map.
• Show Dendrogram
window has two new toolbar buttons:
button — Shows and hides intensity values for each area
dow
button—Showsandhidesthedendrograms.
Clustergram Methods and Properties
The following are new methods of a clustergram object:
•
addTitle —Addtitletoclustergram.
addXLabel — Label x-axis of clustergram.
•
addYLabel — Label y-axis of clustergram.
•
clusterGroup — Select cluster group.
•
The following properties of a clustergram object are renamed:
•
ColumnMarker is now ColumnGroupMarker.
Impute is now ImputeFun.
•
Ratio is now DisplayRatio.
•
RowMarker is now RowGr oupMarker.
•
SymmetricRange is now Symmetric.
•
Note The former property names are still valid.
Following is a new property related to the display of dendrogram tree
diagrams in a clustergram object:
•
ShowDendrogram
The following are new properties related to the display of row and column
labels of a clustergram object:
13
Bioinformatics Toolbox™ Release Notes
• RowLabels
• ColumnLabels
• RowLabelsLocation
• ColumnLabelsLocation
• RowLabelsColor
• ColumnLabelsColor
• LabelsWithMarkers
• RowLabelsRotate
• ColumnLabelsRotate
The follow ing are new properties related to annotating data in a clustergram
object:
•
Annotate
• AnnotColor
14
• AnnotPrecision
When using clustergram properties with the get and set methods, the
property names are now case sens itive.
Compatibility Considerations
In Bioinformatics Toolbox Version 3.3, the property names of a clustergram
object were not case sensitive when used with the
In Bioinformatics Toolbox Version 3.4, property names of a clustergram object
are case sensitive.
get an d set methods.
HeatMap Object, Methods, and Properties
Following is a new object:
• HeatMap object — Object containing matrix and h eat map display
properties.
Version 3.4 (R2009b) Bioinformatics To olbox™ Software
The following are methods of a HeatMap object:
•
addTitle —Addtitletoheatmap.
addXLabel —Labelx-axis of heat map.
•
addYLabel —Labely-axis of heat map.
•
plot — Render heat map for object.
•
view — Render heat map for object.
•
A HeatMap object includes many properties that control the creation of the
heat map, row and column labels, axes labels, title, and data annotation.
DataMatrix Methods
Following is a new method of a DataMatrix object:
•
dmwrite — Write DataMatrix object to text file.
Microarray Functions, Objects, Methods, and
Properties
Following are new functions to create objects containing data from a
microarray gene expression experiment:
•
bioma.ExpressionSet — Contain data from microarray gene expression
experiment.
•
bioma.data.ExptData — Contain expression data from microarray gene
expression experiment.
•
bioma.data.MetaData — Contain sample or feature m etadata from
microarray gene expression experiment.
•
bioma.data.MIAME — C ontain experiment information f rom microarray
gene expression experiment.
These objects have properties and methods that are useful for viewing and
analyzing the data or a subset of the data.
15
Bioinformatics Toolbox™ Release Notes
Mass Spectrometry Functions
Following are new functions:
•
isotopicdist — Calculate high-resolution isotope mass distribution and
density function.
•
tgspcinfo — Return information about SPC file.
tgspcread —ReaddatafromSPCfile.
•
The following function is updated:
•
mspeaks — Convert raw peak data to peak list (centroided data). Updated
to include a new property,
marking the peaks in the plot.
Demos for Sequence Analysis
Following are two new sequence analysis demos:
• Working with SFF Files from the 454 Genome Sequencer FLX System
'Style', which lets you specify the style for
16
• Working with Illumina/Solexa Next-Generation Sequencing Data
Demos for Microarray Analysis
Following are two new microarray analysis demos:
• Working with Objects for Microarray Experiment Data
• Analyzing Illumina Bead Summary Gene Expression Data
Version 3.3 (R2009a) Bioinformatics To olbox™ Software
Version 3.3 (R2009a) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.3 (R2009a):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
Yes—Details labeled
as CompatibilityConsiderations,
below. See also
Summary.
New and updated features in this version include:
• “Data Visualization Functions” on page 17
• “Sequence Utility Functions” on page 17
• “Sequence Conversion Functions” on page 18
• “Bioanalytic and Mass Spectrometry Functions” on page 18
• “Microarray Functions” on page 18
• “Demo for Sequence Analysis” on page 19
Fixed Bugs an d
Known Problems
Bug Reports
Includes fixes
Related
Documentation at
Web Site
No
Data Visualization Functions
Following is a new function:
•
microplateplot — Display visualization of microtiter plate.
Sequence Utility Functions
The following functions are updated:
•
rebasecuts — Find restriction enzymes that cut nucleotide sequence.
Updated to use Version 811 of REBASE, the Restriction Enzyme Database.
•
restrict — Split nucleotide sequence at restriction site. Updated to use
Version 811 of REBASE, the Restriction E nz yme Database.
17
Bioinformatics Toolbox™ Release Notes
Sequence Conversion Functions
The following function is updated:
•
nt2aa — Convert nucleotide sequence to amino acid sequence. Updated to
include a new property,
nucleotide characters.
Bioanalytic and Mass Spectrometry Functions
The following functions are updated to use with data from any separation
technique, including mass spectrometry:
•
msalign — Align peaks in signal to reference peaks.
msbackadj — Correct baseline of signal with peaks.
•
mslowess — Smooth signal with peaks using nonparametric method.
•
msnorm — Norma l ize set of s ignals with peaks.
•
mspeaks — Convert raw peak data to peak list (centroided data).
•
'ACGTOnly', to support ambiguous and unknown
18
msppresample — Resample signal with peaks while preserving peaks.
•
msresample — Resample signal with peaks.
•
mssgolay — Smooth signal with peaks using least-squares polynomial.
comparative genomic hybridization (aCGH) data. Updated to include an
optional heuristic stopping rule to improve performance.
•
ilmnbslookup — Look up Illumina
and annotation information. Updated to read Illumina microRNA array
annotation files.
•
ilmnbsread — Read gene expression data exported from Illumina
BeadStudiosoftware. UpdatedtoreadIllumina microRNA array data files.
•
mattest — Perform two-sample t-test to evaluate differential expression
of genes from two experimental conditions or phenotypes. Updated with
®
BeadStudio™ target (probe) sequence
Version 3.3 (R2009a) Bioinformatics To olbox™ Software
new property, 'VarType', w hich lets you specify equal or unequal (default)
variance for the test.
Compatibility Considerations
A compatibility consideration related to the mattest function was introduced
in Bioinformatics Toolbox Version 3.2, but not reported in the Release Notes
for Version 3.2 (R2008b). Specifically, in Bioinformatics Toolbox Vers ion
3.1 and earlier, the
Bioinformatics Toolbox Version 3.2, the
unequal variance for the test.
mattest function us ed equal variance for the test. In
mattest function starting using
Demo for Sequence Analysis
The following is a new sequence analysis demo:
Predicting Protein Secondary Structure Using a Neural Network
19
Bioinformatics Toolbox™ Release Notes
Version 3.2 (R2008b) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.2 (R2008b):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
Yes—Details labeled
as CompatibilityConsiderations,
below. See also
Summary.
New and updated features in this version include:
• “Data Format and Database Functions” on page 20
• “Sequence Utility Functions” on page 22
• “Multiple Sequence Alignment Functions” on page 22
• “Gene Ontology Functions” on page 23
• “Protein Analysis Functions” on page 23
• “Mass Spectrometry Functions” on page 23
• “Microarray File Format Functions” on page 24
• “Microarray Functions” on page 25
• “DataMatrix Object” on page 25
Fixed Bugs an d
Known Problems
Bug Reports
Includes fixes
Related
Documentation at
Web Site
No
20
• “DataMatrix Methods” on page 26
• “Demo for Visualization Tools” on page 26
• “Demo for Sequence Analysis” on page 26
• “Demos for M icroarray Data Analysis” on page 26
Data Format and Database Functions
Following are new functions:
Version 3.2 (R2008b) Bioinformatics To olbox™ Software
• affygcrma — Perform GC Robust Multi-array Average (GCRMA) procedure
on Affymetrix microarray probe-level data.
•
affyrma — Perform Robust Multi-array Average (RMA) procedure on
Affymetrix microarray probe-level data.
•
affysnpannotread — Read Affymetrix Mapping DNA array data from
CSV-formatted annotation file.
•
geoseriesread — R ead Gene Expression Omnibus (GEO) Series (GSE)
format data.
•
multialignwrite — Write multiple-alignment to file using ClustalW ALN
format.
•
mzcdfread — Read mass spectrometry data from netCDF file.
The following functions are updated:
•
affyread — Read microarray data from Affymetrix GeneChip file. Updated
so that
Probes field in the return structure is now a single, w hich reduces
memory usage.
•
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated so that
structure are now
geosoftread — Read Gene Expression Omnibus (GEO) SOFT format data.
•
PMIntensities and MMIntensities fields in the return
singles, which reduces memory usage.
Updated to support Platform (GPL) records.
•
getgeodata — Retrieve Gene Expression Omnibus (GEO) format data.
Updated to support Platform (GPL) and Series (GSE) records.
•
goannotread — Read annotations from Gene Ontology annotated file.
Updated to include two new properties,
'Fields' and 'Aspect',whichlet
you read a subset of the data in the annotated file.
filter and read a subset of the data. Updated with a 'Verbose' property to
control the progress display while reading the file.
Compatibility Considerations
In Bioinformatics Toolbox Version 3.1 and earlier, the Probes field, in the
structure returned by
fields, in the structure returned by celintensityread,weredoubles.In
Bioinformatics Toolbox Version 3.2, these fields are
Sequence Utility Functions
Following is a new function:
•
cleavelookup — Find cleavage rule for enzyme or compound.
The following functions are updated:
•
blastncbi — Create remote NCBI BLAST report request ID or link to
NCBI BLAST report. Updated to include a
you specify penalties for both opening and extending gaps, and an
property, which lets you limit searches using Entrez query syntax.
affyread,andthePMIntensities and MMIntensities
singles.
'GapCosts' property, w hich lets
'Entrez'
22
cleave — Cleave amino acid sequence with enzyme. Includes a new input
•
argument that specifies the name of an enzyme or compound for which a
cleavage rule is specified in the literature.
•
rebasecuts — Find restriction enzymes that cut nucleotide sequence.
Updated to use Version 806 of REBASE, the Restriction Enzyme Database.
•
restrict — Split nucleotide sequence at restriction site. Updated to use
Version 806 of REBASE, the Restriction E nz yme Database.
•
seqlogo — Display sequence logo for nucleotide or amino acid sequences.
Updated to return a figure handle to the sequence logo.
Multiple Sequen ce Alignment Functions
Following is a new function:
•
multialignwrite — Write multiple al ignment to file using ClustalW ALN
format.
Version 3.2 (R2008b) Bioinformatics To olbox™ Software
faster and without running out of memory. Updated with three new
properties,
filter and read a subset of the data. Updated with a
control the progress display while reading the file.
Microarray File Format Functions
Following are new functions:
•
affygcrma — Perform GC Robust Multi-array Average (GCRMA ) procedure
on Affymetrix microarray probe-level data.
•
affyrma — Perform Robust Multi-array Average (RMA) procedure on
Affymetrix microarray probe-level data.
•
affysnpannotread — Read Affymetrix Mapping DNA array data from
CSV-formatted annotation file.
•
geoseriesread — R ead Gene Expression Omnibus (GEO) Series (GSE)
affyread — Read microarray data from Affymetrix GeneChip file. Updated
so that
Probes field in the return structure is now a single, w hich reduces
memory usage.
•
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated so that
structure are now
geosoftread — Read Gene Expression Omnibus (GEO) SOFT format data.
•
PMIntensities and MMIntensities fields in the return
singles, which reduces memory usage.
Updated to support Platform (GPL) records.
•
getgeodata — Retrieve Gene Expression Omnibus (GEO) format data.
Updated to support Platform (GPL) and Series (GSE) records.
Compatibility Considerations
In Bioinformatics Toolbox Version 3.1 and earlier, the Probes field, in the
structure returned by
fields, in the structure returned by celintensityread,weredoubles.In
Bioinformatics Toolbox Version 3.2, these fields are
affyread,andthePMIntensities and MMIntensities
singles.
Version 3.2 (R2008b) Bioinformatics To olbox™ Software
Microarray Functions
Following are new functions:
•
affysnpintensitysplit — Split Affymetrix SNP probe intensity
information for alleles A and B.
•
affygcrma — Perform GC Robust Multi-array Average (GCRMA ) procedure
on Affymetrix microarray probe-level data.
•
affyrma — Perform Robust Multi-array Average (RMA) procedure on
Affymetrix microarray probe-level data.
•
DataMatrix — Create DataMatrix object.
The following functions are updated:
•
ilmnbslookup — Look up Illumina BeadStudio target (probe) sequence and
annotation information. Updated to support BGX and TXT annotation files.
•
mattest — Perform two-sample t-test to evaluate differential expression
of genes from two experimental conditions or phenotypes. Updated to use
unequal variance instead of equal variance for the test.
•
probesetlookup — Look up information for Affymetrix probe set. Updated
to accept multiple probe set IDs/names or gene IDs.
Compatibility Considerations
In Bioinformatics Toolbox Version 3.1 and earlier, the mattest function
used equal variance for the test. In Bioinformatics Toolbox Version 3.2, the
mattest function uses unequal variance for the test.
DataMatrix Object
Following is a new object:
• DataMatrix object — Data structure encapsulating data and metadata
from microarray experim ent so that it can be indexed by gene or probe
identifiers and by sample identifiers.
25
Bioinformatics Toolbox™ Release Notes
DataMatrix Meth
There are many me
subsets, sort, p
Demo for Visua
The Visualizi
updated to use
Demo for Sequ
The followi
• Analyzing t
ng is a new sequence analysis demo:
Demos for M
Following
• Working w
The Expl
DataMat
The Ana
demo is
affysn
is a new microarray d ata analysis demo:
oring Gene Expression Data demo is updated to use the new
rix object.
lyzing Affymetrix SNP Arrays for DNA Copy Number Variants
updatedtousetwonewfunctions:
pintensitysplit
thods that let you create, index into, modify, create
erform operations on, analyze, and plot a DataMatrix object.
lization Tools
ng the Three-Dimensional Structure of a Molecule demo is
the new
ence Analysis
he Human Distal Gut Microbiome
icroarray Data Analysis
ith GEO Series Data
ods
pdbsuperpose function.
affysnpannotread and
.
26
The Pr
updat
eprocessing Affymetrix Microarray Data at the Probe Lev el demo is
ed to use two new functions:
affygcrma and affyrma.
Version 3.1 (R2008a) Bioinformatics To olbox™ Software
Version 3.1 (R2008a) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.1 (R2008a):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
Yes—Details labeled
as CompatibilityConsiderations,
below. See also
Summary.
New and updated features in this version include:
• “Data Format and Database Functions” on page 27
• “Sequence Utility Functions” on page 28
• “Pairwise Sequence Alignment Functions” on page 29
• “Phylogenetic Tree Tools Function” on page 29
• “Protein Analysis Functions” on page 29
• “Microarray File Format Functions” on page 30
• “Microarray Functions” on page 30
• “Object” on page 32
• “Clustergram Methods” on page 32
Fixed Bugs an d
Known Problems
Bug Reports
Includes fixes
Related
Documentation at
Web Site
No
• “Demo for Sequence Analysis” on page 33
• “Demo for Microarray Data Analysis” on page 33
• “Demo for Visualization Tools” on page 33
• “Demos for Mass Spectrometry Data Analysis” on page 33
Data Format and Database Functions
Following is a new function:
27
Bioinformatics Toolbox™ Release Notes
• ilmnbsread — Read microarray data exported from Illumina BeadStudio
software.
The following functions are updated:
•
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated output structure to include a n e w field,
contains group numbers of probes.
•
fastawrite — Write to file using FASTA format. Updated such that if
you specify an existi ng file, new data is appended to the file instead of
overwriting it.
•
getgenbank — Retrieve sequence information from GenBank
Updated such that if you use the
existing file, new data is appended to the file instead of overwriting it.
Updated to allow you to access a partial sequence by adding new property
'PartialSeq'.
getgenpept — Retrieve sequence information from GenPept database.
•
Updated such that if you use the
existing file, new data is appended to the file instead of overwriting it.
Updated to allow you to access a partial sequence by adding new property
'PartialSeq'.
GroupNumbers,which
®
database.
'ToFile' property and specify an
'ToFile' property and specify an
28
getgeodata — Retrieve Gene Expression Omnibus (GEO) SOFT format
•
data. Updated to retrieve both Sample (GSM ) and Data Set (GDS) data.
Compatibility Considerations
In Bioinform atics Toolbox Version 3.0 and earlier, when writing to files using
the
fastawrite function or the getgenbank or getgenpept functions with the
'ToFile' property, if you specified an existing file, the file was overwritten.
In Bioinformatics Toolbox Version 3.1, if you specify an existing file, new data
is appended to the file instead of overwriting it.
Sequence Utility Functions
The following functions are updated:
•
evalrasmolscript — Send RasMol script commands to Molecule Viewer
window. U pdated to use Version 11.4 of the Jmol molecule viewer.
Version 3.1 (R2008a) Bioinformatics To olbox™ Software
• molviewer — Display and manipulate 3-D molecule structure. Updated
to use Version 11.4 of the Jmol molecule viewer.
•
ramachandran — Draw Ramachandran plot for Protein Data Bank (PDB)
data. Updated to handle PDB files with multiple chains and models
by adding three properties:
Ramachandran plot to m ark glycine residues and display reference regions
by adding three properties:
Updated Ramachandran plot to display amino acid information in ToolTip.
Updated to easily determine the names and sequence positions of amino
acids corresponding to torsion angles by creating an output structure.
•
rebasecuts — Find restriction enzymes that cut nucleotide sequence.
Updated to use Version 710 of REBASE, the Restriction Enzyme Database.
•
restrict — Split nucleotide sequence at restriction site. Updated to use
Version 710 of REBASE, the Restriction E nz yme Database.
'Chain', 'Plot',and'Model'.Updated
'Glycine', 'Regions',and'RegionDef'.
Pairwise Sequence Alignment Functions
The following functions are updated:
•
nwalign — Globally align two sequences using Needleman-Wunsch
algorithm. Updated to improve pairwise sequence performance.
•
swalign — Locally align two sequences using Smith-Waterman algorithm.
Updated to improve pairwise sequence performance.
Phylogenetic Tree Tools Function
The following function is updated:
•
dnds — Estimate synonymous and nonsynonymous substitution rates.
Updated by adding
are excluded from calculations.
'AdjustStops' property to control whether stop codons
Protein Analysis Functions
The following functions are updated:
•
evalrasmolscript — Send RasMol script commands to Molecule Viewer
window. U pdated to use Version 11.4 of the Jmol molecule viewer.
29
Bioinformatics Toolbox™ Release Notes
• molviewer — Display and manipulate 3-D molecule structure. Updated
to use Version 11.4 of the Jmol molecule viewer.
•
ramachandran — Draw Ramachandran plot for Protein Data Bank (PDB)
data. Updated to handle PDB files with multiple chains and models
by adding three properties:
Ramachandran plot to m ark glycine residues and display reference regions
by adding three properties:
Updated Ramachandran plot to display amino acid information in ToolTip.
Updated to easily determine the names and sequence positions of amino
acids by creating an output structure.
Microarray File Format Functions
Following is a new function:
•
ilmnbsread — Read microarray data exported from Illumina BeadStudio
software.
The following functions are updated:
'Chain', 'Plot',and'Model'.Updated
'Glycine', 'Regions',and'RegionDef'.
30
•
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated output structure to include a n e w field,
GroupNumbers,which
contains group numbers of probes.
•
getgeodata — Retrieve Gene Expression Omnibus (GEO) SOFT format
data. Updated to retrieve both Sample (GSM ) and Data Set (GDS) data.
cghfreqplot — Display frequency of D NA copy number alterations across
multiple samples.
•
ilmnbslookup — Look up Illumina BeadStudio target (probe) sequence
and annotation information.
•
redbluecmap — Create red and blue color map.
Version 3.1 (R2008a) Bioinformatics To olbox™ Software
The following functions are updated:
•
clustergram — Compute hierarchical clustering, d isplay dendrogram and
heat map, a n d create clustergram object.
Updated properties include:
- 'Linkage' — Can specify linkage method separately for rows and
columns.
- 'Dendrogram' — Can specify color threshold separately for rows and
columns.
Replaced properties include:
- 'Dimension' —Replacedbythe'Cluster' property, which lets you
cluster along the columns, rows, or both.
- 'Pdist' —Replacedby'RowPdist' and 'ColumnPdist' properties.
New properties include:
- 'Standardize ' — Specifies the dimension for standa r dizing the data.
- 'DisplayRang e' — Specifies the display range of standardized values.
- 'LogTrans' — Controls the log
transform of the data.
2
- 'Impute' — Specifies a function and properties to impute missing data.
- 'RowMarker' — A dds color and text marker to a group of rows.
- 'ColumnMarke r' — Adds color and text marker to a group of columns.
The interactivity of the clustergram figure is enhanced with the following
features:
- Select a group of rows or columns and display the group number and
genes or samples within.
- Create a new clustergram of only a group of the data.
- Export data as a clustergram object or structure in the MATLAB
Workspace.
•
maboxplot — Create box plot for microarray data. Updated by adding
'BoxPlot' property, which lets you specify arguments to pass to the
boxplot function, which creates the box plot.
31
Bioinformatics Toolbox™ Release Notes
• mairplot — Create intensity versus ratio scatter plot of microarray data.
Updated by adding
plot without user interface components.
•
mattest — Perform two-sample t-test to evaluate differential expression of
genes from two expe rimental conditions or phenotypes. Updated by adding
'Bootstrap' property to run bootstrap tests.
mavolcanoplot — Create significance versus gene expression ratio (fold
•
change) scatter plot of microarray data. Updated by adding
property, which lets you display the volcano plot without user interface
components.
•
probesetvalues — Create table of Affymetrix probe set intensity values.
Updated by adding
correction.
•
zonebackadj — Perform background adjustment on Affymetrix microarray
probe-level data using zone-based method. Updated to return a third
output containing the estimated background values for each probe.
'PlotOnly' property, which lets you display the scatter
'PlotOnly'
'Background' property to control the background
32
Compatibility Considerations
In Bioinformatics Toolbox Version 3.0 and earlier, the clustergram function
included
Version 3.1, the
and the
'Dimension' and 'Pdist' properties. In Bioinformatics Toolbox
'Dimension' property is replaced by the 'C luster' property,
'Pdist' property is re placed by the 'RowPdist' and 'ColumnPdist'
The following are new methods of a clustergram object:
•
get — Retrieve information about clustergram object.
Version 3.1 (R2008a) Bioinformatics To olbox™ Software
• plot — Render clustergram heat map and dendrograms for clustergram
object.
•
set — Set property of clustergram object.
view — View clustergram heat map and dendrograms for clustergram
•
object.
Demo for Sequence Analysis
The following is a new sequence analysis demo:
• Performing a Metagenomic Analysis of a Sargasso Sea Sample
Demo for Microarray Data Analysis
The following is a new microarray data analysis demo:
• Analyzing Affymetrix SNP Arrays for DNA Copy Number Variants
Demo for Visualization Tools
The following is a new visualization tool demo:
• Working with the Clustergram Function
Demos for Mass Spectrometry Data Analysis
• The Batch Processing of Spectra Using Distributed Computing demo is
updated to use the latest features of the Parallel Computing Toolbox™
version 3.3, and is now called Batch Processing of Spectra Using Sequential
and Parallel Computing
• The Preprocessing Raw Mass Spectrometry Data demo is updated with
state-of-the-art examples for peak detection using w avelets denoising,
binning by hierarchical clustering, and binning by dynamic programming.
33
Bioinformatics Toolbox™ Release Notes
Version 3.0 (R2007b) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.0 (R2007b):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
Yes—Details labeled
as CompatibilityConsiderations,
below. See also
Summary.
New and updated features in this version include:
• “Data Format and Database Functions” on page 34
• “Microarray File Format Functions” on page 35
• “Microarray Functions” on page 35
• “Sequence Conversion, Utility, and Visualization Functions” on page 35
• “Mass Spectrometry Functions” on page 36
• “Statistical Learning Functions” on page 36
• “Gene Ontology Methods” on page 36
• “Demos for M icroarray Data Analysis” on page 37
• “Demos for Sequence Analysis” on page 37
Fixed Bugs an d
Known Problems
Bug Reports
Includes fixes
Related
Documentation at
Web Site
No
34
• “Demo for Graph Theory Analysis” on pag e 38
Data Format and Database Functions
Following are new functions:
•
blastformat — Create local BLAST database.
blastreadlocal — Read data from local BLAST report.
probesetvalues — Create table of Affymetrix probe set intensity values.
Updatedreturnmatrix,whichcontains intensity values for probe-level
data, to include two new fields:
return a second output containing the column names for the return matrix,
which contains intensity values for probe-level data.
GroupNumber and Direction. Updated to
Sequence Conversion, Utility, and Visualization
Functions
Following are new functions:
•
blastlocal — Perform search on local BLAST database to create BLAST
report.
35
Bioinformatics Toolbox™ Release Notes
• rnaconvert — Convert secondary structure of RNA sequence between
bracket and matrix notations.
•
rnafold — Predict minimum free-energy secondary structure of RNA
sequence.
•
rnaplot — Draw secondary structure of RNA sequence.
Mass Spectrometry Functions
The following function is updated:
•
mspalign — Align mass spectra from multiple peak lists from LC/M S or
GC/MS data set. Updated to include a new property,
which controls the display of an assessment plot relative to the estimation
method and the vector of common mass/charge (m/z) values.
Statistical Learning Functions
The following function is updated:
'ShowEstimation',
36
•
svmsmoset — Create or edit Sequential M inimal Optimization (SMO)
options structure. Updated default values for the
'KernelCacheLimit' properties. Changed the 'Display' property so that
when set to
'iter', a report displays every 500 iterations instead of 10.
'MaxIter' and
Compatibility Considerations
In Bioinformatics Toolbox Version 2.6 and earlier, the svmsmoset function
used a
property with a default of 7500. In Bioinformatics Toolbox Version 3 .0, the
defaults are
property to 'iter', a report displays every 500 iteration s instead of 10.
'MaxIter' property with a default of 1500 and a 'KernelCacheLimit'
15000 and 5000, respectively. Also, when you set the 'Display'
Gene Ontology Methods
The following methods of a gene ontology object are updated:
•
geneont.getancestors — Find terms that are ancestors of specified
Gene Ontology term. Updated to also return the number of times each
ancestor is found. Updated to include two new properties,
which specifies a relationship type to search for in the gene ontology, and
'Relationtype',
Version 3.0 (R2007b) Bioinformatics To olbox™ Software
'Exclude', which controls excluding the original queried term(s) from the
output, unless the term was reached while searching the gene ontology.
•
geneont.getdescendants — Find terms that are descendants of
specified Gene Ontology term. Updated to also return the number of
times each descendant is found. Updated to include two new properties,
'Relationtype', which specifies a relationship type to search for in the
gene ontology, and
'Exclude', which controls excluding the original
queried term(s) from the output, unless the term was reached while
searching the gene ontology.
•
geneont.getrelatives — Find terms that are relatives of specified
Gene Ontology term. Updated to also return the number of times each
relative is found. Updated to include three new properties,
'Levels',
which specifies the number of levels up and down to search in the gene
ontology,
for in the gene ontology, and
'Relationtype', which specifies a relationship type to search
'Exclude', which controls excluding the
original queried term(s) from the output, unless the term was reached
while searching the gene ontology.
Demos for Microarray Data Analysis
The following are two new microarray data analysis demos:
• Detecting DNA Copy Number Alteration in Array-Based CGH Data
• Analyzing Array-Based CGH Data Using Bayesian Hidden Markov
Modeling
Demos for Sequence Analysis
The following are two new sequence analysis demos:
• Predicting and Visualizing the Secondary Structure of RNA Sequences
• Identifying Over-Represented Regulatory Motifs
The Investigating the Bird Flu Virus demo was updated to demonstrate how
to write KML-formatted files, which can be used by Google™ Earth to display
geospatial data.
37
Bioinformatics Toolbox™ Release Notes
Demo for Graph Theory Analysis
The following is a new graph theory demo:
• Working with Graph Theory Functions
38
Version 2.6 (R2007a+) Bioinformatics Toolbox™ Software
Version 2.6 (R2007a+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.6 (Release 2007a+):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
Yes—Details labeled
as CompatibilityConsiderations,
below. See also
Summary.
New and updated functions in this version include:
• “Data Formats and Databases Functions” on page 39
• “Microarray File Formats Functions” on page 40
• “Microarray Utility Functions” on page 40
• “Microarray Normalization and Filtering Functions” on page 41
• “Mass Spectrometry Functions” on page 41
• “Demos for M ass Spectrometry Functions” on page 41
Fixed Bugs an d
Known Problems
Bug Reports
Includes fixes
Related
Documentation at
Web Site
No
Data Formats and Databases Functions
The following functions are updated:
•
affyread — Read microarray data from Affymetrix GeneChip file. Updated
to read Affymetrix files from expression, genotyping, or resequencing
assays on all platforms, except Solaris™.
•
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated to read Affymetrix CEL and CDF files from expression or
genotyping assays on all platforms, except Solaris.
•
mzxmlread — Read mzXML file into MATLAB as structure. Updated to
read mzXML files that conform to the mzXML 2.1 specification or earlier
specifications.
39
Bioinformatics Toolbox™ Release Notes
Compatibility Considerations
In Bioinformatics Toolbox Version 2.6, the structure returned by affyread
when reading a CHP file from an expression a ssay no longer contains a
ProbePairs field. The ProbePairs field still exists in the structure returned
by
affyread when reading a CDF file.
Microarray File Formats Functions
The following functions are updated:
•
affyread — Read microarray data from Affymetrix GeneChip file. Updated
to read Affymetrix files from expression, genotyping, or resequencing
assays on all platforms, except Solaris.
•
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated to read Affymetrix CEL and CDF files from expression or
genotyping assays on all platforms, except Solaris.
Compatibility Considerations
In Bioinformatics Toolbox Version 2.6, the structure returned by affyread
when reading a CHP file from an expression a ssay no longer contains a
ProbePairs field. The ProbePairs field still exists in the structure returned
by
affyread when reading a CDF file.
40
Microarray Utility Functions
The following function is updated:
•
probesetplot — Plot Affymetrix probe set intensity values. U pdated to
accept structures created from CEL and CDF files, instead of a structure
created from a CHP file.
Compatibility Considerations
In Bioinformatics Toolbox Version 2.5 and earlier, the probesetplot function
accepted a structure created from a CHP file as input. Currently it requires
two structures: one created from a CE L file and one created from a CDF
library file. If you have any scripts that call the
need to update them to provide the correct input arguments.
probesetplot function, you
Version 2.6 (R2007a+) Bioinformatics Toolbox™ Software
Microarray Norm
Following is a ne
•
zonebackadj —P
probe-level da
Mass Spectrom
The following
•
mzxmlread —R
read mzXML f
specificat
Following i
multidime
•
sampleali
by introd
nsional mass spectrometry data:
Demos for
The foll
• Visuali
owing are two new mass spectrometry demos:
zing and Preprocessing Hyphenated Mass-Spectrometry Data Sets
for Met
abolite and Protein/Peptide Profiling
w function:
erform background adjustment on Affymetrix microarray
ta using zone-based method.
etry Functions
function is updated:
ead mzXML file into MATLAB as structure. Updated to
iles that conform to the mzXML 2.1 specification or earlier
ions.
s a new function you can use to calibrate and/or synchronize
gn
— Align two data sets containing sequential observations
ucing gaps.
Mass Spectrometry Functions
alization and Filtering Functions
• Differ
Liquid
ential Analysis of Complex Protein and Metabolic Mixtures Using
Chromatography/Mass Spectrometry (LC/MS)
41
Bioinformatics Toolbox™ Release Notes
Version 2.5 (R2007a) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.5 (Release 2007a):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
Yes—Details labeled
as CompatibilityConsiderations,
below. See also
Summary.
New, updated, and deprecated functions in this version include:
• “Data Formats and Database Functions” on page 43
• “Demo for Data Formats and Database Functions” on page 43
• “Statistical Learning Functions” on page 44
• “Protein Analysis and Sequence Utilities Functions” on page 44
• “Demo for Protein Analysis and Sequence Utilities Functions” on page 45
• “Sequence Alignment Functions” on page 45
• “Demo for Sequence Alignment Functions” on page 46
• “Microarray File Formats Functions” on page 46
• “Microarray Normalization and Filtering Functions” on page 46
Fixed Bugs an d
Known Problems
Bug Reports
Includes fixes
Related
Documentation at
Web Site
No
42
• “Demo for Microarray File Formats,Normalization,andFiltering
Functions” on page 47
• “Microarray Data Analysis and Visualization Functions” on page 47
• “Demo for Microarray Data Analysis and Visualization Functions” on
page 47
• “Mass Spectrometry Functions” on page 47
• “Phylogenetic Tree Tools Functions” on page 48
• “Demos for P hylog enetic Tree Tools Functions” on pag e 48
Version 2.5 (R2007a) Bioinformatics To olbox™ Software
• “Phylogenetic Tree Methods” on page 49
Data Formats and Database Functions
Following are new functions for reading and creating files:
•
affyprobeseqread — Read data file containing probe sequence information
for Affymetrix GeneChip a rray .
•
pdbwrite — Write to file using Protein Data Bank (PDB) format.
The following functions were updated:
•
celintensityread — Read probe intensities from Affymetrix CEL files
(Windows
matrices
files in the
pdbread — Read data from Protein Data Bank (PDB) file. Updated so
•
that the six fields containing coordinate information (
are now subfields within the
Updated to include a new property,
specified model from a PDB-formatted text file.
®
32). Update d so th a t the order of columns (CEL files) in return
PMIntensities and MMIntensities matches the order of CEL
CELFiles input argument.
Atom, AtomSD,
Model field of the M ATLAB structure.
ModelNum, which reads only the
Compatibility Considerations
In Bioinformatics Toolbox Version 2.4 and earlier, the celintensityread
function ordered the columns (CEL files) of return matrices PMIntensities
and MMIntensities alphabetically.
In Bioinformatics Toolbox Version 2.4 and earlier, the
stored coordinate information in six fields (
AnisotropicTempSD, Terminal,andHeterogenAtom) within the MATLAB
Atom, AtomSD, AnisotropicTemp,
structure. These six fields are now subfields w ithin the
pdbread function
Model field of the
MATLAB structure.
Demo for Data Formats and Database Functions
The Accessing NCBI Entrez Databa s es with E-Utilities demo illustrates how
to programatically search and retrieve data.
43
Bioinformatics Toolbox™ Release Notes
Statistical Lea
Following are ne
•
optimalleafor
binary cluster
•
svmsmoset —Cr
options struc
The followin
•
svmtrain —T
anew
for the
anewdefaul
gfunctionwasupdated:
SMO met
SMO m
rning Functions
wfunctions:
der
— Determine optimal leaf ordering for hierarchical
tree.
eate or edit Sequential Minimal Optimization (SMO)
ture.
rain support vector machine classifier. Updated to include
hod and a new property,
ethod. The
BoxConstraint property has changed, including
tvalue.
SMO_Opts, which provides options
Compatibility Considerations
In Bioinf
BoxConstraint property with a default of
a
Version 2.5, the default is
ormatics Toolbox Version 2.4 and earlier, the
1, which can l ead to slightly different results.
1
eps
svmtrain function used
. In Bioinformatics Toolbox
Protein Analysis and Sequence Utilities Functions
Following are new functions:
44
•
evalrasmolscript — Send RasMol script commands to molecule viewer.
molviewer — Display and manipulate 3-D molecule structure.
•
proteinpropplot — Plot properties of amino acid sequence.
•
seqinsertgaps — Insert gaps into nucleotide or amino acid sequence.
•
The following functions were updated:
•
featuresparse — Parse features from GenBank, GenPept, or EMBL
data. Updated to include a new property,
Sequence, which controls the
extraction, when possible, of the sequences.
Version 2.5 (R2007a) Bioinformatics To olbox™ Software
• oligoprop — Calculate sequence properties of DNA oligonucleotide.
Updated to handle ambiguous
The following function is obsolete:
•
pdbplot — Plot 3-D protein structure. This function was replaced by the
molviewer function.
N characters in a sequence.
Compatibility Considerations
In Bioinformatics Toolbox Version 2.5, the pdbplot function was replaced
by the
function, you need to update them to call the molviewer function.
molviewer function. If you have any scripts that call the pdbplot
Demo for Protein Analysis and Sequence Utilities
Functions
The Visualizing the Three-dimensional Structure of a Molecule demo
illustrates the
molviewer function.
Sequence Alignment Functions
The following function was updated:
•
seqpdist — Calculate pairwise distance between sequences. Updated to
assume that all input sequences are aligned if they have the same length,
regardless of the presence of gaps. If you know your input sequences are
not aligned, you can align them before passing them to
example, using
using
seqpdist.
multialign), or set PairwiseAlignment to true when
Compatibility Considerations
In Bioinformatics Toolbox Version 2.4 and earlier, the seqpdist function
assumed all input sequences were aligned if they had the same length and
at least one gap.
seqpdist (for
45
Bioinformatics Toolbox™ Release Notes
Demo for Sequence Alignment Functions
The Comparing Whole Genomes demo illustrates how to compare features of
organisms on a genomic evolution scale.
Microarray File Formats Functions
Following is a new function:
•
affyprobeseqread — Read data file containing probe sequence information
for Affymetrix GeneChip a rray .
The following function was updated:
•
celintensityread — Read probe intensities from Affymetrix CEL files
(Windows 32). Updated so that the order of columns (CEL files) in return
matrices
files in the
PMIntensities and MMIntensities matches the order of CEL
CELFiles input argument.
46
Compatibility Considerations
In Bioinformatics Toolbox Version 2.4 and earlier, the celintensityread
function ordered the columns (CEL files) of return matrices PMIntensities
and MMIntensities alphabetically.
Microarray Normalization and Filtering Functions
Following are new functions:
•
affyprobeaffinities — Compute Affymetrix probe affinities from their
sequences and MM probe intensities.
•
gcrmabackadj — Perform GC Robust Multi-array Average (GCRMA)
background adjustment on Affymetrix microarray probe-level data using
sequence information.
•
gcrma — Perform GC Robust Multi-array Average (GCRMA) background
adjustment, quantile normalization, and median-polish summarization on
Affymetrix microarray probe-level data.
Version 2.5 (R2007a) Bioinformatics To olbox™ Software
Demo for Microarray File Formats, Normalization,
and Filtering Functions
The Preprocessing Affymetrix Microarray Data at the Probe Level demo
illustrates the
and
Microarray Data Analysis and Visualization Functions
Following is a new function:
•
mafdr — Estimate false discovery rate (FDR) of differentially expressed
genes from two experimental conditions or phenotypes.
The following function was updated:
•
mattest — Perform two-tailed t-test to evaluate differential expression of
genes from two experimental conditions or phenotypes. Updated to include
a new property,
run.
Permute, which controls whether permutation tests are
Demo for Microarray Data Analysis and Visualization
Functions
The Exploring Gene Expression Data demo illustrates the mattest and mafdr
functions.
Mass Spectrometry Functions
Following are new functions:
•
msdotplot — Plot set of peak lists from LC/MS or GC/MS data set.
mspalign — Align mass spectra from multiple peak lists from LC/M S or
•
GC/MS data set.
•
mspeaks — Convert raw mass spectrometry data to peak list (centroided
data).
•
msppresample — Resample mass spectrometry signal while preserving
peaks.
•
mzxml2peaks — Convert mzXML s tructure to peak list.
47
Bioinformatics Toolbox™ Release Notes
The following function was updated:
•
msheatmap — Create pseudocolor image of set of mass spectra. Updated
to handle LC/MS and GC/MS data.
Phylogenetic Tree Tools Functions
Following is a new function:
•
seqinsertgaps — Insert gaps into nucleotide or amino acid sequence.
The following functions were updated:
•
dnds — Estimate synonymous and nonsynonymous substitution rates.
Updated to include two new properties,
display of the codons considered in the computations and their amino acid
translations, and
window.
•
dndsml — Estimate synonymous and nonsynonymous substitution rates
using maximum likelihood method. Updated to include a new property,
Verbose, which controls the display of the codons considered in the
computations and their amino acid translations.
Verbose, which controls the
Window, which performs the calculations over a sliding
48
•
seqpdist — Calculate pairwise distance between sequences. Updated to
assume that all input sequences are aligned if they have the same length,
regardless of the presence of gaps. If you know your input sequences are
not aligned, you can align them before passing them to
example, using
using
seqpdist.
multialign), or set PairwiseAlignment to true when
seqpdist (for
Compatibility Considerations
In Bioinformatics Toolbox Version 2.4 and earlier, the seqpdist function
assumed all input sequences were aligned if they had the same length and
at least one gap.
Demos for Phylogenetic Tree Tools Functions
The following demos illustrate the nwalign, seqinsertgaps, dnds,and
multialign functions:
Version 2.5 (R2007a) Bioinformatics To olbox™ Software
• Analyzing Synonymous a nd Nonsynonymous Substitution Rates
• Investigating the Bird Flu Virus
The Reconstructing the Origin and the Diffusion of the SARS Epidemic demo
presents an analysis of the SARS epidemic.
Phylogenetic Tree Methods
Following is a new method of a phytree object:
•
reorder — Reorder leaves of p hy logenetic tree.
49
Bioinformatics Toolbox™ Release Notes
Version 2.4 (R2006b) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.4 (Release 2006b):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
Yes
Summary
New functions, obsoleted functions, and changes introduced in this v ersion are
• “Data Formats and Database Functions” on page 50
• “Sequence Utilities Functions” on page 51
• “Sequence Visualization Functions” on page 51
• “Multiple Sequence Alignment Functions” on page 51
• “Microarray File Formats” on page 51
• “Microarray Data Analysis and Visualization Functions” on page 52
• “Graph Theory Functions” on page 52
• “Graph Visualization Methods” on page 53
• “Phylogenetic Tree Methods” on page 53
Fixed Bugs an d
Known Problems
Bug Reports
Includes fixes
Related
Documentation at
Web Site
No
Data Formats and Database Functions
Following is a new function for getting data into the MATLAB environment:
50
•
mzxmlread — Read mzXML file into the MAT LAB software as structure.
The following functions were updated:
•
celintensityread — Read probe intensities from Affymetrix CEL files
(Windows 32). Updated to include a new property,
the display of a progress report showing the name of each CEL file as it
is read.
Verbose, which controls
Version 2.4 (R2006b) Bioinformatics To olbox™ Software
• fastaread — Read data from FASTA file. Updated to include a new
property,
entries from a file.
•
geosoftread — Read Gene Expression Omnibus (GEO) SOFT format data.
Updated to read D ata Set (G DS) files as well as Sample (G SM) files.
•
getblast — BLAST report from NCBI Web site. Updated to include a new
property,
specified time (minutes) for a report from the NCBI Web site.
•
scfread — Read trace data from SCF file. Updated to include more output
options.
Blockread, which controls reading a single entry or block of
WaitTilReady, which pauses the MATLAB software and waits a
Sequence Utilities Functions
Following is a new function for parsing sequence data:
•
featuresparse — Parse features from GenBank, GenPept, or EM BL data.
Sequence Visualization Functions
The following function was updated:
•
seqtool — Open tool to interactively explore biological sequences. Updated
to download sequences from the EMBL database, interactively move the
viewing frame in the Sequence Viewer by pressing and holding Ctrl while
click-dragging, and export an amino acid translation as a FASTA file or to
the MATLAB Workspace.
Multiple Sequen ce Alignment Functions
The following function was updated:
•
multialignviewer — Open viewer for multiple sequence alignments.
Updated to export consensus sequences.
Microarray File Formats
The following function was updated:
•
celintensityread — Read probe intensities from Affymetrix CEL files
(Windows 32). Updated to include a new property,
Verbose, which controls
51
Bioinformatics Toolbox™ Release Notes
the display of a progress report showing the name of each CEL file as it
is read.
Microarray Data Analysis and Visualization Functions
The following functions were updated:
•
clustergram — Create dendrogram and heat map. Updated to include a
new property,
leaf ordering calculation, which determines the leaf order that maximizes
the similarity between neighboring leaves.
•
mairplot — Create intensity versus ratio scatter plot for microarray
signals. Updated to include a new property,
IR plot or MA plot, changing the plot axes to log scale, and adding plot
interactive features such as displaying gene labels, changing factor lines,
normalizing data, and exporting data.
•
mapcaplot — Create Principal Component plot of expression profile data.
Updated by adding an export feature.
OptimalLeafOrder, which enables or disables the optimal
Type, which creates either an
52
•
redgreencmap — Create red and green colormap. Updated to include a new
property,
Interpolation, which sets the method for color interpolation.
Graph Theory Functions
Following are new functions for applying basic graph theory algorithms to
sparse matrices:
•
graphallshortestpaths — Find all shortest paths in graph.
graphconncomp — Find strongly or weakly connected components in graph.
•
graphisdag — Test for cycles in directed graph.
•
graphisomorphism — Find isomorphism between two graphs.
•
graphisspantree — Determine if tree is spanning tree.
•
graphmaxflow — Calculate maximum flow and minimum cut in directed
•
graph.
•
graphminspantree — Find minimal spanning tree in graph.
graphpred2path — Convert predecessor indices to paths.
•
Version 2.4 (R2006b) Bioinformatics To olbox™ Software
• graphshortestpath — Solve shortest path problem in graph.
graphtopoorder — Perform topological sort of directed acyclic graph.
•
graphtraverse — Traverse graph by following adjacent nodes.
•
Graph Visualization Methods
Following are new meth ods for applying basic g r aph theory algorithms to
a
biograph object:
•
allshortestpaths — Find all shortest paths in biograph object.
conncomp — Find strongly or weakly connected components in biograph
•
object.
•
getmatrix — Get connection matrix from biograph object.
isdag — Test for cycles in biograph object.
•
isomorphism — Find isomorphism between two biog raph objects.
•
isspantree — Determine if tree created from biograph obje ct is spanning
•
tree.
•
maxflow — Calculate maximum flow and minimum cut in biograph object.
minspantree — Find minimal spanning tree in biograph object.
•
shortestpath — Solve shortest path problem in biograph object.
•
topoorder — Perform topological sort of directed acyclic graph extracted
•
from biograph object.
•
traverse — Traverse biograph object by following adjacent nodes.
Phylogenetic Tree Methods
Following is a new method for the phytree object:
•
getmatrix — Convert phytree object into a relationship matrix.
53
Bioinformatics Toolbox™ Release Notes
Version 2.3 (R2006a+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.3 (Release 2006a+):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
NoBug ReportsNo
New functions, obsoleted functions, and changes introduced in this v ersion are
• “Data Formats and Databases Functions” on page 54
• “Sequence Utilities Functions” on page 55
• “Sequence Visualization Functions” on page 55
• “Statistical Learning Functions” on page 55
• “Microarray Functions” on page 55
• “Demo for Microarray Functions” on page 56
Fixed Bugs an d
Known Problems
Related
Documentation at
Web Site
Data Formats and Databases Functions
The following functions are obsolete:
•
getpir — Sequence data from PIR-PSD database. This function
retrieved data from the PIR-PSD database. This database has
been discontinued and this function no longer retrieves data. See
http://pir.georgetown.edu/pirwww/dbinfo/nref.shtml for more
details.
54
•
pirread — Read data from Protein Inform ation Resource
(PIR) file. This function supported the data format of the
PIR-PSD database. This database has been discontinued. See
http://pir.georgetown.edu/pirwww/dbinfo/nref.shtml for more
details.
Version 2.3 (R2006a+) Bioinformatics Toolbox™ Software
Sequence Utilit
The following fu
refseq_rna, ref
•
blastncbi —Gen
nction w a s updated to include five new databases, including
seq_genomic, env_nt, refseq_protein, and env_nr:
Sequence Visu
Following is a
•
featuresmap
structure.
Statistica
The follow
RBF_Sigma
•
svmtrain
Microarr
The foll
•
owing function is supported on the Windows 32 platform only:
affyrea
(Windo
new function for visualizing sequence data:
—DrawlinearorcircularmapoffeaturesfromGenBank
lLearningFunctions
ing function was updated to include three new properties, including
, BoxConstraint,andAutoscale:
— Train support vector m achine classifier.
ay Functions
d
— Read microarray data from Affymetrix GeneChip file
ws 32).
ies Functions
erate remote BLAST request.
alization Functions
Follow
microa
•
•
•
•
ing are new functions for preprocessing A ffymetrix probe-level
rray data:
tensityread
celin
ows 32).
(Wind
ackadj
rmab
e-level data using Robust Multi-array Average (RMA) procedure.
prob
ummary
rmas
ymetrix microarray probe-level data using Robust Multi-array Average
Aff
A) procedure.
(RM
yinvarsetnorm
aff
tensities from multiple Affymetrix CEL or DAT files.
in
— Perform background adjustment on Affymetrix microarray
— Calculate gene (probe set) expression values from
— Read probe intensities from Affymetrix CEL files
— Perform rank invariant set normalization on probe
55
Bioinformatics Toolbox™ Release Notes
Following is a new function for two-color microarray normalization:
•
mainvarsetnorm — Perform rank invariant set normalization on gene
expression values from two experimen tal conditions or phenotypes.
Following are new functions for microarray differential expression analysis:
•
mattest — Perform two-sample, two-tailed t-test to evaluate differential
expression of genes from two experimental conditions o r phenotypes.
•
mavolcanoplot — Create significance versus gene expression ratio (fold
change) scatter plot of microarray data.
Demo for Microarray Functions
New demo of the new microarray functions (Analyzing Affymetrix Microarray
Gene Expression Data).
56
Version 2.2.1 (R2006a) Bioinfo rmatics Toolbox™ Software
Version 2.2.1 (R2006a) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.2.1 (Release 2006a):
New Features and
Changes
NoNoBug ReportsNo
Version
Compatibility
Considerations
Fixed Bugs an d
Known Problems
Related
Documentation at
Web Site
57
Bioinformatics Toolbox™ Release Notes
Version 2.2 (R14SP3+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.2 (Release 14SP3+):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
NoBug ReportsNo
New features and changes introduced in this version are
• “Multiple Sequence Alignment Viewer” on page 58
• “Microarray Functions for Agilent Software” on page 58
• “Gene Ontology Database Functions” on page 58
• “Demo for Gene Ontology Functions” on page 59
Fixed Bugs an d
Known Problems
Related
Documentation at
Web Site
Multiple Sequence Alignment Viewer
• multialignviewer — Interactively view, exp lore alignments, and make
• dnds, dndsml — Estimate synonymous and nonsynonymous substitutions
rates.
•
seqneighjoin — Reconstruct a phylogenetic tree with a Neighbor-joining
method.
62
Phylogenetic Tree Methods
• getcanonical — Calculate the canonical form of a phylogenetic tree.
getnewwickstr — Create a Newick formatted string.
•
reroot — Change the root of a phylogenetic tree.
•
subtree — Extract a subtree.
•
weights — Calculate weights for a phylogenetic tree.
•
Microarray Functions
probesetplot — Plot values for an Affymetrix CHP file probe set.
Statistics Functions
rankfeatures — Renamed function. The previous name was sqtlfeatures.
Version 2.0.1 (R14SP2) Bioinformatics Toolbox™ Software
Version 2.0.1 (R14SP2) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.0.1 (Release 14SP2):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
NoBug ReportsNo
New features and changes introduced in this version are
• “Updated RBASE Table” on page 63
• “Expanded Bioperl Demonstration” on page 63
Fixed Bugs an d
Known Problems
Related
Documentation at
Web Site
Updated RBASE Table
RBASE is the enzyme table that the function restrict uses to locate
sequence patterns.
Expanded Bioperl Demonstration
Example of calling the MATLAB software from Perl scripts now includes
several example s of passing various types of data (both directly and by variant
variable) back and forth between Perl and a MATL AB Automation Server. To
view the demo, type
bioperldemo.
63
Bioinformatics Toolbox™ Release Notes
Version 2.0 (R14SP1+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.0 (Release 14SP1+):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
No
New features and changes introduced in this version are
• “Mass Spectrometry Data Analysis” on page 64
• “Graph Visualization Object and Methods” on page 65
• “Statistical Learning” on page 65
• “Sequence Analysis” on page 65
• “Protein Analysis” on page 66
• “Microarray Analysis” on page 66
• “Updated Web Connectivity Function” on page 67
Fixed Bugs an d
Known Problems
No bug fixes
Related
Documentation at
Web Site
No
Mass Spectrometry Data Analysis
Following are new function s designed for preprocessing and classification
of raw mass spectrometry data from SELDI-TOF and MALDI-TOF
spectrometers.
64
•
msresample — Resa mple with antialias filtering.
msbackadj — Correct a baseline by estimation.
•
msalign — Align a spectrum to a set of candidate peaks.
•
msheatmap — Draw a heat map image for a set of spectra and check
•
alignments.
•
msnorm — Normalize a set of spectra.
mslowess — Nonparametric smoothing using the Lowess method.
•
Version 2.0 (R14SP1+) Bioinformatics Toolbox™ Software
• mssgolay — Least-squares polynomial smoothing.
msviewer — Plot a spectrum or a set of spectra.
•
Graph Visualization Object and Methods
New object and set of methods to view relationships between data with
interactive maps.
•
biograph — Function to create a biograph object.
dolayout — Calculate node and edge positions.
•
getnodesbyid — Get handles to nodes.
•
getedgesbynodeid — G et handles to edges.
•
view —Renderagraphinitsviewer.
•
getancestors —Findancestors.
•
getdescendants —Finddescendants.
•
getrelatives —Findneighbors.
•
Statistical Learning
New set of functions to classify data and identify features in the data.
•
classperf — Evaluate the performance of a classifier.
crossvalind — Cross-validation index generation.
•
knnclassify — K-Nearest neighbor classifier.
•
knnimpute — Impute m issing data using the nearest neighbor method.
results in a Figure window. (This may cause problems on the Mac
®
.)
In Bioinformatics Toolbox 2.0 the functions
seqshoworfs,andshowalignment use Java™ based figures. Currently on
the Macintosh
®
, Java figures are not enabled by default. If you use these
seqlogo, seqshowwords,
functions on a Macintosh, you should start the MATLAB software with
matlab -useJavaFigures
Protein Analysis
• pdbplot — Plots 3-D backbone structure of proteins in a PDB file.
Microarray Analysis
• quantilenorm — Quantile normalization.
New set of functions for working with Affymetrix GeneChip data sets.
•
probelibraryinfo — Get library information for a probe.
probesetlink — Show probe set information from NetAffx™.
•
probesetlookup — Get gene information for a probe set.
•
probesetplot —Plotprobesetvalues.
•
probesetvalues — Get probe set values from CEL and CDF information.
•
66
manorm — Normalization by scaling and centering replaces the functions
•
mamadnorm and mameannorm.
Version 2.0 (R14SP1+) Bioinformatics Toolbox™ Software
• affyread — Updated with output structures that have changed slightly.
Some redundant fields have been removed from CDF and CHP structure.
GIN database files are now supported. Version 4 of the Affymetrix GDAC
File Access Runtime Libraries is provided.
Note If you use mamadnorm or mameannorm in any of your personal M-files,
please update your files with the new function
manorm. These functions are
now obsolete and may be removed from future releases of the Bioinformatics
Toolbox software.
• geosoftread — Updated with supports Gene Expression Omnibus
Database records (GDS files).
•
maimage — Updated with supports Affymetrix CEL data.
maboxplot — Now supports Affymetrix CHP data.
•
Updated Web Connectivity F unction
getgenbank — Now returns CDS information for a gene in a structure
allowing direct access to the transcribed sequence.
67
Bioinformatics Toolbox™ Release Notes
Version 1.1.1 (R14SP1) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 1.1.1 (Release 14SP1):
New Features and
Changes
NoNo
Version
Compatibility
Considerations
Fixed Bugs an d
Known Problems
No bug fixes
Related
Documentation at
Web Site
No
68
Version 1.1 (R14) Bioinformatics Toolbox™ Software
Version 1.1 (R14) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 1.1 (Release 14)
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
No
New features and changes introduced in this version are
• “Phylogenetic Analysis Functions” on page 69
• “Phylogenetic Tree Object and Methods” on page 70
• “Hidden Markov Model (HMM) Profiles” on page 70
• “BLAST Functions” on page 71
• “Microarray Functions” on page 71
• “Protein Analysis Function” on page 71
• “Sequence Alignment Functions” on page 71
• “New Demos” on page 71
Fixed Bugs an d
Known Problems
No bug fixes
Related
Documentation at
Web Site
No
Phylogenetic Analysis Functions
New functions for phylogenetic tree creation and analysis.
workspace and return a phytree object with data from the file. Data in the
file uses the Ne wick (New Hampshire) format for describing trees.
•
phytreewrite — Copy the contents of a phytree object from the MATLAB
workspacetoafile.
•
phytreetool — Interactive GUI that allows you to view, edit, and
explore phylogenetic tree data. This GUI allows branch pruning,
reordering, renaming, and distance exploring. It can also open or save
Newick-formatted files.
69
Bioinformatics Toolbox™ Release Notes
• seqlinkage — Construct a phylogenetic tree from pairwise distances.
seqpdist — Calculate the pairwise distance between biological sequences.
•
Phylogenetic Tree Object and Methods
New object for manipulating phyloge n etic tree data.
•
phytree — Function to create a phytree object.
get — Get property values from a phytree object
•
getbyname — Get node names from a phytree object.
•
pdist — Calculate the patristic distances between pairs of leaf nodes.
•
plot — Draw a phylogenetic tree object in a M ATLA B Figure window as a
•
phylogram, cladogram, or radial tree.
•
prune — Remove nodes from a phylogenetic tree.
select — Select branches and leaves from a phylogenetic tree using a
•
specified criteria.
70
•
view — Open a phylogenetic tree in a phytreetool window.
Hidden Markov Model (HMM) Profiles
Updated Hidden Markov Model profile functions.
• The model structure that HMM functions use now includes loop and null
transition probabilities. You can read null and loop probabilities from
PFAM files using
gethmmprof.
• When the function
transition probabilities default to predefined values. If necessary, you can
later m odify the probabilities using the same function.
•
hmmprofalign includes two new properties to control the scoring of
flanking states and null transition probabilities. In addition, a third output
argument with indices pointing to the respective symbols of the query
sequence was added.
pfamhmmread,and,fromPFAMWebdatabases,using
hmmprofstruct builds an HMM model, the loop and null
Version 1.1 (R14) Bioinformatics Toolbox™ Software
BLAST Functions
blastncbi, blastread, getblast — BLAST sequences and view results from
within the MAT LAB software.
Microarray Functions
• imageneread — Read microarray data from an ImaGene®Results file.
affyread — Read microarray data from Affymetrix GeneChip files.
•
gprread — Read m icroarray data from GenePix
•
mapcaplot — Create a Principal Component plot of expression profile data.
•
clustergram — Updated function to do two way bi-clustering.
•
®
Results (GP R) files.
Protein Analysis Function
isoelectric — Estimate the isoelectric point (the pH at which the protein
has a net charge of zero) for an amino acid sequence and estimate the charge
for a given pH.
Sequence Alignment Functions
• seqdisp — Formats sequence output for easy viewing.
seqmatch — Find matches for every string in a library.
•
seqdotplot — Updated function now returns a second output (the matrix
•
of matches a s a sparse matrix).
•
aminolookup , baselookup — Updated functions to get IUB/UPAC
character codes, integer codes, and names for nucleotides and amino acids.
New Demos
• Bicluster demo — Demonstrates some of the options of the clustergram
function.
• Bioperl demo — Illustrates the interoperability between the MATLAB
software and Bioperl, passing arguments from the MATLAB software to
Perl scripts and pulling BLAST search data back to the MATLAB software.
71
Bioinformatics Toolbox™ Release Notes
• Phytree demo for Hominidae species— A phylogenetic tree is
constructed from mtDNA sequences for the Hominidae taxa (also known
as pongidae). This family e mbraces the gorillas, chimpanzees, orangutans
and the humans.
• Phytree demo for HIV/SIV — Analyzes the reconstruction of phylogenetic
trees from infected HIV/SIV organisms.
72
Version 1.0 (R13+) Bioinformatics Toolbox™ Software
Version 1.0 (R13+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 1.0 (Release 13+):
New Features and
Changes
Yes
Details below
Version
Compatibility
Considerations
No
New features and changes introduced in this version are
• “Introduction to Bioinformatics Toolbox” on page 73
Bioinformatics Toolbox Version 1.0(WebReleaseR13SP1+)extendsthe
MATLAB software with basic sequence analysis and gene expression analysis
functions. Bioinformatics Toolbox is a collection of tools built on the MATLAB
numeric computing environment. The toolbox supports a wide range of
common sequence analysis and expression analysis tasks, from accessing
Web-based databases, to sequence alignment, to microarray normalization
and visualization.
Bioinformatics Toolbox is dependent on many functions from the Statistics
Toolbox™ software, including some functions available only in the latest
version of Statistics Toolbox 4.1. We recommend that you install the latest
version of the Statistics Toolbox software before running the Bioinformatics
Toolbox software.
73
Bioinformatics Toolbox™ Release Notes
Bioinformatics Toolbox 1.0 has more than 100 functions implemented using
M-files. For a complete list of functions, in the MATLA B Command Window,
type
help bioinfo
Databases and Data Formats
The toolbox provides functions to directly access many standard Web-based
databases such as GenBank, EMBL, PIR, and PDB. There are also functions
to read many standard file formats, including FASTA and PDB. For
microarray data, there are functionstoreadAffymetrix,GenePix,SPOT
format data, and a function to access data directly from the N C BI Gene
Expression Omnibus Web site.
Sequence Alignment
The toolbox has functions for pairwise sequence alignment and for hidden
Markov model sequence profile alignment, including efficient MATLAB
implementations of the Needleman-Wunsch and Smith-Waterman algorithms.
In addition to the alignment functions there are several tools for visualizing
sequence alignments. The toolbox provides many standard scoring matrices,
including the PAM and BLOSUM families.
74
Sequence Utilities and Statistics
The toolbox contains many functions for working with sequences. There are
functions for converting DNA sequences to RNA or amino acid sequences;
there are functions that report various statistics about sequences, and
functions to search for patterns within the sequence; there are functions
for creating random sequences, and there are functions to perform in-silico
digestion of sequences with restriction enzymes and proteases.
Microarray Normalization and Visualization
The toolbox contains a number of functions for normalizing microarray
data including lowess normalization, global mean normalization, and
MAD normalization. The toolbox provides several functions for visualizing
microarray data, including spatial heat maps, box plots, loglog, and I-R
plots. The toolbox also uses functions from the Statistics Toolbox software to
perform cluster analysis and to visualize the results.
Version 1.0 (R13+) Bioinformatics Toolbox™ Software
Protein Structu
In addition to st
user interface (
sequences.
andard sequence analysis functi ons, there is also a graphical
GUI),
re Analysis
proteinplot, for visualizing properties of protein
75
Bioinformatics Toolbox™ Release Notes
Compatibility Summary for Bioinformatics Toolbox
Software
This table summarizes new features and changes that might cause
incompatibilities when you upgrade from an earlier version, or wh en you
use files on multiple versions. Details are provided in the description of the
new feature or change.
Version
(Release)
Latest Version
V3.5 (R2010a)
V3.4 (R2009b)See the Compatibility Considerations subheading
V3.3 (R2009a)See the Compatibility Considerations subheading
V3.2 (R2008b)See the Compatibility Considerations subheading
V3.1 (R2008a)See the Compatibility Considerations subheading