Mathworks BIOINFORMATICS TOOLBOX RELEASE NOTE

Bioinformatics Toolbox™ Release Notes
How to Contact The MathWorks
www.mathworks. comp.soft-sys.matlab Newsgroup www.mathworks.com/contact_TS.html Technical Support
bugs@mathwo doc@mathworks.com Documentation error reports service@mathworks.com Order status, license renewals, passcodes
info@mathwo
com
rks.com
rks.com
Web
Bug reports
Sales, prici
ng, and general information
508-647-7000 (Phone)
508-647-7001 (Fax)
The MathWorks, Inc. 3 Apple Hill Drive Natick, MA 01760-2098
For contact information about worldwide offices, see the MathWorks Web site.
Bioinformatics Toolbox™ Release Notes
© COPYRIGHT 2003–20 10 by The MathWorks, Inc.
The software described in this document is furnished under a license agreement. The software may be used or copied only under the terms of the license agreement. No part of this manual may be photocopied or reproduced in any form without prior written consent from The MathW orks, Inc.
FEDERAL ACQUISITION: This provision applies to all acquisitions of the Program and Documentation by, for, or through the federal government of the United States. By accepting delivery of the Program or Documentation, the government hereby agrees that this software or documentation qualifies as commercial computer software or commercial computer software documentation as such terms are used or defined in FAR 12.212, DFARS Part 227.72, and DFARS 252.227-7014. Accordingly, the terms and conditions of this Agreement and only those rights specified in this Agreement, shall pertain to and govern theuse,modification,reproduction,release,performance,display,anddisclosureoftheProgramand Documentation by the federal government (or other entity acquiring for or through the federal government) and shall supersede any conflicting contractual terms or conditions. If this License fails to meet the government’s needs or is inconsistent in any respect with federal procurement law, the government agrees to return the Program and Docu mentation, unused, to The MathWorks, Inc.
Trademarks
MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See
www.mathworks.com/trademarks for a list of additional trademarks. Other product or brand
names may be trademarks or registered trademarks of their respective holders.
Patents
The MathWorks products are protected by one or more U.S. patents. Please see
www.mathworks.com/patents for more information.
Summary by Version ............................... 1
Version 3.5 (R2010a) Bioinformatics Toolbox
Software
Version 3.4 (R2009b) Bioinformatics Toolbox
Software
Version 3.3 (R2009a) Bioinformatics Toolbox
Software
Version 3.2 (R2008b) Bioinformatics Toolbox
Software
Version 3.1 (R2008a) Bioinformatics Toolbox
Software
........................................ 4
........................................ 8
........................................ 17
........................................ 20
........................................ 27
Contents
Version 3.0 (R2007b) Bioinformatics Toolbox
Software
Version 2.6 (R2007a+) Bioinformatics Toolbox
Software
Version 2.5 (R2007a) Bioinformatics Toolbox
Software
Version 2.4 (R2006b) Bioinformatics Toolbox
Software
Version 2.3 (R2006a+) Bioinformatics Toolbox
Software
Version 2.2.1 (R2006a) Bioinformatics Toolbox
Software
........................................ 34
........................................ 39
........................................ 42
........................................ 50
........................................ 54
........................................ 57
iii
Version 2.2 (R14SP3+) Bioinformatics Toolbox
Software
Version 2.1.1 (R14SP3) Bioinformatics Toolbox
Software
Version 2.1 (R14SP2+) Bioinformatics Toolbox
Software
Version 2.0.1 (R14SP2) Bioinformatics Toolbox
Software
Version 2.0 (R14SP1+) Bioinformatics Toolbox
Software
Version 1.1.1 (R14SP1) Bioinformatics Toolbox
Software
........................................ 58
........................................ 60
........................................ 61
........................................ 63
........................................ 64
........................................ 68
Version 1.1 (R14) Bioinformatics Toolbox Software
Version 1.0 (R13+) Bioinformatics Toolbox Software
Compatibility Summary for Bioinformatics Toolbox
Software
........................................ 76
... 69
.. 73
iv Contents
SummarybyVersion
This table provides quick access to what’s new in each version. For clarification, see “Using Release Notes” on page 2 .
Bioinformatics Toolbox™ Release Notes
Version (Release)
Latest Versi V3.5 (R2010a
V3.4 (R2009b)
V3.3 (R2009a)
V3.2 (R2
V3.1 (R2008a)
V3.0 (R2007b)
V2.6
V2.5 (R2007a)
008b)
(R2007a+)
New Features and Changes
on
Yes
)
Details
Yes Details
Yes Details
Yes Details
Yes Details
Yes Detai
Yes Details
Yes Details
Version Compatibilit Consideratio
Yes Summary
Yes Summary
Yes Summary
Yes Summary
Yes Summary
Yes
ls
Summa
Yes Summary
Yes Summary
ry
y
ns
Fixed Bugs and Known Problems
Bug Reports Includes fix
Bug Reports Includes fixes
Bug Repor Includes
Bug Reports Includes fixes
Bug Reports Includes fixes
Bug Re Inclu
Bug Reports Includes fixes
Bug Reports Includes fixes
es
ts
fixes
ports
des fixes
Related Documentation at Web Site
Printable R elease Notes: PDF
Current product documentation
No
No
No
No
No
No
No
V2.4 (R2006b)
.3 (R2006a+)
V2
V2.2.1 (R2006a)
Yes
ails
Det
Yes Details
No No Bug Reports
No Bug
No Bug Reports
Reports
ludes fixes
Inc
Includes fixes
Includes fixes
No
No
No
1
Bioinformatics Toolbox™ Release Notes
Version (Release)
V2.2 (R14SP3+)
V2.1.1 (R14SP3)
V2.1 (R14SP2+)
V2.0.1 (R14SP2)
V2.0 (R14SP1+)
V1.1.1 (R14SP1)
V1.1 (R14)
V1.0 (R13+)
New Features and Changes
Yes Details
No No Bug Reports
Version Compatibility Considerations
Fixed Bugs and Known Problems
No Bug Reports
Includes fixes
Related Documentation at Web Site
No
No
Includes fixes
Yes Details
Yes Details
Yes
No Bug Reports
Includes fixes
No Bug Reports
Includes fixes
No
No bug fixes
No
No
No
Details
No No
Yes
No
No bug fixes
No bug fixes
No
No
Details
Yes Details
No
No bug fixes
V1.0 product documentation
Using Release Notes
Use release notes when upgrading to a newer version to learn about:
New features
Changes
Potential impact on your existing files and practices
Review the release notes for other MathWorks™ products required for this product (for example, MATLAB
®
or Simulink®). Determine if enhancements,
bugs, or compatibility considerations in other products impact you.
If you are upgrading from a software version other than the m ost recent one, review the current release notes and all interim versions. For example, when you upg rade from V1.0 to V1.2, review the release notes for V1.1 and V1.2.
2
SummarybyVersion
What Is in the Rel
New Features and
New functional
Changes to exi
Version Compa
When a new fea versions, th impact.
Compatibil Reports at in incompa compatibi
Fixed Bug
The Math view Bug time and provisi availa is not a
ity issues reported after the product release appear under Bug The MathWorks™ W eb site. Bug fixes can sometimes result tibilities, so review the fixed bugs in Bug Reports for any
lity impact.
sandKnownProblems
Works offers a user-searchable Bug Reports database so you can Reports. The development team updates this database at release as more information becomes available. Bug Reports include
ons for any known workarounds or file replacements. Information is
ble for bugs existing in or fixed in Release 14SP2 or later. Information
vailable for all bugs in earlier releases.
ity
sting functionality
tibility Considerations
ture or change introduces a reported incompatibility between
e Compatibility Considerations subsection explains the
ease Notes
Changes
s Bug Reports using your MathWorks Account.
Acces
3
Bioinformatics Toolbox™ Release Notes
Version 3.5 (R2010a) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.5 (R2010a):
New Features and Changes
Yes Details below
Version Compatibility Considerations
Yes—Details labeled as Compatibility Considerations, below. See also Summary.
New and updated features in this version include:
“Data Format and Database Functions” on page 4
“Pairwise Sequence Alignment Functions” on page 5
“Multiple Sequence Alignment Functions” on page 5
“Phylogenetic Tree Tools and Methods” on page 6
“BioIndexedFile Function, Object, Methods, and Properties” on page 6
“BioRead Function, Object, Methods, and Properties” on page 6
“BioMap Function, Object, Methods, and Properties” on page 7
“Function Elements Being Removed” on page 7
Fixed Bugs an d Known Problems
Bug Reports Includes fixes
Related Documentation at Web Site
Printable Release Notes: PDF
Current product documentation
Data Format and Database Functions
The following functions are new:
saminfo — Return inform ation about Sequence Alignment/Map (SAM) file.
samread — Read data from Sequence Alignment/Map (SAM) file.
The following functions are updated:
fastaread — Read data from FASTA file. U pdated to allow trimming of
the headers in the output structure by addition of
4
TrimHeaders property.
Version 3.5 (R2010a) Bioinformatics To olbox™ Software
fastqread — Read data from FASTQ file . Updated to allow trimming of
the headers in the output structure by addition of
phytreeread — Read phylogenetic tree file. Updated to return a second
TrimHeaders property.
output containing bootstrap v alues for tree nodes.
Pairwise Sequence Alignment Functions
The following function is updated:
fastaread — Read data from FASTA file. U pdated to allow trimming of
the headers in the output structure by addition of
TrimHeaders property.
Multiple Sequen ce Alignment Functions
The following functions are updated:
fastaread — Read data from FASTA file. U pdated to allow trimming of
the headers in the output structure by addition of
multialign — Align multiple sequences using progressive method.
Updated to inc lu de a new property,
parfor-loops and compute in parallel mode.
'UseParallel',whichletsyouuse
TrimHeaders property.
seqpdist — Calculate pairwise distance between sequences. Updated to
include a new property,
'UseParallel',whichletsyouuseparfor-loops
and compute in parallel mode.
Compatibility Considerations
In Bioinformatics Toolbox™ Version 3.4 and earlier, the multialign and
seqpdist functions included 'JobManager' and 'WaitInQueue' property
name/property value pairs, w h ich let you process in parallel, including support for the MATLAB scheduler for clusters.
In Bioinformatics Toolbox Version 3.5, the functions do not include the include the 'JobManager' and 'WaitInQueue' property name/property value pairs. Instead they include the 'UseParallel' property name/property value pair, which lets you process in parallel, including support for:
Local workers for multicore machines
multialign and seqpdist
5
Bioinformatics Toolbox™ Release Notes
The MATLAB scheduler for clusters
Third-party schedulers for clusters
Phylogenetic Tree Tools and Methods
The following functions are updated:
phytreeread — Read phylogenetic tree file. Updated to return a second
output containing bootstrap v alues for tree nodes.
seqpdist — Calculate pairwise distance between sequences. Updated to
include a new property, and compute in parallel mode.
BioIndexedFile Function, Object, Methods, and Properties
Following is a new class for an object that lets you extract information from large multi-entry text files.
'UseParallel',whichletsyouuseparfor-loops
BioIndexedFile — Allow quick and efficient access to large text file with
nonuniform-size entries.
This class has properties and methods that are useful for accessing, reading, and parsing data from a large source file.
BioRead Function, Object, Methods, and Properties
Following is a new class for an object that contains data from short-read sequences, including sequence headers, nucleotide sequences, and the quality scores for the seque n ces.
BioRead — Contain sequence and quality data.
This class has properties and methods that you can use to explore, access, filter, and manipulate all or a subset of the data, before doing subsequent analyses or sequence alignment and mapping.
6
Version 3.5 (R2010a) Bioinformatics To olbox™ Software
BioMap Function
Following is a ne sequences, incl the sequences, sequence.
BioMap —Conta
This class ha filter, and m analyses or v
Function El
Function Element Name
'JobManager'
property name/property value pair as input to
multialign
and seqpdist functions
'WaitInQueue'
property name/property value pair as input to
multialign
and seqpdist functions
wclassforanobjectthatcontains data from s hort-read
uding sequence headers, read sequences, quality scores for
and data about alignment and mapping to a single reference
insequence,quality,alignment,andmappingdata.
s properties and methods that you can use to explore, access,
anipulate all or a subset of the data, before doing subsequent
iewing the data.
ements Being Removed
What Happens When You Use This Function Element
Warns
Warns
, Object, Methods, and Properties
Use This Instead
'UseParallel'
property name/property value pair as input to
multialign
and seqpdist functions
'UseParallel'
property name/property value pair as input to
multialign
and seqpdist functions
Compatibility Considerations
See the Compatibility Considerations subheading in “Multiple Sequence Alignment Functions” on page 5
See the Compatibility Considerations subheading in “Multiple Sequence Alignment Functions” on page 5.
7
Bioinformatics Toolbox™ Release Notes
Version 3.4 (R2009b) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.4 (R2009b):
New Features and Changes
Yes Details below
Version Compatibility Considerations
Yes—Details labeled as Compatibility Considerations, below. See also Summary.
New and updated features in this version include:
“Data Format and Database Functions” on page 9
“Protein Analysis Functions” on page 9
“Data Visualization Functions” on page 10
“Sequence Statistics Functions” on page 10
“Sequence Utility Functions” on page 10
“Sequence Visualization Functions” on page 11
“Pairwise Sequence Alignment Functions” on page 11
“Multiple Sequence Alignment Functions” on page 11
“Phylogenetic Tree Tools and Methods” on page 12
Fixed Bugs an d Known Problems
Bug Reports Includes fixes
Related Documentation at Web Site
No
“Clustergram Window” on page 13
“Clustergram Methods and Properties” on page 13
“HeatMap Object, Methods, and Properties” on page 14
“DataMatrix Methods” on page 15
“Microarray Functions, Objects, Methods, and Properties” on page 15
“Mass Spectrometry Functions” on page 16
“Demos for Sequence Analysis” on page 16
8
Version 3.4 (R2009b) Bioinformatics To olbox™ Software
“Demos for Microarray Analysis” on page 16
Data Format and Database Functions
Following are new functions:
fastainfo — Return information about FASTA file.
fastqinfo — Return information about FASTQ file.
fastqread —ReaddatafromFASTQfile.
fastqwrite — Write to file using FASTQ format.
sffinfo — Return information about SFF file.
sffread —ReaddatafromSFFfile.
tgspcinfo — Return information about SPC file.
tgspcread —ReaddatafromSPCfile.
The following functions are updated:
affyread — Read microarray data from Affymetrix
®
GeneChip®file.
Updated to read cell layout files (CLF) and background probe (BGP) files.
multialignwrite — Write multiple alignment to file. Updated to write a
file in either ClustalW ALN format (default) or MSF format.
Protein Analysis Functions
Following is a new function:
isotopicdist — Calculate high-resolution isotope mass distribution and
density function.
The following function is updated:
cleave — Cleave amino acid seque n ce with enzyme. Updated to let you
specify an exception to the enzyme’s cleavage rule and to let you specify a maximum number of missed cleavage sites. Also updated to return the number o f missed cleavage sites per peptide fragment.
9
Bioinformatics Toolbox™ Release Notes
Data Visualizat
The following fu
microplateplo
updated so that row A. Updated you reverse th include a new p size of text l
multialignv
alignment. Multiple Se
showalignm
to control matches an
abels.
iewer
Updated to accept a list of names to label the sequences in the quence Alignment Viewer window.
ent
the inclusion or exclusion of terminal gaps from the count of
d si m ilar resi dues when displaying a pairwise alignment.
ion Functions
nctions are updated:
t
— Display visualization of microtiter plate. Display
first row of input matrix appears at the top and is labeled
to return the handle to the axes of the plot, which lets
e order or the rows or columns in the display. Updated to
roperty,
— Display color-coded sequence alignment. Updated
'TextFontSize', which lets you control the font
— Display and interactively adjust mu ltip le sequence
Compatibility Considerations
In Bioinf by
In Bioin input ma
ormatics Toolbox Version 3.3, the default layout for the plot returned
microp
lateplot
formatics Toolbox Versi on 3.4, the p lo t displays the first row of the
trix at the top.
displayed the first row of the input matrix at the bottom.
10
Sequen
The fol
seqsh
searc
Sequ
The f
cle
spe ama nu
ce Statistics Functions
lowing function is updated:
owwords
h for multip le words in a sequ en ce.
— Graphically display words in sequence. U pd ated to
ence Utility Functions
ollowing functions are updated:
ave
— Cleave amino acid sequence w ith enzyme. Updated to let you
cify an exception to the enzyme’s cleavage rule and to let you specify
ximum number of missed cleavage sites. Also updated to return the
mber of missed cleavage sites per peptide fragment.
Version 3.4 (R2009b) Bioinformatics To olbox™ Software
rebasecuts — Find restriction enzymes that cut nucleotide sequence.
Updated to use Version 904 of REBASE
restrict — Split nucleotide sequence at restriction site. Updated to use
Version 904 of REBASE, the Restriction E nz yme Database.
®
, the Restriction Enzyme Database.
Sequence Visualization Functions
The following functions are updated:
multialignviewer — Display and interactively adjust multiple sequence
alignment. Updated to accept a list of names to label the sequences in the Multiple Sequence Alignment V i ewer window.
showalignment — Display color-coded sequence alignment. Updated
to control the inclusion or exclusion of terminal gaps from the count of matches and similar residues when displaying a pairwise alignment.
Pairwise Sequence Alignment Functions
Following is a new function:
localalign — Return local optimal and suboptimal alignments between
two sequences.
The following functions are updated:
multialignviewer — Display and interactively adjust multiple sequence
alignment. Updated to accept a list of names to label the sequences in the Multiple Sequence Alignment V i ewer window.
showalignment — Display color-coded sequence alignment. Updated
to control the inclusion or exclusion of terminal gaps from the count of matches and similar residues when displaying a pairwise alignment.
Multiple Sequen ce Alignment Functions
The following functions are updated:
multialignviewer — Display and interactively adjust multiple sequence
alignment. Updated to accept a list of names to label the sequences in the Multiple Sequence Alignment V i ewer window.
11
Bioinformatics Toolbox™ Release Notes
multialignwrite — Write multiple alignment to file. Updated to write a
file in either ClustalW ALN format (default) or MSF format.
showalignment — Display color-coded sequence alignment. Updated
to control the inclusion or exclusion of terminal gaps from the count of matches and similar residues when displaying a pairwise alignment.
Phylogenetic Tree Tools and Methods
The Phylogenetic Tree Tool includes the following updates:
Includes two new circular print renderings: equal angle and equal daylight
Updates to Tools menu, including commands to select specific branch and
leaf nodes based on different criteria, such as distance, common ancestors, leaves only, and descendants.
Following is a new method:
cluster — Validate clusters in phylogenetic tree.
12
The following method is updated:
plot — Draw phylogenetic tree. Updated to include two new algorithms
for circular layouts: equal angle and equal daylight. Updated to let you rotate circular trees from 0 through 360 degrees and to rotate leaf labels of circular trees so that the text is aligned to the root node. Updated the
'LeafLabels' property so that it defaults to true forcircularlayoutsand
to
false for square and angular layouts.
Compatibility Considerations
In Bioinformatics Toolbox Version 3.3, the 'LeafLabels' property defaulted to
true when the 'Type' property was 'square' or 'angular',andtofalse
when the 'Type' property was 'radial'.
In Bioinformatics Toolbox Version 3.4, the to
false when the 'Type' property is 'square' or 'angular',andtotrue
when the 'Type' property is 'radial'.
'LeafLabels' property defaults
Version 3.4 (R2009b) Bioinformatics To olbox™ Software
Clustergram Win
The Clustergram
Annotate
of the heat map.
Show Dendrogram
window has two new toolbar buttons:
button — Shows and hides intensity values for each area
dow
button—Showsandhidesthedendrograms.
Clustergram Methods and Properties
The following are new methods of a clustergram object:
addTitle —Addtitletoclustergram.
addXLabel — Label x-axis of clustergram.
addYLabel — Label y-axis of clustergram.
clusterGroup — Select cluster group.
The following properties of a clustergram object are renamed:
ColumnMarker is now ColumnGroupMarker.
Impute is now ImputeFun.
Ratio is now DisplayRatio.
RowMarker is now RowGr oupMarker.
SymmetricRange is now Symmetric.
Note The former property names are still valid.
Following is a new property related to the display of dendrogram tree diagrams in a clustergram object:
ShowDendrogram
The following are new properties related to the display of row and column labels of a clustergram object:
13
Bioinformatics Toolbox™ Release Notes
RowLabels
ColumnLabels
RowLabelsLocation
ColumnLabelsLocation
RowLabelsColor
ColumnLabelsColor
LabelsWithMarkers
RowLabelsRotate
ColumnLabelsRotate
The follow ing are new properties related to annotating data in a clustergram object:
Annotate
AnnotColor
14
AnnotPrecision
When using clustergram properties with the get and set methods, the property names are now case sens itive.
Compatibility Considerations
In Bioinformatics Toolbox Version 3.3, the property names of a clustergram object were not case sensitive when used with the
In Bioinformatics Toolbox Version 3.4, property names of a clustergram object are case sensitive.
get an d set methods.
HeatMap Object, Methods, and Properties
Following is a new object:
HeatMap object — Object containing matrix and h eat map display
properties.
Version 3.4 (R2009b) Bioinformatics To olbox™ Software
The following are methods of a HeatMap object:
addTitle —Addtitletoheatmap.
addXLabel —Labelx-axis of heat map.
addYLabel —Labely-axis of heat map.
plot — Render heat map for object.
view — Render heat map for object.
A HeatMap object includes many properties that control the creation of the heat map, row and column labels, axes labels, title, and data annotation.
DataMatrix Methods
Following is a new method of a DataMatrix object:
dmwrite — Write DataMatrix object to text file.
Microarray Functions, Objects, Methods, and Properties
Following are new functions to create objects containing data from a microarray gene expression experiment:
bioma.ExpressionSet — Contain data from microarray gene expression
experiment.
bioma.data.ExptData — Contain expression data from microarray gene
expression experiment.
bioma.data.MetaData — Contain sample or feature m etadata from
microarray gene expression experiment.
bioma.data.MIAME — C ontain experiment information f rom microarray
gene expression experiment.
These objects have properties and methods that are useful for viewing and analyzing the data or a subset of the data.
15
Bioinformatics Toolbox™ Release Notes
Mass Spectrometry Functions
Following are new functions:
isotopicdist — Calculate high-resolution isotope mass distribution and
density function.
tgspcinfo — Return information about SPC file.
tgspcread —ReaddatafromSPCfile.
The following function is updated:
mspeaks — Convert raw peak data to peak list (centroided data). Updated
to include a new property, marking the peaks in the plot.
Demos for Sequence Analysis
Following are two new sequence analysis demos:
Working with SFF Files from the 454 Genome Sequencer FLX System
'Style', which lets you specify the style for
16
Working with Illumina/Solexa Next-Generation Sequencing Data
Demos for Microarray Analysis
Following are two new microarray analysis demos:
Working with Objects for Microarray Experiment Data
Analyzing Illumina Bead Summary Gene Expression Data
Version 3.3 (R2009a) Bioinformatics To olbox™ Software
Version 3.3 (R2009a) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.3 (R2009a):
New Features and Changes
Yes Details below
Version Compatibility Considerations
Yes—Details labeled as Compatibility Considerations, below. See also Summary.
New and updated features in this version include:
“Data Visualization Functions” on page 17
“Sequence Utility Functions” on page 17
“Sequence Conversion Functions” on page 18
“Bioanalytic and Mass Spectrometry Functions” on page 18
“Microarray Functions” on page 18
“Demo for Sequence Analysis” on page 19
Fixed Bugs an d Known Problems
Bug Reports Includes fixes
Related Documentation at Web Site
No
Data Visualization Functions
Following is a new function:
microplateplot — Display visualization of microtiter plate.
Sequence Utility Functions
The following functions are updated:
rebasecuts — Find restriction enzymes that cut nucleotide sequence.
Updated to use Version 811 of REBASE, the Restriction Enzyme Database.
restrict — Split nucleotide sequence at restriction site. Updated to use
Version 811 of REBASE, the Restriction E nz yme Database.
17
Bioinformatics Toolbox™ Release Notes
Sequence Conversion Functions
The following function is updated:
nt2aa — Convert nucleotide sequence to amino acid sequence. Updated to
include a new property, nucleotide characters.
Bioanalytic and Mass Spectrometry Functions
The following functions are updated to use with data from any separation technique, including mass spectrometry:
msalign — Align peaks in signal to reference peaks.
msbackadj — Correct baseline of signal with peaks.
mslowess — Smooth signal with peaks using nonparametric method.
msnorm — Norma l ize set of s ignals with peaks.
mspeaks — Convert raw peak data to peak list (centroided data).
'ACGTOnly', to support ambiguous and unknown
18
msppresample — Resample signal with peaks while preserving peaks.
msresample — Resample signal with peaks.
mssgolay — Smooth signal with peaks using least-squares polynomial.
Microarray Functions
The following functions are updated:
cghcbs — Perform circular binary segmentation(CBS)onarray-based
comparative genomic hybridization (aCGH) data. Updated to include an optional heuristic stopping rule to improve performance.
ilmnbslookup — Look up Illumina
and annotation information. Updated to read Illumina microRNA array annotation files.
ilmnbsread — Read gene expression data exported from Illumina
BeadStudiosoftware. UpdatedtoreadIllumina microRNA array data files.
mattest — Perform two-sample t-test to evaluate differential expression
of genes from two experimental conditions or phenotypes. Updated with
®
BeadStudio™ target (probe) sequence
Version 3.3 (R2009a) Bioinformatics To olbox™ Software
new property, 'VarType', w hich lets you specify equal or unequal (default) variance for the test.
Compatibility Considerations
A compatibility consideration related to the mattest function was introduced in Bioinformatics Toolbox Version 3.2, but not reported in the Release Notes for Version 3.2 (R2008b). Specifically, in Bioinformatics Toolbox Vers ion
3.1 and earlier, the Bioinformatics Toolbox Version 3.2, the unequal variance for the test.
mattest function us ed equal variance for the test. In
mattest function starting using
Demo for Sequence Analysis
The following is a new sequence analysis demo:
Predicting Protein Secondary Structure Using a Neural Network
19
Bioinformatics Toolbox™ Release Notes
Version 3.2 (R2008b) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.2 (R2008b):
New Features and Changes
Yes Details below
Version Compatibility Considerations
Yes—Details labeled as Compatibility Considerations, below. See also Summary.
New and updated features in this version include:
“Data Format and Database Functions” on page 20
“Sequence Utility Functions” on page 22
“Multiple Sequence Alignment Functions” on page 22
“Gene Ontology Functions” on page 23
“Protein Analysis Functions” on page 23
“Mass Spectrometry Functions” on page 23
“Microarray File Format Functions” on page 24
“Microarray Functions” on page 25
“DataMatrix Object” on page 25
Fixed Bugs an d Known Problems
Bug Reports Includes fixes
Related Documentation at Web Site
No
20
“DataMatrix Methods” on page 26
“Demo for Visualization Tools” on page 26
“Demo for Sequence Analysis” on page 26
“Demos for M icroarray Data Analysis” on page 26
Data Format and Database Functions
Following are new functions:
Version 3.2 (R2008b) Bioinformatics To olbox™ Software
affygcrma — Perform GC Robust Multi-array Average (GCRMA) procedure
on Affymetrix microarray probe-level data.
affyrma — Perform Robust Multi-array Average (RMA) procedure on
Affymetrix microarray probe-level data.
affysnpannotread — Read Affymetrix Mapping DNA array data from
CSV-formatted annotation file.
geoseriesread — R ead Gene Expression Omnibus (GEO) Series (GSE)
format data.
multialignwrite — Write multiple-alignment to file using ClustalW ALN
format.
mzcdfread — Read mass spectrometry data from netCDF file.
The following functions are updated:
affyread — Read microarray data from Affymetrix GeneChip file. Updated
so that
Probes field in the return structure is now a single, w hich reduces
memory usage.
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated so that structure are now
geosoftread — Read Gene Expression Omnibus (GEO) SOFT format data.
PMIntensities and MMIntensities fields in the return
singles, which reduces memory usage.
Updated to support Platform (GPL) records.
getgeodata — Retrieve Gene Expression Omnibus (GEO) format data.
Updated to support Platform (GPL) and Series (GSE) records.
goannotread — Read annotations from Gene Ontology annotated file.
Updated to include two new properties,
'Fields' and 'Aspect',whichlet
you read a subset of the data in the annotated file.
multialignread — Read multiple sequence alignment file. Updated
to support PHYLIP (Phylogeny Inference Package) multiple-sequence alignment files.
mzxmlread —ReaddatafrommzXMLfile. Improvedtoreadlargerfiles,
faster and without running out of memory. Updated with three new properties,
'Levels', 'TimeRange',and'ScanIndices',whichletyou
21
Bioinformatics Toolbox™ Release Notes
filter and read a subset of the data. Updated with a 'Verbose' property to control the progress display while reading the file.
Compatibility Considerations
In Bioinformatics Toolbox Version 3.1 and earlier, the Probes field, in the structure returned by fields, in the structure returned by celintensityread,weredoubles.In Bioinformatics Toolbox Version 3.2, these fields are
Sequence Utility Functions
Following is a new function:
cleavelookup — Find cleavage rule for enzyme or compound.
The following functions are updated:
blastncbi — Create remote NCBI BLAST report request ID or link to
NCBI BLAST report. Updated to include a you specify penalties for both opening and extending gaps, and an property, which lets you limit searches using Entrez query syntax.
affyread,andthePMIntensities and MMIntensities
singles.
'GapCosts' property, w hich lets
'Entrez'
22
cleave — Cleave amino acid sequence with enzyme. Includes a new input
argument that specifies the name of an enzyme or compound for which a cleavage rule is specified in the literature.
rebasecuts — Find restriction enzymes that cut nucleotide sequence.
Updated to use Version 806 of REBASE, the Restriction Enzyme Database.
restrict — Split nucleotide sequence at restriction site. Updated to use
Version 806 of REBASE, the Restriction E nz yme Database.
seqlogo — Display sequence logo for nucleotide or amino acid sequences.
Updated to return a figure handle to the sequence logo.
Multiple Sequen ce Alignment Functions
Following is a new function:
multialignwrite — Write multiple al ignment to file using ClustalW ALN
format.
Version 3.2 (R2008b) Bioinformatics To olbox™ Software
The following function is updated:
multialignread — Read multiple sequence alignment file. Updated
to support PHYLIP (Phylogeny Inference Package) multiple sequence alignment files.
Gene Ontology Functions
The following function is updated:
goannotread — Read annotations from Gene Ontology annotated file.
Updated to include two new properties, you read a subset of the data in the annotated file.
'Fields' and 'Aspect',whichlet
Protein Analysis Functions
Following are new functions:
cleavelookup — Find cleavage rule for enzyme or compound.
pdbsuperpose — Superpose 3-D structures of two proteins.
pdbtransform — Apply linear transformation to 3-D structure of molecule.
The following function is updated:
cleave — Cleave amino acid sequence with enzyme. Includes a new input
argument that specifies the name of an enzyme or compound for which a cleavage rule is specified in the literature.
Mass Spectrometry Functions
Following are new functions:
mzcdf2peaks — Convert mzCDF structure to peak list.
mzcdfinfo — Return information about netCDF file containing mass
spectrometry data.
mzcdfread — Read mass spectrometry data from netCDF file.
mzxmlinfo — Return information about mzXML file.
The following function is updated:
23
Bioinformatics Toolbox™ Release Notes
mzxmlread —ReaddatafrommzXMLfile. Improvedtoreadlargerfiles,
faster and without running out of memory. Updated with three new properties, filter and read a subset of the data. Updated with a control the progress display while reading the file.
Microarray File Format Functions
Following are new functions:
affygcrma — Perform GC Robust Multi-array Average (GCRMA ) procedure
on Affymetrix microarray probe-level data.
affyrma — Perform Robust Multi-array Average (RMA) procedure on
Affymetrix microarray probe-level data.
affysnpannotread — Read Affymetrix Mapping DNA array data from
CSV-formatted annotation file.
geoseriesread — R ead Gene Expression Omnibus (GEO) Series (GSE)
format data.
'Levels', 'TimeRange',and'ScanIndices',whichletyou
'Verbose' property to
24
The following functions are updated:
affyread — Read microarray data from Affymetrix GeneChip file. Updated
so that
Probes field in the return structure is now a single, w hich reduces
memory usage.
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated so that structure are now
geosoftread — Read Gene Expression Omnibus (GEO) SOFT format data.
PMIntensities and MMIntensities fields in the return
singles, which reduces memory usage.
Updated to support Platform (GPL) records.
getgeodata — Retrieve Gene Expression Omnibus (GEO) format data.
Updated to support Platform (GPL) and Series (GSE) records.
Compatibility Considerations
In Bioinformatics Toolbox Version 3.1 and earlier, the Probes field, in the structure returned by fields, in the structure returned by celintensityread,weredoubles.In Bioinformatics Toolbox Version 3.2, these fields are
affyread,andthePMIntensities and MMIntensities
singles.
Version 3.2 (R2008b) Bioinformatics To olbox™ Software
Microarray Functions
Following are new functions:
affysnpintensitysplit — Split Affymetrix SNP probe intensity
information for alleles A and B.
affygcrma — Perform GC Robust Multi-array Average (GCRMA ) procedure
on Affymetrix microarray probe-level data.
affyrma — Perform Robust Multi-array Average (RMA) procedure on
Affymetrix microarray probe-level data.
DataMatrix — Create DataMatrix object.
The following functions are updated:
ilmnbslookup — Look up Illumina BeadStudio target (probe) sequence and
annotation information. Updated to support BGX and TXT annotation files.
mattest — Perform two-sample t-test to evaluate differential expression
of genes from two experimental conditions or phenotypes. Updated to use unequal variance instead of equal variance for the test.
probesetlookup — Look up information for Affymetrix probe set. Updated
to accept multiple probe set IDs/names or gene IDs.
Compatibility Considerations
In Bioinformatics Toolbox Version 3.1 and earlier, the mattest function used equal variance for the test. In Bioinformatics Toolbox Version 3.2, the
mattest function uses unequal variance for the test.
DataMatrix Object
Following is a new object:
DataMatrix object — Data structure encapsulating data and metadata
from microarray experim ent so that it can be indexed by gene or probe identifiers and by sample identifiers.
25
Bioinformatics Toolbox™ Release Notes
DataMatrix Meth
There are many me subsets, sort, p
Demo for Visua
The Visualizi updated to use
Demo for Sequ
The followi
Analyzing t
ng is a new sequence analysis demo:
Demos for M
Following
Working w
The Expl DataMat
The Ana demo is
affysn
is a new microarray d ata analysis demo:
oring Gene Expression Data demo is updated to use the new
rix object.
lyzing Affymetrix SNP Arrays for DNA Copy Number Variants updatedtousetwonewfunctions:
pintensitysplit
thods that let you create, index into, modify, create
erform operations on, analyze, and plot a DataMatrix object.
lization Tools
ng the Three-Dimensional Structure of a Molecule demo is
the new
ence Analysis
he Human Distal Gut Microbiome
icroarray Data Analysis
ith GEO Series Data
ods
pdbsuperpose function.
affysnpannotread and
.
26
The Pr updat
eprocessing Affymetrix Microarray Data at the Probe Lev el demo is
ed to use two new functions:
affygcrma and affyrma.
Version 3.1 (R2008a) Bioinformatics To olbox™ Software
Version 3.1 (R2008a) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.1 (R2008a):
New Features and Changes
Yes Details below
Version Compatibility Considerations
Yes—Details labeled as Compatibility Considerations, below. See also Summary.
New and updated features in this version include:
“Data Format and Database Functions” on page 27
“Sequence Utility Functions” on page 28
“Pairwise Sequence Alignment Functions” on page 29
“Phylogenetic Tree Tools Function” on page 29
“Protein Analysis Functions” on page 29
“Microarray File Format Functions” on page 30
“Microarray Functions” on page 30
“Object” on page 32
“Clustergram Methods” on page 32
Fixed Bugs an d Known Problems
Bug Reports Includes fixes
Related Documentation at Web Site
No
“Demo for Sequence Analysis” on page 33
“Demo for Microarray Data Analysis” on page 33
“Demo for Visualization Tools” on page 33
“Demos for Mass Spectrometry Data Analysis” on page 33
Data Format and Database Functions
Following is a new function:
27
Bioinformatics Toolbox™ Release Notes
ilmnbsread — Read microarray data exported from Illumina BeadStudio
software.
The following functions are updated:
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated output structure to include a n e w field, contains group numbers of probes.
fastawrite — Write to file using FASTA format. Updated such that if
you specify an existi ng file, new data is appended to the file instead of overwriting it.
getgenbank — Retrieve sequence information from GenBank
Updated such that if you use the existing file, new data is appended to the file instead of overwriting it. Updated to allow you to access a partial sequence by adding new property
'PartialSeq'.
getgenpept — Retrieve sequence information from GenPept database.
Updated such that if you use the existing file, new data is appended to the file instead of overwriting it. Updated to allow you to access a partial sequence by adding new property
'PartialSeq'.
GroupNumbers,which
®
database.
'ToFile' property and specify an
'ToFile' property and specify an
28
getgeodata — Retrieve Gene Expression Omnibus (GEO) SOFT format
data. Updated to retrieve both Sample (GSM ) and Data Set (GDS) data.
Compatibility Considerations
In Bioinform atics Toolbox Version 3.0 and earlier, when writing to files using the
fastawrite function or the getgenbank or getgenpept functions with the
'ToFile' property, if you specified an existing file, the file was overwritten.
In Bioinformatics Toolbox Version 3.1, if you specify an existing file, new data is appended to the file instead of overwriting it.
Sequence Utility Functions
The following functions are updated:
evalrasmolscript — Send RasMol script commands to Molecule Viewer
window. U pdated to use Version 11.4 of the Jmol molecule viewer.
Version 3.1 (R2008a) Bioinformatics To olbox™ Software
molviewer — Display and manipulate 3-D molecule structure. Updated
to use Version 11.4 of the Jmol molecule viewer.
ramachandran — Draw Ramachandran plot for Protein Data Bank (PDB)
data. Updated to handle PDB files with multiple chains and models by adding three properties: Ramachandran plot to m ark glycine residues and display reference regions by adding three properties: Updated Ramachandran plot to display amino acid information in ToolTip. Updated to easily determine the names and sequence positions of amino acids corresponding to torsion angles by creating an output structure.
rebasecuts — Find restriction enzymes that cut nucleotide sequence.
Updated to use Version 710 of REBASE, the Restriction Enzyme Database.
restrict — Split nucleotide sequence at restriction site. Updated to use
Version 710 of REBASE, the Restriction E nz yme Database.
'Chain', 'Plot',and'Model'.Updated
'Glycine', 'Regions',and'RegionDef'.
Pairwise Sequence Alignment Functions
The following functions are updated:
nwalign — Globally align two sequences using Needleman-Wunsch
algorithm. Updated to improve pairwise sequence performance.
swalign — Locally align two sequences using Smith-Waterman algorithm.
Updated to improve pairwise sequence performance.
Phylogenetic Tree Tools Function
The following function is updated:
dnds — Estimate synonymous and nonsynonymous substitution rates.
Updated by adding are excluded from calculations.
'AdjustStops' property to control whether stop codons
Protein Analysis Functions
The following functions are updated:
evalrasmolscript — Send RasMol script commands to Molecule Viewer
window. U pdated to use Version 11.4 of the Jmol molecule viewer.
29
Bioinformatics Toolbox™ Release Notes
molviewer — Display and manipulate 3-D molecule structure. Updated
to use Version 11.4 of the Jmol molecule viewer.
ramachandran — Draw Ramachandran plot for Protein Data Bank (PDB)
data. Updated to handle PDB files with multiple chains and models by adding three properties: Ramachandran plot to m ark glycine residues and display reference regions by adding three properties: Updated Ramachandran plot to display amino acid information in ToolTip. Updated to easily determine the names and sequence positions of amino acids by creating an output structure.
Microarray File Format Functions
Following is a new function:
ilmnbsread — Read microarray data exported from Illumina BeadStudio
software.
The following functions are updated:
'Chain', 'Plot',and'Model'.Updated
'Glycine', 'Regions',and'RegionDef'.
30
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated output structure to include a n e w field,
GroupNumbers,which
contains group numbers of probes.
getgeodata — Retrieve Gene Expression Omnibus (GEO) SOFT format
data. Updated to retrieve both Sample (GSM ) and Data Set (GDS) data.
Microarray Functions
Following are new functions:
affysnpquartets —CreatetableofSNPprobequartetresultsfor
Affymetrix probe set.
cghfreqplot — Display frequency of D NA copy number alterations across
multiple samples.
ilmnbslookup — Look up Illumina BeadStudio target (probe) sequence
and annotation information.
redbluecmap — Create red and blue color map.
Version 3.1 (R2008a) Bioinformatics To olbox™ Software
The following functions are updated:
clustergram — Compute hierarchical clustering, d isplay dendrogram and
heat map, a n d create clustergram object.
Updated properties include:
- 'Linkage' — Can specify linkage method separately for rows and
columns.
- 'Dendrogram' — Can specify color threshold separately for rows and
columns.
Replaced properties include:
- 'Dimension' —Replacedbythe'Cluster' property, which lets you
cluster along the columns, rows, or both.
- 'Pdist' —Replacedby'RowPdist' and 'ColumnPdist' properties.
New properties include:
- 'Standardize ' — Specifies the dimension for standa r dizing the data.
- 'DisplayRang e' — Specifies the display range of standardized values.
- 'LogTrans' — Controls the log
transform of the data.
2
- 'Impute' — Specifies a function and properties to impute missing data.
- 'RowMarker' — A dds color and text marker to a group of rows.
- 'ColumnMarke r' — Adds color and text marker to a group of columns.
The interactivity of the clustergram figure is enhanced with the following features:
- Select a group of rows or columns and display the group number and
genes or samples within.
- Create a new clustergram of only a group of the data.
- Export data as a clustergram object or structure in the MATLAB
Workspace.
maboxplot — Create box plot for microarray data. Updated by adding 'BoxPlot' property, which lets you specify arguments to pass to the boxplot function, which creates the box plot.
31
Bioinformatics Toolbox™ Release Notes
mairplot — Create intensity versus ratio scatter plot of microarray data.
Updated by adding plot without user interface components.
mattest — Perform two-sample t-test to evaluate differential expression of
genes from two expe rimental conditions or phenotypes. Updated by adding
'Bootstrap' property to run bootstrap tests.
mavolcanoplot — Create significance versus gene expression ratio (fold
change) scatter plot of microarray data. Updated by adding property, which lets you display the volcano plot without user interface components.
probesetvalues — Create table of Affymetrix probe set intensity values.
Updated by adding correction.
zonebackadj — Perform background adjustment on Affymetrix microarray
probe-level data using zone-based method. Updated to return a third output containing the estimated background values for each probe.
'PlotOnly' property, which lets you display the scatter
'PlotOnly'
'Background' property to control the background
32
Compatibility Considerations
In Bioinformatics Toolbox Version 3.0 and earlier, the clustergram function included Version 3.1, the and the
'Dimension' and 'Pdist' properties. In Bioinformatics Toolbox
'Dimension' property is replaced by the 'C luster' property,
'Pdist' property is re placed by the 'RowPdist' and 'ColumnPdist'
properties.
Object
Following is a new object:
clustergram object — Object containing hierarchical clustering analysis
data.
Clustergram Methods
The following are new methods of a clustergram object:
get — Retrieve information about clustergram object.
Version 3.1 (R2008a) Bioinformatics To olbox™ Software
plot — Render clustergram heat map and dendrograms for clustergram
object.
set — Set property of clustergram object.
view — View clustergram heat map and dendrograms for clustergram
object.
Demo for Sequence Analysis
The following is a new sequence analysis demo:
Performing a Metagenomic Analysis of a Sargasso Sea Sample
Demo for Microarray Data Analysis
The following is a new microarray data analysis demo:
Analyzing Affymetrix SNP Arrays for DNA Copy Number Variants
Demo for Visualization Tools
The following is a new visualization tool demo:
Working with the Clustergram Function
Demos for Mass Spectrometry Data Analysis
The Batch Processing of Spectra Using Distributed Computing demo is
updated to use the latest features of the Parallel Computing Toolbox™ version 3.3, and is now called Batch Processing of Spectra Using Sequential and Parallel Computing
The Preprocessing Raw Mass Spectrometry Data demo is updated with
state-of-the-art examples for peak detection using w avelets denoising, binning by hierarchical clustering, and binning by dynamic programming.
33
Bioinformatics Toolbox™ Release Notes
Version 3.0 (R2007b) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 3.0 (R2007b):
New Features and Changes
Yes Details below
Version Compatibility Considerations
Yes—Details labeled as Compatibility Considerations, below. See also Summary.
New and updated features in this version include:
“Data Format and Database Functions” on page 34
“Microarray File Format Functions” on page 35
“Microarray Functions” on page 35
“Sequence Conversion, Utility, and Visualization Functions” on page 35
“Mass Spectrometry Functions” on page 36
“Statistical Learning Functions” on page 36
“Gene Ontology Methods” on page 36
“Demos for M icroarray Data Analysis” on page 37
“Demos for Sequence Analysis” on page 37
Fixed Bugs an d Known Problems
Bug Reports Includes fixes
Related Documentation at Web Site
No
34
“Demo for Graph Theory Analysis” on pag e 38
Data Format and Database Functions
Following are new functions:
blastformat — Create local BLAST database.
blastreadlocal — Read data from local BLAST report.
cytobandread — Read cytogenetic banding information.
Version 3.0 (R2007b) Bioinformatics To olbox™ Software
The following function was updated:
affyread — Read microarray data from Affymetrix GeneChip file. Updated
the structure returned when reading a CDF library file. The structure contains three new subfields:
GroupNumber, Direction,andGroupName.
Microarray File Format Functions
Following is a new function:
cytobandread — Read cytogenetic banding information.
The following function was updated:
affyread — Read microarray data from Affymetrix GeneChip file. Updated
the structure returned when reading a CDF library file. The structure contains three new subfields:
GroupNumber, Direction,andGroupName.
Microarray Functions
Following are new functions:
chromosomeplot — Plot chromosome ideogram with G-banding pattern.
cghcbs — Perform circular binary segmentation(CBS)onarray-based
comparative genomic hybridization (aCGH) data.
The following function is updated:
probesetvalues — Create table of Affymetrix probe set intensity values.
Updatedreturnmatrix,whichcontains intensity values for probe-level data, to include two new fields: return a second output containing the column names for the return matrix, which contains intensity values for probe-level data.
GroupNumber and Direction. Updated to
Sequence Conversion, Utility, and Visualization Functions
Following are new functions:
blastlocal — Perform search on local BLAST database to create BLAST
report.
35
Bioinformatics Toolbox™ Release Notes
rnaconvert — Convert secondary structure of RNA sequence between
bracket and matrix notations.
rnafold — Predict minimum free-energy secondary structure of RNA
sequence.
rnaplot — Draw secondary structure of RNA sequence.
Mass Spectrometry Functions
The following function is updated:
mspalign — Align mass spectra from multiple peak lists from LC/M S or
GC/MS data set. Updated to include a new property, which controls the display of an assessment plot relative to the estimation method and the vector of common mass/charge (m/z) values.
Statistical Learning Functions
The following function is updated:
'ShowEstimation',
36
svmsmoset — Create or edit Sequential M inimal Optimization (SMO)
options structure. Updated default values for the
'KernelCacheLimit' properties. Changed the 'Display' property so that
when set to
'iter', a report displays every 500 iterations instead of 10.
'MaxIter' and
Compatibility Considerations
In Bioinformatics Toolbox Version 2.6 and earlier, the svmsmoset function used a property with a default of 7500. In Bioinformatics Toolbox Version 3 .0, the defaults are property to 'iter', a report displays every 500 iteration s instead of 10.
'MaxIter' property with a default of 1500 and a 'KernelCacheLimit'
15000 and 5000, respectively. Also, when you set the 'Display'
Gene Ontology Methods
The following methods of a gene ontology object are updated:
geneont.getancestors — Find terms that are ancestors of specified
Gene Ontology term. Updated to also return the number of times each ancestor is found. Updated to include two new properties, which specifies a relationship type to search for in the gene ontology, and
'Relationtype',
Version 3.0 (R2007b) Bioinformatics To olbox™ Software
'Exclude', which controls excluding the original queried term(s) from the
output, unless the term was reached while searching the gene ontology.
geneont.getdescendants — Find terms that are descendants of
specified Gene Ontology term. Updated to also return the number of times each descendant is found. Updated to include two new properties,
'Relationtype', which specifies a relationship type to search for in the
gene ontology, and
'Exclude', which controls excluding the original
queried term(s) from the output, unless the term was reached while searching the gene ontology.
geneont.getrelatives — Find terms that are relatives of specified
Gene Ontology term. Updated to also return the number of times each relative is found. Updated to include three new properties,
'Levels',
which specifies the number of levels up and down to search in the gene ontology, for in the gene ontology, and
'Relationtype', which specifies a relationship type to search
'Exclude', which controls excluding the
original queried term(s) from the output, unless the term was reached while searching the gene ontology.
Demos for Microarray Data Analysis
The following are two new microarray data analysis demos:
Detecting DNA Copy Number Alteration in Array-Based CGH Data
Analyzing Array-Based CGH Data Using Bayesian Hidden Markov
Modeling
Demos for Sequence Analysis
The following are two new sequence analysis demos:
Predicting and Visualizing the Secondary Structure of RNA Sequences
Identifying Over-Represented Regulatory Motifs
The Investigating the Bird Flu Virus demo was updated to demonstrate how to write KML-formatted files, which can be used by Google™ Earth to display geospatial data.
37
Bioinformatics Toolbox™ Release Notes
Demo for Graph Theory Analysis
The following is a new graph theory demo:
Working with Graph Theory Functions
38
Version 2.6 (R2007a+) Bioinformatics Toolbox™ Software
Version 2.6 (R2007a+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.6 (Release 2007a+):
New Features and Changes
Yes Details below
Version Compatibility Considerations
Yes—Details labeled as Compatibility Considerations, below. See also Summary.
New and updated functions in this version include:
“Data Formats and Databases Functions” on page 39
“Microarray File Formats Functions” on page 40
“Microarray Utility Functions” on page 40
“Microarray Normalization and Filtering Functions” on page 41
“Mass Spectrometry Functions” on page 41
“Demos for M ass Spectrometry Functions” on page 41
Fixed Bugs an d Known Problems
Bug Reports Includes fixes
Related Documentation at Web Site
No
Data Formats and Databases Functions
The following functions are updated:
affyread — Read microarray data from Affymetrix GeneChip file. Updated
to read Affymetrix files from expression, genotyping, or resequencing assays on all platforms, except Solaris™.
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated to read Affymetrix CEL and CDF files from expression or genotyping assays on all platforms, except Solaris.
mzxmlread — Read mzXML file into MATLAB as structure. Updated to
read mzXML files that conform to the mzXML 2.1 specification or earlier specifications.
39
Bioinformatics Toolbox™ Release Notes
Compatibility Considerations
In Bioinformatics Toolbox Version 2.6, the structure returned by affyread when reading a CHP file from an expression a ssay no longer contains a
ProbePairs field. The ProbePairs field still exists in the structure returned
by
affyread when reading a CDF file.
Microarray File Formats Functions
The following functions are updated:
affyread — Read microarray data from Affymetrix GeneChip file. Updated
to read Affymetrix files from expression, genotyping, or resequencing assays on all platforms, except Solaris.
celintensityread — Read probe intensities from Affymetrix CEL files.
Updated to read Affymetrix CEL and CDF files from expression or genotyping assays on all platforms, except Solaris.
Compatibility Considerations
In Bioinformatics Toolbox Version 2.6, the structure returned by affyread when reading a CHP file from an expression a ssay no longer contains a
ProbePairs field. The ProbePairs field still exists in the structure returned
by
affyread when reading a CDF file.
40
Microarray Utility Functions
The following function is updated:
probesetplot — Plot Affymetrix probe set intensity values. U pdated to
accept structures created from CEL and CDF files, instead of a structure created from a CHP file.
Compatibility Considerations
In Bioinformatics Toolbox Version 2.5 and earlier, the probesetplot function accepted a structure created from a CHP file as input. Currently it requires two structures: one created from a CE L file and one created from a CDF library file. If you have any scripts that call the need to update them to provide the correct input arguments.
probesetplot function, you
Version 2.6 (R2007a+) Bioinformatics Toolbox™ Software
Microarray Norm
Following is a ne
zonebackadj —P
probe-level da
Mass Spectrom
The following
mzxmlread —R
read mzXML f specificat
Following i multidime
sampleali
by introd
nsional mass spectrometry data:
Demos for
The foll
Visuali
owing are two new mass spectrometry demos:
zing and Preprocessing Hyphenated Mass-Spectrometry Data Sets
for Met
abolite and Protein/Peptide Profiling
w function:
erform background adjustment on Affymetrix microarray
ta using zone-based method.
etry Functions
function is updated:
ead mzXML file into MATLAB as structure. Updated to
iles that conform to the mzXML 2.1 specification or earlier
ions.
s a new function you can use to calibrate and/or synchronize
gn
— Align two data sets containing sequential observations
ucing gaps.
Mass Spectrometry Functions
alization and Filtering Functions
Differ
Liquid
ential Analysis of Complex Protein and Metabolic Mixtures Using
Chromatography/Mass Spectrometry (LC/MS)
41
Bioinformatics Toolbox™ Release Notes
Version 2.5 (R2007a) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.5 (Release 2007a):
New Features and Changes
Yes Details below
Version Compatibility Considerations
Yes—Details labeled as Compatibility Considerations, below. See also Summary.
New, updated, and deprecated functions in this version include:
“Data Formats and Database Functions” on page 43
“Demo for Data Formats and Database Functions” on page 43
“Statistical Learning Functions” on page 44
“Protein Analysis and Sequence Utilities Functions” on page 44
“Demo for Protein Analysis and Sequence Utilities Functions” on page 45
“Sequence Alignment Functions” on page 45
“Demo for Sequence Alignment Functions” on page 46
“Microarray File Formats Functions” on page 46
“Microarray Normalization and Filtering Functions” on page 46
Fixed Bugs an d Known Problems
Bug Reports Includes fixes
Related Documentation at Web Site
No
42
“Demo for Microarray File Formats,Normalization,andFiltering
Functions” on page 47
“Microarray Data Analysis and Visualization Functions” on page 47
“Demo for Microarray Data Analysis and Visualization Functions” on
page 47
“Mass Spectrometry Functions” on page 47
“Phylogenetic Tree Tools Functions” on page 48
“Demos for P hylog enetic Tree Tools Functions” on pag e 48
Version 2.5 (R2007a) Bioinformatics To olbox™ Software
“Phylogenetic Tree Methods” on page 49
Data Formats and Database Functions
Following are new functions for reading and creating files:
affyprobeseqread — Read data file containing probe sequence information
for Affymetrix GeneChip a rray .
pdbwrite — Write to file using Protein Data Bank (PDB) format.
The following functions were updated:
celintensityread — Read probe intensities from Affymetrix CEL files
(Windows matrices files in the
pdbread — Read data from Protein Data Bank (PDB) file. Updated so
that the six fields containing coordinate information (
AnisotropicTemp, AnisotropicTempSD, Terminal,andHeterogenAtom)
are now subfields within the Updated to include a new property, specified model from a PDB-formatted text file.
®
32). Update d so th a t the order of columns (CEL files) in return
PMIntensities and MMIntensities matches the order of CEL
CELFiles input argument.
Atom, AtomSD,
Model field of the M ATLAB structure.
ModelNum, which reads only the
Compatibility Considerations
In Bioinformatics Toolbox Version 2.4 and earlier, the celintensityread function ordered the columns (CEL files) of return matrices PMIntensities and MMIntensities alphabetically.
In Bioinformatics Toolbox Version 2.4 and earlier, the stored coordinate information in six fields (
AnisotropicTempSD, Terminal,andHeterogenAtom) within the MATLAB
Atom, AtomSD, AnisotropicTemp,
structure. These six fields are now subfields w ithin the
pdbread function
Model field of the
MATLAB structure.
Demo for Data Formats and Database Functions
The Accessing NCBI Entrez Databa s es with E-Utilities demo illustrates how to programatically search and retrieve data.
43
Bioinformatics Toolbox™ Release Notes
Statistical Lea
Following are ne
optimalleafor
binary cluster
svmsmoset —Cr
options struc
The followin
svmtrain —T
anew for the anewdefaul
gfunctionwasupdated:
SMO met
SMO m
rning Functions
wfunctions:
der
— Determine optimal leaf ordering for hierarchical
tree.
eate or edit Sequential Minimal Optimization (SMO)
ture.
rain support vector machine classifier. Updated to include
hod and a new property,
ethod. The
BoxConstraint property has changed, including
tvalue.
SMO_Opts, which provides options
Compatibility Considerations
In Bioinf
BoxConstraint property with a default of
a Version 2.5, the default is
ormatics Toolbox Version 2.4 and earlier, the
1, which can l ead to slightly different results.
1
eps
svmtrain function used
. In Bioinformatics Toolbox
Protein Analysis and Sequence Utilities Functions
Following are new functions:
44
evalrasmolscript — Send RasMol script commands to molecule viewer.
molviewer — Display and manipulate 3-D molecule structure.
proteinpropplot — Plot properties of amino acid sequence.
seqinsertgaps — Insert gaps into nucleotide or amino acid sequence.
The following functions were updated:
featuresparse — Parse features from GenBank, GenPept, or EMBL
data. Updated to include a new property,
Sequence, which controls the
extraction, when possible, of the sequences.
Version 2.5 (R2007a) Bioinformatics To olbox™ Software
oligoprop — Calculate sequence properties of DNA oligonucleotide.
Updated to handle ambiguous
The following function is obsolete:
pdbplot — Plot 3-D protein structure. This function was replaced by the molviewer function.
N characters in a sequence.
Compatibility Considerations
In Bioinformatics Toolbox Version 2.5, the pdbplot function was replaced by the function, you need to update them to call the molviewer function.
molviewer function. If you have any scripts that call the pdbplot
Demo for Protein Analysis and Sequence Utilities Functions
The Visualizing the Three-dimensional Structure of a Molecule demo illustrates the
molviewer function.
Sequence Alignment Functions
The following function was updated:
seqpdist — Calculate pairwise distance between sequences. Updated to
assume that all input sequences are aligned if they have the same length, regardless of the presence of gaps. If you know your input sequences are not aligned, you can align them before passing them to example, using using
seqpdist.
multialign), or set PairwiseAlignment to true when
Compatibility Considerations
In Bioinformatics Toolbox Version 2.4 and earlier, the seqpdist function assumed all input sequences were aligned if they had the same length and at least one gap.
seqpdist (for
45
Bioinformatics Toolbox™ Release Notes
Demo for Sequence Alignment Functions
The Comparing Whole Genomes demo illustrates how to compare features of organisms on a genomic evolution scale.
Microarray File Formats Functions
Following is a new function:
affyprobeseqread — Read data file containing probe sequence information
for Affymetrix GeneChip a rray .
The following function was updated:
celintensityread — Read probe intensities from Affymetrix CEL files
(Windows 32). Updated so that the order of columns (CEL files) in return matrices files in the
PMIntensities and MMIntensities matches the order of CEL
CELFiles input argument.
46
Compatibility Considerations
In Bioinformatics Toolbox Version 2.4 and earlier, the celintensityread function ordered the columns (CEL files) of return matrices PMIntensities and MMIntensities alphabetically.
Microarray Normalization and Filtering Functions
Following are new functions:
affyprobeaffinities — Compute Affymetrix probe affinities from their
sequences and MM probe intensities.
gcrmabackadj — Perform GC Robust Multi-array Average (GCRMA)
background adjustment on Affymetrix microarray probe-level data using sequence information.
gcrma — Perform GC Robust Multi-array Average (GCRMA) background
adjustment, quantile normalization, and median-polish summarization on Affymetrix microarray probe-level data.
Version 2.5 (R2007a) Bioinformatics To olbox™ Software
Demo for Microarray File Formats, Normalization, and Filtering Functions
The Preprocessing Affymetrix Microarray Data at the Probe Level demo illustrates the and
gcrma functions.
affyprobeseqread, affyprobeaffinities, gcrmabackadj,
Microarray Data Analysis and Visualization Functions
Following is a new function:
mafdr — Estimate false discovery rate (FDR) of differentially expressed
genes from two experimental conditions or phenotypes.
The following function was updated:
mattest — Perform two-tailed t-test to evaluate differential expression of
genes from two experimental conditions or phenotypes. Updated to include a new property, run.
Permute, which controls whether permutation tests are
Demo for Microarray Data Analysis and Visualization Functions
The Exploring Gene Expression Data demo illustrates the mattest and mafdr functions.
Mass Spectrometry Functions
Following are new functions:
msdotplot — Plot set of peak lists from LC/MS or GC/MS data set.
mspalign — Align mass spectra from multiple peak lists from LC/M S or
GC/MS data set.
mspeaks — Convert raw mass spectrometry data to peak list (centroided
data).
msppresample — Resample mass spectrometry signal while preserving
peaks.
mzxml2peaks — Convert mzXML s tructure to peak list.
47
Bioinformatics Toolbox™ Release Notes
The following function was updated:
msheatmap — Create pseudocolor image of set of mass spectra. Updated
to handle LC/MS and GC/MS data.
Phylogenetic Tree Tools Functions
Following is a new function:
seqinsertgaps — Insert gaps into nucleotide or amino acid sequence.
The following functions were updated:
dnds — Estimate synonymous and nonsynonymous substitution rates.
Updated to include two new properties, display of the codons considered in the computations and their amino acid translations, and window.
dndsml — Estimate synonymous and nonsynonymous substitution rates
using maximum likelihood method. Updated to include a new property,
Verbose, which controls the display of the codons considered in the
computations and their amino acid translations.
Verbose, which controls the
Window, which performs the calculations over a sliding
48
seqpdist — Calculate pairwise distance between sequences. Updated to
assume that all input sequences are aligned if they have the same length, regardless of the presence of gaps. If you know your input sequences are not aligned, you can align them before passing them to example, using using
seqpdist.
multialign), or set PairwiseAlignment to true when
seqpdist (for
Compatibility Considerations
In Bioinformatics Toolbox Version 2.4 and earlier, the seqpdist function assumed all input sequences were aligned if they had the same length and at least one gap.
Demos for Phylogenetic Tree Tools Functions
The following demos illustrate the nwalign, seqinsertgaps, dnds,and
multialign functions:
Version 2.5 (R2007a) Bioinformatics To olbox™ Software
Analyzing Synonymous a nd Nonsynonymous Substitution Rates
Investigating the Bird Flu Virus
The Reconstructing the Origin and the Diffusion of the SARS Epidemic demo presents an analysis of the SARS epidemic.
Phylogenetic Tree Methods
Following is a new method of a phytree object:
reorder — Reorder leaves of p hy logenetic tree.
49
Bioinformatics Toolbox™ Release Notes
Version 2.4 (R2006b) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.4 (Release 2006b):
New Features and Changes
Yes Details below
Version Compatibility Considerations
Yes Summary
New functions, obsoleted functions, and changes introduced in this v ersion are
“Data Formats and Database Functions” on page 50
“Sequence Utilities Functions” on page 51
“Sequence Visualization Functions” on page 51
“Multiple Sequence Alignment Functions” on page 51
“Microarray File Formats” on page 51
“Microarray Data Analysis and Visualization Functions” on page 52
“Graph Theory Functions” on page 52
“Graph Visualization Methods” on page 53
“Phylogenetic Tree Methods” on page 53
Fixed Bugs an d Known Problems
Bug Reports Includes fixes
Related Documentation at Web Site
No
Data Formats and Database Functions
Following is a new function for getting data into the MATLAB environment:
50
mzxmlread — Read mzXML file into the MAT LAB software as structure.
The following functions were updated:
celintensityread — Read probe intensities from Affymetrix CEL files
(Windows 32). Updated to include a new property, the display of a progress report showing the name of each CEL file as it is read.
Verbose, which controls
Version 2.4 (R2006b) Bioinformatics To olbox™ Software
fastaread — Read data from FASTA file. Updated to include a new
property, entries from a file.
geosoftread — Read Gene Expression Omnibus (GEO) SOFT format data.
Updated to read D ata Set (G DS) files as well as Sample (G SM) files.
getblast — BLAST report from NCBI Web site. Updated to include a new
property, specified time (minutes) for a report from the NCBI Web site.
scfread — Read trace data from SCF file. Updated to include more output
options.
Blockread, which controls reading a single entry or block of
WaitTilReady, which pauses the MATLAB software and waits a
Sequence Utilities Functions
Following is a new function for parsing sequence data:
featuresparse — Parse features from GenBank, GenPept, or EM BL data.
Sequence Visualization Functions
The following function was updated:
seqtool — Open tool to interactively explore biological sequences. Updated
to download sequences from the EMBL database, interactively move the viewing frame in the Sequence Viewer by pressing and holding Ctrl while click-dragging, and export an amino acid translation as a FASTA file or to the MATLAB Workspace.
Multiple Sequen ce Alignment Functions
The following function was updated:
multialignviewer — Open viewer for multiple sequence alignments.
Updated to export consensus sequences.
Microarray File Formats
The following function was updated:
celintensityread — Read probe intensities from Affymetrix CEL files
(Windows 32). Updated to include a new property,
Verbose, which controls
51
Bioinformatics Toolbox™ Release Notes
the display of a progress report showing the name of each CEL file as it is read.
Microarray Data Analysis and Visualization Functions
The following functions were updated:
clustergram — Create dendrogram and heat map. Updated to include a
new property, leaf ordering calculation, which determines the leaf order that maximizes the similarity between neighboring leaves.
mairplot — Create intensity versus ratio scatter plot for microarray
signals. Updated to include a new property, IR plot or MA plot, changing the plot axes to log scale, and adding plot interactive features such as displaying gene labels, changing factor lines, normalizing data, and exporting data.
mapcaplot — Create Principal Component plot of expression profile data.
Updated by adding an export feature.
OptimalLeafOrder, which enables or disables the optimal
Type, which creates either an
52
redgreencmap — Create red and green colormap. Updated to include a new
property,
Interpolation, which sets the method for color interpolation.
Graph Theory Functions
Following are new functions for applying basic graph theory algorithms to sparse matrices:
graphallshortestpaths — Find all shortest paths in graph.
graphconncomp — Find strongly or weakly connected components in graph.
graphisdag — Test for cycles in directed graph.
graphisomorphism — Find isomorphism between two graphs.
graphisspantree — Determine if tree is spanning tree.
graphmaxflow — Calculate maximum flow and minimum cut in directed
graph.
graphminspantree — Find minimal spanning tree in graph.
graphpred2path — Convert predecessor indices to paths.
Version 2.4 (R2006b) Bioinformatics To olbox™ Software
graphshortestpath — Solve shortest path problem in graph.
graphtopoorder — Perform topological sort of directed acyclic graph.
graphtraverse — Traverse graph by following adjacent nodes.
Graph Visualization Methods
Following are new meth ods for applying basic g r aph theory algorithms to a
biograph object:
allshortestpaths — Find all shortest paths in biograph object.
conncomp — Find strongly or weakly connected components in biograph
object.
getmatrix — Get connection matrix from biograph object.
isdag — Test for cycles in biograph object.
isomorphism — Find isomorphism between two biog raph objects.
isspantree — Determine if tree created from biograph obje ct is spanning
tree.
maxflow — Calculate maximum flow and minimum cut in biograph object.
minspantree — Find minimal spanning tree in biograph object.
shortestpath — Solve shortest path problem in biograph object.
topoorder — Perform topological sort of directed acyclic graph extracted
from biograph object.
traverse — Traverse biograph object by following adjacent nodes.
Phylogenetic Tree Methods
Following is a new method for the phytree object:
getmatrix — Convert phytree object into a relationship matrix.
53
Bioinformatics Toolbox™ Release Notes
Version 2.3 (R2006a+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.3 (Release 2006a+):
New Features and Changes
Yes Details below
Version Compatibility Considerations
No Bug Reports No
New functions, obsoleted functions, and changes introduced in this v ersion are
“Data Formats and Databases Functions” on page 54
“Sequence Utilities Functions” on page 55
“Sequence Visualization Functions” on page 55
“Statistical Learning Functions” on page 55
“Microarray Functions” on page 55
“Demo for Microarray Functions” on page 56
Fixed Bugs an d Known Problems
Related Documentation at Web Site
Data Formats and Databases Functions
The following functions are obsolete:
getpir — Sequence data from PIR-PSD database. This function
retrieved data from the PIR-PSD database. This database has been discontinued and this function no longer retrieves data. See
http://pir.georgetown.edu/pirwww/dbinfo/nref.shtml for more
details.
54
pirread — Read data from Protein Inform ation Resource
(PIR) file. This function supported the data format of the PIR-PSD database. This database has been discontinued. See
http://pir.georgetown.edu/pirwww/dbinfo/nref.shtml for more
details.
Version 2.3 (R2006a+) Bioinformatics Toolbox™ Software
Sequence Utilit
The following fu refseq_rna, ref
blastncbi —Gen
nction w a s updated to include five new databases, including
seq_genomic, env_nt, refseq_protein, and env_nr:
Sequence Visu
Following is a
featuresmap
structure.
Statistica
The follow
RBF_Sigma
svmtrain
Microarr
The foll
owing function is supported on the Windows 32 platform only:
affyrea
(Windo
new function for visualizing sequence data:
—DrawlinearorcircularmapoffeaturesfromGenBank
lLearningFunctions
ing function was updated to include three new properties, including
, BoxConstraint,andAutoscale:
— Train support vector m achine classifier.
ay Functions
d
— Read microarray data from Affymetrix GeneChip file
ws 32).
ies Functions
erate remote BLAST request.
alization Functions
Follow microa
ing are new functions for preprocessing A ffymetrix probe-level
rray data:
tensityread
celin
ows 32).
(Wind
ackadj
rmab
e-level data using Robust Multi-array Average (RMA) procedure.
prob
ummary
rmas
ymetrix microarray probe-level data using Robust Multi-array Average
Aff
A) procedure.
(RM
yinvarsetnorm
aff
tensities from multiple Affymetrix CEL or DAT files.
in
— Perform background adjustment on Affymetrix microarray
— Calculate gene (probe set) expression values from
— Read probe intensities from Affymetrix CEL files
— Perform rank invariant set normalization on probe
55
Bioinformatics Toolbox™ Release Notes
Following is a new function for two-color microarray normalization:
mainvarsetnorm — Perform rank invariant set normalization on gene
expression values from two experimen tal conditions or phenotypes.
Following are new functions for microarray differential expression analysis:
mattest — Perform two-sample, two-tailed t-test to evaluate differential
expression of genes from two experimental conditions o r phenotypes.
mavolcanoplot — Create significance versus gene expression ratio (fold
change) scatter plot of microarray data.
Demo for Microarray Functions
New demo of the new microarray functions (Analyzing Affymetrix Microarray Gene Expression Data).
56
Version 2.2.1 (R2006a) Bioinfo rmatics Toolbox™ Software
Version 2.2.1 (R2006a) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.2.1 (Release 2006a):
New Features and Changes
No No Bug Reports No
Version Compatibility Considerations
Fixed Bugs an d Known Problems
Related Documentation at Web Site
57
Bioinformatics Toolbox™ Release Notes
Version 2.2 (R14SP3+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.2 (Release 14SP3+):
New Features and Changes
Yes Details below
Version Compatibility Considerations
No Bug Reports No
New features and changes introduced in this version are
“Multiple Sequence Alignment Viewer” on page 58
“Microarray Functions for Agilent Software” on page 58
“Gene Ontology Database Functions” on page 58
“Demo for Gene Ontology Functions” on page 59
Fixed Bugs an d Known Problems
Related Documentation at Web Site
Multiple Sequence Alignment Viewer
multialignviewer — Interactively view, exp lore alignments, and make
manual modifications.
Microarray Functions for Agilent Software
agferead —ReadanAgilent®Feature Extraction Software file.
magetfield — Utility function to extract data from a microarray.
58
Gene Ontology Database Functions
geneont.geneont — Import the Gene Ontology database from the Web.
geneont.getancestors, geneont.getdescen dant s,
geneont.getrelatives — Get a subset of the ontology.
goannotread — Parse Gene Ontology Annotated files.
num2goid — Convert numbers to Gene Ontology IDs.
Version 2.2 (R14SP3+) Bioinformatics Toolbox™ Software
Demo for Gene Ontology Functions
New demo for the new Gene Ontology functions (geneontologydemo)and working with whole genomes (
biomemorymapdemo).
59
Bioinformatics Toolbox™ Release Notes
Version 2.1.1 (R14SP3) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.1.1 (Release 14SP3):
New Features and Changes
No No Bug Reports No
Version Compatibility Considerations
Fixed Bugs an d Known Problems
Related Documentation at Web Site
60
Version 2.1 (R14SP2+) Bioinformatics Toolbox™ Software
Version 2.1 (R14SP2+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.1 (Release 14SP2+)
New Features and Changes
Yes Details below
Version Compatibility Considerations
No Bug Reports No
New features and changes introduced in this version are:
“Sequence Alignment Functions” on page 61
“Sequence Statistics Functions” on page 62
“Sequence Utilities Functions” on page 62
“Phylogenetic Tree Functions” on page 62
“Phylogenetic Tree Methods” on page 62
“Microarray Functions” on page 62
“Statistics Functions” on page 62
Fixed Bugs an d Known Problems
Related Documentation at Web Site
Sequence Alignment Functions
multialign — Align multiple sequences using a progressive method with
Distributed C omputing Toolbox™ support.
multialignread — Read multiple sequence alignment file.
profalign — Align two profiles using Needleman-Wunsch global
alignment.
showalignment — Updated to show multiply aligned sequences.
seqpdist — Updated to calculate pairwise distances between observations
with Distributed Computing Toolbox support.
61
Bioinformatics Toolbox™ Release Notes
Sequence Statistics Functions
codonbias — Calculate codon frequency for each amino acid in a DNA
sequence.
cpgisland — Locate CpG islands in a DNA sequence.
Sequence Utilities Functions
rebasecuts — Find restriction enzymes that cut a protein sequence.
seqtool —GraphicalUserInterface(GUI)forsinglesequenceanalysis.
Phylogenetic Tree Functions
dnds, dndsml — Estimate synonymous and nonsynonymous substitutions
rates.
seqneighjoin — Reconstruct a phylogenetic tree with a Neighbor-joining
method.
62
Phylogenetic Tree Methods
getcanonical — Calculate the canonical form of a phylogenetic tree.
getnewwickstr — Create a Newick formatted string.
reroot — Change the root of a phylogenetic tree.
subtree — Extract a subtree.
weights — Calculate weights for a phylogenetic tree.
Microarray Functions
probesetplot — Plot values for an Affymetrix CHP file probe set.
Statistics Functions
rankfeatures — Renamed function. The previous name was sqtlfeatures.
Version 2.0.1 (R14SP2) Bioinformatics Toolbox™ Software
Version 2.0.1 (R14SP2) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.0.1 (Release 14SP2):
New Features and Changes
Yes Details below
Version Compatibility Considerations
No Bug Reports No
New features and changes introduced in this version are
“Updated RBASE Table” on page 63
“Expanded Bioperl Demonstration” on page 63
Fixed Bugs an d Known Problems
Related Documentation at Web Site
Updated RBASE Table
RBASE is the enzyme table that the function restrict uses to locate sequence patterns.
Expanded Bioperl Demonstration
Example of calling the MATLAB software from Perl scripts now includes several example s of passing various types of data (both directly and by variant variable) back and forth between Perl and a MATL AB Automation Server. To view the demo, type
bioperldemo.
63
Bioinformatics Toolbox™ Release Notes
Version 2.0 (R14SP1+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 2.0 (Release 14SP1+):
New Features and Changes
Yes Details below
Version Compatibility Considerations
No
New features and changes introduced in this version are
“Mass Spectrometry Data Analysis” on page 64
“Graph Visualization Object and Methods” on page 65
“Statistical Learning” on page 65
“Sequence Analysis” on page 65
“Protein Analysis” on page 66
“Microarray Analysis” on page 66
“Updated Web Connectivity Function” on page 67
Fixed Bugs an d Known Problems
No bug fixes
Related Documentation at Web Site
No
Mass Spectrometry Data Analysis
Following are new function s designed for preprocessing and classification of raw mass spectrometry data from SELDI-TOF and MALDI-TOF spectrometers.
64
msresample — Resa mple with antialias filtering.
msbackadj — Correct a baseline by estimation.
msalign — Align a spectrum to a set of candidate peaks.
msheatmap — Draw a heat map image for a set of spectra and check
alignments.
msnorm — Normalize a set of spectra.
mslowess — Nonparametric smoothing using the Lowess method.
Version 2.0 (R14SP1+) Bioinformatics Toolbox™ Software
mssgolay — Least-squares polynomial smoothing.
msviewer — Plot a spectrum or a set of spectra.
Graph Visualization Object and Methods
New object and set of methods to view relationships between data with interactive maps.
biograph — Function to create a biograph object.
dolayout — Calculate node and edge positions.
getnodesbyid — Get handles to nodes.
getedgesbynodeid — G et handles to edges.
view —Renderagraphinitsviewer.
getancestors —Findancestors.
getdescendants —Finddescendants.
getrelatives —Findneighbors.
Statistical Learning
New set of functions to classify data and identify features in the data.
classperf — Evaluate the performance of a classifier.
crossvalind — Cross-validation index generation.
knnclassify — K-Nearest neighbor classifier.
knnimpute — Impute m issing data using the nearest neighbor method.
randfeatures — Randomized subset feature selection.
sqtlfeatures — Sequential forward feature selection. This function will
be renamed to
svmclassify — Classify using a support vector machine classifier.
svmtrain — Train a support vector machine classifier.
rankfeatures in Version 2.1.
Sequence Analysis
New functions for analysis and visualization of multiple sequences.
65
Bioinformatics Toolbox™ Release Notes
seqconsensus — Computes the consensus sequence for a set of sequences.
seqlogo — Displays sequence logos for DNA and protein sequences.
seqprofile — Computes the sequence profile of a multiple alignment.
Updated functions.
palindromes — Updated to allow for gaps in the palindrome.
seqshoworfs, seqshowwords, showalignment —Updatedtodisplaythe
results in a Figure window. (This may cause problems on the Mac
®
.)
In Bioinformatics Toolbox 2.0 the functions
seqshoworfs,andshowalignment use Java™ based figures. Currently on
the Macintosh
®
, Java figures are not enabled by default. If you use these
seqlogo, seqshowwords,
functions on a Macintosh, you should start the MATLAB software with
matlab -useJavaFigures
Protein Analysis
pdbplot — Plots 3-D backbone structure of proteins in a PDB file.
Microarray Analysis
quantilenorm — Quantile normalization.
New set of functions for working with Affymetrix GeneChip data sets.
probelibraryinfo — Get library information for a probe.
probesetlink — Show probe set information from NetAffx™.
probesetlookup — Get gene information for a probe set.
probesetplot —Plotprobesetvalues.
probesetvalues — Get probe set values from CEL and CDF information.
66
manorm — Normalization by scaling and centering replaces the functions
mamadnorm and mameannorm.
Version 2.0 (R14SP1+) Bioinformatics Toolbox™ Software
affyread — Updated with output structures that have changed slightly.
Some redundant fields have been removed from CDF and CHP structure. GIN database files are now supported. Version 4 of the Affymetrix GDAC File Access Runtime Libraries is provided.
Note If you use mamadnorm or mameannorm in any of your personal M-files, please update your files with the new function
manorm. These functions are
now obsolete and may be removed from future releases of the Bioinformatics Toolbox software.
geosoftread — Updated with supports Gene Expression Omnibus
Database records (GDS files).
maimage — Updated with supports Affymetrix CEL data.
maboxplot — Now supports Affymetrix CHP data.
Updated Web Connectivity F unction
getgenbank — Now returns CDS information for a gene in a structure
allowing direct access to the transcribed sequence.
67
Bioinformatics Toolbox™ Release Notes
Version 1.1.1 (R14SP1) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 1.1.1 (Release 14SP1):
New Features and Changes
No No
Version Compatibility Considerations
Fixed Bugs an d Known Problems
No bug fixes
Related Documentation at Web Site
No
68
Version 1.1 (R14) Bioinformatics Toolbox™ Software
Version 1.1 (R14) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 1.1 (Release 14)
New Features and Changes
Yes Details below
Version Compatibility Considerations
No
New features and changes introduced in this version are
“Phylogenetic Analysis Functions” on page 69
“Phylogenetic Tree Object and Methods” on page 70
“Hidden Markov Model (HMM) Profiles” on page 70
“BLAST Functions” on page 71
“Microarray Functions” on page 71
“Protein Analysis Function” on page 71
“Sequence Alignment Functions” on page 71
“New Demos” on page 71
Fixed Bugs an d Known Problems
No bug fixes
Related Documentation at Web Site
No
Phylogenetic Analysis Functions
New functions for phylogenetic tree creation and analysis.
phytreeread —ReadaNewick-formattedtreefileintotheMATLAB
workspace and return a phytree object with data from the file. Data in the file uses the Ne wick (New Hampshire) format for describing trees.
phytreewrite — Copy the contents of a phytree object from the MATLAB
workspacetoafile.
phytreetool — Interactive GUI that allows you to view, edit, and
explore phylogenetic tree data. This GUI allows branch pruning, reordering, renaming, and distance exploring. It can also open or save Newick-formatted files.
69
Bioinformatics Toolbox™ Release Notes
seqlinkage — Construct a phylogenetic tree from pairwise distances.
seqpdist — Calculate the pairwise distance between biological sequences.
Phylogenetic Tree Object and Methods
New object for manipulating phyloge n etic tree data.
phytree — Function to create a phytree object.
get — Get property values from a phytree object
getbyname — Get node names from a phytree object.
pdist — Calculate the patristic distances between pairs of leaf nodes.
plot — Draw a phylogenetic tree object in a M ATLA B Figure window as a
phylogram, cladogram, or radial tree.
prune — Remove nodes from a phylogenetic tree.
select — Select branches and leaves from a phylogenetic tree using a
specified criteria.
70
view — Open a phylogenetic tree in a phytreetool window.
Hidden Markov Model (HMM) Profiles
Updated Hidden Markov Model profile functions.
The model structure that HMM functions use now includes loop and null
transition probabilities. You can read null and loop probabilities from PFAM files using
gethmmprof.
When the function
transition probabilities default to predefined values. If necessary, you can later m odify the probabilities using the same function.
hmmprofalign includes two new properties to control the scoring of
flanking states and null transition probabilities. In addition, a third output argument with indices pointing to the respective symbols of the query sequence was added.
pfamhmmread,and,fromPFAMWebdatabases,using
hmmprofstruct builds an HMM model, the loop and null
Version 1.1 (R14) Bioinformatics Toolbox™ Software
BLAST Functions
blastncbi, blastread, getblast — BLAST sequences and view results from
within the MAT LAB software.
Microarray Functions
imageneread — Read microarray data from an ImaGene®Results file.
affyread — Read microarray data from Affymetrix GeneChip files.
gprread — Read m icroarray data from GenePix
mapcaplot — Create a Principal Component plot of expression profile data.
clustergram — Updated function to do two way bi-clustering.
®
Results (GP R) files.
Protein Analysis Function
isoelectric — Estimate the isoelectric point (the pH at which the protein
has a net charge of zero) for an amino acid sequence and estimate the charge for a given pH.
Sequence Alignment Functions
seqdisp — Formats sequence output for easy viewing.
seqmatch — Find matches for every string in a library.
seqdotplot — Updated function now returns a second output (the matrix
of matches a s a sparse matrix).
aminolookup , baselookup — Updated functions to get IUB/UPAC
character codes, integer codes, and names for nucleotides and amino acids.
New Demos
Bicluster demo — Demonstrates some of the options of the clustergram
function.
Bioperl demo — Illustrates the interoperability between the MATLAB
software and Bioperl, passing arguments from the MATLAB software to Perl scripts and pulling BLAST search data back to the MATLAB software.
71
Bioinformatics Toolbox™ Release Notes
Phytree demo for Hominidae species— A phylogenetic tree is
constructed from mtDNA sequences for the Hominidae taxa (also known as pongidae). This family e mbraces the gorillas, chimpanzees, orangutans and the humans.
Phytree demo for HIV/SIV — Analyzes the reconstruction of phylogenetic
trees from infected HIV/SIV organisms.
72
Version 1.0 (R13+) Bioinformatics Toolbox™ Software
Version 1.0 (R13+) Bioinformatics Toolbox Software
This table summarizes what’s new in Version 1.0 (Release 13+):
New Features and Changes
Yes Details below
Version Compatibility Considerations
No
New features and changes introduced in this version are
“Introduction to Bioinformatics Toolbox” on page 73
“Databases and Data Formats” on page 74
“Sequence Alignment” on page 74
“Sequence Utilities and Statistics” on page 74
“Microarray Normalization andVisualization”onpage74
“Protein Structure Analysis” on page 75
Fixed Bugs an d Known Problems
No bug fixes
Related Documentation at Web Site
V1.0 product documentation
Introduction to Bioinformatics Toolbox
Bioinformatics Toolbox Version 1.0(WebReleaseR13SP1+)extendsthe MATLAB software with basic sequence analysis and gene expression analysis functions. Bioinformatics Toolbox is a collection of tools built on the MATLAB numeric computing environment. The toolbox supports a wide range of common sequence analysis and expression analysis tasks, from accessing Web-based databases, to sequence alignment, to microarray normalization and visualization.
Bioinformatics Toolbox is dependent on many functions from the Statistics Toolbox™ software, including some functions available only in the latest version of Statistics Toolbox 4.1. We recommend that you install the latest version of the Statistics Toolbox software before running the Bioinformatics Toolbox software.
73
Bioinformatics Toolbox™ Release Notes
Bioinformatics Toolbox 1.0 has more than 100 functions implemented using M-files. For a complete list of functions, in the MATLA B Command Window, type
help bioinfo
Databases and Data Formats
The toolbox provides functions to directly access many standard Web-based databases such as GenBank, EMBL, PIR, and PDB. There are also functions to read many standard file formats, including FASTA and PDB. For microarray data, there are functionstoreadAffymetrix,GenePix,SPOT format data, and a function to access data directly from the N C BI Gene Expression Omnibus Web site.
Sequence Alignment
The toolbox has functions for pairwise sequence alignment and for hidden Markov model sequence profile alignment, including efficient MATLAB implementations of the Needleman-Wunsch and Smith-Waterman algorithms. In addition to the alignment functions there are several tools for visualizing sequence alignments. The toolbox provides many standard scoring matrices, including the PAM and BLOSUM families.
74
Sequence Utilities and Statistics
The toolbox contains many functions for working with sequences. There are functions for converting DNA sequences to RNA or amino acid sequences; there are functions that report various statistics about sequences, and functions to search for patterns within the sequence; there are functions for creating random sequences, and there are functions to perform in-silico digestion of sequences with restriction enzymes and proteases.
Microarray Normalization and Visualization
The toolbox contains a number of functions for normalizing microarray data including lowess normalization, global mean normalization, and MAD normalization. The toolbox provides several functions for visualizing microarray data, including spatial heat maps, box plots, loglog, and I-R plots. The toolbox also uses functions from the Statistics Toolbox software to perform cluster analysis and to visualize the results.
Version 1.0 (R13+) Bioinformatics Toolbox™ Software
Protein Structu
In addition to st user interface ( sequences.
andard sequence analysis functi ons, there is also a graphical
GUI),
re Analysis
proteinplot, for visualizing properties of protein
75
Bioinformatics Toolbox™ Release Notes
Compatibility Summary for Bioinformatics Toolbox Software
This table summarizes new features and changes that might cause incompatibilities when you upgrade from an earlier version, or wh en you use files on multiple versions. Details are provided in the description of the new feature or change.
Version (Release)
Latest Version V3.5 (R2010a)
V3.4 (R2009b) See the Compatibility Considerations subheading
V3.3 (R2009a) See the Compatibility Considerations subheading
V3.2 (R2008b) See the Compatibility Considerations subheading
V3.1 (R2008a) See the Compatibility Considerations subheading
NewFeaturesandChangeswithVersion Compatibility Impact
See “Function Elements Being Removed” on page 7.
for this change:
“Data Visualization Functions” on page 10
“Phylogenetic Tree Tools and Methods” on page 12
“Clustergram Methods and Properties” on page 13
for this change:
“Microarray Functions” on page 18
for these changes:
“Data Format and Database Functions” on page 20
“Microarray File Format Functions” on page 24
“Microarray Functions” on page 25
for these changes:
76
“Data Format and Database Functions” on page 27
“Microarray Functions” on page 30
Compatibility Summary for Bioinformatics Toolbox™ Software
Version (Release)
NewFeaturesandChangeswithVersion Compatibility Impact
V3.0 (R2007b) See the Compatibility Considerations subheading
for these changes:
“Statistical Learning Functions” on page 36
V2.6 (R2007a+) See the Compatibility Considerations subheadings
for these changes:
“Data Formats and Databases Functions” on page 39
“Microarray File Formats Functions” on page 40
“Microarray Utility Functions” on page 40
V2.5 (R2007a) See the Compatibility Considerations subheadings
for these changes:
“Data Formats and D atabase Functions” on page 43
“Statistical Learning Functions” on page 44
“Protein Analysis and Sequence Utilities Functions”
on page 44
“Sequence Alignment Functions” on page 45
V2.4 (R2006b)
V2.3 (R2006a+)
V2.2.1 (R2006a)
V2.2 (R14SP3+)
V2.1.1 (R14SP3)
V2.1 (R14SP2+)
V2.0.1 (R14SP2)
V2.0 (R14SP1+)
“Microarray File Formats Functions” on page 46
“Phylogenetic Tree Tools Functions” on page 48
None
None
None
None
None
None
None
None
77
Bioinformatics Toolbox™ Release Notes
Version (Release)
V1.1.1 (R14SP1)
V1.1 (R14)
V1.0 (R13+)
NewFeaturesandChangeswithVersion Compatibility Impact
None
None
None
78
Loading...