Vizxlabs GENESIFTER 2005 User Manual

User’s Guide
GeneSifter Overview
• Login
• Upload Tools
• Pairwise Analysis
• Create Projects
For more information about a feature see the corresponding page in the User’s Guide noted in the blue circle ( ) .
5
4
7
Upload Tools
Upload microarray data files.
33
38
63
Scatter Plot
Interactive scatter plot provides visualization of the entire array data set and identification of individual genes.
Ontology Report
Summarize ontology terms for a gene list and assess the biological significance of the genes within the list.
44
User Login
Access your secure account from any computer (PC or Mac) with Internet access.
74
Create New Project
Create user-defined projects with two or more groups. See next page for project analysis options.
Pairwise Analysis
Define two groups and apply normalization, statistical analysis and quality metrics to create lists of differentially expressed genes.
40
One-Click Gene Summary™
Provides a synopsis of the most current information available for the genes on your array. It includes information from UniGene, LocusLink, Gene Ontology terms and more.
Export Results
Export data and gene annotation to Excel.
GeneSifter Overview
• Project Analysis
• Filtering
• Function Navigation
• Pattern Navigation
•Clustering
58
Filtering
Apply fold change cutoffs, statistical analysis and quality metrics
to create lists of differentially expressed genes.
59
Clustering
Identify patterns of gene expression with unsupervised clustering functions.
54
63
Ontology Report
Summarize ontology terms for a gene list and assess the biological significance of the genes within the list.
62
*
50
Project Analysis
Project analysis functions allow analysis across all conditions in a project.
Pattern Navigation
52
Define and identify patterns of gene expression with supervised clustering.
Function Navigation
Rapidly identify and group genes based on function using Gene Ontology terms.
Ontology Report, Cluster
*
Samples and the One-Click Gene Summary are available for all types of project analysis.
Cluster Samples
Use hierarchical clustering to determine the relationship of samples based on a gene list.
44
One-Click Gene Summary™
Includes information from UniGene, LocusLink, Gene Ontology terms
and more.
GeneSifter
Introduction and Login
Welcome to GeneSifter, the web­based microarray data management and analysis system, which relies on VizX Labs’ BIOME™ bioinformatics software engine. This document gives an overview of some of the features available in Genesifter. To get started from www.genesifter.net these steps:
1. Select the Login button from the
top right corner.
2. Enter your user name and password in their respective prompts and click on the Login button.
4. A successful login should show a screen with control panel on the left and the most recent announcements concerning Genesifter.
please follow
1
2
3
Genesifter Support and Sales
E-mail: support@genesifter.net Toll-free: 1-877-WEB-GENE Direct: 206-283-4363
Genesifter
Online Help
Genesifter provides page-specific online help.
1. Click on the help icon ( ) to access page-specific help documents. The help icon can be found in the upper right corner of most pages.
2. Clicking on the help icon will open a new browser window which will list the help available for that particular page. Select the document you wish to view.
1
2
Uploading Data
Upload Tools
1. In order to upload data, select
Upload Tools from the control panel on the left.
2. GeneSifter offers four tools to load data. The application you use will depend on the format and origins of the data being loaded. QuickLoad Wizard,
Batch Upload, FlexLoad Wizard and Advanced Upload Methods are further described
in the following pages.
2
1
Uploading Data
Using QuickLoad Wizard
Use the QuickLoad Wizard to load your data into GeneSifter. Supported platforms for this tool include:
Affymetrix (native CHP files, or CHP files saved as tab-delimited text)
Codelink
Pathways™ 2 & 4
Spot-On
Mergen arrays scanned with GenePix®
(all other GenePix files may be loaded using FlexLoad)
Data Files may be archived (zipped) prior to upload.
1. Select Upload Tools from the
control panel on the left.
2. Select Run QuickLoad Wizard from
the Upload Tools page. A new window will guide the user in the upload process.
2
1
Uploading Data
Using QuickLoad Wizard (continued)
3. Select the array manufacturer or the image analysis software used and then click the Next button.
4. Select your array from the list of available arrays. If your array is not listed, please
contact scientific support for information on adding the array.After you have selected your array, click Next.
Note for Affymetrix Users: Auto Column
Detection should work for data from MAS 5 and GCOS.
3
4
Uploading Data
Using QuickLoad Wizard (continued)
5. Now you will enter information about
the sample (referred to as “Target”) that was hybridized to the array. If you have already entered information about the target, select it from the Select Target pull-down menu, otherwise enter your target using Create New Target.
6. If you create a new target, you will need to select an appropriate “Condition” from the pull-down menu. Otherwise, enter your condition using Create New Condition.
7. Select Next when you have entered all needed information.
8. If your array has two channels, repeat steps 5, 6, and 7 for the second channel.
5
6
7
Uploading Data
Using QuickLoad Wizard (continued)
9. Select Browse and find your data file on your local computer. Select the file and then click the Next button to upload the file to GeneSifter.
10. You will now see a summary of the information you have provided. You can enter a description for the experiment(s) being uploaded. Select Save Data to save the data in your GeneSifter account.
11. When the data is successfully uploaded, you will see Success! as the Last Upload status. Common reasons for failure include: data not in the correct format, or selecting the wrong array at step 4.
12. After saving, you can either load more data by selecting Next or exit the
QuickLoad Wizard by selecting Done.
9
10
11
12
Uploading Data
Using QuickLoad Wizard
Affymetrix Manual Column Detection
The Upload Wizard will automatically detect the proper data columns for text files generated from MAS 4, MAS 5 and GCOS. If auto detection fails, you can use Manual Column Detection.
1. Affymetrix MAS 5 data format. This is an example of a file from the U34B GeneChip®. Formats may vary due to differences in export from MAS 5. Generally the first column will contain the probeset ID. This identifies each gene on the array. The two columns that are needed by GeneSifter for analysis are the columns containing the signal and detection value for that probe set. In this example, column B, which is labeled Signal, contains the derived signal value and column C, labeled Detection, contains the quality value. There may be additional columns present in a data file and the heading for the signal and quality values may be different than what is presented here. In general the signal column will either be labeled Signal or will have the word Signal at the end of the column name. This column can contain both positive and negative numbers. The quality column will generally be labeled Detection or will have the word Detection at the end of the name. This column will contain the letters A, M and P.
1
Uploading Data
Using QuickLoad Wizard
Affymetrix Manual Column Detection
(continued)
2. Select Run QuickLoad Wizard as usual (see preceding description). Select Manual from pull-down menu.
3. Enter information about columns in the data file. In the sample file the probeset ID is in the first column (column A), the signal is in column B and the detection call value is in column C, so A would be entered for Probeset Column, B would be entered for Signal Column and C would be entered for Detection Call Column. In the sample file the data begins on the second line of the file so 2 is entered for Data starts on line.
4. Select Next and continue as usual for QuickLoad Wizard. The setting will be saved and used to correctly upload the data.
2
3
4
Uploading Data
Using Batch Upload
Use Batch Upload to load multiple data sets stored in a spreadsheet as a tab­delimited text file. Note: See step 7 for a description of required file format.
1. Select Upload Tools from the control panel on the left.
2. Select Run Batch Upload.
1
2
Uploading Data
Using Batch Upload (continued)
3. Enter a name and description for the array you are uploading. The pull-down menu has options for the type of data being loaded including:
Use Affymetrix Probeset IDs – use if the first column of your file contains Affymetrix Probeset IDs instead of GenBank accession numbers.
This File is a GEO Data Set – use if you downloaded data from GEO as a GEO dataset.
Use CodeLink Quality Values –
use if your file has the CodeLink flags G, M, L.
4. Browse your computer to find the file containing your data. Use the Select Array pull-down if you have previously loaded data from this array and you wish to add to that data set.
3
4
5
5. Select Upload Data.
6. Data uploaded. You can either exit Batch Upload by selecting
Done or select Upload More Data.
6
Uploading Data
Using Batch Upload (continued)
7. The file to be loaded must be a tab­delimited text file (txt). GeneSifter does not accept Excel spreadsheets (xls).
The first column figure) should contain an identifier for that gene. Accepted identifiers are:
Accession Number IMAGE Clone ID Affymetrix Probeset IDs
The second column (Column B) is for an internal identifier. This is left to the discretion of the user and can be left blank.
The next 2 columns contain the intensity (Column C) and quality values (Column D) for that gene in the first experiment to be loaded. Additional experiments are added in the same way (one column for intensity, one for quality).
(Column A in the
7
The first row A: This cell can be empty. B: This cell can be empty. C: Target name for Experiment #1. D: Condition for Experiment #1. E: Target name for Experiment #2. F: Condition name for Experiment #2,
etc.
must contain this information.
Uploading Data
Using Batch Upload (continued)
8. File format for Batch Upload using
housekeeping genes. The format is the same with one exception: the third column must be labeled HKG (housekeeping genes). The genes that are designated as housekeeping should be marked with an x in this column. The intensities and quality values should follow as stated for Batch Upload without housekeeping genes.
In this example the genes in rows 7 and 9 (TRIM9 and GOLGB1) have been designated as housekeeping genes.
8
Uploading Data
Using FlexLoad Wizard
Use the FlexLoad Wizard to load data in GeneSifter if the array you are using is not included in the QuickLoad Wizard. Familiarity with the layout of your files is advised before going any further. You will be asked:
• to provide information about the file structure, e.g. what column describes absolute intensity, background intensity, etc.
• how you want the data transformed, e.g. preserve channel intensities or express as a ratio of the two channels.
• how you want the data normalized, e.g. LOWESS
1. Select Upload Tools from the
control panel on the left.
2. Select Run FlexLoad Wizard.
.
1
2
Uploading Data
Using FlexLoad Wizard (continued)
1. The Protocol Title is the name
given to a protocol (i.e. the settings for a specific type of file to be uploaded). You can select a protocol you have already generated or create a new one.
2. If creating a new protocol, replace
“Untitled Protocol” with a Protocol Title.
3. Enter an optional Description for
the protocol.
4. Click on Create New to begin the
creation of a new protocol.
1
2
3
4
Uploading Data
Using FlexLoad Wizard (continued)
5. If you previously loaded the array into your account, select it from the menu list. Alternatively, if you are creating a new protocol, enter the name of the array in the Create New Array field.
6. Select the number of Channels.
7. Enter the number of files you will be uploading (the maximum allowed at any one time is 30).
8. If you know that the genes are all listed in the same order in every file (experiment) then select Same Order.
If you select Unique IDs, FlexLoad will not assume identical gene order in each file, but instead will utilize a supplied unique identifier. If Unique IDs is selected, every ID for that array must be unique and there cannot be any blank data rows. Ideally, you should use a GenBank Accession Number or an Image Clone ID as the Unique ID to assist in populating the One-Click Gene Summary.
5
6
7
8
Uploading Data
Using FlexLoad Wizard (continued)
If your data has two channels, from Step 6:
9. Select how you want your data
represented:
Intensities
Data for each channel will be stored separately in GeneSifter.
Ratios
Generates a ratio of the intensities of the red and green channels. GeneSifter only saves the ratio to your account.
9
Uploading Data
Using FlexLoad Wizard (continued)
10. Provide the column number or letter that contains the gene ID.
11. Identify the type of gene ID by selecting either Auto Detect,
Accession Number, IMAGE Clone ID, or Other. If Accession Number
or IMAGE Clone ID is selected and the data file contains other identifiers or blank rows, errors may occur. In general Auto Detect should be used.
10
11
12
13
12. Optionally, indicate the column number for gene annotation, if available.
13. Indicate the row number where the intensity data begins. Do not include the column headings.
14. Provide the column numbers that contain the respective data. For example, if your data file contains the cy3 intensity in column 8 and column 10 contains the background, enter 8 for Col 1 and 10 for Col 2 to specify these values for the cy-3 intensities.
Op refers to the operation to perform between the two values. Op may be used to subtract background, or take the ratio of foreground/background for quality control purposes. It is not necessary to enter any value for Op, or Col 2.
14
Uploading Data
Using FlexLoad Wizard (continued)
This window appears only if you chose Ratios in Step 9.
15. Perform LOWESS Normalization on the data.
16. Select how you want to normalize the data.
17. Method for calculating the ratio of intensities. Per file basis allows you to take into account any dye-
swap experiments you may have.
15
16
17
Uploading Data
Using FlexLoad Wizard (continued)
18. If your targets already exist, you can select them from the pull­down menu. If you need to create new targets, use Advanced Settings.
19. Select Browse to upload the file(s). If you were uploading more files, additional rows would be present. In this screen, two files are being uploaded.
19
21
20. For two-color arrays, if Per file basis was selected in step 17,
indicate whether you want the ratios to be cy5/cy3 (5/3) or cy3/cy5 (3/5).
21. A summary of parameters that have been previously selected can be viewed.
22. Select Advanced Settings if you need to enter the target and condition information, or to change the number of files being loaded.
18
20
22
Uploading Data
Using FlexLoad Wizard (continued)
23. Upon selecting Advanced Settings, you can change the
number of files to be loaded.
24. You also have the ability to “Create New Targets/Conditions”. Enter target and condition information for each file to be loaded in the top portion of the screen. Otherwise, select “Use Pre-existing Targets”.
25. You can save the output as a text file formatted for Batch Upload by selecting Save as File or load it directly into GeneSifter by selecting
Upload Files
.
25
23
24
Uploading Data
Advanced Upload Methods
Robust multi-array average (RMA) is a method for deriving expression measurements from the probe level data contained in an Affymetrix CEL file.
GeneSifter users have the option to perform RMA or GC-RMA during the upload of CEL files. The normalized data is saved in the user’s account for further analysis.
1. Locate the Affymetrix .CEL files you want to load on your local computer.
2. All the files to be uploaded and transformed using RMA need to be compressed into a single ZIP file for upload.
3. Select Upload Tools from the Control Panel.
4. Click on Run Advanced
Upload Methods to begin.
1
2
3
4
Uploading Data
Advanced Upload Methods
(continued)
5. Select the normalization method. Select the type of Affymetrix array used in your experiments. If your array is not listed, please contact scientific support (support@genesifter.net help loading your array format.
6. Click the Next button to continue.
7. Click Browse to locate the .zip archive containing the CEL files on your local disk. Please note that all the files to be loaded need to be contained within a single .zip file.
8. Choose how you want to create targets and conditions for the loaded files.
9. Click Next to continue.
) for
5
6
7
8
9
Loading...
+ 60 hidden pages