Difference between revisions of "Dramir36 Week 7"

From LMU BioDB 2019
Jump to navigation Jump to search
(Notes/Methods/Results: added strain and time points that are used)
(moved answer to time point and strain used)
Line 33: Line 33:
  
 
==Notes/Methods/Results==
 
==Notes/Methods/Results==
 +
 +
:The strain that will be analyzed is dHAP4. There are four replicates for the 15,30, and 60 minute time points, but only three replicates for the 90 and 120 minute time points.
  
 
*T test: is this gene expression change significantly different than zero at a time point?
 
*T test: is this gene expression change significantly different than zero at a time point?
Line 54: Line 56:
 
** Each of the column headings from the data begin with the experiment name ("wt" for wild type ''S. cerevisiae'' data, "dCIN5" for the ''Δcin5'' data, etc.).  "LogFC" stands for "Log<sub>2</sub> Fold Change" which is the Log<sub>2</sub> red/green ratio.  The timepoints are designated as "t" followed by a number in minutes.  Replicates are numbered as "-0", "-1", "-2", etc. after the timepoint.
 
** Each of the column headings from the data begin with the experiment name ("wt" for wild type ''S. cerevisiae'' data, "dCIN5" for the ''Δcin5'' data, etc.).  "LogFC" stands for "Log<sub>2</sub> Fold Change" which is the Log<sub>2</sub> red/green ratio.  The timepoints are designated as "t" followed by a number in minutes.  Replicates are numbered as "-0", "-1", "-2", etc. after the timepoint.
 
*** The timepoints are t15, t30, t60 (cold shock at 13°C) and t90 and t120 (cold shock at 13°C followed by 30 or 60 minutes of recovery at 30°C).
 
*** The timepoints are t15, t30, t60 (cold shock at 13°C) and t90 and t120 (cold shock at 13°C followed by 30 or 60 minutes of recovery at 30°C).
* '''''Begin by recording in your wiki, the strain that you will analyze, the filename, the number of replicates for each strain and each time point in your data.'''''
 
:The strain that will be analyzed is dHAP4. There are four replicates for the 15,30, and 60 minute time points, but only three replicates for the 90 and 120 minute time points.
 
  
 
==Data/Files==
 
==Data/Files==

Revision as of 16:21, 15 October 2019

User:Dramir36 template:Dramir36 Skinny Genes

  • Week 1
Week 1
Class Journal Week 1
  • Week 2
Week 2
Class Journal Week 2
Dramir36 Week 2
  • Week 3
Week 3
Class Journal Week 3
CDC28/YBR160W Week 3
  • Week 4
Week 4
Class Journal Week 4
Dramir36 Week 4
  • Week 5
Week 5
Class Journal Week 5
CRISPRlnc Group Journal
  • Week 6
Week 6
Class Journal Week 6
Dramir36 Week 6
  • Week 7
Week 7
Class Journal Week 7
Dramir36 Week 7
  • Week 8
Week 8
Class Journal Week 8
Dramir36 Week 8
  • Week 9
Week 9
Class Journal Week 9
Dramir36 Week 9
  • Week 10
Week 10
Class Journal Week 10
Dramir36 Week 10
  • Week 11
Week 11
Dramir36 Week 11
  • Week 12/13
Week 12/13
Dramir36 Week 12/13
  • Week 14
  • Week 15

Purpose

  • to conduct the "analyze" step of the data life cycle for a DNA microarray dataset.
  • to develop an intuition about what different p-value cut-offs mean.
  • to keep a detailed electronic laboratory notebook to facilitate reproducible research.
  • to revisit the "Deception at Duke" case with new insights because you have analyzed your own dataset.

Background

This is a list of steps required to analyze DNA microarray data.

  1. Quantitate the fluorescence signal in each spot
  2. Calculate the ratio of red/green fluorescence
  3. Log2 transform the ratios
    • Steps 1-3 have been performed for you by the GenePix Pro software (which runs the microarray scanner).
  4. Normalize the ratios on each microarray slide
  5. Normalize the ratios for a set of slides in an experiment
  6. Perform statistical analysis on the ratios
  7. Compare individual genes with known data
    • Steps 6-7 are performed in Microsoft Excel
  8. Pattern finding algorithms (clustering)
  9. Map onto biological pathways
    • We will use software called STEM for the clustering and mapping
  10. Identifying regulatory transcription factors responsible for observed changes in gene expression
  11. Dynamical systems modeling of the gene regulatory network (GRNmap)
  12. Viewing modeling results in GRNsight

Notes/Methods/Results

The strain that will be analyzed is dHAP4. There are four replicates for the 15,30, and 60 minute time points, but only three replicates for the 90 and 120 minute time points.
  • T test: is this gene expression change significantly different than zero at a time point?
p>0.05 5%
probability that you would have seen at least this big of a change by chance.
  • ANOVA: is the gene expression significantly different than zero at any time point?
  • Values below 0.25 should be considered to be a gene with no change in expression

Experimental Design and Getting Ready

The data used in this exercise is publicly available at the NCBI GEO database in record GSE83656.

  • Begin by downloading the Excel file for dHAP4, found in the "Data/Files" section of this page
  • In the Excel spreadsheet, there is a worksheet labeled "Master_Sheet_dHAP4"
    • In this worksheet, each row contains the data for one gene (one spot on the microarray).
    • The first column contains the "MasterIndex", which numbers all of the rows sequentially in the worksheet so that we can always use it to sort the genes into the order they were in when we started.
    • The second column (labeled "ID") contains the Systematic Name (gene identifier) from the Saccharomyces Genome Database.
    • The third column contains the Standard Name for each of the genes.
    • Each subsequent column contains the log2 ratio of the red/green fluorescence from each microarray hybridized in the experiment (steps 1-5 above having been performed for you already), for each strain starting with wild type and proceeding in alphabetical order by strain deletion.
    • Each of the column headings from the data begin with the experiment name ("wt" for wild type S. cerevisiae data, "dCIN5" for the Δcin5 data, etc.). "LogFC" stands for "Log2 Fold Change" which is the Log2 red/green ratio. The timepoints are designated as "t" followed by a number in minutes. Replicates are numbered as "-0", "-1", "-2", etc. after the timepoint.
      • The timepoints are t15, t30, t60 (cold shock at 13°C) and t90 and t120 (cold shock at 13°C followed by 30 or 60 minutes of recovery at 30°C).

Data/Files

Conclusion

Acknowledgments

  • Copied purpose, methods, and procedure from Week 7 assignment page to individual journal and modified steps to relate to the dHAP4 data

References