Johnllopez Week 10
From LMU BioDB 2017
Revision as of 20:05, 6 November 2017 by Johnllopez616 (talk | contribs) (Uploaded my GO and Gene tables, powerpoint, spreadsheets, and .txt files)
Electronic Lab Notebook
Preparing My Microarray Data File for Loading into STEM
- I started this portion by downloading the following spreadsheets. I added a new worksheet named dSWI4_stem, selected the values from dSWI4_ANOVA, and copied them into the new worksheet.
- I modified this further by renaming the header "Master_Index" column to "SPOT", "ID" to "Gene Symbol", and deleting the "Standard_Name" column.
- I filtered the data on the B-H corrected p-value column to be greater than 0.05, and deleted all the data in the header row. Once I undid the filter, this ensured that all of the genes within the data set would have a B-H corrected p-value of <.05. The result was 2794 genes remaining.
- I then deleted all of the columns except for the Average Log Fold change columns at the timepoints. I renamed the columns with just time and units. This would be used for analyzing the timepoints in STEM later on.
- In addition, to avoid complications with the STEM software, I replaced any values with the error #DIV/0! with a blank string. There were 40 replacements made.
- I saved the spreadsheets as usual, then I saved it as a .txt file, which you can see here.
Downloading and Extracting STEM Software / Running STEM
- I was able to successfully download the STEM software by going to the following link, downloading/extracting the file, and clicking on the .jar program within it : http://www.cs.cmu.edu/~jernst/stem/
- Before I ran the software using the .txt file, I changed several settings. For the expression data info, I uploaded the .txt file, selected "no normalization" and "spot ID's included in file".
- In the gene info section, I selected "SGD" for the Gene Annotation Source, "no cross references", and "no gene locations". This ensured that the data would only come from SGD and be specialized for yeast.
- Finally, before executing the file, I made sure the clustering method was "STEM Clustering Method".
Viewing and Saving STEM Results
- After changing the Interface Options to say "X-axis scale should be based on real time", I took a screenshot of the "All STEM Profiles(1)" window, and placed it into the powerpoint given in the next step.
- The following powerpoint contains screenshots of each of the individual colored boxes, which meant that these p-values within that color have a statistically significant number of assigned genes.
- This .zip file contains a series of the genes belonging to each individual significant profile.
- This .zip file contains a series of the gene ontology terms belonging to each individual significant profile.
Analyzing and Interpreting STEM Results
- I chose profile 36 to answer the following questions. I found 36 to be the most interesting because of the drastic expression changes at 30m, 90m, and 120m when the expression changes go from positive, to negative, then to positive again.
- 55 genes belong to this profile.
- 30.5 genes were expected to belong to this profile.
- The p-value for the enrichment of genes is 3.5E-5, or 0.000035.
- After filtering the Gene Ontology terms associated with profile 36 to have a p-value > .05, I discovered that 88 of them were associated with it.
- After filtering the GO terms associated with profile 36 to have a corrected p-value > .05, I discovered that 133 of them were associated with it.
- I then selected the following terms from my filtered list: "regulation of metabolic process", "catalytic activity, acting on RNA", "hydrolase activity, acting on ester bonds", "organelle part", "transcription, DNA-templated", and "response to chemical".
GO Definitions
- Regulation of Metabolic Process: Any process that modulates the frequency, rate or extent of the chemical reactions and pathways within a cell or an organism.
- Catalytic Activity, Acting on RNA: Catalytic activity that acts to modify RNA.
- Hydrolase Activity, Acting on Ester Bonds: Catalysis of the hydrolysis of any ester bond.
- Organelle Part:Any constituent part of an organelle, an organized structure of distinctive morphology and function. Includes constituent parts of the nucleus, mitochondria, plastids, vacuoles, vesicles, ribosomes and the cytoskeleton, but excludes the plasma membrane.
- Transcription, DNA-templated:The cellular synthesis of RNA on a template of DNA.
- Response to Chemical:Any process that results in a change in state or activity of a cell or an organism (in terms of movement, secretion, enzyme production, gene expression, etc.) as a result of a chemical stimulus.
Summary
Acknowledgements and References
Acknowledgements
References
http://amigo.geneontology.org/amigo/term/GO:0019222 http://amigo.geneontology.org/amigo/term/GO:0006351 http://amigo.geneontology.org/amigo/term/GO:0044422 http://amigo.geneontology.org/amigo/term/GO:0016788 http://amigo.geneontology.org/amigo/term/GO:0140098 http://amigo.geneontology.org/amigo/term/GO:0042221