Ntesfaio Week 9
Contents
Purpose
The purpose of this week's assignment is to continue from Week 8 of analyzing a microarray dataset and build on the electronic notebook.
Methods
Viewing and Saving Stem Results
powerpoint with all screenshots
- A new window will open called "All STEM Profiles (1)".
- Each box corresponds to a model expression profile. Colored profiles have a statistically significant number of genes assigned; they are arranged in order from most to least significant p value. Profiles with the same color belong to the same cluster of profiles. The number in each box is simply an ID number for the profile.
- Click on the button that says "Interface Options...". At the bottom of the Interface Options window that appears below where it says "X-axis scale should be:", click on the radio button that says "Based on real time". Then close the Interface Options window.
Take a screenshot of this window (on a PC, simultaneously press the Alt and PrintScreen buttons to save the view in the active window to the clipboard) and paste it into a PowerPoint presentation to save your figures.
- Click on each of the SIGNIFICANT profiles (the colored ones) to open a window showing a more detailed plot containing all of the genes in that profile.
- Take a screenshot of each of the individual profile windows and save the images in your PowerPoint presentation.
- At the bottom of each profile window, there are two yellow buttons "Profile Gene Table" and "Profile GO Table". For each of the profiles, click on the "Profile Gene Table" button to see the list of genes belonging to the profile. In the window that appears, click on the "Save Table" button and save the file to your desktop. Make your filename descriptive of the contents, e.g. "wt_profile#_genelist.txt", where you replace the number symbol with the actual profile number.
- Upload these files to the wiki and link to them on your individual journal page. (Note that it will be easier to zip all the files together and upload them as one file).
- For each of the significant profiles, click on the "Profile GO Table" to see the list of Gene Ontology terms belonging to the profile. In the window that appears, click on the "Save Table" button and save the file to your desktop. Make your filename descriptive of the contents, e.g. "wt_profile#_GOlist.txt", where you use "wt", "dGLN3", etc. to indicate the dataset and where you replace the number symbol with the actual profile number. At this point you have saved all of the primary data from the STEM software and it's time to interpret the results!
- Upload these files to the wiki and link to them on your individual journal page. (Note that it will be easier to zip all the files together and upload them as one file).
Analyzing and Interpreting STEM results
Select one of the profiles you saved in the previous step for further intepretation of the data. I suggest that you choose one that has a pattern of up- or down-regulated genes at the cold shock timepoints. Each member of your group should choose a different profile. Answer the following:
I chose to analyze STEM result 45
Why did you select this profile? In other words, why was it interesting to you?
I selected this profile because it had a incline but then stayed steady for a while before declining.
How many genes belong to this profile?
354.0
How many genes were expected to belong to this profile?
44.3 genes were expected
What is the p value for the enrichment of genes in this profile?
p value of 2.9E-201 (significant)
- Bear in mind that we just finished computing p values to determine whether each individual gene had a significant change in gene expression at each time point. This p value determines whether the number of genes that show this particular expression profile across the time points is significantly more than expected.
- Open the GO list file you saved for this profile in Excel. This list shows all of the Gene Ontology terms that are associated with genes that fit this profile. Select the third row and then choose from the menu Data > Filter > Autofilter. Filter on the "p-value" column to show only GO terms that have a p value of < 0.05.
How many GO terms are associated with this profile at p < 0.05?
64
- The GO list also has a column called "Corrected p-value". This correction is needed because the software has performed thousands of significance tests. Filter on the "Corrected p-value" column to show only GO terms that have a corrected p value of < 0.05.
How many GO terms are associated with this profile with a corrected p value < 0.05?
30
Select the top 6 Gene Ontology terms from your filtered list (either p < 0.05 or corrected p < 0.05).
Top 6 for corrected p < 0.05 are:
GO:0005730
GO:0005355
GO:1904659
GO:0006351
GO:0015761
GO:0015755
Note whether the same GO terms are showing up in multiple clusters.
- Look up the definitions for each of the terms at http://geneontology.org. In your research presentation, you will discuss the biological interpretation of these GO terms.
In other words, why does the cell react to cold shock by changing the expression of genes associated with these GO terms? Also, what does this have to do with the transcription factor being deleted (for the groups working with deletion strain data)?
- To easily look up the definitions, go to http://geneontology.org.
- Copy and paste the GO ID (e.g. GO:0044848) into the search field on the left of the page.
In the results page, click on the button that says "Link to detailed information about <term>, in this case "biological phase"". The definition will be on the next results page, e.g. here.
GO:0005730
Name: Nucleolus
Definition: A small, dense body one or more of which are present in the nucleus of eukaryotic cells. It is rich in RNA and protein, is not bounded by a limiting membrane, and is not seen during mitosis. Its prime function is the transcription of the nucleolar DNA into 45S ribosomal-precursor RNA, the processing of this RNA into 5.8S, 18S, and 28S components of ribosomal RNA, and the association of these components with 5S RNA and proteins synthesized outside the nucleolus. This association results in the formation of ribonucleoprotein precursors; these pass into the cytoplasm and mature into the 40S and 60S subunits of the ribosome.
GO:0005355
Name: Glucose Transmembrane Transporter Activity
Definition: Enables the transfer of the hexose monosaccharide glucose from one side of a membrane to the other
GO:1904659
Name: glucose transmembrane transport
Definition: The process in which glucose is transported across a membrane.
GO:0006351
Name: transcription, DNA-templated
Definition: The cellular synthesis of RNA on a template of DNA.
GO:0015761
Name: mannose transmembrane transport
Definition: The process in which mannose is transported across a lipid bilayer, from one side of a membrane to the other. Mannose is the aldohexose manno-hexose, the C-2 epimer of glucose. The D-(+)-form is widely distributed in mannans and hemicelluloses and is of major importance in the core oligosaccharide of N-linked oligosaccharides of glycoproteins.
GO:0015755
Name: fructose transmembrane transport
Definition: The directed movement of fructose into, out of or within a cell, or between cells, by means of some agent such as a transporter or pore. Fructose exists in a open chain form or as a ring compound. D-fructose is the sweetest of the sugars and is found free in a large number of fruits and honey.
Using YEASTRACT to Infer which Transcription Factors Regulate a Cluster of Genes (Tuesday, October 29)
How many transcription factors are green or "significant"?
22
Are CIN5, GLN3, and/or HAP4 on the list? If so, what is their "% in user set", "% in YEASTRACT", and "p value".
CIN5p is 13.51% in user set, and 2.61% in scerevisiae. The p value is 9.74 x 10^-13
Gln3p is 38.79% in user set, and 5.60% in scerevisiae. The p value is 1.92 x 10^13
Hap4p is 15.80% in user set, and 5.04% in scerevisiae. The p value is 5.58 x 10^14
Visualizing Your Gene Regulatory Networks with GRNsight
References
Excel
STEM.Short Time-series Expression Miner. Retrieved on October 28, 2019 from http://www.cs.cmu.edu/~jernst/stem/
Google Powerpoint Slides
Data and Files
powerpoint with all screenshots
Acknowledgements
My homework partners this week are Aby User:Ymesfin and David User:Dramir36. We sat together in class to go over the assignment.
Methods were copied from Week 9