Ntesfaio Week 9

From LMU BioDB 2019
Revision as of 14:47, 29 October 2019 by Ntesfaio (talk | contribs) (Methods: added new section)
Jump to navigation Jump to search

Purpose

The purpose of this week's assignment is to continue from Week 8 of analyzing a microarray dataset and build on the electronic notebook.

Methods

Viewing and Saving Stem Results

NtesfaioStem profiles pic corrected.jpg

powerpoint with all screenshots

DHAP4 gene list

dHAP4 GO list

  • A new window will open called "All STEM Profiles (1)".
  • Each box corresponds to a model expression profile. Colored profiles have a statistically significant number of genes assigned; they are arranged in order from most to least significant p value. Profiles with the same color belong to the same cluster of profiles. The number in each box is simply an ID number for the profile.
  • Click on the button that says "Interface Options...". At the bottom of the Interface Options window that appears below where it says "X-axis scale should be:", click on the radio button that says "Based on real time". Then close the Interface Options window.

Take a screenshot of this window (on a PC, simultaneously press the Alt and PrintScreen buttons to save the view in the active window to the clipboard) and paste it into a PowerPoint presentation to save your figures.

  • Click on each of the SIGNIFICANT profiles (the colored ones) to open a window showing a more detailed plot containing all of the genes in that profile.
  • Take a screenshot of each of the individual profile windows and save the images in your PowerPoint presentation.
  • At the bottom of each profile window, there are two yellow buttons "Profile Gene Table" and "Profile GO Table". For each of the profiles, click on the "Profile Gene Table" button to see the list of genes belonging to the profile. In the window that appears, click on the "Save Table" button and save the file to your desktop. Make your filename descriptive of the contents, e.g. "wt_profile#_genelist.txt", where you replace the number symbol with the actual profile number.
  • Upload these files to the wiki and link to them on your individual journal page. (Note that it will be easier to zip all the files together and upload them as one file).
  • For each of the significant profiles, click on the "Profile GO Table" to see the list of Gene Ontology terms belonging to the profile. In the window that appears, click on the "Save Table" button and save the file to your desktop. Make your filename descriptive of the contents, e.g. "wt_profile#_GOlist.txt", where you use "wt", "dGLN3", etc. to indicate the dataset and where you replace the number symbol with the actual profile number. At this point you have saved all of the primary data from the STEM software and it's time to interpret the results!
  • Upload these files to the wiki and link to them on your individual journal page. (Note that it will be easier to zip all the files together and upload them as one file).

Analyzing and Interpreting STEM results

Select one of the profiles you saved in the previous step for further intepretation of the data. I suggest that you choose one that has a pattern of up- or down-regulated genes at the cold shock timepoints. Each member of your group should choose a different profile. Answer the following:

I chose to analyze STEM result 45

Why did you select this profile? In other words, why was it interesting to you?

I selected this profile because it had a incline but then stayed steady for a while before declining.

How many genes belong to this profile?

354.0

How many genes were expected to belong to this profile?

44.3 genes were expected

What is the p value for the enrichment of genes in this profile?

p value of 2.9E-201 (significant)

  • Bear in mind that we just finished computing p values to determine whether each individual gene had a significant change in gene expression at each time point. This p value determines whether the number of genes that show this particular expression profile across the time points is significantly more than expected.
  • Open the GO list file you saved for this profile in Excel. This list shows all of the Gene Ontology terms that are associated with genes that fit this profile. Select the third row and then choose from the menu Data > Filter > Autofilter. Filter on the "p-value" column to show only GO terms that have a p value of < 0.05.

How many GO terms are associated with this profile at p < 0.05?

64

  • The GO list also has a column called "Corrected p-value". This correction is needed because the software has performed thousands of significance tests. Filter on the "Corrected p-value" column to show only GO terms that have a corrected p value of < 0.05.

How many GO terms are associated with this profile with a corrected p value < 0.05?

30

Select the top 6 Gene Ontology terms from your filtered list (either p < 0.05 or corrected p < 0.05).

Top 6 for corrected p < 0.05 are:

GO:0005730

GO:0005355

GO:1904659

GO:0006351

GO:0015761

GO:0015755


Note whether the same GO terms are showing up in multiple clusters.

  • Look up the definitions for each of the terms at http://geneontology.org. In your research presentation, you will discuss the biological interpretation of these GO terms.

In other words, why does the cell react to cold shock by changing the expression of genes associated with these GO terms? Also, what does this have to do with the transcription factor being deleted (for the groups working with deletion strain data)?

  • Copy and paste the GO ID (e.g. GO:0044848) into the search field on the left of the page.

In the results page, click on the button that says "Link to detailed information about <term>, in this case "biological phase"". The definition will be on the next results page, e.g. here.

GO:0005730

Name: Nucleolus

Definition: A small, dense body one or more of which are present in the nucleus of eukaryotic cells. It is rich in RNA and protein, is not bounded by a limiting membrane, and is not seen during mitosis. Its prime function is the transcription of the nucleolar DNA into 45S ribosomal-precursor RNA, the processing of this RNA into 5.8S, 18S, and 28S components of ribosomal RNA, and the association of these components with 5S RNA and proteins synthesized outside the nucleolus. This association results in the formation of ribonucleoprotein precursors; these pass into the cytoplasm and mature into the 40S and 60S subunits of the ribosome.

GO:0005355

Name: Glucose Transmembrane Transporter Activity

Definition: Enables the transfer of the hexose monosaccharide glucose from one side of a membrane to the other

GO:1904659

Name: glucose transmembrane transport

Definition: The process in which glucose is transported across a membrane.

GO:0006351

Name: transcription, DNA-templated

Definition: The cellular synthesis of RNA on a template of DNA.

GO:0015761

Name: mannose transmembrane transport

Definition: The process in which mannose is transported across a lipid bilayer, from one side of a membrane to the other. Mannose is the aldohexose manno-hexose, the C-2 epimer of glucose. The D-(+)-form is widely distributed in mannans and hemicelluloses and is of major importance in the core oligosaccharide of N-linked oligosaccharides of glycoproteins.

GO:0015755

Name: fructose transmembrane transport

Definition: The directed movement of fructose into, out of or within a cell, or between cells, by means of some agent such as a transporter or pore. Fructose exists in a open chain form or as a ring compound. D-fructose is the sweetest of the sugars and is found free in a large number of fruits and honey.

Using YEASTRACT to Infer which Transcription Factors Regulate a Cluster of Genes (Tuesday, October 29)

How many transcription factors are green or "significant"?

22

Are CIN5, GLN3, and/or HAP4 on the list? If so, what is their "% in user set", "% in YEASTRACT", and "p value".

CIN5p is 13.51% in user set, and 2.61% in scerevisiae. The p value is 9.74 x 10^-13

Gln3p is 38.79% in user set, and 5.60% in scerevisiae. The p value is 1.92 x 10^13

Hap4p is 15.80% in user set, and 5.04% in scerevisiae. The p value is 5.58 x 10^14

Visualizing Your Gene Regulatory Networks with GRNsight

References

Excel

STEM.Short Time-series Expression Miner. Retrieved on October 28, 2019 from http://www.cs.cmu.edu/~jernst/stem/

Google Powerpoint Slides

Data and Files

NtesfaioStem profiles pic corrected.jpg

powerpoint with all screenshots

DHAP4 gene list

dHAP4 GO list

Excel Spreadsheet

Text File of dHAP4 set

Acknowledgements

My homework partners this week are Aby User:Ymesfin and David User:Dramir36. We sat together in class to go over the assignment.

Methods were copied from Week 9

Conclusion

Bio DB Home page

Template:Ntesfaio

Week 1

User:Ntesfaio

Class Journal Week 1

Week 2

Ntesfaio Week 2

Class Journal Week 2

Week 3

RAD53 / YPL153C Week 3

Class Journal Week 3


Week 4

Ntesfaio Week 4

Class Journal Week 4

Week 5

DrugCentral Week 5

Class Journal Week 5

Week 6

Ntesfaio Week 6

Class Journal Week 6

Week 7

Ntesfaio Week 7

Class Journal Week 7

Week 8

Ntesfaio Week 8

Class Journal Week 8

Week 9

Ntesfaio Week 9

Class Journal Week 9

Week 10

Ntesfaio Week 10

Week 11

Ntesfaio Week 11

Sulfiknights

Week 12/13

Ntesfaio Week 12/13

Sulfiknights

Sulfiknights Deliverables

Ntesfaio Week 15

Ntesfaio Final Individual Reflection