Dmadere Week 9
Purpose
The purpose of this analysis was to get an understanding of how gene regulatory networks work corresponding to the Gene Profile 22, and to understand how to interpret gene clusters.
Methods/Results
Viewing and Saving STEM Results
- A new window opened called "All STEM Profiles (1)". Each box corresponds to a model expression profile. Colored profiles have a statistically significant number of genes assigned; they are arranged in order from most to least significant p value. Profiles with the same color belong to the same cluster of profiles. The number in each box is simply an ID number for the profile.
- Clicked on the button that says "Interface Options...". At the bottom of the Interface Options window that appears below where it says "X-axis scale should be:", clicked on the radio button that says "Based on real time". Then closed the Interface Options window.
- Took a screenshot of this window (on a PC, simultaneously pressed the Alt and PrintScreen buttons to save the view in the active window to the clipboard) and pasted it into a PowerPoint presentation to save your figures.
- Clicked on each of the SIGNIFICANT profiles (the colored ones) to open a window showing a more detailed plot containing all of the genes in that profile.
- Took a screenshot of each of the individual profile windows and save the images in PowerPoint presentation.
- At the bottom of each profile window, there are two yellow buttons "Profile Gene Table" and "Profile GO Table". For each of the profiles, clicked on the "Profile Gene Table" button to see the list of genes belonging to the profile. In the window that appears, clicked on the "Save Table" button and save the file to your desktop. Made filename descriptive of the contents, e.g. "wt_profile#_genelist.txt", where you replace the number symbol with the actual profile number.
- For each of the significant profiles, clicked on the "Profile GO Table" to see the list of Gene Ontology terms belonging to the profile. In the window that appears, click on the "Save Table" button and save the file to your desktop. Make your filename descriptive of the contents, e.g. "wt_profile#_GOlist.txt", where you use "wt", "dGLN3", etc. to indicate the dataset and where you replace the number symbol with the actual profile number. At this point you have saved all of the primary data from the STEM software and it's time to interpret the results!
Analyzing and Interpreting STEM Results
- Selected one of the profiles saved in the previous step for further interpretation of the data. Each member of the group selected their own significant profile to analyze.
- Why did you select Profile #22? In other words, why was it interesting to you?
- - I selected Profile #22 because the cluster appeared very different from the rest of the clusters. Unlike the other clusters, this profile started out as a straight line, increased, and then slowly decreased. I wanted to analyze why this was occurring and get a better understanding of why this profile exhibits these qualities. When I clicked on the profile, I noticed that each colored line representing the genes started from 0 and stayed close until around the 60min time frame where it started to increase, and then decreased at the 90min time frame, which I also wanted to analyze further
- How many genes belong to this profile?
- - There are 179 genes assigned to Profile 22.
- How many genes were expected to belong to this profile?
- - There were 22.9 genes expected in this profile.
- What is the p value for the enrichment of genes in this profile?
- - The p-value for the enrichment of genes in Profile 22 is 3.4E-98 and it is significant.
- Opened the GO list file saved for Profile 22 in Excel. Filtered data by p-value < 0.05. Then filtered data by "Corrected P-Value" column to show the corrected p-values of < 0.05.
- How many GO terms are associated with this profile at p < 0.05?
- -There are 25 GO terms associated with this profile at p < 0.05.
- How many GO terms are associated with this profile with a corrected p value < 0.05?
- - There are 3 GO terms associated with Profile 22 with a corrected p-value < 0.05.
- Selected the top 6 Gene Ontology terms from filtered list (p < 0.05 not corrected) and noted whether the same GO terms are showing up with multiple clusters.
- Looked up the definitions for each of the terms at Gene Ontology Database. In your research presentation, will discuss the biological interpretation of these GO terms. In other words, why does the cell react to cold shock by changing the expression of genes associated with these GO terms? Also, what does this have to do with the transcription factor being deleted (for the groups working with deletion strain data)?
- -Cellular response to oxidative stress (GO:0034599): Any process that results in a change in state or activity of a cell (in terms of movement, secretion, enzyme production, gene expression, etc.) as a result of oxidative stress, a state often resulting from exposure to high levels of reactive oxygen species, e.g. superoxide anions, hydrogen peroxide (H2O2), and hydroxyl radicals. Source: GOC:mah
- - Cellular oxidant detoxification (GO:0098869): Any process carried out at the cellular level that reduces or removes the toxicity superoxide radicals or hydrogen peroxide. Source: GOC:vw, GOC:dos
- - Cytoplasm (GO:0005737): All of the contents of a cell excluding the plasma membrane and nucleus, but including other subcellular structures. Source: ISBN:0198547684
- - Oxidation-reduction process (GO:0055114): A metabolic process that results in the removal or addition of one or more electrons to or from a substance, with or without the concomitant removal or addition of a proton or protons. Source: GOC:rph, GOC:jh2, GOC:ecd, GOC:jid, GOC:dhl, GOC:mlg
- - Endocytosis (GO:0006897): A vesicle-mediated transport process in which cells take up external materials or membrane constituents by the invagination of a small region of the plasma membrane to form a new membrane-bounded vesicle. Source: ISBN:0716731363, ISBN:0198506732, GOC:mah
- - Actin Cortical Patch (GO:0030479): An endocytic patch that consists of an actin-containing structure found at the plasma membrane in cells; formed of networks of branched actin filaments that lie just beneath the plasma membrane and assemble, move, and disassemble rapidly. An example of this is the actin cortical patch found in Saccharomyces cerevisiae. Source: GOC:vw, PMID:16959963, GOC:mah, ISBN:0879693568, ISBN:0879693649
Used YEASTRACT to Infer which Transcription Factors Regulate a Cluster of Genes
- Opened the gene list in Excel for the one of the significant profiles from your stem analysis. Chose a cluster with a clear cold shock/recovery up/down or down/up pattern (Profile #22).
- Copied the list of gene IDs onto clipboard.
- Launch a web browser and go to the YEASTRACT Database
- On the left panel of the window, clicked on the link to Rank by TF.
- Pasted your list of genes from your cluster into the box labeled ORFs/Genes.
- Checked the box for Check for all TFs.
- Accepted the defaults for the Regulations Filter (Documented, DNA binding plus expression evidence)
- Didn't apply a filter for "Filter Documented Regulations by environmental condition".
- Ranked genes by TF using: The % of genes in the list and in YEASTRACT regulated by each TF.
- Clicked on the Search button.
- Questions:
- In results window, p values colored green are considered "significant", the ones colored yellow are considered "borderline significant" and the ones colored pink are considered "not significant".
- How many transcription factors are green or "significant"?
- - There are 31 transcription factors that are significant.
- Are CIN5, GLN3, and/or HAP4 on the list? If so, what is their "% in user set", "% in YEASTRACT", and "p value".
- - CIN5: User Set - 21.47%, YEASTRACT - 1.74%, P-value - 0.996
- - GLN3: User Set - 29.38%, YEASTRACT - 2.16%, P-value - 0.880
- - HAP4: User Set - 23.16%, YEASTRACT - 3.76%, P-value - 0.0019
- Used YEASTRACT to assist with creating gene regulatory network
- Selected "significant" transcription factors to run model. Used 17 transcription factors and included GLN3, HAP4, and CIN5 since they are the ones that are being studied in class as well as has a significance on the Profile.
- Copied and pasted list of transcription factors identified from Gene Regulation Matrix into both "Transcription factors" field and the "Target ORF/Genes" field
- Used "Regulations Filter" option of "Documents", "Only DNA binding evidence"
- Clicked on "Generate" and saved Regulation Matrix to Desktop.
Visualized Gene Regulatory Network with GRNsight
- Properly formatted output files from YEASTRACT.
- Opened file in Excel. Selected entire Column A. Went to "Data" tab and selected "Text to columns". Selected "Delimited" and clicked "Next". In the next window, selected "Semicolon", and clicked "Next". In the next window, left data format at "General", and clicked "Finish". Adjacency matrix created.
- Saved file in Excel format.
- Transposed matrix onto new worksheet and labeled it "Network". Selected "Paste Special" then "Transpose".
- Deleted the "p" from each of the gene names in the columns. Adjust case of labels to make them all upper case.
- In cell A1, copied and pasted the text "rows genes affected/cols genes controlling".
- Alphabetize the gene labels both across the top and side
- - Selected the area of the entire adjacency matrix.
- - Click the Data tab and click the custom sort button.
- - Sort Column A alphabetically, excluding the header row.
- - Sorted row 1 from left to right, excluded cell A1.
- Named worksheet containing organized adjacency matrix "network" and saved.
- Visualized what these gene regulatory networks looked like with the GRNsight software.
- Went to GRNsight home page.
- Selected menu item File > Open and selected the regulation matrix .xlsx file that has the "network" worksheet that was formatted. If file was formatted properly, GRNsight would automatically create a graph of network. Clicked the "Grid Layout" button to arrange the nodes in a grid. Screenshot results in PowerPoint.
- An error appeared when inputting the data from the excel sheet. Repeated process three times and was not able to carry out instructions.
Data & Files
DM_dCIN5 Profile GOlist Tables
DM_dCIN5 STEM Results Presentation
Conclusion
The purpose of this experiment was to analyze dCIN5's gene profile and interpret the data given from the gene cluster. In doing so, different gene profiles were shown from dCIN5 in STEM to display the affect of the surrounding genes on the expression. From there, significant profiles were chosen to analyze further (Profile #22 for this example). The data was then analyzed further to determine which transcription factors regulate a cluster of genes. In doing so, the data was unable to be inputted into GRNsight, therefore, I was unable to see the gene regulatory networks.
Acknowledgements
- I worked with my homework group Ivy, Mihir, and Emma this week in class to complete this assignment. We talked about the assignment in class and texted about it as well if anyone needed help.
- I used the .zip file compression instructions from the Week 4 Biological databases instructions.
- "Except for what is noted above, this individual journal entry was completed by me and not copied from another source."
- Dmadere (talk) 21:57, 30 October 2019 (PDT)
References
- Gene Ontology Resource. (n.d.). Retrieved October 30, 2019, from http://geneontology.org/.
- Methodology as provided and edited from the assignment page, step-by-step instrutions, and assignment updates as listed by LMU BioDB 2019. (2019). Week 9. Retrieved October 30, 2019, from https://xmlpipedb.cs.lmu.edu/biodb/fall2019/index.php/Week_9
- S.cerevisiae. (n.d.). Retrieved October 30, 2019, from http://www.yeastract.com/.
- Short Time-series Expression Miner (STEM). (n.d.). Retrieved October 30, 2019, from http://www.cs.cmu.edu/~jernst/stem/.