Difference between revisions of "Ymesfin Week 10"

From LMU BioDB 2019
Jump to navigation Jump to search
(Acknowledgements: signed)
(References: added references)
 
Line 104: Line 104:
  
 
GRNsight. (2019). Retrieved October 29, 2019,  https://dondi.github.io/GRNsight/
 
GRNsight. (2019). Retrieved October 29, 2019,  https://dondi.github.io/GRNsight/
 +
 +
MATLAB R2014b. (2014). Retrieved November 5, 2019.

Latest revision as of 23:11, 6 November 2019

Purpose

The purpose of this assignment was to create a detailed electronic lab notebook to statistically analyze a DNA microarray dataset, demonstrate our understanding of p-value cut-offs, and display the relationships between the network of transcription factors in Saccharomyces cerevisiae. This week students will create a GRNmap input to further understand the relationship between the genes in Saccharomyces cerevisiae and to familiarize themselves with creating queries and modeling.

Methods

Creating the GRNmap Input Workbook

  1. Opened new excel worksheet

production_rates sheet

  1. Named first sheet "production_rates" and labeled the first and second columns 'id' and "production_rate", respectively.
  2. Downloaded Production Rates Database.
  3. Imported list of genes to a new table in the database. Clicked on the "External Data" tab and selected the Excel icon with the "up" arrow on it.
  4. Clicked the "Browse" button and selected Excel file containing network used to upload to GRNsight.
  5. Made sure the button next to "Import the source data into a new table in the current database" and clicked "OK".
  6. In the next window, selected the "network" worksheet, if it wasn't automatically selected. Clicked "Next".
  7. In the next window, made sure the "First Row Contains Column Headings" was checked. Clicked "Next".
  8. In the next window, changed the "Field Name" to "id". Clicked "Next".
  9. In the next window, selected the button for "Choose my own primary key." and chose the "id" field from the drop down next to it. Clicked "Next".
  10. In the next field, made sure it said "Import to Table: network". Clicked Finish.
  11. Clicked "Close".
  12. Went to the "Create" tab. Clicked on the icon for "Query Design".
  13. In the window that appeared, clicked on the "network" table and clicked "Add". Clicked on the "production_rates" table and clicked "Add". Clicked "Close".
  14. Clicked on the word "id" in the network table and dragged mouse to the "standard_name" field in the "production_rates" table, and released.
  15. Right-clicked on the line between those words and selected "Join Properties" from the menu that appears. Selected Option "2: Include ALL records from 'network' and only those records from 'production_rates' where the joined fields are equal." Clicked "OK".
  16. Clicked on the "id" word in the "network" table and dragged it to the bottom of the screen to the first column next to the word "Field" and released.
  17. Clicked on the "production_rate" field in the "production_rates" table and dragged it to the bottom of the screen to the second column next to the word "Field" and released.
  18. Right-clicked anywhere in the gray area near the two tables. In the menu that appeared, selected "Query Type > Make Table Query...".
  19. In the window that appeared, named the table "production_rates_1". Made sure that "Current Database" is selected and Clicked "OK".
  20. Went to the "Query Tools: Menus" tab. Clicked on the exclamation point icon. A window appeared that tells how many rows were being pasted into a new table. Clicked "Yes".
  21. Your new "production_rates_1" table will appear in the list at the left. Double-click on that table name to open it.
  22. Copied the data in this table and pasted it back into Excel workbook.
    • If there were missing values, value 0.1980 was substituted for the missing production rates.

degradation_rates sheet

  1. Added new sheet called "degradation_rates" and labeled the first two columns (from left to right) "id" and "degradation_rate".
  2. Executed a similar query as the "production_rates" sheet, substituting the appropriate "degradation_rates" table in the query.
    • Substituted the value 0.0990 for the missing degradation rates.

expression Data Sheets for Individual Yeast Strains

  1. Added 4 sheets for wt,dGLN3, dHAP4, and dCIN5.
    • Each sheet was given a unique name that followed the convention "STRAIN_log2_expression", where the word "STRAIN" is replaced by the strain designation
  2. First column in each sheet was labeled "id".
  3. The next series of columns were labeled with the timepoints at which the data were collected, without any units. For example, the 15 minute timepoint had a column header "15". Replicate data for the same timepoint were in columns immediately next to each other and had the same column headers. For example, three replicates of the 15 minute timepoint had "15", "15", "15" as the column headers.*# If data was provided for multiple strains, each strain had data for the same timepoints, although the number of replicates could vary.
    • The data for the 15, 30, and 60 minute timepoints, but not the 90 or 120 minute timepoints, were included.
    • The data used was contained in the database used to obtain the production and degradation rates.
  4. A similar query as that for the "production_rates" database sheet for each strains expression data was executed to import the data into the corresponding Excel sheet.

network sheet

  1. Added a new sheet labeled "network".
  2. The network derived from the YEASTRACT database for the Week 9 assignment was copied and pasted into this sheet directly

network_weights sheet

  1. Added new sheet labeled "network_weights"
  2. Copied the content of the "network" sheet to this sheet

optimization_parameters sheet

  1. A new sheet was added and labeled "optimization_parameters"
  2. The first two columns (from left to right) were entitled, "optimization_parameter" and "value".
  3. This worksheet was copied from the sample workbook.
    • Row 15, "Strain", was modified to include the strain designations for which the corresponding STRAIN_log2_expression sheets.

threshold_b sheet

  1. Added new sheet labeled "threshold_b".
  2. Labeled the first column "id" and listed the standard names for the genes in the model in the same order as in the other sheets.
  3. The second column was labeled "threshold_b" and contained the initial guesses of 0 for all the cells.

Dynamical Systems Modeling of Gene Regulatory Network

  1. The GRNmap v1.10 code was downloaded from the GRNmap Downloads pageand MATLAB R2014b was launched.
  2. GRNmodel.m was opened and Run on MATLAB R2014b
  3. The GRNmap input workbook was selected for the program
  4. The output .xlsx and .mat files were saved in the same folder as the input folder, along with .jpg files containing the optimization diagnostic and individual expression plots.

Data/Files

GRNmap Input Sheet

MATLAB Results

Conclusion

In this study, yeast cells (saccharomyces cerevisiae) were exposed to a cold shock to monitor how their gene expression levels related to environmental temperatures. The ANOVA suggest that approximately 40.1% of the collected data is statistically significant and had p values less than 0.05. However, given the large number of data entries used in the study, it is not unreasonable to assume that some of the significant results are due to chance, regardless of whether the p values were less than 0.05. Nonetheless, 4.5% of the data contained p values less than 0.0001, substantiating the significance of at least some of the data. Thus, the database suggests that there is a relationship between the expression levels of certain genes of saccharomyces cerevisiae and the cold shock treatments. Of the most significant stem profiles, Profile 48 was used for further analysis. GRNsight was used to analyze the relationship between the transcription factors associated with Profile 48. Most of the transcription factors were associated with one another creating a network. An input for the GRNmap modeling software was created with excel and run with MATLAB to map out the regulatory effects of the genes on each other.

Acknowledgements

Dr. Kam Dahlquist; Professor

Naomi Tesfaiohannes; Homework Partner

David Ramirez; Homework Partner

Except for what is noted above, this individual journal entry was completed by me and not copied from another source. Ymesfin (talk) 23:10, 6 November 2019 (PST)

References

References

LMU BioDB 2019. (2019). Week 7. Retrieved October 16, 2019, from https://xmlpipedb.cs.lmu.edu/biodb/fall2019/index.php/Week_7

LMU BioDB 2019. (2019). Week 8. Retrieved October 23, 2019, from https://xmlpipedb.cs.lmu.edu/biodb/fall2019/index.php/Week_8

LMU BioDB 2019. (2019). Week 9. Retrieved October 26, 2019, from https://xmlpipedb.cs.lmu.edu/biodb/fall2019/index.php/Week_9

LMU BioDB 2019. (2019). Week 10. Retrieved October 26, 2019, from https://xmlpipedb.cs.lmu.edu/biodb/fall2019/index.php/Week_10

The Gene Ontology Resource. (2019). Retrieved October 26, 2019, from http://geneontology.org

Yeastract. (2019). Retrieved October 29, 2019, from http://www.yeastract.com/index.php

Yeastract Gene Regulation Matrix. (2019). Retrieved October 29, 2019, from http://www.yeastract.com/formregmatrix.php

GRNsight. (2019). Retrieved October 29, 2019, https://dondi.github.io/GRNsight/

MATLAB R2014b. (2014). Retrieved November 5, 2019.