Ymesfin Week 9

From LMU BioDB 2019
Revision as of 16:07, 30 October 2019 by Ymesfin (talk | contribs) (Results: added matrix)
Jump to navigation Jump to search

Purpose

The purpose of this assignment was to create a detailed electronic lab notebook to statistically analyze a DNA microarray dataset and to demonstrate our understanding of p-value cut-offs.

Week 7 Assignment

Week 8 Assignment

Week 9 Assignment

Methods

  1. The data was first downloaded from the wiki
    • The strain we will be analyzing is Delta-Hap4
    • The data was obtained from the following file: Ymesfin_BIOL367_F19_microarray-data_dHAP4(7846).xlsx
    • The file had timepoints at 15, 30, 60, 90, and 120 minutes. All the timepoints had 4 replicates but the 90 and 120 minute timepoints had 3 replicates. There were a total of 6189 samples in this dataset.
  2. Performed ANOVA Statistical Analysis
    1. Calculated average results for each datapoint.
    2. Calculated Sum square of the entire dataset
    3. Created the column headers dHAP4_ss_(TIME) for each timepoint.
    4. Calculated Sum Square for each timepoint
    5. Calculated Fstat for each data point
    6. Calculated p value for each data point
    7. Perfomed Bonferroni p value correction for each data point
    8. Performed Benjamini & Hochberg p value Correction for each data point
    9. Performed Sanity check
  3. Clustering and GO Term Enrichment with STEM Software
    1. The data was copied onto a new excel sheet
    2. Column A was renamed to "SPOT", Column B was renamed "Gene Symbol", and the column named Column C was deleted
    3. All the data entries with BH p-values > 0.05 were deleted
    4. All of the data columns except for the Average Log Fold change columns for each timepoint were deleted
    5. The data columns were renamed with just the time and units (for example, 15m, 30m, etc.)
    6. The data was saved as a tab-delimited text document
    7. The STEM Software, Gene Ontology, and yeast GO annotations were downloaded
    8. The STEM Software was run using the dHAP4 text data
    9. The Profile GO and Profile Gene Tables were saved from the STEM results
    10. Profile 48 from the STEM results was selected for further analysis
      • Why did you select this profile? In other words, why was it interesting to you?
        • I chose Profile 48 because it was the third most significant profile and the expression of the gene appeared parabolic.
      • How many genes belong to this profile?
        • 256 genes are associated with this profile.
      • How many genes were expected to belong to this profile?
        • 32.6 genes were expected to be associated with this profile.
      • What is the p-value for the enrichment of genes in this profile?
        • The p-value of enrichment for this profile is 1.8E-141
      • How many GO terms are associated with this profile at p < 0.05?
        • 35 terms associated with this profile have a p-value < 0.05.
      • How many GO terms are associated with this profile with a corrected p value < 0.05?
        • Only one term associated with this profile had a corrected p value < 0.05.
    11. The definitions of the top 6 terms with p-values <0.05 were searched on http://geneontology.org

How many transcription factors are green or "significant"?

--9 are significant

Are CIN5, GLN3, and/or HAP4 on the list? If so, what is their "% in user set", "% in YEASTRACT", and "p value". CIN5: 16.47%, p value= 0.999999881503972; GLN3: 35.29%, p value= 0.276481708725189; HAP4: 14.51%, p value= 0.597297513212687

Results

Data and Files

dHAP4 Data Sheet

dHAP4 Stem Data

dHAP4 P-Values and Stem Results

dHAP4 Gene List

dHAP4 GO List

dHAP4 Slides

dHAP4 profile 48 Regulation Matrix

Conclusion

Main Page

Ymesfin

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Skinny Genes

Week 12

Week 13

Week 14

Week 15

Acknowledgements

Dr. Kam Dahlquist; Professor

Naomi Tesfaiohannes; Homework Partner

David Ramirez; Homework Partner

Except for what is noted above, this individual journal entry was completed by me and not copied from another source.

References

LMU BioDB 2019. (2019). Week 7. Retrieved October 16, 2019, from https://xmlpipedb.cs.lmu.edu/biodb/fall2019/index.php/Week_7

LMU BioDB 2019. (2019). Week 8. Retrieved October 23, 2019, from https://xmlpipedb.cs.lmu.edu/biodb/fall2019/index.php/Week_8

LMU BioDB 2019. (2019). Week 9. Retrieved October 26, 2019, from https://xmlpipedb.cs.lmu.edu/biodb/fall2019/index.php/Week_9

The Gene Ontology Resource. (2019). Retrieved October 26, 2019, from http://geneontology.org