Difference between revisions of "Ymesfin Week 9"

From LMU BioDB 2019
Jump to navigation Jump to search
(Data and Files: added ppt)
(Methods: added methods)
Line 2: Line 2:
  
 
==Methods==
 
==Methods==
 
+
#The data was first downloaded from the wiki
*Why did you select this profile? In other words, why was it interesting to you?
+
#*The strain we will be analyzing is Delta-Hap4
**I chose Profile 48 because it was the third most significant profile and the expression of the gene appeared parabolic.
+
#*The data was obtained from the following file: [[Media: Ymesfin_BIOL367_F19_microarray-data_dHAP4(7846).xlsx| Ymesfin_BIOL367_F19_microarray-data_dHAP4(7846).xlsx]]
*How many genes belong to this profile?  
+
#*The file had timepoints at 15, 30, 60, 90, and 120 minutes. All the timepoints had 4 replicates but the 90 and 120 minute timepoints had 3 replicates. There were a total of 6189 samples in this dataset.
**256 genes are associated with this profile.
+
#Performed ANOVA Statistical Analysis
*How many genes were expected to belong to this profile?
+
##Calculated average results for each datapoint.
**32.6 genes were expected to be associated with this profile.
+
##Calculated Sum square of the entire dataset
*What is the p-value for the enrichment of genes in this profile?
+
##Created the column headers dHAP4_ss_(TIME) for each timepoint.
**The p-value of enrichment for this profile is 1.8E-141
+
##Calculated Sum Square for each timepoint
*How many GO terms are associated with this profile at p < 0.05?  
+
##Calculated Fstat for each data point
**35 terms associated with this profile have a p-value < 0.05.
+
##Calculated p value for each data point
*How many GO terms are associated with this profile with a corrected p value < 0.05?
+
##Perfomed Bonferroni p value correction for each data point
**Only one term associated with this profile had a corrected p value < 0.05.
+
##Performed Benjamini & Hochberg p value Correction for each data point
 +
##Performed Sanity check
 +
#Clustering and GO Term Enrichment with STEM Software
 +
##The data was copied onto a new excel sheet
 +
##Column A was renamed to "SPOT", Column B was renamed  "Gene Symbol", and the column named Column C was deleted
 +
##All the data entries with BH p-values > 0.05 were deleted
 +
##All of the data columns except for the Average Log Fold change columns for each timepoint were deleted
 +
##The data columns were renamed with just the time and units (for example, 15m, 30m, etc.)
 +
##The data was saved as a tab-delimited text document
 +
##The STEM Software, Gene Ontology, and yeast GO annotations were downloaded
 +
##The STEM Software was run using the dHAP4 text data
 +
##The Profile GO and Profile Gene Tables were saved from the STEM results
 +
##Profile 48 from the STEM results was selected for further analysis
 +
##*Why did you select this profile? In other words, why was it interesting to you?
 +
##**I chose Profile 48 because it was the third most significant profile and the expression of the gene appeared parabolic.
 +
##*How many genes belong to this profile?  
 +
##**256 genes are associated with this profile.
 +
##*How many genes were expected to belong to this profile?
 +
##**32.6 genes were expected to be associated with this profile.
 +
##*What is the p-value for the enrichment of genes in this profile?
 +
##**The p-value of enrichment for this profile is 1.8E-141
 +
##*How many GO terms are associated with this profile at p < 0.05?  
 +
##**35 terms associated with this profile have a p-value < 0.05.
 +
##*How many GO terms are associated with this profile with a corrected p value < 0.05?
 +
##**Only one term associated with this profile had a corrected p value < 0.05.
 +
##The definitions of the top 6 terms with p-values <0.05 were searched on http://geneontology.org
  
 
==Results==
 
==Results==

Revision as of 15:30, 26 October 2019

Purpose

Methods

  1. The data was first downloaded from the wiki
    • The strain we will be analyzing is Delta-Hap4
    • The data was obtained from the following file: Ymesfin_BIOL367_F19_microarray-data_dHAP4(7846).xlsx
    • The file had timepoints at 15, 30, 60, 90, and 120 minutes. All the timepoints had 4 replicates but the 90 and 120 minute timepoints had 3 replicates. There were a total of 6189 samples in this dataset.
  2. Performed ANOVA Statistical Analysis
    1. Calculated average results for each datapoint.
    2. Calculated Sum square of the entire dataset
    3. Created the column headers dHAP4_ss_(TIME) for each timepoint.
    4. Calculated Sum Square for each timepoint
    5. Calculated Fstat for each data point
    6. Calculated p value for each data point
    7. Perfomed Bonferroni p value correction for each data point
    8. Performed Benjamini & Hochberg p value Correction for each data point
    9. Performed Sanity check
  3. Clustering and GO Term Enrichment with STEM Software
    1. The data was copied onto a new excel sheet
    2. Column A was renamed to "SPOT", Column B was renamed "Gene Symbol", and the column named Column C was deleted
    3. All the data entries with BH p-values > 0.05 were deleted
    4. All of the data columns except for the Average Log Fold change columns for each timepoint were deleted
    5. The data columns were renamed with just the time and units (for example, 15m, 30m, etc.)
    6. The data was saved as a tab-delimited text document
    7. The STEM Software, Gene Ontology, and yeast GO annotations were downloaded
    8. The STEM Software was run using the dHAP4 text data
    9. The Profile GO and Profile Gene Tables were saved from the STEM results
    10. Profile 48 from the STEM results was selected for further analysis
      • Why did you select this profile? In other words, why was it interesting to you?
        • I chose Profile 48 because it was the third most significant profile and the expression of the gene appeared parabolic.
      • How many genes belong to this profile?
        • 256 genes are associated with this profile.
      • How many genes were expected to belong to this profile?
        • 32.6 genes were expected to be associated with this profile.
      • What is the p-value for the enrichment of genes in this profile?
        • The p-value of enrichment for this profile is 1.8E-141
      • How many GO terms are associated with this profile at p < 0.05?
        • 35 terms associated with this profile have a p-value < 0.05.
      • How many GO terms are associated with this profile with a corrected p value < 0.05?
        • Only one term associated with this profile had a corrected p value < 0.05.
    11. The definitions of the top 6 terms with p-values <0.05 were searched on http://geneontology.org

Results

Data and Files

dHAP4 Data Sheet

dHAP4 Stem Data

dHAP4 P-Values and Stem Results

dHAP4 Gene List

dHAP4 GO List

Conclusion

Main Page

Ymesfin

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Skinny Genes

Week 12

Week 13

Week 14

Week 15

Acknowledgements

References