Difference between revisions of "Rlegaspi Week 15"
(→Compiling Raw Data and Statistical Analysis: Fixed formatting of bullet points) |
(→Sanity Check: Inputted the second part of answer for questions (percentages)) |
||
Line 86: | Line 86: | ||
***229 genes, 4.23% | ***229 genes, 4.23% | ||
**Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %) | **Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %) | ||
− | ***487 genes, | + | ***487 genes, 9.01% |
**Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %) | **Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %) | ||
− | ***530 genes, | + | ***530 genes, 9.80% |
**What about an average log fold change of > 0.25 and p < 0.05? (and %) | **What about an average log fold change of > 0.25 and p < 0.05? (and %) | ||
− | ***475 genes, | + | ***475 genes, 8.78% |
**Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.) | **Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.) | ||
− | ***513 genes, | + | ***513 genes, 9.49% |
*'''F5 and C60''' | *'''F5 and C60''' | ||
**How many genes have p value < 0.05? and what is the percentage (out of 5408)? | **How many genes have p value < 0.05? and what is the percentage (out of 5408)? | ||
Line 107: | Line 107: | ||
***4 genes, 0.07% | ***4 genes, 0.07% | ||
**Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %) | **Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %) | ||
− | ***479 genes, | + | ***479 genes, 8.86% |
**Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %) | **Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %) | ||
− | ***490 genes, | + | ***490 genes, 9.06% |
**What about an average log fold change of > 0.25 and p < 0.05? (and %) | **What about an average log fold change of > 0.25 and p < 0.05? (and %) | ||
− | ***441 genes, | + | ***441 genes, 8.15% |
**Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.) | **Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.) | ||
− | ***431 genes, | + | ***431 genes, 7.97% |
*'''F20 and C60''' | *'''F20 and C60''' | ||
**How many genes have p value < 0.05? and what is the percentage (out of 5408)? | **How many genes have p value < 0.05? and what is the percentage (out of 5408)? | ||
Line 128: | Line 128: | ||
***707 genes, 13.07% | ***707 genes, 13.07% | ||
**Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %) | **Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %) | ||
− | ***826 genes, | + | ***826 genes, 15.27% |
**Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %) | **Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %) | ||
− | ***1012 genes, | + | ***1012 genes, 18.71% |
**What about an average log fold change of > 0.25 and p < 0.05? (and %) | **What about an average log fold change of > 0.25 and p < 0.05? (and %) | ||
− | ***788 genes, | + | ***788 genes, 14.57% |
**Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.) | **Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.) | ||
− | ***963 genes, | + | ***963 genes, 17.81% |
*'''F60 and C60''' | *'''F60 and C60''' | ||
**How many genes have p value < 0.05? and what is the percentage (out of 5408)? | **How many genes have p value < 0.05? and what is the percentage (out of 5408)? | ||
Line 149: | Line 149: | ||
***1193 genes, 22.06% | ***1193 genes, 22.06% | ||
**Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %) | **Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %) | ||
− | ***870 genes, | + | ***870 genes, 16.09% |
**Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %) | **Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %) | ||
− | ***1200 genes, | + | ***1200 genes, 22.19% |
**What about an average log fold change of > 0.25 and p < 0.05? (and %) | **What about an average log fold change of > 0.25 and p < 0.05? (and %) | ||
− | ***828 genes, | + | ***828 genes, 15.31% |
**Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.) | **Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.) | ||
− | ***1146 genes, | + | ***1146 genes, 21.19% |
= External Links = | = External Links = | ||
{{Template:Rlegaspi}} | {{Template:Rlegaspi}} |
Revision as of 22:15, 12 December 2015
Contents
Shewanella oneidensis
Our Gene Database Testing Report
Group Paper - File:Final Report 20151218 2 HMH.docx
Group Members
- Coder: Mary Alverson
- GenMAPP User & Project Manager: Ron Legaspi
- Quality Assurance: Josh Kuroda
- GenMAPP User: Emily Simso
Important Links
Our Files
Our Deliverables
Gene Database Project Links | |||||||
---|---|---|---|---|---|---|---|
Overview | Deliverables | Reference Format | Guilds | Project Manager | GenMAPP User | Quality Assurance | Coder |
Teams | Heavy Metal HaterZ | The Class Whoopers | GÉNialOMICS | Oregon Trail Survivors |
Individual Journal Entries | ||||
---|---|---|---|---|
Mary Alverson | Week 11 | Week 12 | Week 14 | Week 15 |
Emily Simso | Week 11 | Week 12 | Week 14 | Week 15 |
Ron Legaspi | Week 11 | Week 12 | Week 14 | Week 15 |
Josh Kuroda | Week 11 | Week 12 | Week 14 | Week 15 |
Goals for Week 15
Data Preparation and Statistical Analysis for GenMAPP
- Consult with Dr. Dahlquist on how to process the data (normalization, statistics)
- Perform the statistical analysis in Excel.
- Format the gene expression data for import into GenMAPP.
*Similar to goals from Week 14.
Summary of Progress and Procedure
Compiling Raw Data and Statistical Analysis
December 8, 2015
- Calculated averages from the split data
- Discovered that there are a total of 5408 genes.
- Calculated biological averages of each time point
- Calculated AverageLogRatio comparing C5, C20, and C60 to C0 and F5, F20, and F60 to C60
- Subtracted not divided due to log space
- Performed TTest on the above relationships to get the Pvalue
- Performed Bonferroni
- Performed Benjamini & Hochberg
- Excel file after all of these procedures were uploaded to XMLPipeDB and link to file is as follows: File:StatisticalAnalysis Shewanella RARL 20151207 HMH.xlsx
December 10, 2015
Preparing compiled raw data for GenMAPP and creation of a .txt file
File:StatisticalAnalysis Shewanella RARL 20151210 HMH.xlsx
.txt file: File:CompiledRawDataforGenMAPP Shewanella RARL 20151210 HMH.txt
Sanity Check
Importance of Sanity Check (from [| DNA Microarray Analysis Activity]: In summary, the p value cut-off should not be thought of as some magical number at which data becomes "significant". Instead, it is a moveable confidence level. If we want to be very confident of our data, use a small p value cut-off. If we are OK with being less confident about a gene expression change and want to include more genes in our analysis, we can use a larger p value cut-off. For the GenMAPP analysis below, we will use the fold change cut-off of greater than 0.25 or less than -0.25 and the unadjusted p value cut off of p < 0.05 for our analysis because we want to include several hundred genes in our analysis. (Note: The "AvgLogRatio" tells us the size of the gene expression change and in which direction. Positive values are increases relative to the control; negative values are decreases relative to the control.
- C5 and C0
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- 344 genes, 6.36%
- What about p < 0.01? and what is the percentage (out of 5408)?
- 94 genes, 1.74%
- What about p < 0.001? and what is the percentage (out of 5408)?
- 18 genes, 0.33%
- What about p < 0.0001? and what is the percentage (out of 5408)?
- 5 genes, 0.09%
- How many genes are p < 0.05 for the Bonferroni-corrected p value? and what is the percentage (out of 5408)?
- 2 genes, 0.04%
- How many genes are p < 0.05 for the Benjamini and Hochberg-corrected p value? and what is the percentage (out of 5408)?
- 2 genes, 0.037%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %)
- 180 genes, 3.33%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %)
- 164 genes, 3.03%
- What about an average log fold change of > 0.25 and p < 0.05? (and %)
- 161 genes, 2.98%
- Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.)
- 149 genes, 2.76%
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- C20 and C0
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- 868 genes, 16.05%
- What about p < 0.01? and what is the percentage (out of 5408)?
- 342 genes, 6.32%
- What about p < 0.001? and what is the percentage (out of 5408)?
- 79 genes, 1.46%
- What about p < 0.0001? and what is the percentage (out of 5408)?
- 14 genes, 0.26%
- How many genes are p < 0.05 for the Bonferroni-corrected p value? and what is the percentage (out of 5408)?
- 1 gene, 0.01%
- How many genes are p < 0.05 for the Benjamini and Hochberg-corrected p value? and what is the percentage (out of 5408)?
- 34 genes, 0.63%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %)
- 452 genes,
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %)
- 416 genes,
- What about an average log fold change of > 0.25 and p < 0.05? (and %)
- 437 genes,
- Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.)
- 405 genes,
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- C60 and C0
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- 1017 genes, 18.81%
- What about p < 0.01? and what is the percentage (out of 5408)?
- 471 genes, 8.71%
- What about p < 0.001? and what is the percentage (out of 5408)?
- 163 genes, 3.01%
- What about p < 0.0001? and what is the percentage (out of 5408)?
- 53 genes, 0.98%
- How many genes are p < 0.05 for the Bonferroni-corrected p value? and what is the percentage (out of 5408)?
- 13 genes, 0.24%
- How many genes are p < 0.05 for the Benjamini and Hochberg-corrected p value? and what is the percentage (out of 5408)?
- 229 genes, 4.23%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %)
- 487 genes, 9.01%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %)
- 530 genes, 9.80%
- What about an average log fold change of > 0.25 and p < 0.05? (and %)
- 475 genes, 8.78%
- Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.)
- 513 genes, 9.49%
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- F5 and C60
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- 969 genes, 17.92%
- What about p < 0.01? and what is the percentage (out of 5408)?
- 315 genes, 5.82%
- What about p < 0.001? and what is the percentage (out of 5408)?
- 40 genes, 0.74%
- What about p < 0.0001? and what is the percentage (out of 5408)?
- 7 genes, 0.13%
- How many genes are p < 0.05 for the Bonferroni-corrected p value? and what is the percentage (out of 5408)?
- 1 gene, 0.01%
- How many genes are p < 0.05 for the Benjamini and Hochberg-corrected p value? and what is the percentage (out of 5408)?
- 4 genes, 0.07%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %)
- 479 genes, 8.86%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %)
- 490 genes, 9.06%
- What about an average log fold change of > 0.25 and p < 0.05? (and %)
- 441 genes, 8.15%
- Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.)
- 431 genes, 7.97%
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- F20 and C60
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- 1838 genes, 33.99%
- What about p < 0.01? and what is the percentage (out of 5408)?
- 892 genes, 16.49%
- What about p < 0.001? and what is the percentage (out of 5408)?
- 239 genes, 4.42%
- What about p < 0.0001? and what is the percentage (out of 5408)?
- 54 genes, 1.00%
- How many genes are p < 0.05 for the Bonferroni-corrected p value? and what is the percentage (out of 5408)?
- 10 genes, 0.18%
- How many genes are p < 0.05 for the Benjamini and Hochberg-corrected p value? and what is the percentage (out of 5408)?
- 707 genes, 13.07%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %)
- 826 genes, 15.27%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %)
- 1012 genes, 18.71%
- What about an average log fold change of > 0.25 and p < 0.05? (and %)
- 788 genes, 14.57%
- Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.)
- 963 genes, 17.81%
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- F60 and C60
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
- 2070 genes, 38.28%
- What about p < 0.01? and what is the percentage (out of 5408)?
- 1140 genes, 21.08%
- What about p < 0.001? and what is the percentage (out of 5408)?
- 387 genes, 7.16%
- What about p < 0.0001? and what is the percentage (out of 5408)?
- 120 genes, 2.22%
- How many genes are p < 0.05 for the Bonferroni-corrected p value? and what is the percentage (out of 5408)?
- 33 genes, 0.61%
- How many genes are p < 0.05 for the Benjamini and Hochberg-corrected p value? and what is the percentage (out of 5408)?
- 1193 genes, 22.06%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change greater than zero. How many are there? (and %)
- 870 genes, 16.09%
- Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "AvgLogRatio" column to show all genes with an average log fold change less than zero. How many are there? (and %)
- 1200 genes, 22.19%
- What about an average log fold change of > 0.25 and p < 0.05? (and %)
- 828 genes, 15.31%
- Or an average log fold change of < -0.25 and p < 0.05? (and %) (These are more realistic values for the fold change cut-offs because it represents about a 20% fold change which is about the level of detection of this technology.)
- 1146 genes, 21.19%
- How many genes have p value < 0.05? and what is the percentage (out of 5408)?
External Links
Ron Legaspi
BIOL 367, Fall 2015
Assignment Links
- Week 1 Assignment
- Week 2 Assignment
- Week 3 Assignment
- Week 4 Assignment
- Week 5 Assignment
- Week 6 Assignment
- Week 7 Assignment
- Week 8 Assignment
- Week 9 Assignment
- Week 10 Assignment
- Week 11 Assignment
- Week 12 Assignment
- Week 14 Assignment
- Week 15 Assignment
Individual Weekly Journals
- Individual Journal Week 1 - This is my User Page
- Individual Journal Week 2
- Individual Journal Week 3
- Individual Journal Week 4
- Individual Journal Week 5
- Individual Journal Week 6
- Individual Journal Week 7
- Individual Journal Week 8
- Individual Journal Week 9
- Individual Journal Week 10
- Individual Journal Week 11
- Individual Journal Week 12
- Individual Journal Week 14
- Individual Journal Week 15