Difference between revisions of "Lenaolufson Week 15"

From LMU BioDB 2015
Jump to: navigation, search
(added the electronic notebook for formatting the file for GenMAPP)
(12/8/15: added in protocol for sanity check)
Line 1: Line 1:
 
==12/8/15==
 
==12/8/15==
*It was now time for me to prepare my file for GenMAPP, and I did so by the [http://www.openwetware.org/wiki/BIOL398-01/S10:Sample_Microarray_Analysis_Vibrio_cholerae ''Vibrio cholerae'' instructions found here.]]
+
*It was now time for me to prepare my file for GenMAPP, and I did so by the [http://www.openwetware.org/wiki/BIOL398-01/S10:Sample_Microarray_Analysis_Vibrio_cholerae ''Vibrio cholerae'' instructions found here.]]
 
* I inserted a new worksheet and named it "forGenMAPP".
 
* I inserted a new worksheet and named it "forGenMAPP".
 
* I went back to the "statistics" worksheet and Selected All and Copied.
 
* I went back to the "statistics" worksheet and Selected All and Copied.
Line 14: Line 14:
 
** [[File:Bpertussis CompiledRawData MS2015-3.xlsx]]
 
** [[File:Bpertussis CompiledRawData MS2015-3.xlsx]]
 
** [[File:Bpertussis CompiledRawData MS2015-3.txt]]
 
** [[File:Bpertussis CompiledRawData MS2015-3.txt]]
 +
* Then it was time to perform a sanity check, which was done using the [http://www.openwetware.org/wiki/BIOL398-01/S10:Sample_Microarray_Analysis_Vibrio_cholerae ''Vibrio cholerae'' instructions found here.]]
 +
* I opened my spreadsheet and went to the "forGenMAPP" tab.
 +
* I clicked on cell A1 and selected the menu item Data > Filter > Autofilter. Little drop-down arrows appeared at the top of each column. This enabled me to filter the data according to criteria I set.
 +
* I clicked on the drop-down arrow on my "Pvalue" column. I selected "Custom". In the window that appeared, I set a criterion that filtered my data so that the Pvalue was less than 0.05.
 +
**p-value less than 0.05: 1923/3552, 54%
 +
**p-value less than 0.01: 1028/3552, 29%
 +
**p-value less than 0.001: 242/3552, 7%
 +
**p-value less than 0.0001: 40/3552, 1%
 +
**p < 0.05 for the Bonferroni-corrected p value: 9/3552, 0.2%
 +
**p < 0.05 for the Benjamini and Hochberg-corrected p value: 1365/3552, 38%
 +
*Keeping the (unadjusted) "Pvalue" filter at p < 0.05, I filtered the "Avg_ABC_Samples" column to show all genes with an average log fold change greater than zero.
 +
**964/3552, 27%
 +
*Keeping the (unadjusted) "Pvalue" filter at p < 0.05, I filtered the "Avg_ABC_Samples" column to show all genes with an average log fold change less than zero. 
 +
**959/3552, 27%
 +
*With an average log fold change of > 0.25 and p < 0.05
 +
**874/3552, 25%
 +
*With an average log fold change of < -0.25 and p < 0.05
 +
**848/3552, 24%
 +
* the fold change cut-off of greater than 0.25 or less than -0.25 and the unadjusted p value cut off of p < 0.05
 +
**1722/3552, 48%
  
*p-value less than 0.05: 1923/3552
 
*p-value less than 0.01: 1028/3552
 
*p-value less than 0.001: 242/3552
 
*p-value less than 0.0001: 40/3552
 
*p < 0.05 for the Bonferroni-corrected p value: 9/3552
 
*p < 0.05 for the Benjamini and Hochberg-corrected p value: 1365/3552File:Gene ontology results.png
 
*Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "Avg_LogFC_all" column to show all genes with an average log fold change greater than zero. How many are there?
 
**964/3552
 
*Keeping the (unadjusted) "Pvalue" filter at p < 0.05, filter the "Avg_LogFC_all" column to show all genes with an average log fold change less than zero. How many are there?
 
**959/3552
 
*What about an average log fold change of > 0.25 and p < 0.05?
 
**874/3552
 
*Or an average log fold change of < -0.25 and p < 0.05?
 
**848/3552
 
* the fold change cut-off of greater than 0.25 or less than -0.25 and the unadjusted p value cut off of p < 0.05?
 
**1722/3552
 
 
*numbers of errors found when running the .txt file in the GenMAPP database: 342 errors
 
*numbers of errors found when running the .txt file in the GenMAPP database: 342 errors
 
*23 replacements for the #DIV/0!
 
*23 replacements for the #DIV/0!

Revision as of 21:29, 10 December 2015

12/8/15

  • It was now time for me to prepare my file for GenMAPP, and I did so by the Vibrio cholerae instructions found here.]
  • I inserted a new worksheet and named it "forGenMAPP".
  • I went back to the "statistics" worksheet and Selected All and Copied.
  • I went to my new sheet and clicked on cell A1 and selected Paste Special, clicked on the Values radio button, and clicked OK.
    • I then deleted the ID columns besides the far left one in column A, and I deleted the second MasterIndex column because it was unnecessary.
    • I added a "1" before all of the titles of columns D through I so that none of the columns would have the same names due to the replicates.
  • I selected Columns V through Y (all the fold changes). I selected the menu item Format > Cells. Under the number tab, I selected 2 decimal places. I clicked OK.
  • I selected all the columns containing p values. I selected the menu item Format > Cells. Under the number tab, I selected 4 decimal places. I clicked OK.
  • I deleted the left-most Bonferroni p value column, preserving the one that showed the result of my "if" statement.
  • I inserted a column to the right of the "ID" column. I typed the header "SystemCode" into the top cell of this column. I filled the entire column (each cell) with the letter "N".
  • I selected the menu item File > Save As, and chose "Text (Tab-delimited) (*.txt)" from the file type drop-down menu.
  • After preparing it for GenMAPP, here are the .xls and .txt files:
  • Then it was time to perform a sanity check, which was done using the Vibrio cholerae instructions found here.]
  • I opened my spreadsheet and went to the "forGenMAPP" tab.
  • I clicked on cell A1 and selected the menu item Data > Filter > Autofilter. Little drop-down arrows appeared at the top of each column. This enabled me to filter the data according to criteria I set.
  • I clicked on the drop-down arrow on my "Pvalue" column. I selected "Custom". In the window that appeared, I set a criterion that filtered my data so that the Pvalue was less than 0.05.
    • p-value less than 0.05: 1923/3552, 54%
    • p-value less than 0.01: 1028/3552, 29%
    • p-value less than 0.001: 242/3552, 7%
    • p-value less than 0.0001: 40/3552, 1%
    • p < 0.05 for the Bonferroni-corrected p value: 9/3552, 0.2%
    • p < 0.05 for the Benjamini and Hochberg-corrected p value: 1365/3552, 38%
  • Keeping the (unadjusted) "Pvalue" filter at p < 0.05, I filtered the "Avg_ABC_Samples" column to show all genes with an average log fold change greater than zero.
    • 964/3552, 27%
  • Keeping the (unadjusted) "Pvalue" filter at p < 0.05, I filtered the "Avg_ABC_Samples" column to show all genes with an average log fold change less than zero.
    • 959/3552, 27%
  • With an average log fold change of > 0.25 and p < 0.05
    • 874/3552, 25%
  • With an average log fold change of < -0.25 and p < 0.05
    • 848/3552, 24%
  • the fold change cut-off of greater than 0.25 or less than -0.25 and the unadjusted p value cut off of p < 0.05
    • 1722/3552, 48%