Difference between revisions of "Kmeilak Week 8"

Revision as of 04:03, 16 October 2013

Overview of Microarray Data Analysis

Electronic Lab Notebook

10/10/13

Downloaded Merrill Compiled Raw Data file from Sample Microarray Analysis for Vibrio cholerae page
Saved as Merrell_Compiled_Raw_Data_Vibrio_KM_20131010.xls
Opened file in excel; created second worksheet and named it scaled_centered
Copied all data from compiled_raw_data worksheet into scaled_centered worksheet
Inserted two rows underneath header row (ID, A1, etc)
Calculated average and standard deviation for each column {i.e. =AVERAGE(B4:B5224); =STDEV(B4:B5224)} by typing function into appropriate labeled row and copying and pasting formulas across all columns.
Calculated the scaled centered values by subtracting the average value for each column from the value in each and dividing by the standard deviation {i.e. (=B4-B$2)/B$3}
Inserted a new worksheet and named it "statistics".
Copied and pasted all of scaled_centered worksheet into statistics worksheet (note: did paste special values only).
Added three new columns: "Avg_LogFC_A", "Avg_LogFC_B", "Avg_LogFC_C"
Computed average log fold change {i.e. =AVERAGE(B2:E2)} for all patients
Computed average of averages of three patients in new column titled "Avg_LogFC_all"
Created a new column titled "Tstat" in order to run a T test using the following equation {=AVERAGE(N2:P2)/(STDEV(N2:P2)/SQRT(number of replicates))}. The T test was run in order to see which, if any, of the scaled and centered average log ratios are significantly different from 0 (no change).
Created a new column titled "Pvalue". Calculated P value using the following equation {=TDIST(ABS(R2),degrees of freedom,2)}
Created a new worksheet titled "forGENMAPP".
Copied and pasted everything in "statistics" worksheet into "forGENMAPP" worksheet (note: did paste special values only).
Selected all fold changes and formatted cells under number tab to 2 decimal places.
Columns R and S were set to 4 decimal places in the same manner
Columns N through S were cut and inserted next to column B
Deleted rows "Average" and "StDev"
Added "SystemCode" column to the right of "ID" column and put "N" as value for all rows.
Saved as Tab-delimited Text file.

10/15/13

Opened

Top 10 Gene Ontology terms

macromolecule metabolic process
cellular macromolecule metabolic process
marcomolecule biosynthesis process
biopolymer metabolic process
cell projection organization
branched chain family amino acid metabolic process
amino acid metabolic process
cellular amino acid and derivative metabolic process
cellular nitrogen compound metabolic process
cellular amine metabolic process

Questions

1.

@@ Line 20: / Line 20: @@
 *Created a new column titled "Pvalue". Calculated P value using the following equation {=TDIST(ABS(R2),degrees of freedom,2)}
 *Created a new worksheet titled "forGENMAPP".
-*Copied and pasted everything in "statistics" worksheet into "forGENMAPP" worksheet.
+*Copied and pasted everything in "statistics" worksheet into "forGENMAPP" worksheet (note: did paste special values only).
+*Selected all fold changes and formatted cells under number tab to 2 decimal places.
+*Columns R and S were set to 4 decimal places in the same manner
+*Columns N through S were cut and inserted next to column B
+*Deleted rows "Average" and "StDev"
+*Added "SystemCode" column to the right of "ID" column and put "N" as value for all rows.
+*Saved as Tab-delimited Text file.
 /15/13

Difference between revisions of "Kmeilak Week 8"

Revision as of 04:03, 16 October 2013

Overview of Microarray Data Analysis

Files

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Toolbox