Difference between revisions of "Taur.vil Week 8"
From LMU BioDB 2013
(adding digital files) |
(→Part One Document Files:: finished notebook recording for last week) |
||
Line 5: | Line 5: | ||
:[[Media:Vilgalys_2013_10_10_VibrioFileOne.xls|excel]] | :[[Media:Vilgalys_2013_10_10_VibrioFileOne.xls|excel]] | ||
− | Digital Notebook: | + | ==Digital Notebook:== |
# Downloaded original data (Merrell_Compiled_Raw_Data_Vibrio.xls) from [[http://www.openwetware.org/wiki/BIOL398-01/S10:Sample_Microarray_Analysis_Vibrio_cholerae]] | # Downloaded original data (Merrell_Compiled_Raw_Data_Vibrio.xls) from [[http://www.openwetware.org/wiki/BIOL398-01/S10:Sample_Microarray_Analysis_Vibrio_cholerae]] | ||
# Observed that the data collected had already been log transformed (there were negative numbers) | # Observed that the data collected had already been log transformed (there were negative numbers) | ||
Line 15: | Line 15: | ||
# Pasted values only for the scaled and centered columns. | # Pasted values only for the scaled and centered columns. | ||
# Deleted the rows for average and standard deviation | # Deleted the rows for average and standard deviation | ||
− | # Inserted columns to the right of the data for the average log fold change (FC) of patient and calculated the value by taking the average of the three technical replicates. | + | # Inserted columns to the right of the data for the average log fold change (FC) of patient and calculated the value by taking the average of the three technical replicates. |
− | + | # Calculated the t-stat for each gene in a new column by taking the average of the three biological replicates divided by (the standard deviation of the biological replicates divided by the sq. root of the sample size (which was three) ) | |
+ | # Calculated the p-value in a new column by using Excel's TDIST function where the three entries were the absolute value of the t-stat, 2, and 2. | ||
+ | # Took an average FC for each of the three biological replicates. | ||
+ | # Copied into a new page titled forGenMAPP and inserted column 2 (System Code) where N was entered for each row. | ||
+ | |||
+ | |||
comparing to zero which is the null hypothesis of no change. | comparing to zero which is the null hypothesis of no change. | ||
1) magnitude of change observing | 1) magnitude of change observing | ||
2) variation and number of replicates | 2) variation and number of replicates |
Revision as of 16:44, 15 October 2013
Uploaded Files:
Part One Document Files:
Digital Notebook:
- Downloaded original data (Merrell_Compiled_Raw_Data_Vibrio.xls) from [[1]]
- Observed that the data collected had already been log transformed (there were negative numbers)
- Meant we could begin at the normalization step
- Created a new sheet in Excell, titled scaled_centered.
- In scaled_centered, inserted two new rows and calculated average and standard deviation for each replicate using the excell functions AVERAGE and STDEV.
- Created a new column for each of the samples, relabeling them with a _sc (for scaled centered) after the name. Filled these columns with the scaled centered values calculated by taking the raw data minus the average for the sample (row 2) divided by the standard deviation (row 3).
- created a new worksheet called statistics and copied the ID column into the new worksheet.
- Pasted values only for the scaled and centered columns.
- Deleted the rows for average and standard deviation
- Inserted columns to the right of the data for the average log fold change (FC) of patient and calculated the value by taking the average of the three technical replicates.
- Calculated the t-stat for each gene in a new column by taking the average of the three biological replicates divided by (the standard deviation of the biological replicates divided by the sq. root of the sample size (which was three) )
- Calculated the p-value in a new column by using Excel's TDIST function where the three entries were the absolute value of the t-stat, 2, and 2.
- Took an average FC for each of the three biological replicates.
- Copied into a new page titled forGenMAPP and inserted column 2 (System Code) where N was entered for each row.
comparing to zero which is the null hypothesis of no change. 1) magnitude of change observing 2) variation and number of replicates