Difference between revisions of "Kzebrows Week 14"
From LMU BioDB 2015
(→Electronic Lab Notebook: Dr. Dahlquist's corrections for statistical analysis prep) |
(→Electronic Lab Notebook: finished notebook entry for Dr. Dahlquist's corrections) |
||
Line 4: | Line 4: | ||
*deleted all ID columns except for column A, which I re-named "ID" | *deleted all ID columns except for column A, which I re-named "ID" | ||
*inserted a column to the left of column B and re-named it "MasterIndex" | *inserted a column to the left of column B and re-named it "MasterIndex" | ||
− | *Typed "1" and cell B2 and 2 and Cell B3 | + | *Typed "1" and cell B2 and 2 and Cell B3 and selected both cells. I double-clicked on the + sign to fill the entire column with numbers 1-4208. |
+ | *Selected the data and sorted A-->Z on the "ID" column | ||
+ | *Deleted rows that have "Empty" or "Blank####" ID. This left me with 3,926 files (3,927 minus 1 header row). | ||
+ | *Sorted by MasterIndex column to put IDs back in original order, smallest to largest. | ||
+ | *Replaced "Error" with nothing and got 585 replacements. | ||
+ | *Copy and pasted data into Sheet 3, which I re-named "ScalingCentering". | ||
+ | The next portion of the assignment was done by following the instructions in [http://www.openwetware.org/wiki/BIOL398-01/S10:Sample_Microarray_Analysis_Vibrio_cholerae Sample Microarray Analysis Vibrio cholerae]. | ||
− | + | ===Normalizing the log ratios for the set of slides in the experiment== | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
*I inserted a new worksheet and named it Scaled_Centered | *I inserted a new worksheet and named it Scaled_Centered | ||
*I copied all data from the MasterSheet and pasted it into cell A1 of the Scaled_Centered sheet | *I copied all data from the MasterSheet and pasted it into cell A1 of the Scaled_Centered sheet | ||
Line 19: | Line 20: | ||
*I then copied the column headings for all data columns and pasted them to the right of the last column. Using the copy/paste tool I renamed each column with "_Scaled_Centered" at the end. | *I then copied the column headings for all data columns and pasted them to the right of the last column. Using the copy/paste tool I renamed each column with "_Scaled_Centered" at the end. | ||
*In cell AM4 I typed =(C4-C$2)/C$3 indicating that I wanted data in cell C4 to have the average subtracted from it and then to divide it by the standard deviation. I used the "$" sign to indicate that I did not want the average and standard deviation values to change even when the equation was pasted for the entire column of genes. | *In cell AM4 I typed =(C4-C$2)/C$3 indicating that I wanted data in cell C4 to have the average subtracted from it and then to divide it by the standard deviation. I used the "$" sign to indicate that I did not want the average and standard deviation values to change even when the equation was pasted for the entire column of genes. | ||
+ | *I then copy and pasted that equation across the entire column by clikcing on the original cell and double-clicking on the black plus sign. I copy and pasted this equation for each column of the data. | ||
+ | |||
+ | ===Perform statistical analysis on the ratios=== | ||
+ | *I inserted a new worksheet and named it "Statistics" | ||
+ | *I copied and pasted the ID column from the ScalingCentering worksheet into the first column of the new worksheet | ||
+ | *I copied all Scaled_Centered columns from the ScalingCentering worksheet and pasted the values into column B1 of the new sheet | ||
+ | *Deleted "Average" and "StDev" columns |
Revision as of 00:01, 2 December 2015
Electronic Lab Notebook
First, I downloaded the most recent version of the file from the OTS Files page. I renamed all columns by replacing LR with LogFC. I re-named Sheet 1 "CompiledRawData" and copied all of the data from it and pasted it into Sheet 2, which I re-named "MasterSheet". Next I:
- deleted all ID columns except for column A, which I re-named "ID"
- inserted a column to the left of column B and re-named it "MasterIndex"
- Typed "1" and cell B2 and 2 and Cell B3 and selected both cells. I double-clicked on the + sign to fill the entire column with numbers 1-4208.
- Selected the data and sorted A-->Z on the "ID" column
- Deleted rows that have "Empty" or "Blank####" ID. This left me with 3,926 files (3,927 minus 1 header row).
- Sorted by MasterIndex column to put IDs back in original order, smallest to largest.
- Replaced "Error" with nothing and got 585 replacements.
- Copy and pasted data into Sheet 3, which I re-named "ScalingCentering".
The next portion of the assignment was done by following the instructions in Sample Microarray Analysis Vibrio cholerae.
=Normalizing the log ratios for the set of slides in the experiment
- I inserted a new worksheet and named it Scaled_Centered
- I copied all data from the MasterSheet and pasted it into cell A1 of the Scaled_Centered sheet
- I inserted two rows in between the top row of headers and first data row. I named cell A2 "Average" and named cell A3 "StdDev".
- In cell C2 I typed =AVERAGE(C4:C3929) and in cell C3 I typed =STDEV(C4:C3929). I pressed enter and copied this equation across the rest of the columns through column AL.
- I then copied the column headings for all data columns and pasted them to the right of the last column. Using the copy/paste tool I renamed each column with "_Scaled_Centered" at the end.
- In cell AM4 I typed =(C4-C$2)/C$3 indicating that I wanted data in cell C4 to have the average subtracted from it and then to divide it by the standard deviation. I used the "$" sign to indicate that I did not want the average and standard deviation values to change even when the equation was pasted for the entire column of genes.
- I then copy and pasted that equation across the entire column by clikcing on the original cell and double-clicking on the black plus sign. I copy and pasted this equation for each column of the data.
Perform statistical analysis on the ratios
- I inserted a new worksheet and named it "Statistics"
- I copied and pasted the ID column from the ScalingCentering worksheet into the first column of the new worksheet
- I copied all Scaled_Centered columns from the ScalingCentering worksheet and pasted the values into column B1 of the new sheet
- Deleted "Average" and "StDev" columns