Kzebrows Week 14
From LMU BioDB 2015
Electronic Lab Notebook
First, I downloaded the most recent version of the file from the OTS Files page. I renamed all columns by replacing LR with LogFC. I re-named Sheet 1 "CompiledRawData" and copied all of the data from it and pasted it into Sheet 2, which I re-named "MasterSheet". Next I:
- I deleted all ID columns except for column A, which I re-named "ID"
- I inserted a column to the left of column B and re-named it "MasterIndex"
- I typed "1" and cell B2 and 2 and Cell B3 and selected both cells. I double-clicked on the + sign to fill the entire column with numbers 1-4208.
- Selected the data and sorted A-->Z on the "ID" column
- Deleted rows that have "Empty" or "Blank####" ID. This left me with 3,926 files (3,927 minus 1 header row).
- Sorted by MasterIndex column to put IDs back in original order, smallest to largest.
- Replaced "Error" with nothing and got 585 replacements.
- Copy and pasted data into Sheet 3, which I re-named "ScalingCentering".
The next portion of the assignment was done by following the instructions in Sample Microarray Analysis Vibrio cholerae.
Normalizing the log ratios for the set of slides in the experiment
- I inserted a new worksheet and named it Scaled_Centered
- I copied all data from the MasterSheet and pasted it into cell A1 of the Scaled_Centered sheet
- I inserted two rows in between the top row of headers and first data row. I named cell A2 "Average" and named cell A3 "StdDev".
- In cell C2 I typed =AVERAGE(C4:C3929) and in cell C3 I typed =STDEV(C4:C3929). I pressed enter and copied this equation across the rest of the columns through column AL.
- I then copied the column headings for all data columns and pasted them to the right of the last column. Using the copy/paste tool I renamed each column with "_Scaled_Centered" at the end.
- In cell AM4 I typed =(C4-C$2)/C$3 indicating that I wanted data in cell C4 to have the average subtracted from it and then to divide it by the standard deviation. I used the "$" sign to indicate that I did not want the average and standard deviation values to change even when the equation was pasted for the entire column of genes.
- I then copy and pasted that equation across the entire column by clikcing on the original cell and double-clicking on the black plus sign. I copy and pasted this equation for each column of the data.
Perform statistical analysis on the ratios
- I inserted a new worksheet and named it "Statistics"
- I copied and pasted the ID column from the ScalingCentering worksheet into the first column of the new worksheet
- I copied all Scaled_Centered columns from the ScalingCentering worksheet and pasted the values into column B1 of the new sheet
- Deleted "Average" and "StDev" columns