Lenaolufson Week 14
From LMU BioDB 2015
Revision as of 02:21, 3 December 2015 by Lenaolufson (Talk | contribs) (added in the protocol followed in class for the data)
12/1/15
- I followed the protocol given by Dr. Dahlquist left on my talk page to perform the correct steps for editing the excel data sheet. The protocol is as follows:
- I renamed Sheet1 to "CompiledRawData".
- I renamed my column heading as follows:
- I called my leftmost column "ID" instead of Code.
- For the data columns, I got rid of the "(635/532)" from each header. I named them like this as an example: "LogRatio_SampleA_Cy3-Cy5".
- Once I renamed the columns, I did all further manipulations in a different sheet. I copied and pasted all of the data into Sheet2 which I renamed to "DyeSwap".
- I created a "MasterIndex" column as follows. I inserted a new column to the right of the "ID" column and named it "MasterIndex". In this column I created a numerical index of genes so that I can always sort them back into the same order that they started out in.
- I typed a "1" in cell B2 and a "2" in cell B3.
- I selected both cells. I hovered my mouse over the bottom-right corner of the selection until it made a thin black + sign. I double-clicked on the + sign to fill the entire column with a series of numbers from 1 to 8448 (the number of spots on the microarray).
- Then, I selected all of the data and sorted it A-->Z on the "ID" column.
- I deleted all of the rows that had an ID of "_". The number of records after deleting the "_" columns: 7104.
- Then I swapped the dye orientation so that all of the samples were Cy5/Cy3.
- I inserted a column to the right of the columns that needed to be swapped. I named the new column the same as I did before, but added "_swapped" to the header to designate that I swapped the samples.
- Then, I typed a formula in the column: =C2*(-1). I copied and pasted the formula to the entire column.
- I created a new worksheet that I named "MasterSheet". I copied and Pasted special > Paste values the ID, MasterIndex, and data columns that were all in the orientation of Cy5/Cy3 (the original ones and the ones I just swapped).
- This was then the starting point for the normalization and statistics. I copied and pasted the data from this sheet into a new worksheet, which I renamed "ScalingCentering".
- In this new sheet, I performed the scaling and centering according to the Vibrio cholerae instructions found here.]
- When I computed the average and standard deviation calculations for the log ratios, all of the values that came out were much too high to make sense with the data. Upon looking at the data and consulting with Dr. Dahlquist, we found that some of the values from the raw data were extremely large such as 100000.
- At this point, I posted my spreadsheet and e-mailed Dr. Dahlquist the link to it.
- Data set after followed protocol given by Dr. D: