Difference between revisions of "Laurmagee: Week 8"
From LMU BioDB 2013
(→Part 1) |
(Stuff) |
||
Line 1: | Line 1: | ||
===Part 1=== | ===Part 1=== | ||
− | * First download the following data file: [[File:Merrell_Compiled_Raw_Data_Vibrio.xls]] | + | *First download the following data file: [[File:Merrell_Compiled_Raw_Data_Vibrio.xls]] |
− | + | *Insert a new spreadsheet in the label it "scaled_centered". | |
− | Insert a new | + | *Select all the data in your previous worksheet and copy it into the new spreadsheet |
− | + | *Insert two rows in between the top row of headers and the first data row. | |
− | Insert two rows in between the top row of headers and the first data row. | + | *In cell A2, type "Average" and in cell A3, type "StdDev". |
− | In cell A2, type "Average" and in cell A3, type "StdDev". | + | *Compute the Average log ratio for each chip (each column of data). In cell B2, type the following equation: =AVERAGE(B4:B5224) |
− | + | *Compute the Standard Deviation of the log ratios on each chip (each column of data). In cell B3, type the following equation: =STDEV(B4:B5224) | |
− | =AVERAGE(B4:B5224) | + | *Drag these lines into the rest of the columns, because Excel will automatically change the equation to match your specific column. |
− | + | *Copy the column headings for all of your data columns and then paste them to the right of the last data column so that you have a second set of headers above blank colums of cells. Edit the names of the columns so that they now read: A1_scaled_centered, A2_scaled_centered, etc. | |
− | + | *In cell N4, type the following equation: =(B4-B$2)/B$3 The dollar signs are important because we will be using the same mean (B2) and standard deviation (B3) in all the succeeding columns values as well. | |
− | =STDEV(B4:B5224) | + | *Copy and paste the scaling and centering equation for each of the columns of data with the "_scaled_centered" column header. Be sure that your equation is correct for the column you are calculating (i.e. change the values to match the cells with the correct average and standard deviation). |
− | + | *Insert a new worksheet and name it "statistics". | |
− | + | *Go back to the "scaling_centering" worksheet and copy the first column ("ID"). And paste these values into the first column of your new spreadsheet. | |
− | + | *Go back to the "scaling_centering" worksheet and copy the columns that are designated "_scaled_centered". Click on the B1 cell and select "Paste Special" from the Edit menu. A window will open: click on the radio button for "Values" and click OK. | |
− | Copy the column headings for all of your data columns and then paste them to the right of the last data column so that you have a second set of headers above blank colums of cells. Edit the names of the columns so that they now read: A1_scaled_centered, A2_scaled_centered, etc. | + | *Go to a new column on the right of your worksheet. Type the header "Avg_LogFC_A", "Avg_LogFC_B", and "Avg_LogFC_C" into the top cell of the next three columns. |
− | In cell N4, type the following equation: | + | *Compute the average log fold change for the replicates for each patient by typing the equation: =AVERAGE(B2:E2) into cell N2. Copy this equation and paste it into the rest of the column. |
− | =(B4-B$2)/B$3 | + | *Create the equation for patients B and C and paste it into their respective columns. |
− | + | *Now you will compute the average of the averages. Type the header "Avg_LogFC_all" into the first cell in the next empty column. | |
− | + | *Create the equation that will compute the average of the three previous averages you calculated and paste it into this entire column. | |
− | Copy and paste the scaling and centering equation for each of the columns of data with the "_scaled_centered" column header. Be sure that your equation is correct for the column you are calculating. | + | *Insert a new column labeled "Tstat". |
− | + | *Enter the equation: =AVERAGE(N2:P2)/(STDEV(N2:P2)/SQRT(3)) and paste it into all rows in that column. | |
− | + | *Label the top cell in the next column "Pvalue" and enter the equation: =TDIST(ABS(R2),2,2) paste it into all rows in that column. | |
− | + | *Insert a new worksheet and name it "forGenMAPP". | |
− | Insert a new worksheet and name it "statistics". | + | *Go back to the "statistics" worksheet and Select All and Copy. |
− | Go back to the "scaling_centering" worksheet and copy the first column ("ID"). | + | *Go to your new sheet and click on cell A1 and select Paste Special, click on the Values radio button, and click OK. |
− | + | *Select Columns B through Q (all the fold changes). Select the menu item Format > Cells. Under the number tab, select 2 decimal places. Click OK. | |
− | Go back to the "scaling_centering" worksheet and copy the columns that are designated "_scaled_centered". | + | *Select Columns R and S. Select the menu item Format > Cells. Under the number tab, select 4 decimal places. Click OK. |
− | + | *Select Columns N through S and Cut. Select Column B by left-clicking on the "B" at the top of the column. Then right-click on the Column B header and select "Insert Cut Cells". This will insert the data without writing over your existing columns. | |
− | Go to a new column on the right of your worksheet. Type the header "Avg_LogFC_A", "Avg_LogFC_B", and "Avg_LogFC_C" into the top cell of the next three columns. | + | *Insert a column to the right of the "ID" column. Type the header "SystemCode" into the top cell of this column. Fill the entire column (each cell) with the letter "N". |
− | Compute the average log fold change for the replicates for each patient by typing the equation: | + | *Select the menu item File > Save As, and choose "Text (Tab-delimited) (*.txt)" from the file type drop-down menu. |
− | =AVERAGE(B2:E2) | + | |
− | into cell N2. Copy this equation and paste it into the rest of the column. | + | |
− | Create the equation for patients B and C and paste it into their respective columns. | + | |
− | Now you will compute the average of the averages. Type the header "Avg_LogFC_all" into the first cell in the next empty column. Create the equation that will compute the average of the three previous averages you calculated and paste it into this entire column. | + | |
− | Insert a new column | + | |
− | =AVERAGE(N2:P2)/(STDEV(N2:P2)/SQRT( | + | |
− | + | ||
− | Label the top cell in the next column "Pvalue" | + | |
− | =TDIST(ABS(R2), | + | |
− | + | ||
− | Insert a new worksheet and name it "forGenMAPP". | + | |
− | Go back to the "statistics" worksheet and Select All and Copy. | + | |
− | Go to your new sheet and click on cell A1 and select Paste Special, click on the Values radio button, and click OK | + | |
− | Select Columns B through Q (all the fold changes). Select the menu item Format > Cells. Under the number tab, select 2 decimal places. Click OK. | + | |
− | Select Columns R and S. Select the menu item Format > Cells. Under the number tab, select 4 decimal places. Click OK. | + | |
− | Select Columns N through S and Cut. Select Column B by left-clicking on the "B" at the top of the column. Then right-click on the Column B header and select "Insert Cut Cells". This will insert the data without writing over your existing columns. | + | |
− | + | ||
− | Insert a column to the right of the "ID" column. Type the header "SystemCode" into the top cell of this column. Fill the entire column (each cell) with the letter "N". | + | |
− | Select the menu item File > Save As, and choose "Text (Tab-delimited) (*.txt)" from the file type drop-down menu | + | |
− | + | ||
*[[File:Merrell_Compiled_Raw_Data_Vibrio_LM.xml]] | *[[File:Merrell_Compiled_Raw_Data_Vibrio_LM.xml]] |
Revision as of 06:52, 18 October 2013
Part 1
- First download the following data file: File:Merrell Compiled Raw Data Vibrio.xls
- Insert a new spreadsheet in the label it "scaled_centered".
- Select all the data in your previous worksheet and copy it into the new spreadsheet
- Insert two rows in between the top row of headers and the first data row.
- In cell A2, type "Average" and in cell A3, type "StdDev".
- Compute the Average log ratio for each chip (each column of data). In cell B2, type the following equation: =AVERAGE(B4:B5224)
- Compute the Standard Deviation of the log ratios on each chip (each column of data). In cell B3, type the following equation: =STDEV(B4:B5224)
- Drag these lines into the rest of the columns, because Excel will automatically change the equation to match your specific column.
- Copy the column headings for all of your data columns and then paste them to the right of the last data column so that you have a second set of headers above blank colums of cells. Edit the names of the columns so that they now read: A1_scaled_centered, A2_scaled_centered, etc.
- In cell N4, type the following equation: =(B4-B$2)/B$3 The dollar signs are important because we will be using the same mean (B2) and standard deviation (B3) in all the succeeding columns values as well.
- Copy and paste the scaling and centering equation for each of the columns of data with the "_scaled_centered" column header. Be sure that your equation is correct for the column you are calculating (i.e. change the values to match the cells with the correct average and standard deviation).
- Insert a new worksheet and name it "statistics".
- Go back to the "scaling_centering" worksheet and copy the first column ("ID"). And paste these values into the first column of your new spreadsheet.
- Go back to the "scaling_centering" worksheet and copy the columns that are designated "_scaled_centered". Click on the B1 cell and select "Paste Special" from the Edit menu. A window will open: click on the radio button for "Values" and click OK.
- Go to a new column on the right of your worksheet. Type the header "Avg_LogFC_A", "Avg_LogFC_B", and "Avg_LogFC_C" into the top cell of the next three columns.
- Compute the average log fold change for the replicates for each patient by typing the equation: =AVERAGE(B2:E2) into cell N2. Copy this equation and paste it into the rest of the column.
- Create the equation for patients B and C and paste it into their respective columns.
- Now you will compute the average of the averages. Type the header "Avg_LogFC_all" into the first cell in the next empty column.
- Create the equation that will compute the average of the three previous averages you calculated and paste it into this entire column.
- Insert a new column labeled "Tstat".
- Enter the equation: =AVERAGE(N2:P2)/(STDEV(N2:P2)/SQRT(3)) and paste it into all rows in that column.
- Label the top cell in the next column "Pvalue" and enter the equation: =TDIST(ABS(R2),2,2) paste it into all rows in that column.
- Insert a new worksheet and name it "forGenMAPP".
- Go back to the "statistics" worksheet and Select All and Copy.
- Go to your new sheet and click on cell A1 and select Paste Special, click on the Values radio button, and click OK.
- Select Columns B through Q (all the fold changes). Select the menu item Format > Cells. Under the number tab, select 2 decimal places. Click OK.
- Select Columns R and S. Select the menu item Format > Cells. Under the number tab, select 4 decimal places. Click OK.
- Select Columns N through S and Cut. Select Column B by left-clicking on the "B" at the top of the column. Then right-click on the Column B header and select "Insert Cut Cells". This will insert the data without writing over your existing columns.
- Insert a column to the right of the "ID" column. Type the header "SystemCode" into the top cell of this column. Fill the entire column (each cell) with the letter "N".
- Select the menu item File > Save As, and choose "Text (Tab-delimited) (*.txt)" from the file type drop-down menu.