Difference between revisions of "Data Analysts Week 14"
(editing references) |
(milestone 4) |
||
Line 41: | Line 41: | ||
===Milestone 4=== | ===Milestone 4=== | ||
− | + | #We inserted a new worksheet and named it "CHP_stem". | |
+ | #We selected all of the data from the "CHP_ANOVA" worksheet and Paste special > paste values into the "CHP_stem" worksheet. | ||
+ | #"Master_Index" was renamed to "SPOT". Column B named "ID" was renamed to "Gene Symbol". We deleted the column named "Standard_Name". | ||
+ | #We filtered the data on the B-H corrected p value to be > 0.05. | ||
+ | #We selected all of the rows except for the header row and deleted the rows by right-clicking and choosing "Delete Row" from the context menu. Then we undid this filter. | ||
+ | #We deleted all of the data columns except for the Average Log Fold change columns for each timepoint. | ||
+ | #we renamed the data columns with just the time and units. | ||
+ | #We clicked "Replace all" to remove the #DIV/0! errors. | ||
+ | #We saved this spreadsheet as Text (Tab-delimited) (*.txt). | ||
+ | #We downloaded the stem.zip file and selected "Extract all" from the menu, creating a folder called stem. | ||
Revision as of 12:57, 25 April 2024
Continuing Milestone 3
Hailey Ivanson helped Katie and I with the Bonferroni and B-H values.
- We used the formula =IFCHP_Bonferroni_p-value>1,1,CHP_Bonferroni_p-value) and =IFControl_Bonferroni_p-value>1,1,Control_Bonferroni_p-value)
- we inserted a new worksheet named "CHP_ANOVA_B-H" and "Control_ANOVA_B-H."
- we copied and pasted the "MasterIndex", "ID", and "Standard Name" columns from our previous worksheet into the first two columns of the new worksheet.
- We used Paste special > Paste values and copied our unadjusted p values from our ANOVA worksheet and pasted it into Column D.
- We selected all of columns A, B, C, and D and sorted by ascending values, smallest to largest.
- We typed "Rank" in cell E1. we typed "1" into cell E2 and "2" into cell E3. Then we selected both cells E2 and E3 and double clicked on the plus sign to fill the column with a series of numbers from 1 to 4697.
- Hailey Ivanson assisted us in calculating the Benjamini and Hochberg p value correction. We typed CHP_B-H_p-value and repeated for the control. We copied that equation to the entire column. =(D2*4697)/E2
- We then typed "CHP_B-H_p-value" into cell G1.
- In cell G2: we used the equation =IF(F2>1,1,F2) and copied that equation to the entire column.
- We selected columns A through G and sorted them in ascending order.
- We copied column G and used Paste special > Paste values to paste it into the next column of our ANOVA sheet.
- We zipped and uploaded the .xlsx file.
- We performed a sanity check by selecting row 1: Data > Filter > Autofilter- p value less than 0.05
Sanity Check Results:
- CONTROL
How many genes are p < 0.05 for the Benjamini and Hochberg-corrected p value? and what is the percentage (out of 4697)? 3699 <.01 3219 <.001 2558 <.0001 1921 <.00001 1325;
- CHP
How many genes are p < 0.05 for the Benjamini and Hochberg-corrected p value? and what is the percentage (out of 4697)? 2863 <.01 2403 <.001 1884 <.0001 1435; 30.55% <.00001 1076; 22.91%
Milestone 4
- We inserted a new worksheet and named it "CHP_stem".
- We selected all of the data from the "CHP_ANOVA" worksheet and Paste special > paste values into the "CHP_stem" worksheet.
- "Master_Index" was renamed to "SPOT". Column B named "ID" was renamed to "Gene Symbol". We deleted the column named "Standard_Name".
- We filtered the data on the B-H corrected p value to be > 0.05.
- We selected all of the rows except for the header row and deleted the rows by right-clicking and choosing "Delete Row" from the context menu. Then we undid this filter.
- We deleted all of the data columns except for the Average Log Fold change columns for each timepoint.
- we renamed the data columns with just the time and units.
- We clicked "Replace all" to remove the #DIV/0! errors.
- We saved this spreadsheet as Text (Tab-delimited) (*.txt).
- We downloaded the stem.zip file and selected "Extract all" from the menu, creating a folder called stem.
Acknowledgements
This procedure was adapted from the Data Analysis page Milestone 3 and 4 protocol, linked here: Data Analysis The procedure for Milestone 3 was also adapted from the steps outlined in the Week 9 assignment page. The procedure for Milestone 4 was also adapted from the steps outlined in the Week 10 assignment page. Our quality assurance, Hailey Ivanson was a key part in completing this milestone, and her help was very valuable. Except for what is noted above, this individual journal entry was completed by Katie and Charlotte and not copied from another source.
Ckapla12 (talk) 14:28, 23 April 2024 (PDT)
References
LMU BioDB 2024. (2024). Week 14. Retrieved April 23, 2024 from https://xmlpipedb.cs.lmu.edu/biodb/spring2024/index.php/Week_14
LMU BioDB 2024. (2024). Week 9. Retrieved April 23, 2024 from https://xmlpipedb.cs.lmu.edu/biodb/spring2024/index.php/Week_9
LMU BioDB 2024. (2024). Week 10. Retrieved April 23, 2024 from https://xmlpipedb.cs.lmu.edu/biodb/spring2024/index.php/Week_10
LMU BioDB 2024. (2024). Data Analysis. Retrieved April 23, 2024 from https://xmlpipedb.cs.lmu.edu/biodb/spring2024/index.php/Data_Analysis