Difference between revisions of "Data Analysts Week 14"

Revision as of 12:57, 25 April 2024

We used the formula =IFCHP_Bonferroni_p-value>1,1,CHP_Bonferroni_p-value) and =IFControl_Bonferroni_p-value>1,1,Control_Bonferroni_p-value)
we inserted a new worksheet named "CHP_ANOVA_B-H" and "Control_ANOVA_B-H."
we copied and pasted the "MasterIndex", "ID", and "Standard Name" columns from our previous worksheet into the first two columns of the new worksheet.
We used Paste special > Paste values and copied our unadjusted p values from our ANOVA worksheet and pasted it into Column D.
We selected all of columns A, B, C, and D and sorted by ascending values, smallest to largest.
We typed "Rank" in cell E1. we typed "1" into cell E2 and "2" into cell E3. Then we selected both cells E2 and E3 and double clicked on the plus sign to fill the column with a series of numbers from 1 to 4697.
Hailey Ivanson assisted us in calculating the Benjamini and Hochberg p value correction. We typed CHP_B-H_p-value and repeated for the control. We copied that equation to the entire column. =(D2*4697)/E2
We then typed "CHP_B-H_p-value" into cell G1.
In cell G2: we used the equation =IF(F2>1,1,F2) and copied that equation to the entire column.
We selected columns A through G and sorted them in ascending order.
We copied column G and used Paste special > Paste values to paste it into the next column of our ANOVA sheet.
We zipped and uploaded the .xlsx file.
We performed a sanity check by selecting row 1: Data > Filter > Autofilter- p value less than 0.05

Sanity Check Results:

CONTROL

How many genes are p < 0.05 for the Benjamini and Hochberg-corrected p value? and what is the percentage (out of 4697)? 3699 <.01 3219 <.001 2558 <.0001 1921 <.00001 1325;

CHP

How many genes are p < 0.05 for the Benjamini and Hochberg-corrected p value? and what is the percentage (out of 4697)? 2863 <.01 2403 <.001 1884 <.0001 1435; 30.55% <.00001 1076; 22.91%

Milestone 4

We inserted a new worksheet and named it "CHP_stem".
We selected all of the data from the "CHP_ANOVA" worksheet and Paste special > paste values into the "CHP_stem" worksheet.
"Master_Index" was renamed to "SPOT". Column B named "ID" was renamed to "Gene Symbol". We deleted the column named "Standard_Name".
We filtered the data on the B-H corrected p value to be > 0.05.
We selected all of the rows except for the header row and deleted the rows by right-clicking and choosing "Delete Row" from the context menu. Then we undid this filter.
We deleted all of the data columns except for the Average Log Fold change columns for each timepoint.
we renamed the data columns with just the time and units.
We clicked "Replace all" to remove the #DIV/0! errors.
We saved this spreadsheet as Text (Tab-delimited) (*.txt).
We downloaded the stem.zip file and selected "Extract all" from the menu, creating a folder called stem.

Acknowledgements

This procedure was adapted from the Data Analysis page Milestone 3 and 4 protocol, linked here: Data Analysis The procedure for Milestone 3 was also adapted from the steps outlined in the Week 9 assignment page. The procedure for Milestone 4 was also adapted from the steps outlined in the Week 10 assignment page. Our quality assurance, Hailey Ivanson was a key part in completing this milestone, and her help was very valuable. Except for what is noted above, this individual journal entry was completed by Katie and Charlotte and not copied from another source.

Ckapla12 (talk) 14:28, 23 April 2024 (PDT)

References

LMU BioDB 2024. (2024). Week 14. Retrieved April 23, 2024 from https://xmlpipedb.cs.lmu.edu/biodb/spring2024/index.php/Week_14

LMU BioDB 2024. (2024). Week 9. Retrieved April 23, 2024 from https://xmlpipedb.cs.lmu.edu/biodb/spring2024/index.php/Week_9

LMU BioDB 2024. (2024). Week 10. Retrieved April 23, 2024 from https://xmlpipedb.cs.lmu.edu/biodb/spring2024/index.php/Week_10

LMU BioDB 2024. (2024). Data Analysis. Retrieved April 23, 2024 from https://xmlpipedb.cs.lmu.edu/biodb/spring2024/index.php/Data_Analysis

@@ Line 41: / Line 41: @@
 ===Milestone 4===
+#We inserted a new worksheet and named it "CHP_stem".
+#We selected all of the data from the "CHP_ANOVA" worksheet and Paste special > paste values into the "CHP_stem" worksheet.
+#"Master_Index" was renamed  to "SPOT". Column B named "ID" was renamed to "Gene Symbol". We deleted the column named "Standard_Name".
+#We filtered the data on the B-H corrected p value to be > 0.05.
+#We selected all of the rows except for the header row and deleted the rows by right-clicking and choosing "Delete Row" from the context menu. Then we undid this filter.
+#We deleted all of the data columns except for the Average Log Fold change columns for each timepoint.
+#we renamed the data columns with just the time and units.
+#We clicked "Replace all" to remove the #DIV/0! errors.
+#We saved this spreadsheet as Text (Tab-delimited) (*.txt).
+#We downloaded the stem.zip file and selected "Extract all" from the menu, creating a folder called stem.

Difference between revisions of "Data Analysts Week 14"

Revision as of 12:57, 25 April 2024

Contents

Continuing Milestone 3

Milestone 4

Acknowledgements

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools