Difference between revisions of "Rlegaspi Week 14"
From LMU BioDB 2015
(→Summary of Progress and Procedure: Finished writing down my progress and procedure.) |
(→December 3, 2015 thru December 8, 2015: updated file link) |
||
Line 33: | Line 33: | ||
**Ensured that I had 11520 Gene IDs; in which the last row which had a "Gene ID" as its label was changed to the correct Gene ID that is "SO4357." | **Ensured that I had 11520 Gene IDs; in which the last row which had a "Gene ID" as its label was changed to the correct Gene ID that is "SO4357." | ||
**Once all the necessary changes were made and I had touched base with my partner Emily on Sunday and Monday, I had uploaded the file for splitting by Dr. Dahlquist: [[File:UpdatedCompiledRawData Shewanella RARL 20151201 HMH.xlsx]] | **Once all the necessary changes were made and I had touched base with my partner Emily on Sunday and Monday, I had uploaded the file for splitting by Dr. Dahlquist: [[File:UpdatedCompiledRawData Shewanella RARL 20151201 HMH.xlsx]] | ||
− | *Split data was received and posted as a file on our team's file page by Dr. Dahlquist: [[ | + | *Split data was received and posted as a file on our team's file page by Dr. Dahlquist: [[File:UpdatedCompiledRawData Shewanella RARL 20151201 HMH forsplitting.xlsx]] |
**Downloaded this file and copied the sheets of data into a new Excel file entitled ''StatisticalAnalysis Shewanella RARL 20151207 HMH'' | **Downloaded this file and copied the sheets of data into a new Excel file entitled ''StatisticalAnalysis Shewanella RARL 20151207 HMH'' | ||
**Created a new sheet called Averages | **Created a new sheet called Averages |
Latest revision as of 07:39, 8 December 2015
Contents
Shewanella oneidensis
Our Gene Database Testing Report
Group Paper - File:Final Report 20151218 2 HMH.docx
Group Members
- Coder: Mary Alverson
- GenMAPP User & Project Manager: Ron Legaspi
- Quality Assurance: Josh Kuroda
- GenMAPP User: Emily Simso
Important Links
Our Files
Our Deliverables
Gene Database Project Links | |||||||
---|---|---|---|---|---|---|---|
Overview | Deliverables | Reference Format | Guilds | Project Manager | GenMAPP User | Quality Assurance | Coder |
Teams | Heavy Metal HaterZ | The Class Whoopers | GÉNialOMICS | Oregon Trail Survivors |
Individual Journal Entries | ||||
---|---|---|---|---|
Mary Alverson | Week 11 | Week 12 | Week 14 | Week 15 |
Emily Simso | Week 11 | Week 12 | Week 14 | Week 15 |
Ron Legaspi | Week 11 | Week 12 | Week 14 | Week 15 |
Josh Kuroda | Week 11 | Week 12 | Week 14 | Week 15 |
Goals for Week 14
Data Preparation and Statistical Analysis
- Create a Master Raw Data file that contains the IDs and columns of data required for further analysis.
- Consult with Dr. Dahlquist on how to process the data (normalization, statistics)
- Perform the statistical analysis in Excel.
- Format the gene expression data for import into GenMAPP.
Summary of Progress and Procedure
Compiling Raw Data and Statistical Analysis
December 1, 2015 through December 3, 2015
Referencing the Week 12 Feedback provided by Dr. Dahlquist, I was able to begin compiling the raw data on single Excel File:
- Created an Excel File and named file Raw Data Shewanella RARL 20151201
- Sheet 1 was entitled CompiledRawData Sheet:
- Column 1 = Gene ID
- Column 2 = MasterIndex (numbered from 1 to 11520)
- The rest of the columns was log data taken from the 0, 5, 20, and 60 time points respectively
- 7 timepoints total (C0, C5, C20, C60, F5, F20, F60) and 4 replicates total; therefore, 28 total columns of data
- Created a MasterSheet and copied information from CompiledRawData Sheet into this new sheet
- Sorted the Gene ID's in alphabetical order (A-Z) and deleted the rows that contained an ID of Blank, blank, gDNA, NC-, or ORF resulting in the deletion of 705 rows.
- Deleted the cells that contained the error message of
#NUM!
which resulted in the deletion of 2,118 cells. - Deleted the cells that contained the error message of
#DIV/0!
which resulted in the deletion of 23 cells.
- Created a ScalingCentering Sheet
- Copied over data from the MasterSheet
- Added two rows right below the title row to represent the calculations for the Average and the Standard Deviation of each column
- For the Scaled and Centered Columns of data, typed the equation
=(C4-C$2)/C$3
in the first cell under scaled and centered column for replicate 1 at timepoint C0, and used Excel functions in order to scale and center the rest of the data with the equation as a template.
- Sent this file to Dr. Dahlquist to split the data to get rid of duplicates: File:Raw Data Shewanella RARL 20151201.xlsx
December 3, 2015 thru December 8, 2015
Discrepancies and issues arose with data between my partner Emily Simso and I that were brought to our attention by Dr. Dahlquist; thus, a review of the compiled raw data needed to be done in order for our Excel Sheets to match and to continue on with statistical analysis:
- Repeated procedure from File:Raw Data Shewanella RARL 20151201.xlsx; however, feedback from Dr. Dahlquist was kept in mind and created a new Excel file called UpdatedCompiledRawData Shewanella RARL 20151201 HMH
- Correct set of timepoints were used in my previous Excel file so no changes were needed to be done there
- Ensured that I had 11520 Gene IDs; in which the last row which had a "Gene ID" as its label was changed to the correct Gene ID that is "SO4357."
- Once all the necessary changes were made and I had touched base with my partner Emily on Sunday and Monday, I had uploaded the file for splitting by Dr. Dahlquist: File:UpdatedCompiledRawData Shewanella RARL 20151201 HMH.xlsx
- Split data was received and posted as a file on our team's file page by Dr. Dahlquist: File:UpdatedCompiledRawData Shewanella RARL 20151201 HMH forsplitting.xlsx
- Downloaded this file and copied the sheets of data into a new Excel file entitled StatisticalAnalysis Shewanella RARL 20151207 HMH
- Created a new sheet called Averages
- Averaged together the replicate data from the two spots that are now split and used the equation
=AVERAGE(C2,AG2)
under the column for C0 replicate 1 - Used excel to copy this equation to the entire column and get a derivative of the equation copied for the other columns of averages for each replicate
- Averaged together the replicate data from the two spots that are now split and used the equation
- Created a new sheet called Statistics
- Copied and pasted values from the Averages sheet into this new sheet
- Computed the average of the biological replicates for each treatment, biological average was calculated with the following equation for C0:
=AVERAGE(C2:F2)
and a derivative of this equation was used for every timepoint. - Calculated the average log ratios of C5/C0, C20/C0, C60/C0, F5/C60, F20/C60, and F60/C60
- Since its in log space, I just needed to subtract the average from the C5 to the average from the C0.
- Performed a two-sample T test between C5 and C0, C20 and C0, C60 and C0, F5 and C60, and so on:
=TTEST(<range of cells containing the biological replicates for C0>, <range of cells containing the biological replicates for C5>, 2,3]
- This will returned the p value and uploaded the file to the team's file page to be reviewed by Dr. Dahlquist, while performing a sanity check: File:StatisticalAnalysis Shewanella RARL 20151207 HMH.xlsx
External Links
Ron Legaspi
BIOL 367, Fall 2015
Assignment Links
- Week 1 Assignment
- Week 2 Assignment
- Week 3 Assignment
- Week 4 Assignment
- Week 5 Assignment
- Week 6 Assignment
- Week 7 Assignment
- Week 8 Assignment
- Week 9 Assignment
- Week 10 Assignment
- Week 11 Assignment
- Week 12 Assignment
- Week 14 Assignment
- Week 15 Assignment
Individual Weekly Journals
- Individual Journal Week 1 - This is my User Page
- Individual Journal Week 2
- Individual Journal Week 3
- Individual Journal Week 4
- Individual Journal Week 5
- Individual Journal Week 6
- Individual Journal Week 7
- Individual Journal Week 8
- Individual Journal Week 9
- Individual Journal Week 10
- Individual Journal Week 11
- Individual Journal Week 12
- Individual Journal Week 14
- Individual Journal Week 15