Data Analysis
Revision as of 14:39, 18 November 2019 by Kdahlquist (talk | contribs) (→Guild Members: pasted in team names)
Final Project Links | |||||||
---|---|---|---|---|---|---|---|
Overview | Deliverables | Guilds | Project Manager | Quality Assurance | Data Analysis | Coder/Designer | |
Teams | FunGals | Sulfiknights | Skinny Genes |
The role of the Data Analyst will be to apply the data analysis pipeline that you learned by analyzing the Dahlquist Lab microarray dataset to complete the analysis of a different published yeast timecourse microarray dataset. The Data Analysts are the end-users of the project, ultimately determining whether the work of the coder/designer and quality assurance members is useful to them.
Contents
Guild Members
- Sulfiknights: Ivy, Marcus
- FunGals: Emma, Kaitlyn
- Skinny Genes: Aby, David
Milestones
The milestones do not necessarily correspond to particular days/weeks; instead they are sets of tasks grouped together.
Milestone 1: Annotated Bibliography
- The Data Analysts will work with their teams to develop an annotated bibliography of papers relating to their team's assigned paper.
Milestone 2: Journal Club Presentation
- The Data Analysts will work with their teams to create and deliver a Journal Club presentation about to their team's assigned paper.
Milestone 3: Getting the data ready for analysis
- Download and examine the microarray dataset, comparing it to the samples and experiment described in your journal club article.
- Make a "sample-data relationship table" that lists all of the samples (microarray chips), noting the treatment, time point, and replicate number.
- Come up with consistent column headers that summarize this information
- For example, the Dahlquist Lab microarray data used strain_LogFC_timepoint-replicate number, as in wt_LogFC_t15-1.
- Come up with consistent column headers that summarize this information
- Organize the data in a worksheet in an Excel workbook so that:
- ID is in the first column
- Data columns are to the right, in increasing chronological order, using the column header pattern you created
- Replicates are grouped together
Milestone 4: ANOVA analysis
- Perform an ANOVA analysis of the data, as you did on Week 8 for the Dahlquist lab data.
- Note that you will need to adjust your formulas to take into account the different number of timepoints and replicates in your article's dataset.
Milestone 5: Clustering with stem and YEASTRACT
- Cluster the data with stem, as you did on Week 9.
- Note that we will make some adjustments to the GO term analysis because stem was not providing GO term names.
- Use YEASTRACT to generate a candidate gene regulatory network as you did on Week 9.
Milestone 6: Create an input workbook for GRNmap using MS Access database
- Create an input workbook for GRNmap based on a Microsoft Access database that the Coder/Designer and QA's make, following protocol in Week 10
- Run GRNmap and interpret data.
- As the end-user of the Access database, the Data Analysts will provide feedback to the QAs and Coder/Designer about the usability of database.
Final Project Links | |||||||
---|---|---|---|---|---|---|---|
Overview | Deliverables | Guilds | Project Manager | Quality Assurance | Data Analysis | Coder/Designer | |
Teams | FunGals | Sulfiknights | Skinny Genes |