Difference between revisions of "Data Analysis"

From LMU BioDB 2019
Jump to navigation Jump to search
(Milestone 3: Complete Microarray Data Analysis: add links to data)
(Milestone 3: Complete Microarray Data Analysis: start filling out steps)
Line 23: Line 23:
 
=== Milestone 3: Complete Microarray Data Analysis ===
 
=== Milestone 3: Complete Microarray Data Analysis ===
  
* Download and examine the microarray dataset, comparing it to the samples and experiment described in your journal club article.
+
# Download and examine the microarray dataset, comparing it to the samples and experiment described in your journal club article.
** [https://sgd-prod-upload.s3.amazonaws.com/S000204227/Barreto_2012_PMID_23039231.zip Barreto et al. (2012)]
+
#* [https://sgd-prod-upload.s3.amazonaws.com/S000204227/Barreto_2012_PMID_23039231.zip Barreto et al. (2012)]
** [https://sgd-prod-upload.s3.amazonaws.com/S000204415/Kitagawa_2002_PMID_12269742.zip Kitagawa et al. (2002)]
+
#* [https://sgd-prod-upload.s3.amazonaws.com/S000204415/Kitagawa_2002_PMID_12269742.zip Kitagawa et al. (2002)]
** [https://sgd-prod-upload.s3.amazonaws.com/S000204367/Thorsen_2007_PMID_17327492.zip Thorsen et al. (2007)]
+
#* [https://sgd-prod-upload.s3.amazonaws.com/S000204367/Thorsen_2007_PMID_17327492.zip Thorsen et al. (2007)]
* Make a "sample-data relationship table" that lists all of the samples (microarray chips), noting the treatment, time point, and replicate number.
+
# Make a "sample-data relationship table" that lists all of the samples (microarray chips), noting the treatment, time point, and replicate number.
** Come up with consistent column headers that summarize this information
+
#* Come up with consistent column headers that summarize this information
*** For example, the Dahlquist Lab microarray data used strain_LogFC_timepoint-replicate number, as in wt_LogFC_t15-1.
+
#** For example, the Dahlquist Lab microarray data used strain_LogFC_timepoint-replicate number, as in wt_LogFC_t15-1.
* Perform an ANOVA analysis of the data.
+
# Organize the data in a worksheet in an Excel workbook so that:
* Cluster the data with stem.
+
#* ID is in the first column
* Use YEASTRACT to generate a candidate gene regulatory network.
+
#* Data columns are to the right, in increasing chronological order
* Create an input workbook for GRNmap based on a Microsoft Access database that the Coder/Designer and QA's make.
+
#* Replicates are grouped together
* Run GRNmap and interpret data.
+
# Perform an ANOVA analysis of the data.
* As the end-user of the Access database, the Data Analysts will provide feedback to the QAs and Coder/Designer about the usability of database.
+
# Cluster the data with stem.
 +
# Use YEASTRACT to generate a candidate gene regulatory network.
 +
# Create an input workbook for GRNmap based on a Microsoft Access database that the Coder/Designer and QA's make.
 +
# Run GRNmap and interpret data.
 +
# As the end-user of the Access database, the Data Analysts will provide feedback to the QAs and Coder/Designer about the usability of database.
  
 
{{Final Project Links}}
 
{{Final Project Links}}
  
 
[[Category:Group Projects]]
 
[[Category:Group Projects]]

Revision as of 15:30, 18 November 2019

Final Project Links
Overview Deliverables Guilds Project Manager Quality Assurance Data Analysis Coder/Designer
Teams FunGals Sulfiknights Skinny Genes

The role of the Data Analyst will be to apply the data analysis pipeline that you learned by analyzing the Dahlquist Lab microarray dataset to complete the analysis of a different published yeast timecourse microarray dataset. The Data Analysts are the end-users of the project, ultimately determining whether the work of the coder/designer and quality assurance members is useful to them.

Guild Members

  • Ivy, Marcus
  • Emma, Kaitlyn
  • Aby, David

Milestones

The milestones do not necessarily correspond to particular weeks; instead they are sets of tasks grouped together.

Milestone 1: Annotated Bibliography

  • The Data Analysts will work with their teams to develop an annotated bibliography of papers relating to their team's assigned paper.

Milestone 2: Journal Club Presentation

  • The Data Analysts will work with their teams to create and deliver a Journal Club presentation about to their team's assigned paper.

Milestone 3: Complete Microarray Data Analysis

  1. Download and examine the microarray dataset, comparing it to the samples and experiment described in your journal club article.
  2. Make a "sample-data relationship table" that lists all of the samples (microarray chips), noting the treatment, time point, and replicate number.
    • Come up with consistent column headers that summarize this information
      • For example, the Dahlquist Lab microarray data used strain_LogFC_timepoint-replicate number, as in wt_LogFC_t15-1.
  3. Organize the data in a worksheet in an Excel workbook so that:
    • ID is in the first column
    • Data columns are to the right, in increasing chronological order
    • Replicates are grouped together
  4. Perform an ANOVA analysis of the data.
  5. Cluster the data with stem.
  6. Use YEASTRACT to generate a candidate gene regulatory network.
  7. Create an input workbook for GRNmap based on a Microsoft Access database that the Coder/Designer and QA's make.
  8. Run GRNmap and interpret data.
  9. As the end-user of the Access database, the Data Analysts will provide feedback to the QAs and Coder/Designer about the usability of database.
Final Project Links
Overview Deliverables Guilds Project Manager Quality Assurance Data Analysis Coder/Designer
Teams FunGals Sulfiknights Skinny Genes