Difference between revisions of "Data Analysis"
Jump to navigation
Jump to search
Kdahlquist (talk | contribs) (→Guild Members: add guild member names) |
Kdahlquist (talk | contribs) (→Milestone 5: Clustering with stem and YEASTRACT: instructions on how to use GO for enrichment) |
||
(10 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
== Guild Members == | == Guild Members == | ||
− | * Ivy, Marcus | + | * [[Sulfiknights]]: Ivy, Marcus |
− | * Emma, Kaitlyn | + | * [[FunGals]]: Emma, Kaitlyn |
− | * Aby, David | + | * [[Skinny Genes]]: Aby, David |
== Milestones == | == Milestones == | ||
− | The milestones do not necessarily correspond to particular weeks; instead they are sets of tasks grouped together. | + | The milestones do not necessarily correspond to particular days/weeks; instead they are sets of tasks grouped together. |
+ | |||
+ | * Data Analysts and QA's who have a partner in their group can have a shared ''individual'' journal entry. Both students will be given the same grade and are expected to contribute equally to the electronic lab notebook. | ||
=== Milestone 1: Annotated Bibliography === | === Milestone 1: Annotated Bibliography === | ||
Line 21: | Line 23: | ||
* The Data Analysts will work with their teams to create and deliver a Journal Club presentation about to their team's assigned paper. | * The Data Analysts will work with their teams to create and deliver a Journal Club presentation about to their team's assigned paper. | ||
− | === Milestone 3: | + | === Milestone 3: Getting the data ready for analysis === |
+ | |||
+ | # Download and examine the microarray dataset, comparing it to the samples and experiment described in your journal club article. | ||
+ | #* [[Skinny Genes]]: [https://sgd-prod-upload.s3.amazonaws.com/S000204227/Barreto_2012_PMID_23039231.zip Barreto et al. (2012)] | ||
+ | #* [[FunGals]]: [https://sgd-prod-upload.s3.amazonaws.com/S000204415/Kitagawa_2002_PMID_12269742.zip Kitagawa et al. (2002)] | ||
+ | #* [[Sulfiknights]]: [https://sgd-prod-upload.s3.amazonaws.com/S000204367/Thorsen_2007_PMID_17327492.zip Thorsen et al. (2007)] | ||
+ | # Along with the QA's, make a "sample-data relationship table" that lists all of the samples (microarray chips), noting the treatment, time point, and replicate number. | ||
+ | #* Come up with consistent column headers that summarize this information | ||
+ | #** For example, the Dahlquist Lab microarray data used strain_LogFC_timepoint-replicate number, as in wt_LogFC_t15-1. | ||
+ | # Organize the data in a worksheet in an Excel workbook so that: | ||
+ | #* ID is in the first column | ||
+ | #* Data columns are to the right, in increasing chronological order, using the column header pattern you created | ||
+ | #* Replicates are grouped together | ||
+ | |||
+ | === Milestone 4: ANOVA analysis === | ||
+ | |||
+ | # Perform an ANOVA analysis of the data, as you did on [[Week 8]] for the Dahlquist lab data. | ||
+ | #* Note that you will need to adjust your formulas to take into account the different number of timepoints and replicates in your article's dataset. | ||
+ | |||
+ | === Milestone 5: Clustering with stem and YEASTRACT === | ||
+ | |||
+ | # Cluster the data with stem, as you did on [[Week 9]]. | ||
+ | #* Note that we will make some adjustments to the GO term analysis because stem was not providing GO term names. We are going to use the GO enrichment tool at GeneOntology.org instead. | ||
+ | ## Go to [http://geneontology.org/ http://geneontology.org/]. | ||
+ | ## For the cluster you want to analyze, open the gene list and copy the list of genes. | ||
+ | ## Paste the list of genes into the "Go Enrichment Analysis" box on the right hand side of the GeneOntology.org page. | ||
+ | ## Select "Saccharomyces cerevisiae" from the species drop-down menu. | ||
+ | ## Click the "Launch" buton. | ||
+ | ## Near the bottom of the results page, click on the button to Export "Table". | ||
+ | ## This will prompt you to save a .txt file that can be opened in Excel to view your results. | ||
+ | # Use YEASTRACT to generate a candidate gene regulatory network as you did on [[Week 9]]. | ||
+ | |||
+ | === Milestone 6: Create an input workbook for GRNmap using MS Access database === | ||
− | + | # Create an input workbook for GRNmap based on a Microsoft Access database that the Coder/Designer and QA's make, following protocol in [[Week 10]] | |
− | + | # Run GRNmap and interpret data. | |
− | + | # As the end-user of the Access database, the Data Analysts will provide feedback to the QAs and Coder/Designer about the usability of database. | |
− | |||
− | |||
− | |||
{{Final Project Links}} | {{Final Project Links}} | ||
[[Category:Group Projects]] | [[Category:Group Projects]] |
Latest revision as of 15:28, 26 November 2019
Final Project Links | |||||||
---|---|---|---|---|---|---|---|
Overview | Deliverables | Guilds | Project Manager | Quality Assurance | Data Analysis | Coder/Designer | |
Teams | FunGals | Sulfiknights | Skinny Genes |
The role of the Data Analyst will be to apply the data analysis pipeline that you learned by analyzing the Dahlquist Lab microarray dataset to complete the analysis of a different published yeast timecourse microarray dataset. The Data Analysts are the end-users of the project, ultimately determining whether the work of the coder/designer and quality assurance members is useful to them.
Contents
Guild Members
- Sulfiknights: Ivy, Marcus
- FunGals: Emma, Kaitlyn
- Skinny Genes: Aby, David
Milestones
The milestones do not necessarily correspond to particular days/weeks; instead they are sets of tasks grouped together.
- Data Analysts and QA's who have a partner in their group can have a shared individual journal entry. Both students will be given the same grade and are expected to contribute equally to the electronic lab notebook.
Milestone 1: Annotated Bibliography
- The Data Analysts will work with their teams to develop an annotated bibliography of papers relating to their team's assigned paper.
Milestone 2: Journal Club Presentation
- The Data Analysts will work with their teams to create and deliver a Journal Club presentation about to their team's assigned paper.
Milestone 3: Getting the data ready for analysis
- Download and examine the microarray dataset, comparing it to the samples and experiment described in your journal club article.
- Along with the QA's, make a "sample-data relationship table" that lists all of the samples (microarray chips), noting the treatment, time point, and replicate number.
- Come up with consistent column headers that summarize this information
- For example, the Dahlquist Lab microarray data used strain_LogFC_timepoint-replicate number, as in wt_LogFC_t15-1.
- Come up with consistent column headers that summarize this information
- Organize the data in a worksheet in an Excel workbook so that:
- ID is in the first column
- Data columns are to the right, in increasing chronological order, using the column header pattern you created
- Replicates are grouped together
Milestone 4: ANOVA analysis
- Perform an ANOVA analysis of the data, as you did on Week 8 for the Dahlquist lab data.
- Note that you will need to adjust your formulas to take into account the different number of timepoints and replicates in your article's dataset.
Milestone 5: Clustering with stem and YEASTRACT
- Cluster the data with stem, as you did on Week 9.
- Note that we will make some adjustments to the GO term analysis because stem was not providing GO term names. We are going to use the GO enrichment tool at GeneOntology.org instead.
- Go to http://geneontology.org/.
- For the cluster you want to analyze, open the gene list and copy the list of genes.
- Paste the list of genes into the "Go Enrichment Analysis" box on the right hand side of the GeneOntology.org page.
- Select "Saccharomyces cerevisiae" from the species drop-down menu.
- Click the "Launch" buton.
- Near the bottom of the results page, click on the button to Export "Table".
- This will prompt you to save a .txt file that can be opened in Excel to view your results.
- Use YEASTRACT to generate a candidate gene regulatory network as you did on Week 9.
Milestone 6: Create an input workbook for GRNmap using MS Access database
- Create an input workbook for GRNmap based on a Microsoft Access database that the Coder/Designer and QA's make, following protocol in Week 10
- Run GRNmap and interpret data.
- As the end-user of the Access database, the Data Analysts will provide feedback to the QAs and Coder/Designer about the usability of database.
Final Project Links | |||||||
---|---|---|---|---|---|---|---|
Overview | Deliverables | Guilds | Project Manager | Quality Assurance | Data Analysis | Coder/Designer | |
Teams | FunGals | Sulfiknights | Skinny Genes |