Difference between revisions of "Week 5"

From LMU BioDB 2017
Jump to: navigation, search
(Reflect: changed wording)
(Homework Partners: approved last database)
 
(26 intermediate revisions by 10 users not shown)
Line 1: Line 1:
{{Under_Construction}}
 
 
 
'''This journal entry is due on Tuesday, October 3, at 12:01 AM PDT.'''
 
'''This journal entry is due on Tuesday, October 3, at 12:01 AM PDT.'''
  
Line 13: Line 11:
 
== Individual Journal Assignment ==
 
== Individual Journal Assignment ==
  
 +
For this week, both partners will contribute to the same journal entry in lieu of individual journal entries.
 
* Store this journal entry as "''username'' Week 5" (i.e., this is the text to place between the square brackets when you link to this page).
 
* Store this journal entry as "''username'' Week 5" (i.e., this is the text to place between the square brackets when you link to this page).
 
* Link from your user page to this Assignment page.
 
* Link from your user page to this Assignment page.
* Link to your journal entry from your user page.
+
* Link to your journal entry (the shared database page) from your user page.
* Link back from your journal entry to your user page.
+
* Link back from your shared database page to your user page.
 
* Don't forget to add the "Journal Entry" category to the end of your wiki page.
 
* Don't forget to add the "Journal Entry" category to the end of your wiki page.
 
**'''''Note: You can easily fulfill all of these links by adding them to your template and then using your template on your journal entry.'''''
 
**'''''Note: You can easily fulfill all of these links by adding them to your template and then using your template on your journal entry.'''''
* For your assignment this week, you will keep an '''''electronic laboratory notebook''''' on your individual wiki page. An electronic laboratory notebook records all the manipulations you perform on the data and the answers to the questions throughout the protocol. Like a paper lab notebook found in a wet lab, it should contain enough information so that you or someone else could reproduce what you did using only the information from the notebook.
+
* For your assignment this week, both partners will contribute to an '''''electronic laboratory notebook''''' on your database page (see below). An electronic laboratory notebook records all the manipulations you perform on the data and the answers to the questions throughout the protocol. Like a paper lab notebook found in a wet lab, it should contain enough information so that you or someone else could reproduce what you did using only the information from the notebook.
** To be clear, on your individual wiki page, you will document your individual process in your electronic lab notebook.
+
** To be clear, for this week, you and your partner will share an individual journal entry page, named after the database you will evaluate.
** From this week onward, please use the individual journal page for your electronic lab notebook instead of a separate notebook page.
 
  
 
=== Homework Partners ===
 
=== Homework Partners ===
Line 27: Line 25:
 
For most weeks in the semester, you will be assigned a "homework partner" from a complementary discipline. You will be expected to consult with your partner, sharing your domain expertise, in order to complete the assignment. However, unless otherwise stated, each partner must submit his or her own work as the individual journal entry (direct copies of each other's work is not allowed). You must give the details of the interaction with your partner in the [[Week_1#Acknowledgments | Acknowledgments section]] of your journal assignment.  Homework partners for this week are:
 
For most weeks in the semester, you will be assigned a "homework partner" from a complementary discipline. You will be expected to consult with your partner, sharing your domain expertise, in order to complete the assignment. However, unless otherwise stated, each partner must submit his or her own work as the individual journal entry (direct copies of each other's work is not allowed). You must give the details of the interaction with your partner in the [[Week_1#Acknowledgments | Acknowledgments section]] of your journal assignment.  Homework partners for this week are:
  
* Eddie Azinge, Mary Balducci
+
* Eddie Azinge, Mary Balducci - [https://card.mcmaster.ca/ The Comprehensive Antibiotic Resistance Database] '''Approved!''' ''— [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 21:02, 1 October 2017 (PDT)''
* Eddie Bachoura, Emma Tyrnauer
+
* Eddie Bachoura, Emma Tyrnauer [https://www.fludb.org/brc/home.spg?decorator=influenza Influenza Research Database] '''Approved!''' ''— [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 22:04, 3 October 2017 (PDT)''
* Dina Bashoura, Nicole Kalcic
+
* Dina Bashoura, Nicole Kalcic - [https://monarchinitiative.org/ The Monarch Initiative] '''Approved!''' ''— [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 01:40, 2 October 2017 (PDT)''
* Blair Hamilton, Corinne Wong  
+
* Blair Hamilton, Corinne Wong - [https://www.animalgenome.org/cgi-bin/QTLdb/index Animal QTL Database] '''Approved!''' ''— [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 21:02, 1 October 2017 (PDT)''
* Hayden Hinsch, Simon Wroblewski
+
* Hayden Hinsch, Simon Wroblewski - [http://cgdb.biocuckoo.org/index.php Circadian Gene Database] '''Approved!''' ''— [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 14:32, 29 September 2017 (PDT)''
* Arash Lari, Antonio Porras
+
* Arash Lari, Antonio Porras - [https://img.jgi.doe.gov/cgi-bin/vr/main.cgi IMG/VR: a database of of cultured and uncultured DNA viruses and retroviruses] '''Approved!''' ''— [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 21:03, 1 October 2017 (PDT)''
* Quinn Lanners, John Lopez
+
* Quinn Lanners, John Lopez - miRPathDB '''Approved!'''  ''— [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 16:15, 28 September 2017 (PDT)''
* Zach Van Ysseldyk, Katie Wright
+
* Zach Van Ysseldyk, Katie Wright [http://ctdbase.org/ Comparative Toxicogenomics Database] '''Approved!''' ''— [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 14:32, 29 September 2017 (PDT)''
  
 
== NAR Database Evaluation and Presentation ==
 
== NAR Database Evaluation and Presentation ==
Line 50: Line 48:
 
*** You may not choose a database from NCBI, EBI, or the DNA Databank of Japan.  You may not choose Ensembl, UniProt, SGD, or other major model organism database.  The intent for this exercise is to pick something that is not one of the "major" databases.  
 
*** You may not choose a database from NCBI, EBI, or the DNA Databank of Japan.  You may not choose Ensembl, UniProt, SGD, or other major model organism database.  The intent for this exercise is to pick something that is not one of the "major" databases.  
 
*** Sign up for your database by editing this page next to you and your partner's names.  Dr. Dahlquist must approve all database choices.
 
*** Sign up for your database by editing this page next to you and your partner's names.  Dr. Dahlquist must approve all database choices.
<!--
 
=== Database Wiki Page ===
 
  
For your assignment, create a new wiki page to profile your database.  There will be one page per group; both partners will contribute to the same page.
+
=== Database Evaluation ===
* Link to your database page from the [[Class Journal Week 5]] page.  These pages will be a resource for the class as we move forward with this unit of the course. 
 
* Link to your database page from your user page.
 
* Link from your database page to the [[Class Journal Week 5]] page.
 
* Link from your database page to your user pages.
 
  
Read the article about the database from the ''Nucleic Acids Research'' journal and then go online to the database itself.  When you answer the questions below, provide a hyperlink to the page that you got the information from.
+
For your assignment, create a new wiki page to profile your database.  For this week, there will be one page per set of partners; both partners will contribute content and notes for their electronic lab notebook to the same page; you do not need to have separate individual journal entries for this week.
# What database did you access? (link to the home page of the database)
+
* Use the name of the database as the name of your page.
# What is the purpose of the database?
+
 
# What biological information does it contain?
+
Read the article about the database from the ''Nucleic Acids Research'' journal and then go online to the database itself.  In keeping with Academic Honesty and citation practices, when you answer the questions below, provide a hyperlink to the page that you got the information from.  There should be at least one hyperlink per answer.
# What species are covered in the database? 
+
* '''General information about the database'''
# What biological questions can it be used to answer? 
+
*# What is the name of the database? (link to the home page)
# What type (or types) of database is it (sequence, structure model organism, or specialty [what?]; primary or “meta”; curated electronically, manually [in-house], manually [community])?
+
*# What type (or types) of database is it?
# What individual or organization maintains the database?
+
*## What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])
# What is their funding source(s)?   
+
*## What type of data source does it have?
# Is there a license agreement or any restrictions on access to the database?
+
*##* primary versus secondary ("meta")
# How often is the database updated? When was the last update?
+
*##* curated versus non-curated
# Are there links to other databases?   
+
*##* electronic versus human curation
# Can the information be downloaded?
+
*##* in-house staff versus community curation
#* In what file formats?
+
*# What individual or organization maintains the database?
# Evaluate the “user-friendliness” of the database.  
+
*#* public versus private
#* Is the Web site well-organized?   
+
*#* large national or multinational entity or small lab group
#* Does it have a help section or tutorial?   
+
*# What is their funding source(s)?   
#* Run a sample query.  Do the results make sense?
+
* '''Scientific quality of the database'''
 +
*# Does the content appear to completely cover its content domain?
 +
*#* How many records does the database contain?
 +
*#* What claims do the database owners make about coverage in the corresponding paper? 
 +
*# What species are covered in the database?
 +
*# Is the database content useful? I.e., what biological questions can it be used to answer?
 +
*# Is the database content timely?
 +
*#* Is there a need in the scientific community for such a database at this time?
 +
*#* Is the content covered by other databases already?
 +
*#* When did the database first go online?
 +
*#* How often is the database updated?
 +
*#* When was the last update?
 +
* '''General utility of the database to the scientific community'''
 +
*# Are there links to other databases?  Which ones?
 +
*# Is it convenient to browse the data?
 +
*# Is it convenient to download the data?
 +
*#* In what file formats are the data provided?
 +
*#* Are they standard or non-standard formats? 
 +
*# Evaluate the “user-friendliness” of the database: can a naive user quickly navigate the website and gather useful information?  
 +
*#* Is the website well-organized?   
 +
*#* Does it have a help section or tutorial?   
 +
*#* Are the search options sensible?
 +
*#* Run a sample query.  Do the results make sense?
 +
*# Access:  Is there a license agreement or any restrictions on access to the database?
 +
* '''Summary judgment'''
 +
*# Would you direct a colleague unfamiliar with the field to use it?
 +
*# Is this a professional or hobby database?
  
 
==== Some Definitions ====
 
==== Some Definitions ====
Line 84: Line 102:
 
** In-house is when the human works for the database organization.
 
** In-house is when the human works for the database organization.
 
** Community is when the database allows members of the scientific community that don't work for the database organization to add information to the record.
 
** Community is when the database allows members of the scientific community that don't work for the database organization to add information to the record.
-->
+
 
 
=== PowerPoint Presentation ===
 
=== PowerPoint Presentation ===
  
Line 98: Line 116:
 
** Visuals/slides
 
** Visuals/slides
 
** Speaking style/delivery
 
** Speaking style/delivery
* '''''Your PowerPoint slides must be uploaded and linked to on your individual journal entries by the journal deadline of 12:01 AM on Tuesday, October 3, even if your presentation is on Thursday .'''''
+
* '''''Your PowerPoint slides must be uploaded and linked to on your database wiki page by the journal deadline of 12:01 AM on Tuesday, October 3, even if your presentation is on Thursday .'''''
 
** You can update your slides before your presentation, but we will be grading the ones you upload by the deadline.
 
** You can update your slides before your presentation, but we will be grading the ones you upload by the deadline.
 
* Finally, your presentation will also be evaluated by your fellow classmates (anonymously) who will answer the following questions:
 
* Finally, your presentation will also be evaluated by your fellow classmates (anonymously) who will answer the following questions:

Latest revision as of 05:04, 4 October 2017

This journal entry is due on Tuesday, October 3, at 12:01 AM PDT.

Objectives

The purpose of this assignment is:

  • to deeply explore and perform a critical review of an existing biological database.
  • to communicate your findings in an effective oral presentation.
  • to gain and perform a self-assessment of your scientific data literacy skills.

Individual Journal Assignment

For this week, both partners will contribute to the same journal entry in lieu of individual journal entries.

  • Store this journal entry as "username Week 5" (i.e., this is the text to place between the square brackets when you link to this page).
  • Link from your user page to this Assignment page.
  • Link to your journal entry (the shared database page) from your user page.
  • Link back from your shared database page to your user page.
  • Don't forget to add the "Journal Entry" category to the end of your wiki page.
    • Note: You can easily fulfill all of these links by adding them to your template and then using your template on your journal entry.
  • For your assignment this week, both partners will contribute to an electronic laboratory notebook on your database page (see below). An electronic laboratory notebook records all the manipulations you perform on the data and the answers to the questions throughout the protocol. Like a paper lab notebook found in a wet lab, it should contain enough information so that you or someone else could reproduce what you did using only the information from the notebook.
    • To be clear, for this week, you and your partner will share an individual journal entry page, named after the database you will evaluate.

Homework Partners

For most weeks in the semester, you will be assigned a "homework partner" from a complementary discipline. You will be expected to consult with your partner, sharing your domain expertise, in order to complete the assignment. However, unless otherwise stated, each partner must submit his or her own work as the individual journal entry (direct copies of each other's work is not allowed). You must give the details of the interaction with your partner in the Acknowledgments section of your journal assignment. Homework partners for this week are:

NAR Database Evaluation and Presentation

Each year, the journal Nucleic Acids Research (NAR) devotes the first issue in January to biological databases. The Week 4 Assignment introduced you to four "gold standard" biological databases. In this assignment you will use what you learned to evaluate a different biological database. Collectively, through presentations, you will gain experience with the breadth and depth of biological databases available on the Web:

Database Evaluation

For your assignment, create a new wiki page to profile your database. For this week, there will be one page per set of partners; both partners will contribute content and notes for their electronic lab notebook to the same page; you do not need to have separate individual journal entries for this week.

  • Use the name of the database as the name of your page.

Read the article about the database from the Nucleic Acids Research journal and then go online to the database itself. In keeping with Academic Honesty and citation practices, when you answer the questions below, provide a hyperlink to the page that you got the information from. There should be at least one hyperlink per answer.

  • General information about the database
    1. What is the name of the database? (link to the home page)
    2. What type (or types) of database is it?
      1. What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])
      2. What type of data source does it have?
        • primary versus secondary ("meta")
        • curated versus non-curated
        • electronic versus human curation
        • in-house staff versus community curation
    3. What individual or organization maintains the database?
      • public versus private
      • large national or multinational entity or small lab group
    4. What is their funding source(s)?
  • Scientific quality of the database
    1. Does the content appear to completely cover its content domain?
      • How many records does the database contain?
      • What claims do the database owners make about coverage in the corresponding paper?
    2. What species are covered in the database?
    3. Is the database content useful? I.e., what biological questions can it be used to answer?
    4. Is the database content timely?
      • Is there a need in the scientific community for such a database at this time?
      • Is the content covered by other databases already?
      • When did the database first go online?
      • How often is the database updated?
      • When was the last update?
  • General utility of the database to the scientific community
    1. Are there links to other databases? Which ones?
    2. Is it convenient to browse the data?
    3. Is it convenient to download the data?
      • In what file formats are the data provided?
      • Are they standard or non-standard formats?
    4. Evaluate the “user-friendliness” of the database: can a naive user quickly navigate the website and gather useful information?
      • Is the website well-organized?
      • Does it have a help section or tutorial?
      • Are the search options sensible?
      • Run a sample query. Do the results make sense?
    5. Access: Is there a license agreement or any restrictions on access to the database?
  • Summary judgment
    1. Would you direct a colleague unfamiliar with the field to use it?
    2. Is this a professional or hobby database?

Some Definitions

  • Electronic curation occurs when someone writes a program to add information to a database record from another database.
  • Manual curation occurs when a human reviews the information being added to a record to validate it as true.
    • In-house is when the human works for the database organization.
    • Community is when the database allows members of the scientific community that don't work for the database organization to add information to the record.

PowerPoint Presentation

Each pair will prepare and give a 12-15 minute PowerPoint presentation based on their assigned database in class on Tuesday, October 3 or Thursday, October 5.

  • You will need to prepare ~12-15 slides (assume 1 slide per minute of presentation).
    • Please follow the Presentation Guidelines for how to format your slides.
    • You may give a live demo of the database if you wish, but practice carefully so that you can do the presentation in 15 minutes.
      • Alternately, you may choose to show screen shots instead of the live demo.
  • You need to present the information you gathered about your database that you listed in your review above, but organized as a presentation.
  • Your presentation (both the slides and the oral presentation) will be evaluated by the instructors using the guidelines shown here in the four areas:
    • Content and message
    • Organization
    • Visuals/slides
    • Speaking style/delivery
  • Your PowerPoint slides must be uploaded and linked to on your database wiki page by the journal deadline of 12:01 AM on Tuesday, October 3, even if your presentation is on Thursday .
    • You can update your slides before your presentation, but we will be grading the ones you upload by the deadline.
  • Finally, your presentation will also be evaluated by your fellow classmates (anonymously) who will answer the following questions:
    1. What is the speakers’ take-home message? (One short sentence)
    2. What is the best point about the presentation’s organization? What needs improvement? Give one specific example for each.
    3. What is the best point about the presentation’s visuals (slides)? What needs improvement? Give one specific example for each.
    4. What is the best point about the presentation’s delivery (speaking style)? What needs improvement? Give one specific example for each for each presenter.

Shared Journal Assignment

  • Store your journal entry in the shared Class Journal Week 5 page. If this page does not exist yet, go ahead and create it (congratulations on getting in first 👏🏼)
  • Link to your journal entry from your user page.
  • Link back from the journal entry to your user page.
    • NOTE: You can easily fulfill the links part of these instructions by adding them to your template and using the template on your user page.
  • Sign your portion of the journal with the standard wiki signature shortcut (~~~~).
  • Add the "Journal Entry" and "Shared" categories to the end of the wiki page (if someone has not already done so).

Reflect

A set of core competencies for scientific data literacy is listed in the section below. Answer the following questions on the shared Class Journal Week 5 page:

  1. Which of these core competencies are you most skilled with (or which is most familiar to you)? Where and how did you gain the skills/become familiar?
  2. Which of these core competencies do you want to know more about? Why?

Scientific Data Literacy Core Competencies

  1. Databases and Data Formats
    • Understand how to query relational databases, and be familiar with data types and formats for the discipline.
  2. Discovery and Acquisition of Data
    • Locate and utilize disciplinary data repositories, and identify appropriate data sources
  3. Data Management and Organization
    • Understand the lifecycle of data, and use data management plans to track subsets of processed data.
  4. Data Conversion and Interoperability
    • Migrate data from one format to another, and understand the benefits of standard data formats.
  5. Quality Assurance
    • Use metadata and screening procedures to recognize artifacts, incompletion, or corruption of data sets.
  6. Metadata
    • Interpret metadata from external sources, and annotate data so it can be used by external users.
  7. Data Curation and Re-use
    • Recognize the role of curation throughout the data lifecycle in its value in effective reuse of data.
  8. Cultures of Practice
    • Know the practices, values, and norms of discipline as they relate to managing, sharing, and curating data.
  9. Data Preservation
    • Understand the technology, resource, and organizational components of preserving data.
  10. Data Analysis
    • Understand the basic analysis tools of their discipline including workflow management tools.
  11. Data Visualization
    • Use visualization tools of discipline, and understand the advantages of the different types of visualization.
  12. Ethics, including citation of data
    • Understand intellectual property, privacy, and the ethos of the discipline around sharing and citing data.