Difference between revisions of "Monarch Initiative Week 4"
(→General information about the database: answering third question) |
(→Scientific quality of the database: documentation link) |
||
Line 42: | Line 42: | ||
*# Is the database content timely? | *# Is the database content timely? | ||
− | + | It is unclear. They provide information at the bottom banner of the page showing they have made updates to the actual site in 2024 and that this tool is a work in progress, but they do not provide any information about the actual data, other then that they have an API, or more accurately multiple API's pulling data from other databases. They mention a different number of databases in their research paper and the page on the site that shows the list. You can find the list of databases they mention on their page [https://monarch-initiative.github.io/monarch-documentation/#standards-documentation here]. | |
*#* Is there a need in the scientific community for such a database at this time? | *#* Is there a need in the scientific community for such a database at this time? |
Revision as of 18:52, 6 February 2024
To User Page: User: Asandle1, User: Kmill104
To Assignment Page: Week 4
https://academic.oup.com/nar/article/52/D1/D938/7449493
Contents
Database Evaluation
Andrew doing 1 and 3, Katie 2 and 4 For your assignment, create a new wiki page to profile your database. For this week, there will be one page per set of partners; both partners will contribute content and notes for their electronic lab notebook to the same page; you do not need to have separate individual journal entries for this week.
- The name of your page should be "Database name Week 4".
Read the article about the database from the Nucleic Acids Research journal and then go online to the database itself. In keeping with Academic Honesty and citation practices, when you answer the questions below, provide a hyperlink to the page that you got the information from. There should be at least one hyperlink per answer.
General information about the database
- What is the name of the database? Monarch Initiative Monarch Initiative Front Page
- What type (or types) of database is it? The Monarch Initiative integrates gene, disease, and phenotype data. The database combines knowledge from across sources to reveal how they are connected. The database intends to show how these connections can tell us the causes and mechanisms of human disease. Nucleic Acids Research Article
- What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?]) Each node contains a general description of either a disorder, disease, phenotypic feature, or gene. There is often a section for additional names if there are other ways to refer to it. Each node then has several association links organized in a table to show how the specific node is related to the other types of data on the website. Depending on the node, there may be more or less information regarding its associations. Gene nodes have their association counts with phenotypic features, interactions, pathways, orthologs, molecular functions, cellular components, biological processes, causal diseases, and correlated diseases. A disease node only has association counts between phenotypic features, causal genes, and correlated genes. For gene nodes, their sequence can be found by clicking on a link that takes the user to another database. Monarch Initiative Explore Page
- What type of data source does it have?
- primary versus secondary ("meta")?
- curated versus non-curated?
- if curated, is it electronic versus human curation?
- if human curation, is it in-house staff versus community curation?
- if curated, is it electronic versus human curation?
- What individual or organization maintains the database?
- public versus private
- large national or multinational entity or small lab group
- What is their funding source(s)?
Scientific quality of the database
- Does the content appear to completely cover its content domain?
- How many records does the database contain?
- Does the content appear to completely cover its content domain?
It is not completely clear how many databases the database contains. This is something that could be better communicated. That being said when opening the search bar it shows 845539 results. It is possible that this is the total number of entries in the database. Search Page
- What claims do the database owners make about coverage in the corresponding paper?
- What species are covered in the database? (If it is a very long list, summarize.)
The database covers and connects "phenotypes to genotypes across species". About
- Is the database content useful? I.e., what biological questions can it be used to answer?
It is useful, how useful is more difficult to determine without having a particular use case that requires it to test it out. The database helps merge data from different scientific research fields onto one platform to help inform research and improve data organization to make more informed clinical decisions. About
- Is the database content timely?
It is unclear. They provide information at the bottom banner of the page showing they have made updates to the actual site in 2024 and that this tool is a work in progress, but they do not provide any information about the actual data, other then that they have an API, or more accurately multiple API's pulling data from other databases. They mention a different number of databases in their research paper and the page on the site that shows the list. You can find the list of databases they mention on their page here.
- Is there a need in the scientific community for such a database at this time?
- Is the content covered by other databases already?
- How current is the database?
- When did the database first go online?
- How often is the database updated?
- When was the last update?
They are no very forthcoming or clear about which species/model organisms they are using other than humans.
General utility of the database to the scientific community
- Are there links to other databases? Which ones?
- Is it convenient to browse the data?
- Is it convenient to download the data?
- In what file formats are the data provided?
- What type of files, indicated by the file extension (e.g., .txt, .xml., etc.)?
- Are they standard or non-standard formats? (i.e., are they following an approved standard for that type of data)?
- In what file formats are the data provided?
- Evaluate the “user-friendliness” of the database: can a naive user quickly navigate the website and gather useful information?
- Is the website well-organized?
- Does it have a help section or tutorial?
- Are the search options sensible?
- Run a sample query. Do the results make sense?
- Access: Is there a license agreement or any restrictions on access to the database?
Summary judgment
- Would you direct a colleague unfamiliar with the field to use it?
- Is this a professional or "hobby" database? The "hobby" analogy means that it was that person's hobby to make the database. It could mean that it is limited in scope, done by one or a few persons, or seems amateur.
Some Definitions
- Electronic curation occurs when someone writes a program to add information to a database record from another database.
- Manual curation occurs when a human reviews the information being added to a record to validate it as true.
- In-house is when the human works for the database organization.
- Community is when the database allows members of the scientific community that don't work for the database organization to add information to the record.