Difference between revisions of "IMG/VR Week 5"
(→General information about the database: changed formating) |
(→curated versus non-curated? If curated, is it electronic versus human curation? If human curation, is it in-house staff versus community curation?: small edit) |
||
Line 14: | Line 14: | ||
IMG/VR offers secondary source data, as the genomic information of this database was collected from outside sources that sequenced the DNA. | IMG/VR offers secondary source data, as the genomic information of this database was collected from outside sources that sequenced the DNA. | ||
− | ==== | + | ====Curated versus non-curated? If curated, is it electronic versus human curation? If human curation, is it in-house staff versus community curation?==== |
− | |||
− | |||
===3.What individual or organization maintains the database?=== | ===3.What individual or organization maintains the database?=== |
Revision as of 16:38, 28 September 2019
Contents
- 1 General information about the database
- 1.1 1. What is the name of the database? (link to the home page)
- 1.2 2. What type (or types) of database is it?
- 1.2.1 1)What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])
- 1.2.2 2)What type of data source does it have? primary versus secondary ("meta")
- 1.2.3 Curated versus non-curated? If curated, is it electronic versus human curation? If human curation, is it in-house staff versus community curation?
- 1.3 3.What individual or organization maintains the database?
- 1.4 4. What is their funding source(s)?
- 2 Scientific quality of the database
- 2.1 1. Does the content appear to completely cover its content domain?
- 2.2 2. What species are covered in the database? (If it is a very long list, summarize.)
- 2.3 3. Is the database content useful? I.e., what biological questions can it be used to answer?
- 2.4 4.Is the database content timely?
- 2.5 5.How current is the database?
- 3 General utility of the database to the scientific community
- 3.1 1. Are there links to other databases? Which ones?
- 3.2 2. Is it convenient to browse the data?
- 3.3 3. Is it convenient to download the data?
- 3.4 4. Evaluate the “user-friendliness” of the database: can a naive user quickly navigate the website and gather useful information?
- 3.5 5.Access: Is there a license agreement or any restrictions on access to the database?
- 4 Summary judgment
- 5 Acknowledgments
- 6 References
General information about the database
1. What is the name of the database? (link to the home page)
IMG/VR [[1]]
2. What type (or types) of database is it?
1)What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])
IMG/VR is a database of viral DNA for sequencing and analysis. The site organizes these DNA strands according to which part of the human body the virus infects, the ecosystem that the virus inhabits, and the host associated with the virus.
2)What type of data source does it have? primary versus secondary ("meta")
IMG/VR offers secondary source data, as the genomic information of this database was collected from outside sources that sequenced the DNA.
Curated versus non-curated? If curated, is it electronic versus human curation? If human curation, is it in-house staff versus community curation?
3.What individual or organization maintains the database?
IMG/M is a public database maintained by The Regents of the University of California.
4. What is their funding source(s)?
Since it is a public database, IMG/VR is publically funded by the state of California.
Scientific quality of the database
1. Does the content appear to completely cover its content domain?
Content domain: “annotation, analysis, and distribution” of genome and microbiome datasets [[2]]
How many records does the database contain?
8389 cultivated reference virus
What claims do the database owners make about coverage in the corresponding paper?
GenBank are processed through the IMG submission system [[3]] and IMG annotation pipeline before being integrated into the IMG data warehouse.
2. What species are covered in the database? (If it is a very long list, summarize.)
Viruses
3. Is the database content useful? I.e., what biological questions can it be used to answer?
It can be used to answer questions of comparative analysis between different genome datasets. With thousands of datasets and millions of genes, its analytic tools can be used to answer how species are similar and different. There is also specificity in this as the database allows you to see characteristics of the genome that can be used for comparison.
4.Is the database content timely?
Is there a need in the scientific community for such a database at this time?
Yes. There is a need for public access in the scientific community of the genomes of different species. This can lead to more discovery about how these organisms are related to one another and can potentially lead to ground breaking research if a model organism is discovered.
Is the content covered by other databases already?
5.How current is the database?
When did the database first go online?
The database first went online in 2016.
How often is the database updated?
It is updated on a quarterly basis. [[4]]
When was the last update?
The last update was in September 2019.
General utility of the database to the scientific community
1. Are there links to other databases? Which ones?
2. Is it convenient to browse the data?
3. Is it convenient to download the data?
- In what file formats are the data provided?
- What type of files, indicated by the file extension (e.g., .txt, .xml., etc.)?
- Are they standard or non-standard formats? (i.e., are they following an approved standard for that type of data)?
- Is the website well-organized?
- Does it have a help section or tutorial?
- Are the search options sensible?
- Run a sample query. Do the results make sense?
5.Access: Is there a license agreement or any restrictions on access to the database?
Summary judgment
1. Would you direct a colleague unfamiliar with the field to use it?
I would not recommend this database to an unfamiliar colleague. While useful in analyzing and comparing DNA sequences, the database contains loads of niche information and pages that are not very comprehensible, particularly for someone who has little experience in bioinformatics and general biology. We had difficulty navigating the website because certain pages were not functioning and had to rely on the “Help” tab to understand how to use it.
2.Is this a professional or "hobby" database? The "hobby" analogy means that it was that person's hobby to make the database. It could mean that it is limited in scope, done by one or a few persons, and seems amateur.
IMG/M seems to be a professional database that offers genomic information to scientists across the world. It encourages such researchers to analyze, distribute, and annotate the information they provide and requests that the website be properly cited, suggesting that it is meant for usage in scientific writings.