Difference between revisions of "IMG/VR Week 5"
(→Scientific quality of the database: answer) |
(→1)What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?]): changed answer) |
||
Line 8: | Line 8: | ||
====1)What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])==== | ====1)What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])==== | ||
− | IMG/ | + | IMG/VR is a database of viral DNA for sequencing and analysis. The site organizes these DNA strands according to which part of the human body the virus infects, the ecosystem that the virus inhabits, and the host associated with the virus. |
====2)What type of data source does it have? primary versus secondary ("meta")==== | ====2)What type of data source does it have? primary versus secondary ("meta")==== |
Revision as of 16:37, 28 September 2019
Contents
- 1 General information about the database
- 2 Scientific quality of the database
- 2.1 1. Does the content appear to completely cover its content domain?
- 2.2 2. What species are covered in the database? (If it is a very long list, summarize.)
- 2.3 3. Is the database content useful? I.e., what biological questions can it be used to answer?
- 2.4 4.Is the database content timely?
- 2.5 5.How current is the database?
- 3 General utility of the database to the scientific community
- 3.1 1. Are there links to other databases? Which ones?
- 3.2 2. Is it convenient to browse the data?
- 3.3 3. Is it convenient to download the data?
- 3.4 4. Evaluate the “user-friendliness” of the database: can a naive user quickly navigate the website and gather useful information?
- 3.5 5.Access: Is there a license agreement or any restrictions on access to the database?
- 4 Summary judgment
- 5 Acknowledgments
- 6 References
General information about the database
1. What is the name of the database? (link to the home page)
IMG/VR [[1]]
2. What type (or types) of database is it?
1)What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])
IMG/VR is a database of viral DNA for sequencing and analysis. The site organizes these DNA strands according to which part of the human body the virus infects, the ecosystem that the virus inhabits, and the host associated with the virus.
2)What type of data source does it have? primary versus secondary ("meta")
IMG/M offers secondary source data, as the genomic information of this database was collected from outside sources that sequenced the DNA.
curated versus non-curated?
if curated, is it electronic versus human curation? if human curation, is it in-house staff versus community curation?
3.What individual or organization maintains the database?
IMG/M is a public database maintained by The Regents of the University of California.
4. What is their funding source(s)?
Since it is a public database, IMG/VR is publically funded by the state of California.
Scientific quality of the database
1. Does the content appear to completely cover its content domain?
Content domain: “annotation, analysis, and distribution” of genome and microbiome datasets [[2]]
How many records does the database contain?
8389 cultivated reference virus
What claims do the database owners make about coverage in the corresponding paper?
GenBank are processed through the IMG submission system [[3]] and IMG annotation pipeline before being integrated into the IMG data warehouse.
2. What species are covered in the database? (If it is a very long list, summarize.)
Viruses
3. Is the database content useful? I.e., what biological questions can it be used to answer?
It can be used to answer questions of comparative analysis between different genome datasets. With thousands of datasets and millions of genes, its analytic tools can be used to answer how species are similar and different. There is also specificity in this as the database allows you to see characteristics of the genome that can be used for comparison.
4.Is the database content timely?
Is there a need in the scientific community for such a database at this time?
Yes. There is a need for public access in the scientific community of the genomes of different species. This can lead to more discovery about how these organisms are related to one another and can potentially lead to ground breaking research if a model organism is discovered.
Is the content covered by other databases already?
5.How current is the database?
When did the database first go online?
The database first went online in 2016.
How often is the database updated?
It is updated on a quarterly basis. [[4]]
When was the last update?
The last update was in September 2019.
General utility of the database to the scientific community
1. Are there links to other databases? Which ones?
2. Is it convenient to browse the data?
3. Is it convenient to download the data?
- In what file formats are the data provided?
- What type of files, indicated by the file extension (e.g., .txt, .xml., etc.)?
- Are they standard or non-standard formats? (i.e., are they following an approved standard for that type of data)?
- Is the website well-organized?
- Does it have a help section or tutorial?
- Are the search options sensible?
- Run a sample query. Do the results make sense?