QLanners Week 14

From LMU BioDB 2017
Revision as of 23:48, 30 November 2017 by Kdahlquist (talk | contribs) (from NCBI)
Jump to: navigation, search

Used the gene HSF1, which is a transcription factor, to determine which fields should be pulled from each database.

General info we want about each gene:

  • Gene ID from each database
  • Description/Function (ensembl)
  • DNA Sequence (ensembl)
  • Protein Sequence (UniProt)
  • Locus tag (NCBI)
  • Also Known As (NCBI)
  • Consensus Sequence (JASPAR)
  • Regulation (SGD)
  • Interaction (SGD)
  • Similar Proteins (UniProt)
  • Gene Ontology (SGD - see if we can find it on UniProt)

We decided that from JASPAR we will pull:

  • Gene ID (this will be the matrix id
  • Sequence Logo
  • Frequency Matrix
  • also get class and family

Breakdown of what we want from all other databases:
NCBI:

  • Gene ID
  • Locus Tag
  • Also Known As
  • Also get RefSeq IDs for chromosome, mRNA, and protein.


Ensembl:

  • Gene ID
  • Description/Function
  • DNA Sequence
  • also get chromosomal location, about this gene


UniProt:

  • Gene ID (Note that for UniProt, it will be a protein ID, and that there are two different ones that you need to get. Kdahlquist (talk) 15:36, 30 November 2017 (PST)
  • Protein Sequence
  • Similar Proteins
  • Protein Type/Name
  • Please get the species from UniProt as well, even though we know it is going to be yeast.


SGD:

  • Gene ID
    • Standard Name, i.e., HSF1
    • Systematic Name, i.e., YGL073W
    • SGD ID, i.e., S000003041
  • Regulation
  • Interaction
  • Gene Ontology