Difference between revisions of "KSherbina Week 3"

From LMU BioDB 2013
Jump to: navigation, search
(Reading Frames: Corrected commands to translate -1, -2, and -3 reading frames.)
(XMLPipeDB Match Practice: Italicized the last mention of ATG in question 3.)
 
(3 intermediate revisions by one user not shown)
Line 29: Line 29:
 
  cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
 
  cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
  
 +
===XMLPipeDB Match Practice===
  
 +
#There are two unique matches: go:0009165 and go:0009165. The pattern go:009165 appears twice and the pattern go:009165 appears once in the file. These patterns represent a gene ontology term, which describes what biological function the product of a gene is involved in.
 +
#There are two unique matches: "james k.d." and "james a.a.". The pattern "james k.d." appears 8238 times while the pattern "james a.a." appears only once in the file. These patterns probably designate the names of the scientists who determined the genetic sequence of ''P. falciparum''.
 +
#According to Match, the pattern ''ATG'' appears 165 times in the file. However, according to ''grep'', the pattern ''ATG'' appears 162 times. The discrepancy makes sense considering that grep searches for patterns one line at a time while Match searches all of the text continuously for a specific pattern. There may have been three ''ATG'' patterns that were broken up between two consecutive lines. Thus, Match was able to find them, whereas ''grep'' was not.
 +
 +
[[User:Ksherbina|Ksherbina]] ([[User talk:Ksherbina|talk]]) 23:17, 12 September 2013 (PDT)
  
 
<!--To find a pattern within text : grep "^A....T" hs_ref_GRCh37_chr19.fa-->
 
<!--To find a pattern within text : grep "^A....T" hs_ref_GRCh37_chr19.fa-->

Latest revision as of 06:18, 13 September 2013

Katrina Sherbina
Class Page    User Page
Assignment Description Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13 Week 15
Class Journal Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9
Individual Journal Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11
Other Week 5: Database Wiki
Final Project Team H(oo)KD Project Page Journal Club Presentation Project Individual Journal

Contents

[edit] The Genetic Code, by Computer

[edit] Complement of a Strand

To find the complementary strand of the sequence in the file prokaryote.txt, the following sequence of piped text was executed

cat prokaryote.txt | sed "y/atcg/tagc/"

[edit] Reading Frames

The following piped sequence of text was invoked to determine the sequence of amino acids encoded by

  • the +1 reading frame:
cat prokaryote.txt | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
  • the +2 reading frame:
cat prokaryote.txt | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
  • the +3 reading frame:
cat prokaryote.txt | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
  • the -1 reading frame:
cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
  • the -2 reading frame:
cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
  • the -1 reading frame:
cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed

[edit] XMLPipeDB Match Practice

  1. There are two unique matches: go:0009165 and go:0009165. The pattern go:009165 appears twice and the pattern go:009165 appears once in the file. These patterns represent a gene ontology term, which describes what biological function the product of a gene is involved in.
  2. There are two unique matches: "james k.d." and "james a.a.". The pattern "james k.d." appears 8238 times while the pattern "james a.a." appears only once in the file. These patterns probably designate the names of the scientists who determined the genetic sequence of P. falciparum.
  3. According to Match, the pattern ATG appears 165 times in the file. However, according to grep, the pattern ATG appears 162 times. The discrepancy makes sense considering that grep searches for patterns one line at a time while Match searches all of the text continuously for a specific pattern. There may have been three ATG patterns that were broken up between two consecutive lines. Thus, Match was able to find them, whereas grep was not.

Ksherbina (talk) 23:17, 12 September 2013 (PDT)

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox