Difference between revisions of "Kevinmcgee Week 3"

From LMU BioDB 2013
Jump to: navigation, search
(Added homework)
(Added XML match homework)
 
Line 13: Line 13:
 
  -3 seq_file | sed "y/actg/tgac/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
 
  -3 seq_file | sed "y/actg/tgac/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
  
 +
==XMLPipeDB Match Practice==
 +
 +
#What Match command tallies the occurrences of the pattern GO:000916. in the 493.P_falciparum.xml file?
 +
:*2 unique matches
 +
:*First match occurred twice, second match occured once.
 +
:* There appears to be multiple references to this in the doccument. It appears to be a specific alignment on a protein sequence of 493.P-Falciparum
 +
#What Match command tallies the occurrences of the pattern \"James.*\" in the 493.P_falciparum.xml file?
 +
:*2 Unique matches
 +
:*First match occurred 8238 times, second match occurred 1 time.
 +
:*When I typed in the command grep "\James.*\"", it told me that it was a persons name. By looking at the title and noticing the file is copyrighted, I believe that it is safe to assume that James K.D. is the name of an author of a scientific article.
 +
#Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.
 +
:* Match brought back 830,101 matches
 +
:* grep and wc gave me 502,401 lines, 502,401 words and 35671048 letters
 +
:* It makes sense that grep and wc had less matches because they cannot account for multiple matches on the same line.
  
 
[[User:Kevinmcgee|Kevinmcgee]] ([[User talk:Kevinmcgee|talk]]) 11:14, 12 September 2013 (PDT)
 
[[User:Kevinmcgee|Kevinmcgee]] ([[User talk:Kevinmcgee|talk]]) 11:14, 12 September 2013 (PDT)
  
 
[[Category:Journal Entry]]
 
[[Category:Journal Entry]]

Latest revision as of 18:40, 12 September 2013

Contents

[edit] Genetic Code By Computer

[edit] Complementary Strand

seq_file | sed "y/actg/tagc/"

[edit] Reading Frame Codes

+1 seq_file | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
+2 seq_file | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
+3 seq_file | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
-1 seq_file | sed "y/actg/tgac/" | rev | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
-2 seq_file | sed "y/actg/tgac/" | rev | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
-3 seq_file | sed "y/actg/tgac/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed

[edit] XMLPipeDB Match Practice

  1. What Match command tallies the occurrences of the pattern GO:000916. in the 493.P_falciparum.xml file?
  • 2 unique matches
  • First match occurred twice, second match occured once.
  • There appears to be multiple references to this in the doccument. It appears to be a specific alignment on a protein sequence of 493.P-Falciparum
  1. What Match command tallies the occurrences of the pattern \"James.*\" in the 493.P_falciparum.xml file?
  • 2 Unique matches
  • First match occurred 8238 times, second match occurred 1 time.
  • When I typed in the command grep "\James.*\"", it told me that it was a persons name. By looking at the title and noticing the file is copyrighted, I believe that it is safe to assume that James K.D. is the name of an author of a scientific article.
  1. Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.
  • Match brought back 830,101 matches
  • grep and wc gave me 502,401 lines, 502,401 words and 35671048 letters
  • It makes sense that grep and wc had less matches because they cannot account for multiple matches on the same line.

Kevinmcgee (talk) 11:14, 12 September 2013 (PDT)

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox