Difference between revisions of "Stephen Louie Week 3"

From LMU BioDB 2013
Jump to: navigation, search
(Added "complement of a strand")
(Added corrections to #3 of XMLPipeDB practice)
 
(8 intermediate revisions by one user not shown)
Line 5: Line 5:
 
==Complement of a Strand==
 
==Complement of a Strand==
  
The command I used to get the complement strand was slouie4@ab201:/nfs/home/dondi/xmlpipedb/data$ cat prokaryote.txt | sed "y/atcg/tagc/"
+
The command I used to get the complement strand was:  
 +
cat prokaryote.txt | sed "y/atcg/tagc/"
 +
 
 +
==Reading Frames==
 +
 
 +
The six sets of text proccessing commands I used were:
 +
 
 +
cat prokaryote.txt | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
 +
cat prokaryote.txt | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
 +
cat prokaryote.txt | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
 +
cat prokaryote.txt | rev | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
 +
cat prokaryote.txt | rev | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
 +
cat prokaryote.txt | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
 +
 
 +
'''Corrections:
 +
 
 +
'''cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed'''
 +
 
 +
'''cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed'''
 +
 
 +
'''cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed'''
 +
'''
 +
==XMLPipeDB Match Practice==
 +
 
 +
#There are two unique matches.  GO:0009165 appears 2 times.  GO:0009168 appears 1 time.  I believe GO:000916 is a specific strain of bacterium
 +
#There are two unique matches.  "James K.D." appears 8238 times.  "James A.A." appears once.  \"James.*\ probably represents the name of an author.
 +
#Match gave 165 occurrences. grep/wc gave 162 occurrences.  Theses answers make sense due to the unique ways in which Match and grep/wc count for their occurrences.  Match reads the entire file as a single line while grep/wc breaks and scans the file in chunks.  The reasons grep/wc has less occurrences is because when the file is split up into chunks it can split a certain match in two pieces resulting in a lost match.  Since Match looks at the file in a single line, none of the matches are at risk of being split apart.  '''Match found 830,101 occurrences. grep/wc gave 502,410 occurrences.  Match reads the entire file as a single line while grep/wc breaks and scans the file in chunks.  The reasons grep/wc has less occurrences is because when the file is split up into chunks it can split a certain match in two pieces resulting in a lost match.  Since Match looks at the file in a single line, none of the matches are at risk of being split apart.''' 
 +
 
 +
[[User:Slouie|Slouie]] ([[User talk:Slouie|talk]]) 11:18, 17 September 2013 (PDT)
 +
[[Category:Journal Entry]]

Latest revision as of 04:50, 20 September 2013

Contents

[edit] Where's your stuff?

The icon changed from a note to a webpage

[edit] Complement of a Strand

The command I used to get the complement strand was:

cat prokaryote.txt | sed "y/atcg/tagc/"

[edit] Reading Frames

The six sets of text proccessing commands I used were:

cat prokaryote.txt | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
cat prokaryote.txt | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
cat prokaryote.txt | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
cat prokaryote.txt | rev | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
cat prokaryote.txt | rev | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
cat prokaryote.txt | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed

Corrections:

cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed

cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed

cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed

[edit] XMLPipeDB Match Practice

  1. There are two unique matches. GO:0009165 appears 2 times. GO:0009168 appears 1 time. I believe GO:000916 is a specific strain of bacterium
  2. There are two unique matches. "James K.D." appears 8238 times. "James A.A." appears once. \"James.*\ probably represents the name of an author.
  3. Match gave 165 occurrences. grep/wc gave 162 occurrences. Theses answers make sense due to the unique ways in which Match and grep/wc count for their occurrences. Match reads the entire file as a single line while grep/wc breaks and scans the file in chunks. The reasons grep/wc has less occurrences is because when the file is split up into chunks it can split a certain match in two pieces resulting in a lost match. Since Match looks at the file in a single line, none of the matches are at risk of being split apart. Match found 830,101 occurrences. grep/wc gave 502,410 occurrences. Match reads the entire file as a single line while grep/wc breaks and scans the file in chunks. The reasons grep/wc has less occurrences is because when the file is split up into chunks it can split a certain match in two pieces resulting in a lost match. Since Match looks at the file in a single line, none of the matches are at risk of being split apart.

Slouie (talk) 11:18, 17 September 2013 (PDT)

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox