Difference between revisions of "KSherbina Week 3"
From LMU BioDB 2013
(Entered the commands to find the amino acid sequence encoded by the +1 and +2 reading frames.) |
(→XMLPipeDB Match Practice: Italicized the last mention of ATG in question 3.) |
||
(7 intermediate revisions by one user not shown) | |||
Line 17: | Line 17: | ||
cat prokaryote.txt | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | cat prokaryote.txt | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | *the +3 reading frame: | ||
+ | cat prokaryote.txt | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | *the -1 reading frame: | ||
+ | cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | |||
+ | *the -2 reading frame: | ||
+ | cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | |||
+ | *the -1 reading frame: | ||
+ | cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | |||
+ | ===XMLPipeDB Match Practice=== | ||
+ | |||
+ | #There are two unique matches: go:0009165 and go:0009165. The pattern go:009165 appears twice and the pattern go:009165 appears once in the file. These patterns represent a gene ontology term, which describes what biological function the product of a gene is involved in. | ||
+ | #There are two unique matches: "james k.d." and "james a.a.". The pattern "james k.d." appears 8238 times while the pattern "james a.a." appears only once in the file. These patterns probably designate the names of the scientists who determined the genetic sequence of ''P. falciparum''. | ||
+ | #According to Match, the pattern ''ATG'' appears 165 times in the file. However, according to ''grep'', the pattern ''ATG'' appears 162 times. The discrepancy makes sense considering that grep searches for patterns one line at a time while Match searches all of the text continuously for a specific pattern. There may have been three ''ATG'' patterns that were broken up between two consecutive lines. Thus, Match was able to find them, whereas ''grep'' was not. | ||
+ | |||
+ | [[User:Ksherbina|Ksherbina]] ([[User talk:Ksherbina|talk]]) 23:17, 12 September 2013 (PDT) | ||
<!--To find a pattern within text : grep "^A....T" hs_ref_GRCh37_chr19.fa--> | <!--To find a pattern within text : grep "^A....T" hs_ref_GRCh37_chr19.fa--> | ||
Line 38: | Line 56: | ||
<!--XMLPipeDB Match Practice Task 1--> | <!--XMLPipeDB Match Practice Task 1--> | ||
<!--java -jar xmlpipedb-match-1.1.1.jar "GO:000916." < 493.P_falciparum.xml--> | <!--java -jar xmlpipedb-match-1.1.1.jar "GO:000916." < 493.P_falciparum.xml--> | ||
− | :<!-- : sending a file--> | + | :<!-- < : sending a file--> |
<!--There are two unique matches. First one appears twice. Second one appears once.--> | <!--There are two unique matches. First one appears twice. Second one appears once.--> | ||
Latest revision as of 06:18, 13 September 2013
Assignment Description | Week 1 | Week 2 | Week 3 | Week 4 | Week 5 | Week 6 | Week 7 | Week 8 | Week 9 | Week 10 | Week 11 | Week 12 | Week 13 | Week 15 |
Class Journal | Week 1 | Week 2 | Week 3 | Week 4 | Week 5 | Week 6 | Week 7 | Week 8 | Week 9 | |||||
Individual Journal | Week 2 | Week 3 | Week 4 | Week 5 | Week 6 | Week 7 | Week 8 | Week 9 | Week 10 | Week 11 |
Other | Week 5: Database Wiki |
Final Project | Team H(oo)KD Project Page | Journal Club Presentation | Project Individual Journal |
Contents |
[edit] The Genetic Code, by Computer
[edit] Complement of a Strand
To find the complementary strand of the sequence in the file prokaryote.txt, the following sequence of piped text was executed
cat prokaryote.txt | sed "y/atcg/tagc/"
[edit] Reading Frames
The following piped sequence of text was invoked to determine the sequence of amino acids encoded by
- the +1 reading frame:
cat prokaryote.txt | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
- the +2 reading frame:
cat prokaryote.txt | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
- the +3 reading frame:
cat prokaryote.txt | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
- the -1 reading frame:
cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
- the -2 reading frame:
cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
- the -1 reading frame:
cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
[edit] XMLPipeDB Match Practice
- There are two unique matches: go:0009165 and go:0009165. The pattern go:009165 appears twice and the pattern go:009165 appears once in the file. These patterns represent a gene ontology term, which describes what biological function the product of a gene is involved in.
- There are two unique matches: "james k.d." and "james a.a.". The pattern "james k.d." appears 8238 times while the pattern "james a.a." appears only once in the file. These patterns probably designate the names of the scientists who determined the genetic sequence of P. falciparum.
- According to Match, the pattern ATG appears 165 times in the file. However, according to grep, the pattern ATG appears 162 times. The discrepancy makes sense considering that grep searches for patterns one line at a time while Match searches all of the text continuously for a specific pattern. There may have been three ATG patterns that were broken up between two consecutive lines. Thus, Match was able to find them, whereas grep was not.
Ksherbina (talk) 23:17, 12 September 2013 (PDT)