Difference between revisions of "Vkuehn Week 3"
From LMU BioDB 2013
(first reading frame added) |
(Reading frames fixed!) |
||
(4 intermediate revisions by one user not shown) | |||
Line 2: | Line 2: | ||
===Complement of a Strand=== | ===Complement of a Strand=== | ||
cat sequence_file | grep "aggta" prokaryote.txt | sed "s/aggta/tccat/g" prokaryote.txt | cat sequence_file | grep "aggta" prokaryote.txt | sed "s/aggta/tccat/g" prokaryote.txt | ||
− | ===Reading Frames=== | + | ===Reading Frames:=== |
+1 Reading Frame: | +1 Reading Frame: | ||
cat prokaryote.txt | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | cat prokaryote.txt | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+2 Reading Frame: | +2 Reading Frame: | ||
+ | cat prokaryote.txt | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | +3 Reading Frame: | ||
+ | cat prokaryote.txt | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | ===Reverse Reading Frames:=== | ||
+ | -1 Reading Frame: | ||
+ | cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed | ||
+ | -2 Reading Frame: | ||
+ | cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/t/u/g" | sed "s/^.//g" | sed "s/.../& /g" | sed -f genetic-code.sed | ||
+ | -3 Reading Frame: | ||
+ | cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/t/u/g" | sed "s/..//g" | sed "s/.../& /g" | sed -f genetic-code.sed | ||
+ | ===XMLPipeDB Match Practice:=== | ||
+ | '''What Match command tallies the occurrences of the pattern GO:000916. in the 493.P_falciparum.xml file?''' | ||
+ | * 2 unique matches | ||
+ | * One match appears once, and the other appears twice. | ||
+ | * I think this represents some kind of a protein marker or specific sequence. | ||
+ | '''What Match command tallies the occurrences of the pattern \"James.*\" in the 493.P_falciparum.xml file?''' | ||
+ | * 2 unique matches | ||
+ | * One match appears 8238 times, and the other one appears 1 time | ||
+ | * This information represents the authors who found certain proteins gene sequences, the top one may have been referenced more often. | ||
+ | '''Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.''' | ||
+ | * Match gave 1 match that appeared 830101 times | ||
+ | * It said 502410 lines, 502410 words and 35671048 characters. | ||
+ | * I think this gave different responses because the Match actually counted the number of times ATG appeared, while grep did not do this correctly. It did not have spaces between so it did not correctly read the number of times ATG appeared, it counted the lines and the characters where it appeared instead. | ||
[[user: Vkuehn|Viktoria Kuehn]] | [[user: Vkuehn|Viktoria Kuehn]] | ||
[[Template:Vkuehn]] | [[Template:Vkuehn]] | ||
[[Category: Journal Entry]] | [[Category: Journal Entry]] |
Latest revision as of 15:47, 26 September 2013
Contents |
[edit] Individual Journal Assignment Week 3
[edit] Complement of a Strand
cat sequence_file | grep "aggta" prokaryote.txt | sed "s/aggta/tccat/g" prokaryote.txt
[edit] Reading Frames:
+1 Reading Frame:
cat prokaryote.txt | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
+2 Reading Frame:
cat prokaryote.txt | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
+3 Reading Frame:
cat prokaryote.txt | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
[edit] Reverse Reading Frames:
-1 Reading Frame:
cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed
-2 Reading Frame:
cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/t/u/g" | sed "s/^.//g" | sed "s/.../& /g" | sed -f genetic-code.sed
-3 Reading Frame:
cat prokaryote.txt | sed "y/atcg/tagc/" | rev | sed "s/t/u/g" | sed "s/..//g" | sed "s/.../& /g" | sed -f genetic-code.sed
[edit] XMLPipeDB Match Practice:
What Match command tallies the occurrences of the pattern GO:000916. in the 493.P_falciparum.xml file?
- 2 unique matches
- One match appears once, and the other appears twice.
- I think this represents some kind of a protein marker or specific sequence.
What Match command tallies the occurrences of the pattern \"James.*\" in the 493.P_falciparum.xml file?
- 2 unique matches
- One match appears 8238 times, and the other one appears 1 time
- This information represents the authors who found certain proteins gene sequences, the top one may have been referenced more often.
Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.
- Match gave 1 match that appeared 830101 times
- It said 502410 lines, 502410 words and 35671048 characters.
- I think this gave different responses because the Match actually counted the number of times ATG appeared, while grep did not do this correctly. It did not have spaces between so it did not correctly read the number of times ATG appeared, it counted the lines and the characters where it appeared instead.