Vpachec3 Week 3
Contents
The Genetic Code, by Computer
cat prokaryote.txt | sed "y/atcg/tagc/"
Reading Frames
+1
cat prokaryote.txt | sed "s/..$//g" | sed "y/t/u/" | sed "s/.../& /g" | sed -f genetic-code.sed
S T I F Q - V R W P K K T I L N L K R C L I P C S A Y N P A A S S A G G I L
+2
cat prokaryote.txt | sed "s/^.//g" | sed "s/.$//g" | sed "y/t/u" | sed "s/.../& /g" | sed -f genetic-code.sed
L L Y F N R Y D G Q R R Q Y - T - N V A - Y H V P R I T Q P P V P L A A F -
+3
cat prokaryote.txt | sed "s/^..//g" | sed "y/t/u/" | sed "s/.../& /g" | sed -f genetic-code.sed
Y Y I S I G T M A K E D N I E L E T L P N T M F R V - P S R Q F R W R H F N
-1
cat prokaryote.txt | sed "y/tagc/aucg/" | rev | sed "s/.../& /g" | sed "s/..$//g" | sed -f genetic-code.sed
V K M P P A E L A A G L Y A E H G I R Q R F K F N I V F F G H R T Y - N I V
-2
cat prokaryote.txt | sed "y/tagc/aucg" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed -f genetic-code.sed
L K C R Q R N W R L G Y T R N M V L G N V S S S I L S S L A I V P I E I - -
-3
cat prokaryote.txt | sed "y/tagc/aucg/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed -f genetic-code.sed
N A A S G T G G W V I R G T W Y - A T F Q V Q Y C L L W P S Y L L K Y S R
XMLPipeDB Match Practice
What Match command tallies the occurrences of the pattern GO:000[567] in the 493.P_falciparum.xml file? How many unique matches are there?
There were 3 unique matches.
How many times does each unique match appear?
go:0007: 113 go:0006: 1100 go:0005: 1371
Try to find one such occurrence “in situ” within that file. Look at the neighboring content around that occurrence.
Example:<dbReference type="GO" id="GO:0005622">
grep “GO:000[567]” 493.P_falciparum.xml | more
Based on where you find this occurrence, what kind of information does this pattern represent?
ontology ID of a gene.
What Match command tallies the occurrences of the pattern \"Yu.*\" in the 493.P_falciparum.xml file?
java -jar xmlpipedb-match-1.1.1.jar \"Yu.*\" < 493.P_falciparum.xml
How many unique matches are there?
3
How many times does each unique match appear?
"yu b.": 1 "yu k.": 228 "yu m.": 1
What information do you think this pattern represents? I think that Yu is a last name and the letters following the the first letter of the first name,
Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.
What answer does Match give you?
java -jar xmlpipedb-match-1.1.1.jar ATG < hs_ref_GRCh37_chr19.fa
atg: 830101
Total unique matches: 1
What answer does grep + wc give you?
502410 502410 35671048
Explain why the counts are different. (Hint: Make sure you understand what exactly is being counted by each approach.)
I think they are counting the lines, words and characters. I recall it going over was wc meant in class and going over how to get the different counts.
Electronic Lab Book
- Go to the magnifying glass symbol at the top of the computer screen and type in ' Terminal'
- Click on Terminal and type in: ssh my dot cs dot lmu dot edu and click 'Enter'
- Type in password and press enter
- ( I personally took a while to figure this out) Then type: cd ~dondi