Difference between revisions of "Vpachec3 Week 3"
(Adding in the answer for +1) |
(Added to number 5) |
||
(15 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
==The Genetic Code, by Computer== | ==The Genetic Code, by Computer== | ||
− | + | cat prokaryote.txt | sed "y/atcg/tagc/" | |
==Reading Frames== | ==Reading Frames== | ||
Line 7: | Line 7: | ||
cat prokaryote.txt | sed "s/..$//g" | sed "y/t/u/" | sed "s/.../& /g" | sed -f genetic-code.sed | cat prokaryote.txt | sed "s/..$//g" | sed "y/t/u/" | sed "s/.../& /g" | sed -f genetic-code.sed | ||
+ | |||
S T I F Q - V R W P K K T I L N L K R C L I P C S A Y N P A A S S A G G I L | S T I F Q - V R W P K K T I L N L K R C L I P C S A Y N P A A S S A G G I L | ||
===+2=== | ===+2=== | ||
+ | cat prokaryote.txt | sed "s/^.//g" | sed "s/.$//g" | sed "y/t/u" | sed "s/.../& /g" | sed -f genetic-code.sed | ||
+ | |||
+ | L L Y F N R Y D G Q R R Q Y - T - N V A - Y H V P R I T Q P P V P L A A F - | ||
+ | |||
===+3=== | ===+3=== | ||
+ | cat prokaryote.txt | sed "s/^..//g" | sed "y/t/u/" | sed "s/.../& /g" | sed -f genetic-code.sed | ||
+ | |||
+ | Y Y I S I G T M A K E D N I E L E T L P N T M F R V - P S R Q F R W R H F N | ||
+ | |||
===-1=== | ===-1=== | ||
+ | cat prokaryote.txt | sed "y/tagc/aucg/" | rev | sed "s/.../& /g" | sed "s/..$//g" | sed -f genetic-code.sed | ||
+ | |||
+ | V K M P P A E L A A G L Y A E H G I R Q R F K F N I V F F G H R T Y - N I V | ||
+ | |||
===-2=== | ===-2=== | ||
+ | cat prokaryote.txt | sed "y/tagc/aucg" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed -f genetic-code.sed | ||
+ | |||
+ | L K C R Q R N W R L G Y T R N M V L G N V S S S I L S S L A I V P I E I - - | ||
+ | |||
===-3=== | ===-3=== | ||
− | |||
− | |||
− | + | cat prokaryote.txt | sed "y/tagc/aucg/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed -f genetic-code.sed | |
− | Then | + | |
+ | N A A S G T G G W V I R G T W Y - A T F Q V Q Y C L L W P S Y L L K Y S R | ||
+ | |||
+ | |||
+ | ==XMLPipeDB Match Practice== | ||
+ | |||
+ | ''What Match command tallies the occurrences of the pattern GO:000[567] in the 493.P_falciparum.xml file?'' | ||
+ | '''How many unique matches are there?''' | ||
+ | There were 3 unique matches. | ||
+ | |||
+ | '''How many times does each unique match appear?''' | ||
+ | go:0007: 113 | ||
+ | go:0006: 1100 | ||
+ | go:0005: 1371 | ||
+ | |||
+ | ''Try to find one such occurrence “in situ” within that file. Look at the neighboring content around that occurrence.'' | ||
+ | |||
+ | Example:<dbReference type="GO" id="GO:0005622"> | ||
+ | |||
+ | grep “GO:000[567]” 493.P_falciparum.xml | more | ||
+ | |||
+ | Based on where you find this occurrence, what kind of information does this pattern represent? | ||
+ | ontology ID of a gene. | ||
+ | |||
+ | '''What Match command tallies the occurrences of the pattern \"Yu.*\" in the 493.P_falciparum.xml file?''' | ||
+ | java -jar xmlpipedb-match-1.1.1.jar \"Yu.*\" < 493.P_falciparum.xml | ||
+ | |||
+ | '''How many unique matches are there?''' | ||
+ | 3 | ||
+ | '''How many times does each unique match appear?''' | ||
+ | "yu b.": 1 | ||
+ | "yu k.": 228 | ||
+ | "yu m.": 1 | ||
+ | |||
+ | '''What information do you think this pattern represents?''' | ||
+ | I think that Yu is a last name and the letters following the the first letter of the first name, | ||
+ | |||
+ | Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing. | ||
+ | |||
+ | What answer does Match give you? | ||
+ | java -jar xmlpipedb-match-1.1.1.jar ATG < hs_ref_GRCh37_chr19.fa | ||
+ | |||
+ | atg: 830101 | ||
+ | |||
+ | Total unique matches: 1 | ||
− | + | '''What answer does grep + wc give you?''' | |
− | + | 502410 502410 35671048 | |
− | + | '''Explain why the counts are different. (Hint: Make sure you understand what exactly is being counted by each approach.)''' | |
− | .. | + | |
− | + | I think they are counting the lines, words and characters. I recall it going over was wc meant in class and going over how to get the different counts. | |
− | + | ||
− | + | ||
− | + | ||
+ | ==Electronic Lab Book== | ||
+ | # Go to the magnifying glass symbol at the top of the computer screen and type in ' Terminal' | ||
+ | # Click on Terminal and type in: ssh my dot cs dot lmu dot edu and click 'Enter' | ||
+ | # Type in password and press enter | ||
+ | # ( I personally took a while to figure this out) Then type: cd ~dondi/xmlpipedb/data - this gets you into the directory | ||
+ | # For each section of the assignment, there were different files to be accessed. See individual questions above. | ||
==Links== | ==Links== |
Latest revision as of 07:10, 22 September 2015
Contents
The Genetic Code, by Computer
cat prokaryote.txt | sed "y/atcg/tagc/"
Reading Frames
+1
cat prokaryote.txt | sed "s/..$//g" | sed "y/t/u/" | sed "s/.../& /g" | sed -f genetic-code.sed
S T I F Q - V R W P K K T I L N L K R C L I P C S A Y N P A A S S A G G I L
+2
cat prokaryote.txt | sed "s/^.//g" | sed "s/.$//g" | sed "y/t/u" | sed "s/.../& /g" | sed -f genetic-code.sed
L L Y F N R Y D G Q R R Q Y - T - N V A - Y H V P R I T Q P P V P L A A F -
+3
cat prokaryote.txt | sed "s/^..//g" | sed "y/t/u/" | sed "s/.../& /g" | sed -f genetic-code.sed
Y Y I S I G T M A K E D N I E L E T L P N T M F R V - P S R Q F R W R H F N
-1
cat prokaryote.txt | sed "y/tagc/aucg/" | rev | sed "s/.../& /g" | sed "s/..$//g" | sed -f genetic-code.sed
V K M P P A E L A A G L Y A E H G I R Q R F K F N I V F F G H R T Y - N I V
-2
cat prokaryote.txt | sed "y/tagc/aucg" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed -f genetic-code.sed
L K C R Q R N W R L G Y T R N M V L G N V S S S I L S S L A I V P I E I - -
-3
cat prokaryote.txt | sed "y/tagc/aucg/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed -f genetic-code.sed
N A A S G T G G W V I R G T W Y - A T F Q V Q Y C L L W P S Y L L K Y S R
XMLPipeDB Match Practice
What Match command tallies the occurrences of the pattern GO:000[567] in the 493.P_falciparum.xml file? How many unique matches are there?
There were 3 unique matches.
How many times does each unique match appear?
go:0007: 113 go:0006: 1100 go:0005: 1371
Try to find one such occurrence “in situ” within that file. Look at the neighboring content around that occurrence.
Example:<dbReference type="GO" id="GO:0005622">
grep “GO:000[567]” 493.P_falciparum.xml | more
Based on where you find this occurrence, what kind of information does this pattern represent?
ontology ID of a gene.
What Match command tallies the occurrences of the pattern \"Yu.*\" in the 493.P_falciparum.xml file?
java -jar xmlpipedb-match-1.1.1.jar \"Yu.*\" < 493.P_falciparum.xml
How many unique matches are there?
3
How many times does each unique match appear?
"yu b.": 1 "yu k.": 228 "yu m.": 1
What information do you think this pattern represents? I think that Yu is a last name and the letters following the the first letter of the first name,
Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.
What answer does Match give you?
java -jar xmlpipedb-match-1.1.1.jar ATG < hs_ref_GRCh37_chr19.fa
atg: 830101
Total unique matches: 1
What answer does grep + wc give you?
502410 502410 35671048
Explain why the counts are different. (Hint: Make sure you understand what exactly is being counted by each approach.)
I think they are counting the lines, words and characters. I recall it going over was wc meant in class and going over how to get the different counts.
Electronic Lab Book
- Go to the magnifying glass symbol at the top of the computer screen and type in ' Terminal'
- Click on Terminal and type in: ssh my dot cs dot lmu dot edu and click 'Enter'
- Type in password and press enter
- ( I personally took a while to figure this out) Then type: cd ~dondi/xmlpipedb/data - this gets you into the directory
- For each section of the assignment, there were different files to be accessed. See individual questions above.