Difference between revisions of "Jwoodlee Week 3"
(→Reading Frames: fixed command) |
(→Reading Frames: Added expasy credit) |
||
Line 42: | Line 42: | ||
− | + | Checked with Expasy translation tool. | |
− | + | ||
− | + | ||
− | + | ||
=== XMLPipeDB Match Practice === | === XMLPipeDB Match Practice === |
Revision as of 18:34, 21 September 2015
Contents
Electronic Lab Notebook
ssh into my.cs.lmu.edu using your username, and enter your password.
Complement of a Strand
locate the file in ~dondi/xmlpipedb/data, and enter the following command:
cat prokaryote.txt | sed “y/actg/tgac”
This will yield prokaryote.txt’s complementary DNA strand.
Reading Frames
These sets of commands are more complicated than Complement of a Strand. This is essentially what I had to accomplish:
take sequence file, replace the t’s with u’s, break up the sequence into groups of 3, use genetic-code.sed as the translation “chart”, and then eliminate extra nucleotides if there are any. For the different reading frames I will just delete the first one or two nucleotides
After lots of googling I came up with this basic outline in terminal:
cat prokaryote.txt | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed
For different reading frames, insert sed “s/^.//g” or "s/^..//g" after prokaryote.txt, and of course in order to use a different DNA sequence prokaryote.txt would be different. The following commands will be written exactly and should return the correct output. Enter these commands into terminal after navigating to nfs/home/dondi/xmlpipedb/data/.
+1 cat prokaryote.txt | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed | sed "s/[acug]//g"
+2 cat prokaryote.txt | sed "s/^.//g" | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed | sed "s/[acug]//g"
+3 cat prokaryote.txt | sed "s/^..//g" | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed | sed "s/[acug]//g"
In order to do these frames, I transcribed the DNA using sed "y///" and then reversed them in order to translate them from the proper side. (5' --> 3')
-1 cat prokaryote.txt | sed "y/actg/tgac/" | rev | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed | sed "s/[acug]//g"
-2 cat prokaryote.txt | sed "y/actg/tgac/" | rev | sed "s/t/u/g" | sed "s/^.//g" | sed "s/.../& /g" | sed -f genetic-code.sed | sed "s/[acug]//g"
-3 cat prokaryote.txt | sed "y/actg/tgac/" | rev | sed "s/t/u/g" | sed "s/^..//g" | sed "s/.../& /g" | sed -f genetic-code.sed | sed "s/[acug]//g"
Checked with Expasy translation tool.
XMLPipeDB Match Practice
For your convenience, the XMLPipeDB Match Utility (xmlpipedb-match-1.1.1.jar) has been installed in the ~dondi/xmlpipedb/data directory alongside the other practice files. Use this utility to answer the following questions:
- What Match command tallies the occurrences of the pattern
GO:000[567]
in the 493.P_falciparum.xml file?- How many unique matches are there?
- How many times does each unique match appear?
- Try to find one such occurrence “in situ” within that file. Look at the neighboring content around that occurrence.
- Describe how you did this.
- Based on where you find this occurrence, what kind of information does this pattern represent?
- What Match command tallies the occurrences of the pattern
\"Yu.*\"
in the 493.P_falciparum.xml file?- How many unique matches are there?
- How many times does each unique match appear?
- What information do you think this pattern represents?
- Use Match to count the occurrences of the pattern
ATG
in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.- What answer does Match give you?
- What answer does grep + wc give you?
- Explain why the counts are different. (Hint: Make sure you understand what exactly is being counted by each approach.)
BIOL 367, Fall 2015, User Page, Team Page
Weekly Assignments | Individual Journal Pages | Shared Journal Pages |
---|---|---|
|
|
|