Difference between revisions of "Nanguiano Week 3"
 (→Reading Frames:  clarified the divided lines)  | 
				 (added more commands, and added part 2 of the assignment)  | 
				||
| Line 13: | Line 13: | ||
  cd ~dondi/xmlpipedb/data  |   cd ~dondi/xmlpipedb/data  | ||
  cp genetic-code.sed ~nanguia1/biodb/week3  |   cp genetic-code.sed ~nanguia1/biodb/week3  | ||
| + |  cp xmlpipedb-match-1.1.1.jar ~nanguia1/biodb/week3  | ||
| + |  cp 493.P_falciparum.xml ~nanguia1/biodb/week3  | ||
| + |  cp hs_ref_GRCh37_chr19.fa ~nanguia1/biodb/week3  | ||
  cd ~nanguia1/biodb/week3  |   cd ~nanguia1/biodb/week3  | ||
| − | + | === Complement of a Strand ===  | |
Write a sequence of piped text processing commands that, when given a nucleotide sequence, returns its complementary strand.    | Write a sequence of piped text processing commands that, when given a nucleotide sequence, returns its complementary strand.    | ||
| Line 24: | Line 27: | ||
  tcgccatatg  |   tcgccatatg  | ||
| − | + | === Reading Frames ===  | |
Write ''6'' sets of text processing commands that, when given a nucleotide sequence, returns the resulting amino acid sequence, one for each possible reading frame for the nucleotide sequence.  | Write ''6'' sets of text processing commands that, when given a nucleotide sequence, returns the resulting amino acid sequence, one for each possible reading frame for the nucleotide sequence.  | ||
| Line 60: | Line 63: | ||
  IP  |   IP  | ||
| − | + | === Check Your Work ===  | |
Utilizing the [http://web.expasy.org/translate/ ExPASy Translate Tool], I inputted my sample dna sequence, "agcggtatac". The result was as follows:  | Utilizing the [http://web.expasy.org/translate/ ExPASy Translate Tool], I inputted my sample dna sequence, "agcggtatac". The result was as follows:  | ||
[[File:NAW3TranslationTest.png]]  | [[File:NAW3TranslationTest.png]]  | ||
| + | |||
| + | == XMLPipeDB Match Practice ==  | ||
| + | |||
| + | For your convenience, the XMLPipeDB Match Utility (''xmlpipedb-match-1.1.1.jar'') has been installed in the ''~dondi/xmlpipedb/data'' directory alongside the other practice files. Use this utility to answer the following questions:  | ||
| + | |||
| + | # What Match command tallies the occurrences of the pattern <code>GO:000[567]</code> in the ''493.P_falciparum.xml'' file?  | ||
| + | #* How many unique matches are there?  | ||
| + | #* How many times does each unique match appear?  | ||
| + | # Try to find one such occurrence “in situ” within that file. Look at the neighboring content around that occurrence.  | ||
| + | #* Describe how you did this.  | ||
| + | #* Based on where you find this occurrence, what kind of information does this pattern represent?  | ||
| + | # What Match command tallies the occurrences of the pattern <code>\"Yu.*\"</code> in the ''493.P_falciparum.xml'' file?  | ||
| + | #* How many unique matches are there?  | ||
| + | #* How many times does each unique match appear?  | ||
| + | #* What information do you think this pattern represents?  | ||
| + | # Use Match to count the occurrences of the pattern <code>ATG</code> in the ''hs_ref_GRCh37_chr19.fa'' file (this may take a while).  Then, use '''grep''' and '''wc''' to do the same thing.  | ||
| + | #* What answer does Match give you?  | ||
| + | #* What answer does '''grep''' + '''wc''' give you?  | ||
| + | #* Explain why the counts are different. (''Hint:'' Make sure you understand what exactly is being counted by each approach.)  | ||
| + | |||
== Links ==  | == Links ==  | ||
{{Template:Nanguiano}}  | {{Template:Nanguiano}}  | ||
Revision as of 22:26, 15 September 2015
Contents
The Genetic Code, by Computer
Connect to the my.cs.lmu.edu workstation as shown in class and do the following exercises from there.
For this exercise, I performed the following series of commands to prepare for the assignment.
ssh my.cs.lmu.edu -l nanguia1 mkdir biodb cat >"sequence_file.txt" agcggtatac cd biodb mkdir week3 mv sequence_file.txt biodb/week3 cd ~dondi/xmlpipedb/data cp genetic-code.sed ~nanguia1/biodb/week3 cp xmlpipedb-match-1.1.1.jar ~nanguia1/biodb/week3 cp 493.P_falciparum.xml ~nanguia1/biodb/week3 cp hs_ref_GRCh37_chr19.fa ~nanguia1/biodb/week3 cd ~nanguia1/biodb/week3
Complement of a Strand
Write a sequence of piped text processing commands that, when given a nucleotide sequence, returns its complementary strand.
On a sequence_file.txt file containing the sequence "agcggtatac", the command and output was as follows:
cat sequence_file.txt | sed "y/atgc/tacg/" tcgccatatg
Reading Frames
Write 6 sets of text processing commands that, when given a nucleotide sequence, returns the resulting amino acid sequence, one for each possible reading frame for the nucleotide sequence. You should have 6 different sets of commands, one for each possible reading frame.
On a sequence_file.txt containing the sequence "agcggtatac", the command and output was as follows:
+1
cat sequence_file.txt | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ //g" | sed "s/[acgu]//g" SGI
+2
cat sequence_file.txt | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ //g" | sed "s/[acgu]//g" AVY
+3
cat sequence_file.txt | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ //g" | sed "s/[acgu]//g" RY
The remaining three were divided onto two lines on this wiki because they could not fit onto one without causing graphical bugs. The actual command was written without newlines.
-1
cat sequence_file.txt | sed "y/acgt/tgca/" | rev | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ //g" | sed "s/[acgu]//g" VYR
-2
cat sequence_file.txt | sed "y/acgt/tgca/" | rev | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ //g" | sed "s/[acgu]//g" YTA
-3
cat sequence_file.txt | sed "y/acgt/tgca/" | rev | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ //g" | sed "s/[acgu]//g" IP
Check Your Work
Utilizing the ExPASy Translate Tool, I inputted my sample dna sequence, "agcggtatac". The result was as follows:
XMLPipeDB Match Practice
For your convenience, the XMLPipeDB Match Utility (xmlpipedb-match-1.1.1.jar) has been installed in the ~dondi/xmlpipedb/data directory alongside the other practice files. Use this utility to answer the following questions:
-  What Match command tallies the occurrences of the pattern 
GO:000[567]in the 493.P_falciparum.xml file?- How many unique matches are there?
 - How many times does each unique match appear?
 
 -  Try to find one such occurrence “in situ” within that file. Look at the neighboring content around that occurrence.
- Describe how you did this.
 - Based on where you find this occurrence, what kind of information does this pattern represent?
 
 -  What Match command tallies the occurrences of the pattern 
\"Yu.*\"in the 493.P_falciparum.xml file?- How many unique matches are there?
 - How many times does each unique match appear?
 - What information do you think this pattern represents?
 
 -  Use Match to count the occurrences of the pattern 
ATGin the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.- What answer does Match give you?
 - What answer does grep + wc give you?
 - Explain why the counts are different. (Hint: Make sure you understand what exactly is being counted by each approach.)
 
 
Links
 Nicole Anguiano
 BIOL 367, Fall 2015
Assignment Links
- Week 1 Assignment
 - Week 2 Assignment
 - Week 3 Assignment
 - Week 4 Assignment
 - Week 5 Assignment
 - Week 6 Assignment
 - Week 7 Assignment
 - Week 8 Assignment
 - Week 9 Assignment
 - Week 10 Assignment
 - Week 11 Assignment
 - Week 12 Assignment
 - Week 14 Assignment
 - Week 15 Assignment
 
Individual Journals
- Individual Journal Week 2
 - Individual Journal Week 3
 - Individual Journal Week 4
 - Individual Journal Week 5
 - Individual Journal Week 6
 - Individual Journal Week 7
 - Individual Journal Week 8
 - Individual Journal Week 9
 - Individual Journal Week 10
 - Individual Journal Week 11
 - Individual Assessment
 - Deliverables
 
