Difference between revisions of "Nanguiano Week 3"

From LMU BioDB 2015
Jump to: navigation, search
(creating page and adding links)
 
(added initial steps for the assignment as well as assignment text)
Line 1: Line 1:
 
== The Genetic Code, by Computer ==
 
== The Genetic Code, by Computer ==
 +
=== The Genetic Code, by Computer ===
 +
 +
Connect to the ''my.cs.lmu.edu'' workstation as shown in class and do the following exercises from there.
 +
 +
For this exercise, I performed the following series of commands to prepare for the assignment.
 +
ssh my.cs.lmu.edu -l nanguia1 <!-- I also inputted my password -->
 +
mkdir biodb
 +
cat >"sequence_file.txt"
 +
agcggtatac <!-- then I pressed control d to complete the command -->
 +
cd biodb
 +
mkdir week3
 +
mv sequence_file.txt biodb/week3
 +
cd ~dondi/xmlpipedb/data
 +
cp genetic-code.sed ~nanguia1/biodb/week3
 +
cd ~nanguia1/biodb/week3
 +
 +
==== Complement of a Strand ====
 +
 +
Write a sequence of piped text processing commands that, when given a nucleotide sequence, returns its complementary strand. In other words, fill in the question marks:
 +
 +
    cat ''sequence_file'' | '''?????'''
 +
    cat "sequence_file" |
 +
 +
For example, if ''sequence_file'' contains:
 +
 +
    agcggtatac
 +
 +
Then your text processing commands should display:
 +
 +
    tcgccatatg
 +
 +
==== Reading Frames ====
 +
 +
Write ''6'' sets of text processing commands that, when given a nucleotide sequence, returns the resulting amino acid sequence, one for each possible reading frame for the nucleotide sequence. In other words, fill in the question marks:
 +
 +
    cat ''sequence_file'' | '''?????'''
 +
 +
You should have 6 different sets of commands, one for each possible reading frame. For example, if ''sequence_file'' contains:
 +
 +
    agcggtatac
 +
 +
Then your text processing commands for 5’-3’ frame 1 should display:
 +
 +
    SGI
 +
 +
Your text processing commands for 5’-3’ frame 3 should display:
 +
 +
    RY
 +
 +
...and so on.
 +
 +
* '''Hint 1:''' The 6 sets of commands are very similar to each other.
 +
* '''Hint 2:''' Under the ''~dondi/xmlpipedb/data'' directory in the Keck lab, you will find a file called ''genetic-code.sed''.  To save you some typing, this file has already been prepared with the correct sequence of '''sed''' commands for converting any base triplets into the corresponding amino acid.  For example, this line in that file: <pre>s/ugc/C/g</pre> ...corresponds to a uracil-guanine-cytosine sequence transcribing to the cysteine amino acid (C).  The trick is to figure out how to use this file to your advantage, in the commands that you'll be forming.
 +
 +
==== Check Your Work ====
 +
 +
Fortunately, online tools are available for checking your work; we recommend the ExPASy Translate Tool, sponsored by the same people who run SwissProt. You’re free to use this tool to see if your text processing commands produce the same results.
  
 
== Links ==
 
== Links ==
 
{{Template:Nanguiano}}
 
{{Template:Nanguiano}}

Revision as of 01:30, 15 September 2015

The Genetic Code, by Computer

The Genetic Code, by Computer

Connect to the my.cs.lmu.edu workstation as shown in class and do the following exercises from there.

For this exercise, I performed the following series of commands to prepare for the assignment.

ssh my.cs.lmu.edu -l nanguia1 
mkdir biodb
cat >"sequence_file.txt" 
agcggtatac 
cd biodb 
mkdir week3
mv sequence_file.txt biodb/week3
cd ~dondi/xmlpipedb/data
cp genetic-code.sed ~nanguia1/biodb/week3
cd ~nanguia1/biodb/week3

Complement of a Strand

Write a sequence of piped text processing commands that, when given a nucleotide sequence, returns its complementary strand. In other words, fill in the question marks:

   cat sequence_file | ?????
   cat "sequence_file" | 

For example, if sequence_file contains:

   agcggtatac

Then your text processing commands should display:

   tcgccatatg

Reading Frames

Write 6 sets of text processing commands that, when given a nucleotide sequence, returns the resulting amino acid sequence, one for each possible reading frame for the nucleotide sequence. In other words, fill in the question marks:

   cat sequence_file | ?????

You should have 6 different sets of commands, one for each possible reading frame. For example, if sequence_file contains:

   agcggtatac

Then your text processing commands for 5’-3’ frame 1 should display:

   SGI

Your text processing commands for 5’-3’ frame 3 should display:

   RY

...and so on.

  • Hint 1: The 6 sets of commands are very similar to each other.
  • Hint 2: Under the ~dondi/xmlpipedb/data directory in the Keck lab, you will find a file called genetic-code.sed. To save you some typing, this file has already been prepared with the correct sequence of sed commands for converting any base triplets into the corresponding amino acid. For example, this line in that file:
    s/ugc/C/g
    ...corresponds to a uracil-guanine-cytosine sequence transcribing to the cysteine amino acid (C). The trick is to figure out how to use this file to your advantage, in the commands that you'll be forming.

Check Your Work

Fortunately, online tools are available for checking your work; we recommend the ExPASy Translate Tool, sponsored by the same people who run SwissProt. You’re free to use this tool to see if your text processing commands produce the same results.

Links

Nicole Anguiano
BIOL 367, Fall 2015

Assignment Links
Individual Journals
Shared Journals