Difference between revisions of "Kzebrows Week 3"

From LMU BioDB 2015
Jump to: navigation, search
(Finding the complement of a strand.)
 
(+1 reading frame)
Line 5: Line 5:
 
I opened the prokaryote file using cat infA-E.coli-K12.txt, which gave me the DNA sequence of the mRNA-like strand. If this is read from 5’ to 3’, I needed to create the complementary strand. I then typed in the sed rule indicating what I wanted to replace, which gave me the complementary strand. The complete command was  
 
I opened the prokaryote file using cat infA-E.coli-K12.txt, which gave me the DNA sequence of the mRNA-like strand. If this is read from 5’ to 3’, I needed to create the complementary strand. I then typed in the sed rule indicating what I wanted to replace, which gave me the complementary strand. The complete command was  
 
  cat infA-E.coli-K12.txt | sed “y/atcg/tagc/”.
 
  cat infA-E.coli-K12.txt | sed “y/atcg/tagc/”.
 +
 +
==Reading Frames==
 +
 +
First I opened the file and replaced all of the T's with U's using
 +
sed "s/t/u/g"
 +
 +
Which gave me the DNA sequence translated into mRNA. This still gave me a long string of letters so I used
 +
sed "s/.../& /g"
 +
to indicate that I wanted a space every three letters, separating the sequence into codons.
 +
 +
Then, from looking at the file genetic-code.sed, which contains a separate list of each codon and the letter of the corresponding amino acid, I knew that this file needed to be added to the list of commands in order for its information to be used with the infA-E.coli-K12.txt file. The final string of commands for the +1 sequence then looks like this:
 +
cat infA-E.coli-K12.txt | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed

Revision as of 23:09, 21 September 2015

Complement of a Strand

I decided to use the E. coli file for practice in the first part of the assignment. Initially, my instinct was to use SED as the command to replace the letters in the sequence. I needed to replace A with T, T with A, C with G, and G with C; however, I realized that SED only replaces things in a sequence, and if I used SED then every letter that changed would immediately change back to the original, defeating the purpose of the command. I then remembered that I can use sed “y”/<original characters>/<new characters>/ to replace everything in one go.

I opened the prokaryote file using cat infA-E.coli-K12.txt, which gave me the DNA sequence of the mRNA-like strand. If this is read from 5’ to 3’, I needed to create the complementary strand. I then typed in the sed rule indicating what I wanted to replace, which gave me the complementary strand. The complete command was

cat infA-E.coli-K12.txt | sed “y/atcg/tagc/”.

Reading Frames

First I opened the file and replaced all of the T's with U's using

sed "s/t/u/g" 

Which gave me the DNA sequence translated into mRNA. This still gave me a long string of letters so I used

sed "s/.../& /g" 

to indicate that I wanted a space every three letters, separating the sequence into codons.

Then, from looking at the file genetic-code.sed, which contains a separate list of each codon and the letter of the corresponding amino acid, I knew that this file needed to be added to the list of commands in order for its information to be used with the infA-E.coli-K12.txt file. The final string of commands for the +1 sequence then looks like this:

cat infA-E.coli-K12.txt | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed