Jkuroda Week 3
Complement of a Strand
To get the complement, I immediately thought of replacing each nucleotide with its complement, and so I used sed with "y/atcg/tagc/" to implement my idea.
cat sequence file | sed "y/atcg/tagc/"
Reading Frames
For this initial reading frame, I first thought of replacing the t's with u's, then I simply used the genetic-code.sed file to do the rest for me. But this did not work, since the sed command was going through the commands line by line. I was left with a messy line of lonely nucleotides with the amino acid abbreviations between them. I thought about it for a second and realized that I could solve this issue by simply separating each base triplet with a space. That seemed to solve the problem but then there were a couple of stray nucleotides at the end of the line, so I added a sed command to get rid of any extra nucleotides.
+1 cat sequence file | sed "s/t/u/g" | sed "s/.../& /g" |sed -f genetic-code.sed | sed "s/[aucg]//g"
Now that we are in a different reading frame, the overall process is mainly similar, but there is one small addition. I used the sed command to delete the first character in the sequence file.
+2 cat sequence file | sed "s/^.//g" | sed "s/t/u/g" | sed "s/.../& /g" |sed -f genetic-code.sed | sed "s/[aucg]//g"
Similarly for this reading frame, I just added an extra character to be deleted from the beginning of the sequence.
+3 cat sequence file | sed "s/^..//g" | sed "s/t/u/g" | sed "s/.../& /g" |sed -f genetic-code.sed | sed "s/[aucg]//g"
For the next three reading frames, I remembered that there was a handy rev command for reversing the characters in a sequence, so I placed that command before I did the usual sequence of commands.
-1 cat sequence file | rev sequence file | sed "s/t/u/g" | sed "s/.../& /g" |sed -f genetic-code.sed | sed "s/[aucg]//g"
Now that the reverse command is in place, the rest of the commands are similar to the previous reading frames, with the deletion of the first and second characters for -2 and -3, respectively.
-2 cat sequence file | rev sequence file | sed "s/^.//g" | sed "s/t/u/g" | sed "s/.../& /g" |sed -f genetic-code.sed | sed "s/[aucg]//g"
-3 cat sequence file | rev sequence file | sed "s/^..//g" | sed "s/t/u/g" | sed "s/.../& /g" |sed -f genetic-code.sed | sed "s/[aucg]//g"