Week 3 E-notes Eyanosch

From LMU BioDB 2015
Revision as of 06:17, 22 September 2015 by Eyanosch (Talk | contribs) (writing notes on how the hw was attempted, ended up getting a little stuck but luckily my week partners work helped.)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

In order to produce the complementary strands nucleotide sequence

cat sequence_file | sed "y/atgc/tacg/"

this follows the rule of y/<original characters>/<new characters>/

IF afcggtatac is contained in sequence_file

the output would be tcgccatatg

  • basically I wanted the computer to read the file and replace each individual letter with its corresponding base, A with T (and vise versa), and C with G (and vise versa)

Write 6 sets of text processing commands that, when given a nucleotide sequence, returns the resulting amino acid sequence, one for each possible reading frame for the nucleotide sequence.

When looking at this problem, there are a few things that need to be done.

The nucleotides sequence must be established and converted into RNA. This can be done by replacing the T's with U's.

Then the nucleotide sequence must be broken into its codon components, probably starting with the +1 reading frame.

Next the codons must be read and converted into the specific Amino acids which we need to use Dondi's ~dondi/xmlpipedb/data directory in which genetic-code.sed has the conversions already written. I'm not entirely sure how t o invoke the command so I took a look at my partner Brandons page for help, this is the part that I have been stuck on. The code written matches sed -f <file with rules>.