Lenaolufson Week 3

From LMU BioDB 2015
Revision as of 04:13, 22 September 2015 by Lenaolufson (Talk | contribs) (filled out first parts of journal page, but need to go back and look to see how to do the numbering correctly)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The Genetic Code, by Computer

Complement of a Strand

Write a sequence of piped text processing commands that, when given a nucleotide sequence, returns its complementary strand. In other words, fill in the question marks:

cat sequence_file.txt | sed "y/atgc/tacg/" tcgccatatg

Reading Frames

Write 6 sets of text processing commands that, when given a nucleotide sequence, returns the resulting amino acid sequence, one for each possible reading frame for the nucleotide sequence. In other words, fill in the question marks:

The sequence used was "agcggtatac"

  • +1

cat sequence_file.txt | sed "s/ .../&g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ / /g" | sed "s/ [acgu] / /g" SGI

  • +2

cat sequence_file.txt | sed "s/^./ /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ / /g" | sed "s/ [acgu]/ /g" AVY

  • +3

cat sequence_file.txt | sed "s/^../ /g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ / /g" | sed "s/[acgu]/ /g" RY

  • -1

cat sequence_file.txt | sed "y/acgt/tgca/" | rev | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ / /g" | sed "s/[acgu] / /g" VYR

  • -2

cat sequence_file.txt | sed "y/acgt/tgca/" | rev | sed "s/^./ /g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ / /g" | sed "s/[acgu] / /g" YTA

  • -3

cat sequence_file.txt | sed "y/acgt/tgca/" | rev | sed "s/^../ / g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | sed "s/ / /g" | sed "s/[acgu] / /g" IP

XMLPipeDB Match Practice

For your convenience, the XMLPipeDB Match Utility (xmlpipedb-match-1.1.1.jar) has been installed in the ~dondi/xmlpipedb/data directory alongside the other practice files. Use this utility to answer the following questions:

  1. What Match command tallies the occurrences of the pattern GO:000[567] in the 493.P_falciparum.xml file?
    • How many unique matches are there?
    • How many times does each unique match appear?
    1. Try to find one such occurrence “in situ” within that file. Look at the neighboring content around that occurrence.
    • Describe how you did this.
    • Based on where you find this occurrence, what kind of information does this pattern represent?
      1. What Match command tallies the occurrences of the pattern \"Yu.*\" in the 493.P_falciparum.xml file?
    • How many unique matches are there?
    • How many times does each unique match appear?
    • What information do you think this pattern represents?
        1. Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.
    • What answer does Match give you?
    • What answer does grep + wc give you?
    • Explain why the counts are different. (Hint: Make sure you understand what exactly is being counted by each approach.)