Emilysimso Week 4

From LMU BioDB 2015
Revision as of 22:12, 26 September 2015 by Emilysimso (Talk | contribs) (Answered question 2)

Jump to: navigation, search

Question 1

  • Used grep "[ct]at[at]at" infA-E.coli-K12.txt to highlight the -10box
  • Used grep "tt[gt]ac[at]" infA-E.coli-K12.txt to highlight the -35 box
  • Used sed "s/cattat/<minus10box>&<\/minus10box>/g" infA-E.coli-K12.txt | sed "s/tttact/<minus35box>&<\/minus35box>/g" to add the labels to the -35box and -10box
  • Added sed -r "s/<\/minus10box>.{11}/<tss>&<\/tss>/g"
    • This did not work because the <tss> label got added before the </minus10box>
  • Used sed -r "s/<\/minus10box>.{5}/&<\/tss>/g" | sed "s/<\/tss>/<tss>c&/g" to add the tss site markers
  • Added grep "gagg" to find the rbs
  • Added sed "s/gagg/<rbs>&<\/rbs>/g" around the gagg to mark the rbs
  • Used grep "atg" to find possible start codons
  • Added sed -r "s/<\/rbs>.{8}/&<\/startcodon>/g" | sed "s/<\/startcodon>/<startcodon>atg&/g" to mark the start codon (atg)
  • Possible stop codons - taa, tag, or tga
  • Used sed "s/.../ & /g" infA-E.coli-K12.txt | grep "taa" | grep "tag" | grep "tga" to find possible stop codons
    • tga is only possible stop codon
  • Added sed "1s/tga/<stop_codon>&<\/stop_codon>/g"
    • Looked for first one after the start codon
  • Added sed "1s/tga/<stop_codon>&<\/stop_codon>/3"
  • Used sed "s/aaaaggt/<terminator>&/g" to mark the first part of the terminator
  • Used grep "gcctttt" infA-E.coli-K12.txt to find the rest of the hairpin
    • Looked for next four bases - they were tatt
  • Used sed "s/gcctttttatt/&<\/terminator>/g" to mark the end of the terminator
  • Final command: sed "s/cattat/<minus10box>&<\/minus10box>/g" infA-E.coli-K12.txt | sed "s/tttact/<minus35box>&<\/minus35box>/g" | sed -r "s/<\/minus10box>.{5}/&<\/tss>/g" | sed "s/<\/tss>/<tss>c&/g" | sed "s/gagg/<rbs>&<\/rbs>/g" | sed -r "s/<\/rbs>.{8}/&<\/start_codon>/g" | sed "s/<\/start_codon>/<startcodon>atg&/g" | sed "1s/tga/<stop_codon>&<\/stop_codon>/3" | sed "s/aaaaggt/<terminator>&/g" | sed "s/gcctttttatt/&<\/terminator>/g"
  • Final result: ttttcaccacaagaatgaatgttttcggcacatttctccccagagtgttataattgcggtcgcagagttggttacgctcattaccccgctgccgataaggaatttttcgcgtcaggtaacgcccatcgtttatctcaccgctcccttatacgttgcgcttttggtgcggcttagccgtgtgttttcggagtaatgtgccgaacctgtttgttgcgatttagcgcgcaaatc<minus35box>tttact</minus35box>tatttacagaacttcgg<minus10box>cattat</minus10box>cttgc<tss>c</tss>cggttcaaattacggtagtgatacccca<rbs>gagg</rbs>attagatg<startcodon>atg</start_codon>gccaaagaagacaatat<stop_codon>tga</stop_codon>aatgcaaggtaccgttcttgaaacgttgcctaataccatgttccgcgtagagttagaaaacggtcacgtggttactgcacacatctccggtaaaatgcgcaaaaactacatccgcatcctgacgggcgacaaagtgactgttgaactgaccccgtacgacctgagcaaaggccgcattgtcttccgtagtcgctgattgttttaccgcctgatgggcgaagagaaagaacgagt<terminator>aaaaggtcggtttaaccggcctttttatt</terminator>ttat

Question 2

  • Took the strand from the tss to the end of the terminator
  • Used sed "y/atcg/uagc/"
  • Resulting sequence: ggccaaguuuaaugccaucacuauggggucuccuaaucuacuaccgguuucuucuguuauaacuuuacguuccauggcaagaacuuugcaacggauuaugguacaaggcgcaucucaaucuuuugccagugcaccaaugacguguguagaggccauuuuacgcguuuuugauguaggcguaggacugcccgcuguuucacugacaacuugacuggggcaugcuggacucguuuccggcguaacagaaggcaucagcgacuaacaaaauggcggacuacccgcuucucuuucuugcucauuuuccagccaaauuggccggaaaaauaa