|
|
| | | |
| I opened up terminal, and used the ssh command to get into dondi's directory: ~dondi/xmlpipedb/data. In there I got access to infA-E.coli-K12.txt which is the nucleotide sequence I will be using for this assignment. | | I opened up terminal, and used the ssh command to get into dondi's directory: ~dondi/xmlpipedb/data. In there I got access to infA-E.coli-K12.txt which is the nucleotide sequence I will be using for this assignment. |
− | # Modify the gene sequence string so that it highlights or “tags” the special sequences within this gene, as follows (ellipses indicate bases in the sequence; note the spaces before the start tag and after the end tag):
| + | * Modify the gene sequence string so that it highlights or “tags” the special sequences within this gene, as follows (ellipses indicate bases in the sequence; note the spaces before the start tag and after the end tag): |
| * -35 box of the promoter | | * -35 box of the promoter |
| **As shown in class I used the sed command to get the first occurrence of the minus 35 strand in the sequence: | | **As shown in class I used the sed command to get the first occurrence of the minus 35 strand in the sequence: |
|
|
| ** The stop codon requires I find one of three possible three character sequences. At first I tried using brackets: "t[ag][ag]", but I soon found out that that yielded too many results. There are only three stop codons and the brackets give me 4 unique codons. So into the wiki I went, and realized I could use a vertical bar to separate three unique codons, and search for them. The problem however, was that this did not work. After being stumped for awhile I realized that before I piped to that command I needed to break up the line into sets of 3, just like I did in the week 3 assignment. As a result I got this command: <code> cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>&<\/minus35box>\n/1" | sed -r "2s/^.{17}/&\n/g" | sed -r "3s/[ct]at[at]at/<minus10box>&<\/minus10box>\n/1" | sed -r "4s/^.{5}/&\n/g" | sed "5s/^./<tss>&<\/tss>\n/g" | sed "6s/gagg/<rbs>&<\/rbs>\n/1" | sed "7s/atg/<start_codon>&<\/start_codon>\n/1" | sed "8s/.../& /g"| sed -r "8s/tag|tga|taa/<stop_codon>&<\/stop_codon>/1"</code> | | ** The stop codon requires I find one of three possible three character sequences. At first I tried using brackets: "t[ag][ag]", but I soon found out that that yielded too many results. There are only three stop codons and the brackets give me 4 unique codons. So into the wiki I went, and realized I could use a vertical bar to separate three unique codons, and search for them. The problem however, was that this did not work. After being stumped for awhile I realized that before I piped to that command I needed to break up the line into sets of 3, just like I did in the week 3 assignment. As a result I got this command: <code> cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>&<\/minus35box>\n/1" | sed -r "2s/^.{17}/&\n/g" | sed -r "3s/[ct]at[at]at/<minus10box>&<\/minus10box>\n/1" | sed -r "4s/^.{5}/&\n/g" | sed "5s/^./<tss>&<\/tss>\n/g" | sed "6s/gagg/<rbs>&<\/rbs>\n/1" | sed "7s/atg/<start_codon>&<\/start_codon>\n/1" | sed "8s/.../& /g"| sed -r "8s/tag|tga|taa/<stop_codon>&<\/stop_codon>/1"</code> |
| *terminator | | *terminator |
− | ** The first part of the terminator hairpin is: <code> aaaaggt </code>, which means, abiding by the rules of the terminator provided to us, that the first half bonds with <code> gcctttt </code>. So now the trick is to grab the correct terminator sequence. I ended up breaking the terminator command into two different commands. I used one to insert the first tag, and the second one to insert the second tag. I did this because I wasn't sure how long the sequence would be between the two hairpin sequences. This is what I got to capture the terminator sequence: <code> cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>&<\/minus35box>\n/1" | sed -r "2s/^.{17}/&\n/g" | sed -r "3s/[ct]at[at]at/<minus10box>&<\/minus10box>\n/1" | sed -r "4s/^.{5}/&\n/g" | sed "5s/^./<tss>&<\/tss>\n/g" | sed "6s/gagg/<rbs>&<\/rbs>\n/1" | sed "7s/atg/<start_codon>&<\/start_codon>\n/1" | sed "8s/.../& /g"| sed -r "8s/tag|tga|taa/<stop_codon>&<\/stop_codon>/1" | sed "8s/ //g" | sed "8s/aaaaggt/<terminator>&/g" | sed -r "8s/gcctttt..../&<\/terminator>/g" </code> | + | ** The first part of the terminator hairpin is: <code>aaaaggt</code>, which means, abiding by the rules of the terminator provided to us, that the first half bonds with <code> gcctttt </code>. So now the trick is to grab the correct terminator sequence. I ended up breaking the terminator command into two different commands. I used one to insert the first tag, and the second one to insert the second tag. I did this because I wasn't sure how long the sequence would be between the two hairpin sequences. This is what I got to capture the terminator sequence: <code> cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>&<\/minus35box>\n/1" | sed -r "2s/^.{17}/&\n/g" | sed -r "3s/[ct]at[at]at/<minus10box>&<\/minus10box>\n/1" | sed -r "4s/^.{5}/&\n/g" | sed "5s/^./<tss>&<\/tss>\n/g" | sed "6s/gagg/<rbs>&<\/rbs>\n/1" | sed "7s/atg/<start_codon>&<\/start_codon>\n/1" | sed "8s/.../& /g"| sed -r "8s/tag|tga|taa/<stop_codon>&<\/stop_codon>/1" | sed "8s/ //g" | sed "8s/aaaaggt/<terminator>&/g" | sed -r "8s/gcctttt..../&<\/terminator>/g" </code> |
| * And so, finally, it is all marked up. However I'm not quite done yet, I need to get rid of all the new lines I created. In order to do this I used this command: sed ':a;N;$!ba;s/\n//g' (from wiki), so the final output is as follows. | | * And so, finally, it is all marked up. However I'm not quite done yet, I need to get rid of all the new lines I created. In order to do this I used this command: sed ':a;N;$!ba;s/\n//g' (from wiki), so the final output is as follows. |
− | #*(a)<code> cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>&<\/minus35box>\n/1" | sed -r "2s/^.{17}/&\n/g" | sed -r "3s/[ct]at[at]at/<minus10box>&<\/minus10box>\n/1" | sed -r "4s/^.{5}/&\n/g" | sed "5s/^./<tss>&<\/tss>\n/g" | sed "6s/gagg/<rbs>&<\/rbs>\n/1" | sed "7s/atg/<start_codon>&<\/start_codon>\n/1" | sed "8s/.../& /g"| sed -r "8s/tag|tga|taa/<stop_codon>&<\/stop_codon>/1" | sed "8s/ //g" | sed "8s/aaaaggt/<terminator>&/g" | sed -r "8s/gcctttt..../&<\/terminator>/g" | sed ':a;N;$!ba;s/\n//g' </code>
| + | *(a) |
− | #*(b)
| + | |
| | | |
| + | ttttcaccacaagaatgaatgttttcggcacatttctccccagagtgttataattgcggtcgcagagttggttacgctcattaccccgctgccgataaggaatttttcgcgtcaggtaacgcccatcgtttatctcaccgctcccttatacgttgcgcttttggtgcggcttagccgtgtgttttcg |
| + | gagtaatgtgccgaacctgtttgttgcgatttagcgcgcaaatc<minus35box>tttact</minus35box>tatttacagaacttcgg<minus10box>cattat</minus10box>cttgc<tss>c</tss>ggttcaaattacggtagtgatacccca |
| + | <rbs>gagg</rbs>attag<start_codon>atg</start_codon>gccaaagaagacaatattgaaatgcaaggtaccgttcttgaaacgttgcctaataccatgttccgcgtagagttagaaaacggtcacgtggttac |
| + | tgcacacatctccggtaaaatgcgcaaaaactacatccgcatcctgacgggcgacaaagtgactgttgaactgaccccgtacgacctgagcaaaggccgcattgtcttccgtagtcgc<stop_codon>tga</stop_codon> |
| + | ttgttttaccgcctgatgggcgaagagaaagaacgagt<terminator>aaaaggtcggtttaaccggcctttttatt</terminator>ttat |
| | | |
− | ttttcaccacaagaatgaatgttttcggcacatttctccccagagtgttataattgcggtcgcagagttggttacgctcattaccccgctgccgataaggaatttttcgcgtcaggtaacgcccatcgtttatctcaccgctcccttatacgttgcgcttttggtgcggcttagccgtgtgttttcg gagtaatgtgccgaacctgtttgttgcgatttagcgcgcaaatc<minus35box>tttact</minus35box>tatttacagaacttcgg<minus10box>cattat</minus10box>cttgc<tss>c</tss>ggttcaaattacggtagtgatacccca<rbs>gagg</r bs>attag<start_codon>atg</start_codon>gccaaagaagacaatattgaaatgcaaggtaccgttcttgaaacgttgcctaataccatgttccgcgtagagttagaaaacggtcacgtggttactgcacacatctccggtaaaatgcgcaaaaactacatccgcatcctg acgggcgacaaagtgactgttgaactgaccccgtacgacctgagcaaaggccgcattgtcttccgtagtcgc<stop_codon>tga</stop_codon>ttgttttaccgcctgatgggcgaagagaaagaacgagt<terminator>aaaaggtcggtttaaccggcctttttat t</terminator>ttat
| + | *(b)And the final command is as follows: |
| | | |
| + | <code> cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>&<\/minus35box>\n/1" | sed -r "2s/^.{17}/&\n/g" | sed -r "3s/[ct]at[at]at/<minus10box>&<\/minus10box>\n/1" | sed -r "4s/^.{5}/&\n/g" | sed "5s/^./<tss>&<\/tss>\n/g" | sed "6s/gagg/<rbs>&<\/rbs>\n/1" | sed "7s/atg/<start_codon>&<\/start_codon>\n/1" | sed "8s/.../& /g"| sed -r "8s/tag|tga|taa/<stop_codon>&<\/stop_codon>/1" | sed "8s/ //g" | sed "8s/aaaaggt/<terminator>&/g" | sed -r "8s/gcctttt..../&<\/terminator>/g" | sed ':a;N;$!ba;s/\n//g' </code> |
| | | |
− | # What is the ''exact'' mRNA sequence that is transcribed from this gene?
| |
− | #*(a)
| |
− | #*(b)
| |
− | # What is the amino acid sequence that is translated from this mRNA?
| |
− | #*(a)
| |
− | #*(b)
| |
| | | |
− | ==== Supplementary Information ====
| + | * What is the ''exact'' mRNA sequence that is transcribed from this gene? |
| + | **In order to get the mRNA sequence I need to get the sequence between the transcription start site and the terminator. I found it easiest to make new lines based on the mark up tags already there. From that point I can pick and choose which lines I need to transcribe. Using sed, I can delete lines. Example: <code> sed "2,4D"</code> So, using this trick, I deleted all unnecessary lines. From there all nucleotides not deleted should be transcribed into mRNA. I was going to make new lines by typing out a bunch of different sed commands for each different tag, but I can do it simply by using two. This puts each tag on its own line: <code> sed "s/>/&\n/g" | sed "s/</\n&/g"</code>. Now I go through, delete the tags and the useless sequences, remove the extra lines, and transcribe. Here is the sequence followed by the command. |
| + | **(a) |
| + | cgguucaaauuacgguagugauaccccagaggauuagauggccaaagaagacaauauugaaaugcaagguaccguucuug |
| + | aaacguugccuaauaccauguuccgcguagaguuagaaaacggucacgugguuacugcacacaucuccgguaaaaugcgca |
| + | aaaacuacauccgcauccugacgggcgacaaagugacuguugaacugaccccguacgaccugagcaaaggccgcauugu |
| + | cuuccguagucgcugauuguuuuaccgccugaugggcgaagagaaagaacgaguaaaaggucgguuuaaccggccuuuuuauu |
| + | **(b) |
| + | <code>cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>&<\/minus35box>\n/1" |
| + | | sed -r "2s/^.{17}/&\n/g" | sed -r "3s/[ct]at[at]at/<minus10box>&<\/minus10box>\n/1" | sed -r "4s/^.{5}/&\n/g" |
| + | | sed "5s/^./<tss>&<\/tss>\n/g" | sed "6s/gagg/<rbs>&<\/rbs>\n/1" | sed "7s/atg/<start_codon>&<\/start_codon>\n/1" |
| + | | sed "8s/.../& /g"| sed -r "8s/tag|tga|taa/<stop_codon>&<\/stop_codon>/1" | sed "8s/ //g" |
| + | | sed "8s/aaaaggt/<terminator>&/g" | sed -r "8s/gcctttt..../&<\/terminator>/g" | sed ':a;N;$!ba;s/\n//g' |
| + | | sed "s/>/&\n/g" | sed "s/</\n&/g" | sed "1,10D;12D;14D;16D;18D;20D;22D;24D;26D;28D;29D" |
| + | | sed ':a;N;$!ba;s/\n//g' | sed "s/t/u/g" </code> |
| | | |
Exception encountered, of type "Error"
[99fb0a40] /biodb/fall2015/index.php?diff=cur&oldid=1825&title=Jwoodlee_Week_4 Error from line 434 of /apps/xmlpipedb/biodb/fall2015/includes/diff/DairikiDiff.php: Call to undefined function each()
Backtrace:
#0 /apps/xmlpipedb/biodb/fall2015/includes/diff/DairikiDiff.php(544): DiffEngine->diag()
#1 /apps/xmlpipedb/biodb/fall2015/includes/diff/DairikiDiff.php(344): DiffEngine->compareSeq()
#2 /apps/xmlpipedb/biodb/fall2015/includes/diff/DairikiDiff.php(227): DiffEngine->diffLocal()
#3 /apps/xmlpipedb/biodb/fall2015/includes/diff/DairikiDiff.php(721): DiffEngine->diff()
#4 /apps/xmlpipedb/biodb/fall2015/includes/diff/DairikiDiff.php(859): Diff->__construct()
#5 /apps/xmlpipedb/biodb/fall2015/includes/diff/DairikiDiff.php(980): MappedDiff->__construct()
#6 /apps/xmlpipedb/biodb/fall2015/includes/diff/TableDiffFormatter.php(194): WordLevelDiff->__construct()
#7 /apps/xmlpipedb/biodb/fall2015/includes/diff/DiffFormatter.php(140): TableDiffFormatter->changed()
#8 /apps/xmlpipedb/biodb/fall2015/includes/diff/DiffFormatter.php(111): DiffFormatter->block()
#9 /apps/xmlpipedb/biodb/fall2015/includes/diff/DifferenceEngine.php(888): DiffFormatter->format()
#10 /apps/xmlpipedb/biodb/fall2015/includes/diff/DifferenceEngine.php(802): DifferenceEngine->generateTextDiffBody()
#11 /apps/xmlpipedb/biodb/fall2015/includes/diff/DifferenceEngine.php(733): DifferenceEngine->generateContentDiffBody()
#12 /apps/xmlpipedb/biodb/fall2015/includes/diff/DifferenceEngine.php(662): DifferenceEngine->getDiffBody()
#13 /apps/xmlpipedb/biodb/fall2015/includes/diff/DifferenceEngine.php(632): DifferenceEngine->getDiff()
#14 /apps/xmlpipedb/biodb/fall2015/includes/diff/DifferenceEngine.php(453): DifferenceEngine->showDiff()
#15 /apps/xmlpipedb/biodb/fall2015/includes/page/Article.php(795): DifferenceEngine->showDiffPage()
#16 /apps/xmlpipedb/biodb/fall2015/includes/page/Article.php(506): Article->showDiffPage()
#17 /apps/xmlpipedb/biodb/fall2015/includes/actions/ViewAction.php(44): Article->view()
#18 /apps/xmlpipedb/biodb/fall2015/includes/MediaWiki.php(395): ViewAction->show()
#19 /apps/xmlpipedb/biodb/fall2015/includes/MediaWiki.php(273): MediaWiki->performAction()
#20 /apps/xmlpipedb/biodb/fall2015/includes/MediaWiki.php(566): MediaWiki->performRequest()
#21 /apps/xmlpipedb/biodb/fall2015/includes/MediaWiki.php(414): MediaWiki->main()
#22 /apps/xmlpipedb/biodb/fall2015/index.php(44): MediaWiki->run()
#23 {main}