Mpetredi Week 4
From LMU BioDB 2013
				
								
				
				
																
				
				
								
				-  Modify the gene sequence string so that it highlights or “tags” the special sequences within this gene, as follows (ellipses indicate bases in the sequence; note the spaces before the start tag and after the end tag):
-  -35 box of the promoter 
- tttact
- sed -r "s/.{17} <minus10box>/ <\/minus35box> &/g" | sed "s/tt[gt]ac[at] <\/minus35box>/ <minus35box> &/g"
 
- -10 box of the promoter
- cattat
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g"
 
-  transcription start site
- atg
- cat infA-E.coli-K12.txt | sed "2s/atg/<tss>&<\/tss> /1"
 
-  ribosome binding site 
- gagg
- cat infA-E.coli-K12.txt | sed "s/gagg/ <rbs>&<\/rbs> /g"
 
- start codon
- atg
- cat infA-E.coli-K12.txt | sed "2s/atg/ <start_codon>&<\/start_codon> /g"
 
-  stop codon
- tag
- cat infA-E.coli-K12.txt | sed "s/tag/ <stop_codon>&<\/stop_codon> /3" | sed "s/tga/ <stop_codon>&<\/stop_codon> /3" | sed "s/taa/ <stop_codon>&<\/stop_codon> /3"
 
- terminator
- aaaaggtcggtttaaccggcctttttatt
- cat infA-E.coli-K12.txt | sed "s/aaaaggt...........gcctttt..../ <terminator>&<\/terminator> /g"
 
 
-  -35 box of the promoter 
-  What is the exact mRNA sequence that is transcribed from this gene?
- aaaagugguguucuuacuuacaaaagccguguaaagaggggucucacaauauuaacgccagcgucucaaccaaugcgaguaauggggcgacggcuauuccuuaaaaagcgcaguccauugcggguagcaaauagaguggcgagggaauaugcaacgcgaaaaccacgccgaaucggcacacaaaagccucauuacacggcuuggacaaacaacgcuaaaucgcgcguuuagaaaugaauaaaugucuugaagccguaauagaacggccaaguuuaaugccaucacuauggggucuccuaaucuaccgguuucuucuguuauaacuuuacguuccauggcaagaacuuugcaacggauuaugguacaaggcgcaucucaaucuuuugccagugcaccaaugacguguguagaggccauuuuacgcguuuuugauguaggcguaggacugcccgcuguuucacugacaacuugacuggggcaugcuggacucguuuccggcguaacagaaggcaucagcgacuaacaaaauggcggacuacccgcuucucuuucuugcucauuuuccagccaaauuggccggaaaaauaaaaua
 
-  What is the amino acid sequence that is translated from this mRNA?
- Met A K E D N I E Met Q G T V L E T L P N T Met F R V E L E N G H V V T A H I S G K Met R K N Y I R I L T G D K V T V E L T P Y D L S K G R I V F R S R Stop
 
- All commands in one string
cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <minus10box>/ <\/minus35box> &/g" | sed "s/tt[gt]ac[at] <\/minus35box>/ <minus35box> &/g" | sed "s/gagg/ <rbs>&<\/rbs> /g" | sed "s/<\/rbs>/&\n/g" | sed "2s/atg/ <start_codon>&<\/start_codon> /1" | sed "s/.../ \n /3" | sed "s/aaaaggt...........gcctttt..../ <terminator>&<\/terminator> /g" | sed "2s/atg/<tss>&<\/tss> /1" | sed "s/tag/ <stop_codon>&<\/stop_codon> /3" | sed "s/tga/ <stop_codon>&<\/stop_codon> /3" | sed "s/taa/ <stop_codon>&<\/stop_codon> /3"

