Difference between revisions of "Mpetredi Week 4"
From LMU BioDB 2013
(Added notes about last command line string) |
(changed stop codon command line sequence) |
||
Line 9: | Line 9: | ||
#**cat infA-E.coli-K12.txt | sed "2s/atg/ <start_codon>&<\/start_codon> /g" | #**cat infA-E.coli-K12.txt | sed "2s/atg/ <start_codon>&<\/start_codon> /g" | ||
#* stop codon | #* stop codon | ||
− | #**cat infA-E.coli-K12.txt | sed " | + | #**cat infA-E.coli-K12.txt | sed "s/tag/ <stop_codon>&<\/stop_codon> /3" | sed "s/tga/ <stop_codon>&<\/stop_codon> /3" | sed "s/taa/ <stop_codon>&<\/stop_codon> /3" |
#*terminator | #*terminator | ||
#**cat infA-E.coli-K12.txt | sed "s/aaaaggt...........gcctttt..../ <terminator>&<\/terminator> /g" | #**cat infA-E.coli-K12.txt | sed "s/aaaaggt...........gcctttt..../ <terminator>&<\/terminator> /g" |
Revision as of 04:05, 20 September 2013
- Modify the gene sequence string so that it highlights or “tags” the special sequences within this gene, as follows (ellipses indicate bases in the sequence; note the spaces before the start tag and after the end tag):
- -35 and -10 box of the promoter
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g"
- transcription start site
- cat infA-E.coli-K12.txt | sed "2s/atg/<tss>&<\/tss> /1"
- ribosome binding site
- cat infA-E.coli-K12.txt | sed "s/gagg/ <rbs>&<\/rbs> /g"
- start codon
- cat infA-E.coli-K12.txt | sed "2s/atg/ <start_codon>&<\/start_codon> /g"
- stop codon
- cat infA-E.coli-K12.txt | sed "s/tag/ <stop_codon>&<\/stop_codon> /3" | sed "s/tga/ <stop_codon>&<\/stop_codon> /3" | sed "s/taa/ <stop_codon>&<\/stop_codon> /3"
- terminator
- cat infA-E.coli-K12.txt | sed "s/aaaaggt...........gcctttt..../ <terminator>&<\/terminator> /g"
- -35 and -10 box of the promoter
- What is the exact mRNA sequence that is transcribed from this gene?
- What is the amino acid sequence that is translated from this mRNA?
NOTE: Answers not final
- All commands in one string
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g" | sed "s/gagg/ <rbs>&<\/rbs> /g" | sed "s/<\/rbs>/&\n/g" | sed "2s/atg/ <start_codon>&<\/start_codon> /1" | sed "2s/t[ag][ag]/ <stop_codon>&<\/stop_codon> /2" | sed "s/aaaaggt...........gcctttt..../ <terminator>&<\/terminator> /g" | sed "2s/atg/<tss>&<\/tss> /1"
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g" | sed "2s/atg/<tss>&<\/tss> /1" | sed "s/gagg/ <rbs>&<\/rbs> /g" | sed "s/<\/rbs>/&\n/g" | sed "2s/atg/ <start_codon>&<\/start_codon> \n /1" | sed "3s/.../ & /g" | sed "6s/tag|taa|tga/ <stop_codon>&<\/stop_codon> /2" | sed "s/aaaaggt...........gcctttt..../ <terminator>&<\/terminator> /g"
- Note: This splits everything after ATG into lines, but stop codon does not show and first two letters are separate from the other groups of 3s for some reason.