Class Notes 2
From LMU BioDB 2013
- Consensus Sequence: brackets [] will add an either/or search to the command line
- e.g.: [ct]at[at]at will use either c or t in the first box, and a or t in the second box
- would create the following
- cataat
- tataat
- cattat
- tattat
- would create the following
- e.g.: [ct]at[at]at will use either c or t in the first box, and a or t in the second box
- other half of a hairpin aaaggt is gcctttt
- look at it this way:
- aaaaggt
- ttttccg
Commands
- \/ goes to the end of a command
cat infA-E.coli-K12.txt | sed "s/[ct]at[at]at/ HEREcat infA-E.coli-K12.txt ! /g"
- What happened?
- !! stands for the last command that you typed
- Proper way how to type command:cat infA-E.coli-K12.txt | sed "s/[ct]at[at]at/ HERE\!\!\! /g"
- Find the -10 box: cat infA-E.coli-K12.txt | sed "s/[ct]at[at]at/ -10 HERE\!\!\! /g" | sed "s/tt[gt]ac[at]/ -35 HEEERE\!\!\! /g"
- To get the true answer, -10 and -35 must be 15 characters apart (use periods)
- Find the -10 box: cat infA-E.coli-K12.txt | sed "s/[ct]at[at]at/ -10 HERE\!\!\! /g" | sed "s/tt[gt]ac[at]/ -35 HEEERE\!\!\! /g"
- cat infA-E.coli-K12.txt | sed "s/[ct]at[at]at/ <minus10box>&<\/minus10box> /g" | sed "s/tt[gt]ac[at]/ <minus35box>&<\/minus35box> /g"
- proper way to type command to find all available -10 and -35 boxes
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed "s/................. <\/minus10/<minus35box> &/g"
- [character]{#} repeats a character # amount of times. Must use "sed -r" to activate the command.
- cat infA-E.coli-K12.txt | sed -r "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed "s/.{17} <\/minus10/<minus35box> &/g"
- [character]{#} repeats a character # amount of times. Must use "sed -r" to activate the command.
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g"
- CORRECT ONE TO USE
Finding GAGG
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g" | sed "s/gagg/ <rbs>&<\/rbs> /g"
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g" | sed "s/gagg/ <rbs>&<\/rbs> /g" | sed "s/atg/ HEEEERE /g"
- Finds all ATG start sites, but doesn't indicate the one after the ribosome binding site
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g" | sed "s/gagg/ <rbs>&<\/rbs> /g" | sed "s/<\/rbs>.&atg/ HEEEERE /g"
- Still doesn't quite indicate the proper ATG
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g" | sed "s/gagg/ <rbs>&<\/rbs> /g" | sed "s/<\/rbs>/&\n/g"
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g" | sed "s/gagg/ <rbs>&<\/rbs> /g" | sed "s/<\/rbs>/&\n/g" | sed "2s/atg/ HEEEEERE /g"
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g" | sed "s/gagg/ <rbs>&<\/rbs> /g" | sed "s/<\/rbs>/&\n/g" | sed "2s/atg/ HEEEEERE /1"
- replaced last g with 1 to look for a certain match
- cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g" | sed "s/gagg/ <rbs>&<\/rbs> /g" | sed "s/<\/rbs>/&\n/g" | sed "2s/atg/ HEEEEERE /1"
cat infA-E.coli-K12.txt | sed "s/cat[at]at/ <minus10box>&<\/minus10box> /g" | sed -r "s/.{17} <\/minus10/<\/minus35box> &/g" | sed "s/tt[gt]ac[at]<\/minus35box>/ <minus35box>&/g" | sed "s/gagg/ <rbs>&<\/rbs> /g" | sed "s/<\/rbs>/&\n/g" | sed "2s/atg/ <start_codon>&<\/start_codon> /g"