Difference between revisions of "Vpachec3 Week 4"
(Edited the sequence of the new command) |
(Added a section title) |
||
Line 8: | Line 8: | ||
Right before we were stopped to bring it back into a larger group discussion, Nicole taught me that \n would break it into two lines. We just didn't get to apply it to the command line just yet. | Right before we were stopped to bring it back into a larger group discussion, Nicole taught me that \n would break it into two lines. We just didn't get to apply it to the command line just yet. | ||
− | + | ==Transcription Start Site== | |
Now trying this on my own. I used the \n to break the line to start to figure out how to add the transcription start site. I wanted to break the information into two line so i used this command: | Now trying this on my own. I used the \n to break the line to start to figure out how to add the transcription start site. I wanted to break the information into two line so i used this command: | ||
cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>& <\/minus35box> /1" |sed "s/[ct]at[at]at/ <minus10box>& <\/minus10box> /2"| sed "s/ <minus10box>/& \n/g" | cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>& <\/minus35box> /1" |sed "s/[ct]at[at]at/ <minus10box>& <\/minus10box> /2"| sed "s/ <minus10box>/& \n/g" |
Revision as of 02:09, 27 September 2015
My lab partner, Nicole, was a big help and helped me go through the Week 4 homework. Here is how far we got:
vpachec3@ab201:/nfs/home/dondi/xmlpipedb/data$ cat infA-E.coli-K12.txt |sed "s/tt[gt]ac[at]/ <minus35box>& <\/minus35box> /1"|sed "s/[ct]at[at]at/ <minus10box>& <\/minus10box> /2"
This is what the command gave us:
ttttcaccacaagaatgaatgttttcggcacatttctccccagagtgttataattgcggtcgcagagttggttacgctcattaccccgctgccgataaggaatttttcgcgtcaggtaacgcccatcgtttatctcaccgctcccttatacgttgcgcttttggtgcggcttagccgtgtgttttcggagtaatgtgccgaacctgtttgttgcgatttagcgcgcaaatc <minus35box>tttact </minus35box> tatttacagaacttcgg <minus10box>cattat </minus10box> cttgccggttcaaattacggtagtgataccccagaggattagatggccaaagaagacaatattgaaatgcaaggtaccgttcttgaaacgttgcctaataccatgttccgcgtagagttagaaaacggtcacgtggttactgcacacatctccggtaaaatgcgcaaaaactacatccgcatcctgacgggcgacaaagtgactgttgaactgaccccgtacgacctgagcaaaggccgcattgtcttccgtagtcgctgattgttttaccgcctgatgggcgaagagaaagaacgagtaaaaggtcggtttaaccggcctttttattttat
Right before we were stopped to bring it back into a larger group discussion, Nicole taught me that \n would break it into two lines. We just didn't get to apply it to the command line just yet.
Transcription Start Site
Now trying this on my own. I used the \n to break the line to start to figure out how to add the transcription start site. I wanted to break the information into two line so i used this command:
cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>& <\/minus35box> /1" |sed "s/[ct]at[at]at/ <minus10box>& <\/minus10box> /2"| sed "s/ <minus10box>/& \n/g"
However, I wanted to break the line after the minus 10 box so I modified the command:
cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>& <\/minus35box> /1" |sed "s/[ct]at[at]at/ <minus10box>& <\/minus10box> /2"| sed "s/ <\/minus10box> /&\n/g"
This command gave me:
ttttcaccacaagaatgaatgttttcggcacatttctccccagagtgttataattgcggtcgcagagttggttacgctcattaccccgctgccgataaggaatttttcgcgtcaggtaacgcccatcgtttatctcaccgctcccttatacgttgcgcttttggtgcggcttagccgtgtgttttcggagtaatgtgccgaacctgtttgttgcgatttagcgcgcaaatc <minus35box>tttact </minus35box> tatttacagaacttcgg <minus10box>cattat </minus10box>
cttgccggttcaaattacggtagtgataccccagaggattagatggccaaagaagacaatattgaaatgcaaggtaccgttcttgaaacgttgcctaataccatgttccgcgtagagttagaaaacggtcacgtggttactgcacacatctccggtaaaatgcgcaaaaactacatccgcatcctgacgggcgacaaagtgactgttgaactgaccccgtacgacctgagcaaaggccgcattgtcttccgtagtcgctgattgttttaccgcctgatgggcgaagagaaagaacgagtaaaaggtcggtttaaccggcctttttattttat
Breaking it into two lines would be easier to insert the transcription start site because we were told:The transcription start site is located at the 12th nucleotide after the first nucleotide of the -10 box.
This means that I could count to insert the transcription start site and use commands that I have used before to insert the tag. I used :
cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>& <\/minus35box> /1" |sed "s/[ct]at[at]at/ <minus10box>& <\/minus10box> /2"| sed "s/ <\/minus10box> /&\n/g"| sed "2s/cc/ <tts> /1"
However this was problematic because it replace the nucleotide rather than put it in front. Therefore, revision of the command was needed.
New command
cat infA-E.coli-K12.txt | sed "s/tt[gt]ac[at]/ <minus35box>& <\/minus35box> /1" |sed "s/[ct]at[at]at/ <minus10box>& <\/minus10box> /2"| sed "s/ <\/minus10box> /&\n/g"| sed "2s/c/ <tss>& <\/tss> /5"
This command did exactly what I needed:
ttttcaccacaagaatgaatgttttcggcacatttctccccagagtgttataattgcggtcgcagagttggttacgctcattaccccgctgccgataaggaatttttcgcgtcaggtaacgcccatcgtttatctcaccgctcccttatacgttgcgcttttggtgcggcttagccgtgtgttttcggagtaatgtgccgaacctgtttgttgcgatttagcgcgcaaatc <minus35box>tttact </minus35box> tatttacagaacttcgg <minus10box>cattat </minus10box>
cttgccggttcaaatta <tss>c </tss>ggtagtgataccccagaggattagatggccaaagaagacaatattgaaatgcaaggtaccgttcttgaaacgttgcctaataccatgttccgcgtagagttagaaaacggtcacgtggttactgcacacatctccggtaaaatgcgcaaaaactacatccgcatcctgacgggcgacaaagtgactgttgaactgaccccgtacgacctgagcaaaggccgcattgtcttccgtagtcgctgattgttttaccgcctgatgggcgaagagaaagaacgagtaaaaggtcggtttaaccggcctttttattttat