Kzebrows Week 4

From LMU BioDB 2015
Revision as of 22:06, 27 September 2015 by Kzebrows (Talk | contribs) (Finding the -35 box and -10 box.)

Jump to: navigation, search

Transcription and Translation "Taken to the Next Level"

To start this assignment I began by opening Terminal on my laptop. I entered

ssh kzebrows@my.cs.lmu.edu 

followed by my password to log into the LMU CMSI database. As I usually do, I entered the following commands in order to enter Dr. Dionisio's directory, list the files in the directory, and choose the appropriate file for this assignment:

~cd dondi/xmlpipedb/data | ls | cat infA-E.coli-K12.txt

This took me to the E.coli file and showed me the nucleotide sequence. To complete this assignment I frequently used this page as a resource.

I began by using grep to find the potential -35 box and -10 box because grep highlights the searched pattern in red. I simply entered

cat infA-E.coli-K12.txt | grep "tt[gt]ac[at]"

which gave me two possible answers for the -35 box, tttact and tttaca, both of which fit the pattern. Now it was a matter of finding out which one was the correct one. I also searched for the -10 box using

cat infA-E.coli-K12.txt | grep "[ct]at[at]at"

which also revealed two potential sites at tataat and cattat. I realized that in order to find out which sequences were the correct ones I needed to visualize them both together, but grep doesn't do this, so instead I used sed. To do this, I entered the sed commands as a pipe, and added three space on either side of each occurrence of the consensus sequences (both -35 and -10) in the file to make the sequences more visible.. This is done by adding sed "s/<pattern>/& /g" where <pattern> is what I wish to find and each space after the "&" sign is what I wished to add to each side of the pattern (instructions found here). The pipe looked like this:

cat infA-E.coli-K12.txt | sed "[ct]at[at]at/   &   /g" | sed "tt[gt]ac[at]/   &   /g"

This made it clear that it was the first -35 box option, tttact, and the second -10 box option, cattat, that I was looking for in this gene.