Mpetredi Week 3
From LMU BioDB 2013
Contents |
Mitchell Petredis
Compliment of a Strand
cat infA-E.coli-K12.txt | sed "y/atcg/tagc/"
Reading Frames
Positive (Forward)
- cat infA-E.coli-K12.txt | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed
- cat infA-E.coli-K12.txt | sed "s/^.//g" | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed
- cat infA-E.coli-K12.txt | sed "s/^..//g" | sed "s/t/u/g" | sed "s/.../& /g" | sed -f genetic-code.sed
Negative (Reverse)
- cat infA-E.coli-K12.txt | sed "y/atcg/uagc/" | rev | sed "s/.../& /g" | sed -f genetic-code.sed
- cat infA-E.coli-K12.txt | sed "y/atcg/uagc/" | sed "s^.//g" | rev | sed "s/.../& /g" | sed -f genetic-code.sed
- cat infA-E.coli-K12.txt | sed "y/atcg/uagc/" | sed "s^..//g" | rev | sed "s/.../& /g" | sed -f genetic-code.sed
XMLPipeDB Match Practice
- java -jar xmlpipedb-match-1.1.1.jar "GO:000916." < 493.P_falciparum.xml
- There are 2 unique matches.
- Each unique match appears twice in the first line and once in the second line.
- java -jar xmlpipedb-match-1.1.1.jar "\"James.*\"" < 493.P_falciparum.xml
- There are 2 unique matches.
- The first line (james k.d.) has 8238 matches, and the second line james a.a. has 1 match.
- "James" may refer to the person who sequenced all or a portion of falciparum.
- Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.
- Match ATG results in 1 unique match appearing 830101 times
- grep and wc result in 502410 lines, 502410 words, and 35671048 characters
- These answers make sense because Match is only looking for a specific instance when ATG appears, which would make its value lower than the grep | wc combo. grep | wc cannot differentiate specific pieces of text and yields any occurrence when the pattern ATG comes up.
Mpetredi (talk) 21:53, 12 September 2013 (PDT)Mitchell Petredis