Ajvree Week 3

From LMU BioDB 2013
Revision as of 03:51, 13 September 2013 by Ajvree (Talk | contribs)

Jump to: navigation, search

Week 3 Individual Assignment

Notes:

sed review
& = "repeat what you found" /Wisconsin is still better than &/

Shortcuts-

  • cd to change directories, ls to view file content
  • up and down arrows to view command history, or type history, !number to redo that command
  • CTRL R for reverse search- type in part of search, will recall past commands
  • tab to fill in file name
  • grep- text finder - looks for pattern: "ACTG" filename
  • grep is case sensitive
  • A followed by T with multiple things in between:
  • . = "wildcard" "A......T"
  • indicate beginning of line: ^ "^A......T"
  • end of line: $ "A......T$"
  • use previous command | wc to find word count for previously used file
  • command|command
  • wc- word count
  • enter lines, then CTRL D
    1. lines, # words, #characters

To use xmldb match, enter java -jar xmlpipe.db-match-1.1.1.jar FIRST to give file, insert < sign in front
java -jar xmlpipedb-match-1.1.1.jar "A......T" < hs_ref_GRCh37_chr19.fa

1) "What Match command..."
-2 unique matches
-2,1
-what does info represent?

2) double quote w/in a double quote: "\"James.*\"" asterisk= zero or more
-unique 2
-2,1
-what info?

Reading frames -break into triplets s/.../&space/g and sed"s/t/u/g" | sed -f genetic-code.sed -convert into genetic code s/cgu/L/g s/aug/M/g USE -F -drop between 0-2 characters s/^.//g -3-5- reverse sequence rev



Reading Frames

Write 6 sets of text processing commands that, when given a nucleotide sequence, returns the resulting amino acid sequence, one for each possible reading frame for the nucleotide sequence. In other words, fill in the question marks:

+1:
cat sequence_file | sed "s/.../& /g" | sed "s/t/u/g" | sed -F genetic-code.sed
+2:
cat sequence file | sed "s/^.//g" | sed "s/t/u/g" | sed -F genetic code.sed

XMLPipeDB Match Practice

For your convenience, the XMLPipeDB Match Utility (xmlpipedb-match-1.1.1.jar) has been installed in the ~dondi/xmlpipedb/data directory alongside the other practice files. Use this utility to answer the following questions:

1. What Match command tallies the occurrences of the pattern GO:000916. in the 493.P_falciparum.xml file?
The java -jar command allows you to use the XMLPipeDB Match to tally the occurrences.
How many unique matches are there?
-2
How many times does each unique match appear?
-2,1
What information do you think the pattern GO:000916. represents?
I'm not entirely sure, but it looks like a type of identification tag for a protein.

2.What Match command tallies the occurrences of the pattern \"James.*\" in the 493.P_falciparum.xml file?
How many unique matches are there?
-2
How many times does each unique match appear?
-2,1
What information do you think the pattern \"James.*\" represents?

3.Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.
What answer does Match give you?
What answer does grep/wc give you?
Do the answers make sense? Explain your response.

Ajvree (talk) 08:48, 12 September 2013 (PDT)
User Page

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox