Ajvree Week 3
Week 3 Individual Assignment
Notes:
Shortcuts-
- cd to change directories, ls to view content
 - up and down arrows to view command history, or type history, !number to redo that command
 - CTRL R for reverse search- type in part of search, will recall past commands
 - tab to fill in file name
 
- grep- text finder - looks for pattern: "ACTG" filename
 - A followed by T with multiple things in between:
 - . = "wildcard" "A......T"
 - indicate beginning of line: ^ "^A......T"
 - end of line: $ "A......T$"
 - use previous command | wc to find word count for previously used file
 
- command|command
 - wc- word count
 - enter lines, then CTRL D
- lines, # words, #characters
 
 
To use xmldb match, enter java -jar xmlpipe.db-match-1.1.1.jar FIRST
to give file, insert < sign in front
java -jar xmlpipedb-match-1.1.1.jar "A......T" < hs_ref_GRCh37_chr19.fa
1) "What Match command..."
-2 unique matches
-2,1
-what does info represent?
2) 
double quote w/in a double quote: "\"James.*\""
asterisk= zero or more
-unique 2
-2,1
-what info?
Reading Frames
Write 6 sets of text processing commands that, when given a nucleotide sequence, returns the resulting amino acid sequence, one for each possible reading frame for the nucleotide sequence. In other words, fill in the question marks:
XMLPipeDB Match Practice
For your convenience, the XMLPipeDB Match Utility (xmlpipedb-match-1.1.1.jar) has been installed in the ~dondi/xmlpipedb/data directory alongside the other practice files. Use this utility to answer the following questions:
   What Match command tallies the occurrences of the pattern GO:000916. in the 493.P_falciparum.xml file?
       How many unique matches are there?
       How many times does each unique match appear?
       What information do you think the pattern GO:000916. represents? 
   What Match command tallies the occurrences of the pattern \"James.*\" in the 493.P_falciparum.xml file?
       How many unique matches are there?
       How many times does each unique match appear?
       What information do you think the pattern \"James.*\" represents? 
   Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.
       What answer does Match give you?
       What answer does grep/wc give you?
       Do the answers make sense? Explain your response.