Difference between revisions of "HDelgadi Week 3"
(Adding the amino acids for the six reading frames) |
(adding signature) |
||
(14 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
− | + | ===''' Complementary Strand '''=== | |
cat seq_file | sed "y/tagc/atcg/" | cat seq_file | sed "y/tagc/atcg/" | ||
− | + | ===''' 6 Different Reading Frames '''=== | |
− | + | +1 Reading Frame | |
− | ''' XMLPipeDB Match Practice ''' | + | # cat sequence_file | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed |
+ | |||
+ | +2 Reading Frame | ||
+ | |||
+ | # cat sequence_file | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | |||
+ | +3 Reading Frame | ||
+ | |||
+ | # cat sequence_file | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | |||
+ | -1 Reading Frame | ||
+ | |||
+ | # rev sequence_file | sed "y/atgc/tacg/" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | |||
+ | -2 Reading Frame | ||
+ | |||
+ | # rev sequence_file | sed "y/atgc/tacg/" | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | |||
+ | -3 Reading Frame | ||
+ | |||
+ | #rev sequence_file | sed "y/atgc/tacg/" | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed | ||
+ | |||
+ | ===''' XMLPipeDB Match Practice '''=== | ||
* What Match command tallies the occurrences of the pattern GO:000916. in the 493.P_falciparum.xml file? | * What Match command tallies the occurrences of the pattern GO:000916. in the 493.P_falciparum.xml file? | ||
Line 22: | Line 44: | ||
* What information do you think the pattern GO:000916. represents? | * What information do you think the pattern GO:000916. represents? | ||
+ | |||
+ | This pattern might represent the proteins in the organism Falciparum. | ||
* What Match command tallies the occurrences of the pattern \"James.*\" in the 493.P_falciparum.xml file? | * What Match command tallies the occurrences of the pattern \"James.*\" in the 493.P_falciparum.xml file? | ||
Line 35: | Line 59: | ||
"james k.d." appears 8238 times and "james a.a." appears just once. | "james k.d." appears 8238 times and "james a.a." appears just once. | ||
− | + | * What information do you think the pattern \"James.*\" represents? | |
+ | |||
+ | This pattern can represent the different labeled proteins. | ||
+ | |||
+ | *Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing. What answer does Match give you? | ||
+ | |||
+ | Match gives me 830101 for the occurrences of the pattern ATG in the file. | ||
+ | |||
+ | *What answer does grep/wc give you? | ||
+ | |||
+ | Grep/wc gave me 502410 (lines), 502410 (words), and 35671048 (characters). | ||
+ | |||
+ | *Do the answers make sense? Explain your response. | ||
+ | |||
+ | The answers for Match and grep/wc make sense because grep/wc gives us the number of occurrences in a line and counts them as one even if there's two combined, so it is expected to see a lower number of occurrences through grep/wc than Match. | ||
+ | |||
+ | {{HDelgadi}} | ||
+ | |||
+ | [[User:HDelgadi|HDelgadi]] ([[User talk:HDelgadi|talk]]) 23:46, 12 September 2013 (PDT) |
Latest revision as of 06:46, 13 September 2013
Contents |
[edit] Complementary Strand
cat seq_file | sed "y/tagc/atcg/"
[edit] 6 Different Reading Frames
+1 Reading Frame
- cat sequence_file | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
+2 Reading Frame
- cat sequence_file | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
+3 Reading Frame
- cat sequence_file | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
-1 Reading Frame
- rev sequence_file | sed "y/atgc/tacg/" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
-2 Reading Frame
- rev sequence_file | sed "y/atgc/tacg/" | sed "s/^.//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
-3 Reading Frame
- rev sequence_file | sed "y/atgc/tacg/" | sed "s/^..//g" | sed "s/.../& /g" | sed "s/t/u/g" | sed -f genetic-code.sed
[edit] XMLPipeDB Match Practice
- What Match command tallies the occurrences of the pattern GO:000916. in the 493.P_falciparum.xml file?
The command: java -jar xmlpipedb-match-1.1.1.jar "GO:000916." <493.P_falciparum.xml.
- How many unique matches are there?
There are two unique matches.
- How many times does each unique match appear?
go:0009165 appears twice and go:0009168 appears once.
- What information do you think the pattern GO:000916. represents?
This pattern might represent the proteins in the organism Falciparum.
- What Match command tallies the occurrences of the pattern \"James.*\" in the 493.P_falciparum.xml file?
The command: java -jar xmlpipedb-match-1.1.1.jar "\"James.*\"" < 493.P_falciparum.xml.
- How many unique matches are there?
There are two unique matches.
- How many times does each unique match appear?
"james k.d." appears 8238 times and "james a.a." appears just once.
- What information do you think the pattern \"James.*\" represents?
This pattern can represent the different labeled proteins.
- Use Match to count the occurrences of the pattern ATG in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing. What answer does Match give you?
Match gives me 830101 for the occurrences of the pattern ATG in the file.
- What answer does grep/wc give you?
Grep/wc gave me 502410 (lines), 502410 (words), and 35671048 (characters).
- Do the answers make sense? Explain your response.
The answers for Match and grep/wc make sense because grep/wc gives us the number of occurrences in a line and counts them as one even if there's two combined, so it is expected to see a lower number of occurrences through grep/wc than Match.
HDelgadi Week 3
- Link: [Protegen]