Difference between revisions of "Week 3"
(→Overview: Highlight useful resources.) |
(→Shared Journal Assignment: Write up shared journal section.) |
||
Line 106: | Line 106: | ||
* Sign your portion of the journal with the standard wiki signature shortcut (<code><nowiki>~~~~</nowiki></code>). | * Sign your portion of the journal with the standard wiki signature shortcut (<code><nowiki>~~~~</nowiki></code>). | ||
* Add the "Journal Entry" and "Shared" categories to the end of the wiki page (if someone has not already done so). | * Add the "Journal Entry" and "Shared" categories to the end of the wiki page (if someone has not already done so). | ||
+ | |||
+ | === Read === | ||
+ | |||
+ | * [http://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/ Ford, Paul. “What is Code?” ''Business Week'', June 11, 2015.] This is a ''long'' article—but quite worthwhile. If you can read it in one sitting, go right ahead; we will focus on specific parts at various points in the semester. | ||
+ | |||
+ | This week focuses on the first two sections of this article: “The Man in the Taupe Blazer” and “Let’s Begin.” | ||
+ | |||
+ | === Reflect === | ||
+ | |||
+ | # Pull out a quote from the first two sections of [http://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/ “What is Code?”] that you think directly relates to what you experienced in the individual portion of this assignment. Explain why this quote is particularly resonant for you. | ||
+ | # What are your thoughts on gender issues in computer science? How different/similar do you think the situation is in biology? Feel free to speak from a particular lens (biology major, computer science major, LMU student, etc.). | ||
+ | # What do you think you need in order to grow more comfortable, confident, and effective with the command line? | ||
[[Category:Assignment]] | [[Category:Assignment]] |
Revision as of 23:20, 6 September 2015
This journal entry is due on Tuesday, September 22, at midnight PDT. (Thursday night/Friday morning)
Contents
Overview
The purpose of this assignment is:
- To give you some hands-on practice time with a command-line interface.
- To show you an example of how a manual task can be automated.
- To reinforce the material from the previous week.
These readings/resources will be of direct help in completing the assignment:
- Where's my Stuff?
- Introduction to the Command Line
- Using the XMLPipeDB Match Utility
- http://explainshell.com/ —lets you type in a command and will display a visual explanation of what it does (to the best of its ability)
Individual Journal Assignment
- Store this journal entry as "username Week 3" (i.e., this is the text to place between the square brackets when you link to this page).
- Link from your user page to this Assignment page.
- Link to your journal entry from your user page.
- Link back from your journal entry to your user page.
- Don't forget to add the "Journal Entry" category to the end of your wiki page.
- Note: you can easily fulfill all of these links by adding them to your template and then using your template on your journal entry.
- For your assignment this week, you will keep an electronic laboratory notebook on your individual wiki page. An electronic laboratory notebook records all the manipulations you perform on the data and the answers to the questions throughout the protocol. Like a paper lab notebook found in a wet lab, it should contain enough information so that you or someone else could reproduce what you did using only the information from the notebook.
Homework Partners
Not yet assigned.
The Genetic Code, by Computer
Connect to the my.cs.lmu.edu workstation as shown in class and do the following exercises from there.
For these exercises, two files are available in the Keck lab system for practice; of course, you can always make your own sequences up. The practice files are ~dondi/xmlpipedb/data/prokaryote.txt and ~dondi/xmlpipedb/data/infA-E.coli-K12.txt.
Complement of a Strand
Write a sequence of piped text processing commands that, when given a nucleotide sequence, returns its complementary strand. In other words, fill in the question marks:
cat sequence_file | ?????
For example, if sequence_file contains:
agcggtatac
Then your text processing commands should display:
tcgccatatg
Reading Frames
Write 6 sets of text processing commands that, when given a nucleotide sequence, returns the resulting amino acid sequence, one for each possible reading frame for the nucleotide sequence. In other words, fill in the question marks:
cat sequence_file | ?????
You should have 6 different sets of commands, one for each possible reading frame. For example, if sequence_file contains:
agcggtatac
Then your text processing commands for 5’-3’ frame 1 should display:
SGI
Your text processing commands for 5’-3’ frame 3 should display:
RY
...and so on.
- Hint 1: The 6 sets of commands are very similar to each other.
- Hint 2: Under the ~dondi/xmlpipedb/data directory in the Keck lab, you will find a file called genetic-code.sed. To save you some typing, this file has already been prepared with the correct sequence of sed commands for converting any base triplets into the corresponding amino acid. For example, this line in that file:
s/ugc/C/g
...corresponds to a uracil-guanine-cytosine sequence transcribing to the cysteine amino acid (C). The trick is to figure out how to use this file to your advantage, in the commands that you'll be forming.
Check Your Work
Fortunately, online tools are available for checking your work; we recommend the ExPASy Translate Tool, sponsored by the same people who run SwissProt. You’re free to use this tool to see if your text processing commands produce the same results.
XMLPipeDB Match Practice
For your convenience, the XMLPipeDB Match Utility (xmlpipedb-match-1.1.1.jar) has been installed in the ~dondi/xmlpipedb/data directory alongside the other practice files. Use this utility to answer the following questions:
- What Match command tallies the occurrences of the pattern
GO:000[567]
in the 493.P_falciparum.xml file?- How many unique matches are there?
- How many times does each unique match appear?
- Try to find one such occurrence “in situ” within that file. Look at the neighboring content around that occurrence.
- Describe how you did this.
- Based on where you find this occurrence, what kind of information does this pattern represent?
- What Match command tallies the occurrences of the pattern
\"Yu.*\"
in the 493.P_falciparum.xml file?- How many unique matches are there?
- How many times does each unique match appear?
- What information do you think this pattern represents?
- Use Match to count the occurrences of the pattern
ATG
in the hs_ref_GRCh37_chr19.fa file (this may take a while). Then, use grep and wc to do the same thing.- What answer does Match give you?
- What answer does grep + wc give you?
- Explain why the counts are different. (Hint: Make sure you understand what exactly is being counted by each approach.)
- Store your journal entry in the shared Class Journal Week 3 page. If this page does not exist yet, go ahead and create it (congratulations on getting in first :) )
- Link to your journal entry from your user page.
- Link back from the journal entry to your user page.
- NOTE: you can easily fulfill the links part of these instructions by adding them to your template and using the template on your user page.
- Sign your portion of the journal with the standard wiki signature shortcut (
~~~~
). - Add the "Journal Entry" and "Shared" categories to the end of the wiki page (if someone has not already done so).
Read
- Ford, Paul. “What is Code?” Business Week, June 11, 2015. This is a long article—but quite worthwhile. If you can read it in one sitting, go right ahead; we will focus on specific parts at various points in the semester.
This week focuses on the first two sections of this article: “The Man in the Taupe Blazer” and “Let’s Begin.”
Reflect
- Pull out a quote from the first two sections of “What is Code?” that you think directly relates to what you experienced in the individual portion of this assignment. Explain why this quote is particularly resonant for you.
- What are your thoughts on gender issues in computer science? How different/similar do you think the situation is in biology? Feel free to speak from a particular lens (biology major, computer science major, LMU student, etc.).
- What do you think you need in order to grow more comfortable, confident, and effective with the command line?