Difference between revisions of "Week 4"

From LMU BioDB 2013
Jump to: navigation, search
(Buddy System: Set buddy system TBD.)
(Supplementary Information: Adjust timing of paper assignment.)
 
(5 intermediate revisions by one user not shown)
Line 1: Line 1:
{{Under Construction}}
 
 
 
'''This journal entry is due on Friday, September 20, at midnight PDT.''' ''(Thursday night/Friday morning)''
 
'''This journal entry is due on Friday, September 20, at midnight PDT.''' ''(Thursday night/Friday morning)''
  
Line 13: Line 11:
  
 
''After'' completing the readings above and activities below, answer the following questions:
 
''After'' completing the readings above and activities below, answer the following questions:
# Provide a narrative of what took place your work session together.
+
# What was the most beneficial aspect of working collaboratively on this assignment?
# What was the most beneficial aspect of working with a buddy on this assignment?
+
# What was the most challenging aspect of working collaboratively on this assignment?
# What was the most challenging aspect of working with a buddy on this assignment?
+
 
# Would it have been easier to do this work by hand/pencil?  Why or why not?
 
# Would it have been easier to do this work by hand/pencil?  Why or why not?
 
# What strategies did you use to check the correctness of your work?
 
# What strategies did you use to check the correctness of your work?
Line 21: Line 18:
 
{{Individual Journal Instructions|week=4}}
 
{{Individual Journal Instructions|week=4}}
  
Connect to the Keck lab as described in the [[Main Page]] of this wiki, and do the following exercises from there.
+
Connect to the Keck lab ''my.cs.lmu.edu'' server as described in the [[Keck Lab Workstation Guidelines]] page of this wiki and do the following exercises from there.
  
=== Buddy System ===
+
=== Group Assignments, Individual Submissions ===
  
For this exercise, you have been assigned to a “journal buddy;” you should work on the exercises together.
+
To help ease the learning curve, we have grouped the class into pairs and one trio.  Please sit together in the lab this week, and work with each other ''for at least one additional session outside the classroom''However, continue to ''submit on an individual basis'' for both the individual and shared journal entries.
* Plan at least one “live” work session, where you and your journal buddy can interactively talk, think, and work through the exercisesDocumenting this work session is one of the reflection questions above.
+
* While you may work with your buddy to figure out solutions to the exercises, you should still answer the reflection questions ''and'' write up your journal pages ''individually''.  In particular, any explanations, issues, or “think-aloud” comments should be in your own words.
+
  
The buddy assignments below will not be totally finalized until class time. 
+
* Alina—Kevin Meilak
 
+
* Hilda—Dillon—Katrina
Due to the number of students in class, there is one “three-headed” buddy group.  Note that the instructions are just as applicable to a three-member group as to a pair; if anything, scheduling a work session might be just a little more complicated for the trio.
+
* Kevin McGee—Kurt
 
+
* Lauren—Viktoria
* TBD
+
* Miles—Gabriel
 +
* Mitchell—Lena
 +
* Stephen—Tauras
  
 
=== Transcription and Translation “Taken to the Next Level” ===
 
=== Transcription and Translation “Taken to the Next Level” ===
Line 39: Line 36:
 
This computer exercise examines gene expression at a much more detailed level than before, requiring knowledge in both the biological aspects of the process and the translation of these steps into computer text-processing equivalents.
 
This computer exercise examines gene expression at a much more detailed level than before, requiring knowledge in both the biological aspects of the process and the translation of these steps into computer text-processing equivalents.
  
The following sequence represents a real gene, called ''infA'' and found in ''E. coli'' K12.  As you might have guessed, it’s stored as ''infA-E.coli-K12.txt'' in ''~xmlpipedb/data''.
+
The following sequence represents a real gene, called ''infA'' and found in ''E. coli'' K12.  As you might have guessed, it’s stored as ''infA-E.coli-K12.txt'' in ''~dondi/xmlpipedb/data''.
  
 
  ttttcaccacaagaatgaatgttttcggcacatttctccccagagtgttataattgcggtcgcagagttggttacgc
 
  ttttcaccacaagaatgaatgttttcggcacatttctccccagagtgttataattgcggtcgcagagttggttacgc
Line 65: Line 62:
 
==== Supplementary Information ====
 
==== Supplementary Information ====
  
As a sample answer for the first question, last week’s paper handout sequence would have been marked as follows (it had everything but the -35 box; line breaks are included only for clarity):
+
As a sample answer for the first question, [[Week 2]]’s paper handout sequence would have been marked as follows (it had everything but the -35 box; line breaks are included only for clarity):
 
  tctac <minus10box>tatatt</minus10box> tcaat <tss>a</tss> ttccu <rbs>aggaggt</rbs> ttgacct
 
  tctac <minus10box>tatatt</minus10box> tcaat <tss>a</tss> ttccu <rbs>aggaggt</rbs> ttgacct
 
  <start_codon>atg</start_codon> attgaacttgaaacgttgcc <stop_codon>taa</stop_codon> taccatgttccgcgtataaccca
 
  <start_codon>atg</start_codon> attgaacttgaaacgttgcc <stop_codon>taa</stop_codon> taccatgttccgcgtataaccca
Line 71: Line 68:
 
'''Note:''' The commands needed to generate the output above will be similar, but ''not'' exactly the same as the ones needed for ''infA''.
 
'''Note:''' The commands needed to generate the output above will be similar, but ''not'' exactly the same as the ones needed for ''infA''.
  
 +
Base your commands on the following hints/guidelines about the gene, plus your own knowledge learned from the past few weeks:
 
* The consensus sequence for the -10 site is '''[ct]at[at]at'''.
 
* The consensus sequence for the -10 site is '''[ct]at[at]at'''.
 
* The consensus sequence for the -35 site is '''tt[gt]ac[at]'''.
 
* The consensus sequence for the -35 site is '''tt[gt]ac[at]'''.
Line 80: Line 78:
  
 
==== Computer Tips ====
 
==== Computer Tips ====
 +
 
* Remember that '''sed''' is line-based, and that you can add and count lines to get certain things done, say strictly before or after a certain point.
 
* Remember that '''sed''' is line-based, and that you can add and count lines to get certain things done, say strictly before or after a certain point.
 
* Don't forget how you enforced reading frames in [[Week 3]].
 
* Don't forget how you enforced reading frames in [[Week 3]].

Latest revision as of 15:30, 17 September 2013

This journal entry is due on Friday, September 20, at midnight PDT. (Thursday night/Friday morning)

Contents

[edit] Shared Journal Assignment

  • Store your journal entry in the shared Class Journal Week 4 page. If this page does not exist yet, go ahead and create it (congratulations on getting in first :) )
  • Link to your journal entry from your user page.
  • Link back from the journal entry to your user page.
    • NOTE: you can easily fulfill the links part of these instructions by adding them to your template and using the template on your user page.
  • Sign your portion of the journal with the standard wiki signature shortcut (~~~~).
  • Add the "Journal Entry" and "Shared" categories to the end of the wiki page (if someone has not already done so).

[edit] Read (or Re-Read)

[edit] Reflect

After completing the readings above and activities below, answer the following questions:

  1. What was the most beneficial aspect of working collaboratively on this assignment?
  2. What was the most challenging aspect of working collaboratively on this assignment?
  3. Would it have been easier to do this work by hand/pencil? Why or why not?
  4. What strategies did you use to check the correctness of your work?

[edit] Individual Journal Assignment

  • Store this journal entry as "username Week 4" (i.e., this is the text to place between the square brackets when you link to this page).
  • Link from your user page to this Assignment page.
  • Link to your journal entry from your user page.
  • Link back from your journal entry to your user page.
  • Don't forget to add the "Journal Entry" category to the end of your wiki page.
    • Note: you can easily fulfill all of these links by adding them to your template and then using your template on your journal entry.

Connect to the Keck lab my.cs.lmu.edu server as described in the Keck Lab Workstation Guidelines page of this wiki and do the following exercises from there.

[edit] Group Assignments, Individual Submissions

To help ease the learning curve, we have grouped the class into pairs and one trio. Please sit together in the lab this week, and work with each other for at least one additional session outside the classroom. However, continue to submit on an individual basis for both the individual and shared journal entries.

  • Alina—Kevin Meilak
  • Hilda—Dillon—Katrina
  • Kevin McGee—Kurt
  • Lauren—Viktoria
  • Miles—Gabriel
  • Mitchell—Lena
  • Stephen—Tauras

[edit] Transcription and Translation “Taken to the Next Level”

This computer exercise examines gene expression at a much more detailed level than before, requiring knowledge in both the biological aspects of the process and the translation of these steps into computer text-processing equivalents.

The following sequence represents a real gene, called infA and found in E. coli K12. As you might have guessed, it’s stored as infA-E.coli-K12.txt in ~dondi/xmlpipedb/data.

ttttcaccacaagaatgaatgttttcggcacatttctccccagagtgttataattgcggtcgcagagttggttacgc
tcattaccccgctgccgataaggaatttttcgcgtcaggtaacgcccatcgtttatctcaccgctcccttatacgtt
gcgcttttggtgcggcttagccgtgtgttttcggagtaatgtgccgaacctgtttgttgcgatttagcgcgcaaatc
tttacttatttacagaacttcggcattatcttgccggttcaaattacggtagtgataccccagaggattagatggcc
aaagaagacaatattgaaatgcaaggtaccgttcttgaaacgttgcctaataccatgttccgcgtagagttagaaaa
cggtcacgtggttactgcacacatctccggtaaaatgcgcaaaaactacatccgcatcctgacgggcgacaaagtga
ctgttgaactgaccccgtacgacctgagcaaaggccgcattgtcttccgtagtcgctgattgttttaccgcctgatg
ggcgaagagaaagaacgagtaaaaggtcggtttaaccggcctttttattttat

For each of the following questions pertaining to this gene, provide (a) the actual answer, and (b) the sequence of text-processing commands that calculates this answer. Specific information about how these sequences can be identified is included after the list of questions.

  1. Modify the gene sequence string so that it highlights or “tags” the special sequences within this gene, as follows (ellipses indicate bases in the sequence; note the spaces before the start tag and after the end tag):
    • -35 box of the promoter
      ... <minus35box>...</minus35box> ...
    • -10 box of the promoter
      ... <minus10box>...</minus10box> ...
    • transcription start site
      ... <tss>...</tss> ...
    • ribosome binding site
      ... <rbs>...</rbs> ...
    • start codon
      ... <start_codon>...</start_codon> ...
    • stop codon
      ... <stop_codon>...</stop_codon> ...
    • terminator
      ... <terminator>...</terminator> ...
  2. What is the exact mRNA sequence that is transcribed from this gene?
  3. What is the amino acid sequence that is translated from this mRNA?

[edit] Supplementary Information

As a sample answer for the first question, Week 2’s paper handout sequence would have been marked as follows (it had everything but the -35 box; line breaks are included only for clarity):

tctac <minus10box>tatatt</minus10box> tcaat <tss>a</tss> ttccu <rbs>aggaggt</rbs> ttgacct
<start_codon>atg</start_codon> attgaacttgaaacgttgcc <stop_codon>taa</stop_codon> taccatgttccgcgtataaccca
<terminator>gccgccagttccgctggcggcatttt</terminator> aac

Note: The commands needed to generate the output above will be similar, but not exactly the same as the ones needed for infA.

Base your commands on the following hints/guidelines about the gene, plus your own knowledge learned from the past few weeks:

  • The consensus sequence for the -10 site is [ct]at[at]at.
  • The consensus sequence for the -35 site is tt[gt]ac[at].
  • The ideal number of base pairs between the -35 and -10 box is 17, counting from the first nucleotide after the end of the -35 sequence up to the last nucleotide before the -10 sequence.
  • The transcription start site is located at the 12th nucleotide after the first nucleotide of the -10 box.
  • A consensus sequence for the ribosome binding site is gagg.
  • The first half of the terminator “hairpin” is aaaaggt, where the u in the mRNA binds with a g instead of the usual a.
  • The terminator includes 4 more nucleotides after the hairpin completes.

[edit] Computer Tips

  • Remember that sed is line-based, and that you can add and count lines to get certain things done, say strictly before or after a certain point.
  • Don't forget how you enforced reading frames in Week 3.
  • If you do add lines or spaces to get the job done, make sure to clean up after yourself by removing them from the final answer.
  • This exercise is difficult enough that you might be thinking to yourself, “I’d rather do this by hand!” This sentiment is understandable, but when you find yourself feeling this way, consider the following:
    • Part of the difficulty is learning these things for the first time. Once you’ve gotten the hang of it, there’s no way that doing things by hand will be faster.
    • Consider trying to do this over and over, for multiple genes, with lots of potential variations. Doing this by hand not only takes longer at this point, but risks errors that a computer won’t make (once the correct commands have been determined).
  • Form your commands so that they can be strung together into a single pipeline of processing directives in the end. In other words, once you’ve figured out how to do each step, no human intervention should be needed to perform everything from beginning to end.
  • You will need the More Text Processing Features wiki page to complete this assignment. The How to Read XML Files wiki page gives you an idea for why the requested output was formatted the way it was.
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox