Difference between revisions of "Troque Week 11"

From LMU BioDB 2015
Jump to: navigation, search
m (Preparation for Journal Club on Your Species: Added another phrase to part 1)
(Preparation for Journal Club on Your Species: Finished methods)
Line 23: Line 23:
 
#** ''' '''
 
#** ''' '''
 
#* What were the methods used in the study?
 
#* What were the methods used in the study?
#** ''' ''Shigella flexneri'' strain 301 (abbreviated Sf301), which we sequenced, was isolated from a patient with severe acute clinical manifestations of shigellosis in the Changping District, Beijing, in 1984, and has since been used as a reference strain for ''S.flexneri'' in China. The strain was routinely grown at 37°C overnight on tryptic soy agar containing 0.01% Congo red. Red colonies were inoculated into tryptic soy broth and grown to stationary phase at 37°C for isolating plasmid and chromosomal DNAs. The plasmid and the chromosomal libraries were separately constructed using pBluescript KS(–) (Strategene) as vectors. Approximately 48 000 clones were sequenced from both ends using the big‐dye kit (ABI) and ABI377 or ABI3700 automated sequencers, giving rise to 10 times coverage of the genome. Sequences were assembled initially using the phred/phrap program (12) when the sequence coverage was ∼4‐fold over the estimated size of the genome. The program was run with optimized parameters and the quality score was set to ≥20. Further assembly was carried out repeatedly using the same program when more sequences were obtained. When 100 500 sequences were assembled into 318 contigs, the Consed program was used for sequence finishing (13). Gaps among contigs were closed either by primer walking on selected clones, which were identified by analysis on the forward and the reversed links between contigs using the perl/Tk algorithm, or by sequencing the DNA amplicons generated by polymerase chain reaction (PCR). Glimmer 2.0, a program that searches for protein coding regions, was used to identify those ORFs possessing more than 30 consecutive codons (14). Overlapping and closely clustered ORFs were manually inspected. Predicted polypeptide sequences were used to search the non‐redundant protein database with BLASTP, and the clusters of orthologous groups of proteins (COGs) database was used to identify families to which predicted proteins are related (15). Mobile elements and repetitive sequences were identified using pair‐wise comparison. tRNA sequences were identified by the program tRNAscan‐SE (16). Repetitive regions were defined as those that have at least 200 bp with the significance of e–10 by BLASTN against the Sf301 genome itself and known IS databases. Sequence annotation and graphs of the circular and linear genomic maps were prepared using a newly developed Perl‐Script tool kit (available at ftp://ftp.chgb.org.cn/pub/). Whole genomic comparison with E.coli K12 MG1655 ( accession no. U00096) and O157 EDL933 (accession no. AE00517H) was performed using the GenomeComp program (J.Yang, J.Wang, Q.Jin, Y.Shen, Z.Yao and R.Chen, manuscript in preparation). The accession numbers for Sf301 chromosome and plasmid pCP301 are AE005674 and AF386526, respectively, in GenBank.'''
+
#** ''' The steps that the research group took mainly involved automating the process of genome sequencing, namely, base-calling, identifying open reading frames, and comparing genomes of the ''Shigella flexneri'' strain under observation. This particular strain, Sf301, was originally isolated from a patient with an acute case of shigellosis in the Changping District of Beijing in 1984. The culture used was grown in tryptic soy broth agar containing 0.01% Congo red dye at a constant 37 degrees Celsius. Shotgun sequencing, which involves randomly breaking up DNA sequences into small pieces and then reassembling them by looking at overlapping regions, initially involved the employment of a highly accurate base-calling software, called ''phred'', which significantly reduced human interaction with the DNA sequences, thus also reducing the errors that would have resulted from human involvement. After reaching 318 overlapping regions in the specie’s genome, a program called ''consed'' was then used for sequence finishing. Identifying open reading frames involved the Glimmer 2.0 software, but some manual inspection was still employed for overlapping ORFs. The databases BLASTP and COGs were used to identify families of related proteins. Genomic comparison with E. coli K12 was then executed using the GenomeComp software. The resulting genome sequence from these processes is now accessible under accession numbers in GenBank.'''
 
#* Briefly state the result shown in each of the figures and tables.
 
#* Briefly state the result shown in each of the figures and tables.
 
#** ''' '''
 
#** ''' '''

Revision as of 07:02, 13 November 2015

User Page        Bio Databases Main Page       


Preparation for Journal Club on Your Species

Your team will split into two halves for journal club presentations that will take place in class on Tuesday, November 17 and Tuesday, November 24. The Coder and Quality Assurance person will present the genome paper for your species and the GenMAPP Users will present the microarray paper for your species. You will decide within your team who will present on which day. Please edit the schedule on the Main Page to show who is presenting on which day.

In preparation for your journal club presentation, you will each individually complete the following assignment on your individual journal page.

  1. Make a list of at least 10 biological terms for which you did not know the definitions when you first read the article. Define each of the terms. You can use the glossary in any molecular biology, cell biology, or genetics text book as a source for definitions, or you can use one of many available online biological dictionaries. Cite your sources for the definitions by providing the proper citation (for a book) or the URL to the page with the definition for online sources. Each definition must have it's own URL citation.
  2. Write an outline of the article. The length should be a minimum of the equivalent of 2 pages of standard 8 1/2 by 11 inch paper (you can use the "Print Preview" option in your browser to see the length). Your outline can be in any form you choose, but you should utilize the wiki syntax of headers and either numbered or bulleted lists to create it. The text of the outline does not have to be complete sentences, but it should answer the questions listed below and have enough information so that others can follow it. However, your outline should be in YOUR OWN WORDS, not copied straight from the article.
    • What is the importance or significance of this work (i.e., your species)?
    • What were the methods used in the study?
      • The steps that the research group took mainly involved automating the process of genome sequencing, namely, base-calling, identifying open reading frames, and comparing genomes of the Shigella flexneri strain under observation. This particular strain, Sf301, was originally isolated from a patient with an acute case of shigellosis in the Changping District of Beijing in 1984. The culture used was grown in tryptic soy broth agar containing 0.01% Congo red dye at a constant 37 degrees Celsius. Shotgun sequencing, which involves randomly breaking up DNA sequences into small pieces and then reassembling them by looking at overlapping regions, initially involved the employment of a highly accurate base-calling software, called phred, which significantly reduced human interaction with the DNA sequences, thus also reducing the errors that would have resulted from human involvement. After reaching 318 overlapping regions in the specie’s genome, a program called consed was then used for sequence finishing. Identifying open reading frames involved the Glimmer 2.0 software, but some manual inspection was still employed for overlapping ORFs. The databases BLASTP and COGs were used to identify families of related proteins. Genomic comparison with E. coli K12 was then executed using the GenomeComp software. The resulting genome sequence from these processes is now accessible under accession numbers in GenBank.
    • Briefly state the result shown in each of the figures and tables.
    • How do the results of this study compare to the results of previous studies (See Discussion).
    • For the genome paper (Coder and QA only): in addition to the journal article, please find and review the Model Organism Database (MOD) for your species similarly to what you did to review your assigned database for the NAR assignment. In particular, make sure to answer the following:
      • In order to find our database, we first had to search for our model organism from UniProt. I typed in the search bar at the top the phrase "shigella flexneri 2a 301" since this is the organism we are observing. Once the results showed up, I then copied one of the genes into the clipboard, googled "shigella flexneri genome database" and pasted the gene name into some of the database that were yielded in the Google search. Some of the viable databases that I found can be located here and here.
      1. What types of data can be found in the database (sequence, structures, annotations, etc.); is it a primary or “meta” database; is it curated electronically, manually [in-house], or manually [community])?
      2. What individual or organization maintains the database?
      3. What is their funding source(s)?
      4. Is there a license agreement or any restrictions on access to the database?
      5. How often is the database updated?
      6. Are there links to other databases?
      7. Can the information be downloaded?
        • In what file formats?
      8. Evaluate the “user-friendliness” of the database.
        • Is the Web site well-organized?
        • Does it have a help section or tutorial?
        • Run a sample query. Do the results make sense?
      9. What is the format (regular expression) of the main type of gene ID for this species (the "ordered locus name" ID)? (for example, for Vibrio cholerae it was VC#### or VC_####).

Assignment Links

Weekly Assignments

Individual Journal Entries

Shared Journal Entries