GENialOMICS

From LMU BioDB 2015
Revision as of 06:02, 8 December 2015 by Kwyllie (Talk | contribs) (Another syntax fix.)

Jump to: navigation, search
Genialomics-banner.jpg


Weekly Group Assignments Shared Group Journals Project Links Team Members


Week 10


Week 11

Individual Goals and Progress

Weekly Goals and Progress
Anu Varshneya Brandon Litvak Veronica Pacheco Kevin Wyllie
Goals
  • Complete journal club individual assignment
  • Create and practice journal club presentation with Brandon
  • Find a MOD
  • Create project timeline with soft deadlines for each person/milestone
  • Complete milestone 0: Working Environment Setup
  • Complete milestone 1: Version Control Setup
  • Begin milestone 2: “Developer Rig” Setup and Initial As-Is Build
  • Reformat Home Page with Dr. Dahlquists recommendations
  • Perform an initial import/export cycle (with Anu)
  • Figure out a file management system (with Anu)
  • Characterize regular expression patterns for ID detection
  • Further explore the found MOD and review it
  • Complete Journal Club presentation on the genome paper

Work with Kevin Wyllie

  • Understand experimental design
  • Understand sample-data relationship
    • raw.zip and .sdrf
    • Construct sample-data diagram
  • Develop compiled raw data file

Work with Veronica Pacheco

  • Understand experimental design
  • Understand sample-data relationship
    • raw.zip and .sdrf
    • Construct sample-data diagram
  • Develop compiled raw data file
Progress
  • Found a possible model organism database (with Brandon Litvak)
  • The journal club presentation and outline was prepared; MOD was examined and reviewed
  • Completed journal club individual assignment
  • Created and practice journal club presentation with Brandon
  • Found a MOD
  • Created project timeline with soft deadlines for each person/milestone
  • Completed milestone 0: Working Environment Setup
  • Reformatted Home Page with Dr. Dahlquists recommendations
  • Journal Club Presentation File: Genome Presentation
  • Found a possible model organism database.
    • The more recent version of the MOD was found
  • The regular expression patterns for J2315 were determined
  • Preparation was done for the Genome Paper Presentation
    • Completed outline of the genome paper
  • File management system was determined
  • we understood the experimental design
  • made chart for microarray experiment
  • finished powerpoint on microarray paper
  • Vpachec3 Week 11
Created methods diagram (media:KWVPMethoddiagram.jpg).
  • File:B. Cenopacia.pptx
  • we understood the experimental design
  • made chart for microarray experiment
  • finished powerpoint on microarray paper
Individual Journal Pages

File Management System

  • Files utilized in weekly projects will renamed as follows: XXXX_GEN_(Initials)(Week#)_yyyymmdd, where "XXXX" is the original filename. If multiple versions of the same file (with identical filenames) are used on the same day then a (positive integer) (starting from 1) will be added to any additional versions (e.g. XXXX_GEN_BL11_yyyymmdd, XXXX_GEN_BL11_yyyymmdd(1), XXXX_GEN_BL11_yyyymmdd(2), for three different versions of the same file uploaded by Brandon Litvak during Week 11)
  • Files will be uploaded to the weekly progress table under a file row with a clear label (under the respective group member that created them/used them)
  • All original unmodified files will be saved and will also be uploaded, together, as a compressed zip with the filename: ORIG_GEN_(Initials)(Week#); the compressed zip containing all original files will be the last entry in the row designated for the files, with the label "ORIGINAL FILES"

Other Progress

Journal Club Presentations

Genome Paper Presentation Week 11
Microarray Paper Presentation Week 12

Week 12

Individual Goals and Progress

Weekly Goals and Progress
Anu Varshneya Brandon Litvak Veronica Pacheco Kevin Wyllie
Goals
  • Complete milestone 0: Working Environment Setup
    • Set up the development machine (my laptop) with all required software for coding.
  • Complete milestone 1: Version Control Setup
    • Set up a branch specific to our project and clone necessary code from GitHub onto the development machine.
  • Complete milestone 2: “Developer Rig” Setup and Initial As-Is Build
    • Confirm that all core software for developing, building, and testing prototype version of GenMAPP Builder are on the development machine.
    • Set up Eclipse and java project workspace.
    • Run initial build.
  • Complete Milestone 1: Initial Database Export
  • Create a Gene Database testing report for the initial export
  • Further explore the various ID systems; verify previous findings
  • Create expressions for Match/PGSQL that will assist in evaluating the quality of any exported databases

Work with Kevin Wyllie

  • Read the microarray paper to understand the experiment.
  • Create a table or list that shows the correspondence between the samples in the experiment and the files you have downloaded.
  • Determine how many biological or technical replicates, and which samples were labeled with Cy3 or Cy5.
  • Create a Master Raw Data file that contains the IDs and columns of data required for further analysis.
  • Consult with Dr. Dahlquist on how to process the data (normalization, statistics).

Work with Veronica Pacheco

  • Read the microarray paper to understand the experiment.
  • Create a table or list that shows the correspondence between the samples in the experiment and the files you have downloaded.
  • Determine how many biological or technical replicates, and which samples were labeled with Cy3 or Cy5.
  • Create a Master Raw Data file that contains the IDs and columns of data required for further analysis.
  • Consult with Dr. Dahlquist on how to process the data (normalization, statistics).
Progress
  • Completed milestone 0: Working Environment Setup
  • Completed milestone 1: Version Control Setup
  • Completed milestone 2: “Developer Rig” Setup and Initial As-Is Build
  • Began milestone 3: Species Profile Creation
    • Need to consult with Brandon and/or Dr. Dionisio before continuing.
  • Completed Milestone 1: Initial Database Export
  • Completed Milestone 2: ID Pattern Definition and Verification (will be revisited for future work with modified forms of GenMAPP builder)
    • Should talk to Anu about further steps involving TallyEngine/GenMAPP builder (organize more exports)
  • Compiled data into one Excel spreadsheet.
  • Centered data and began statistical analysis (stopped at T statistic).
  • Compiled data into one Excel spreadsheet.
  • Centered data and began statistical analysis (stopped at T statistic).
Individual Journal Pages
Files Used/Created

Media:Raw_compiled_data_KW20151119.xlsx

Media:Raw_compiled_data_KW20151119.xlsx

Other Progress

Week 14

Individual Goals and Progress

Weekly Goals and Progress
Anu Varshneya Brandon Litvak Veronica Pacheco Kevin Wyllie
Goals
  • Finish second build.
  • Analyze results from previous build with Brandon Litvak and determine modifications that need to be made to code.
  • Begin modifying code to collect gene names from "ORF" instead of "OrderedLocusTags"
  • Start writing README and scientific paper (parts of deliverables).
  • Perform database export on the second build and other builds of the customized genMAPP builder
  • Determine, with Anu, what modifications must be done to GenMAPP builder/Tallyengine
  • Find out why most of the data/gene-names were not captured in the "OrderedLocusNames" table of the PSQL database for Export 1
  • Perform testing report on any builds created for the week.

Work with Kevin Wyllie

  • Perform the statistical analysis in Excel.
  • Format the gene expression data for import into GenMAPP.
  • Import data into GenMAPP, create ColorSets, and run MAPPFinder.
  • Document and take notes on test runs with GenMAPP.
  • Use the EX.txt file to help the Coder/Quality Assurance team members to validate the .gdb.
  • Do a journal club outline of the paper so that you can use it in the Discussion section of your group report and your final presentation.

Create a .mapp file showing one pathway that is changed in your data.

Work with Veronica Pacheco

  • Perform the statistical analysis in Excel.
  • Format the gene expression data for import into GenMAPP.
  • Import data into GenMAPP, create ColorSets, and run MAPPFinder.
  • Document and take notes on test runs with GenMAPP.
  • Use the EX.txt file to help the Coder/Quality Assurance team members to validate the .gdb.
  • Do a journal club outline of the paper so that you can use it in the Discussion section of your group report and your final presentation.

Create a .mapp file showing one pathway that is changed in your data.

Progress
  • Finished second build.
  • Analyzed results from previous build with Brandon Litvak and determine modifications that need to be made to code.
  • Began modifying code to collect gene names from "ORF" instead of "OrderedLocusTags"
  • Finished third build.
  • Customized Tally Engine to collect counts for ORF ID's.
  • Finished fourth build.

Work with Kevin Wyllie

  • Performed the statistical analysis in Excel.
  • Formatted the gene expression data for import into GenMAPP.
  • Imported data into GenMAPP, created ColorSets, and ran MAPPFinder.
  • Documented and took screenshots on test runs with GenMAPP.
  • Sent the EX.txt file to Brandon and also uploaded on wiki

Work with Veronica Pacheco

  • Performed the statistical analysis in Excel.
  • Formatted the gene expression data for import into GenMAPP.
  • Imported data into GenMAPP, created ColorSets, and ran MAPPFinder.
  • Documented and took screenshots on test runs with GenMAPP.
  • Sent the EX.txt file to Brandon and also uploaded on wiki
Individual Journal Pages
Files Used/Created

Build 2 - with customized species profile:

Anuvarsh (talk) 15:22, 1 December 2015 (PST)

Build 3 - picking up gene names from ORF instead of ordered locus

Anuvarsh (talk) 14:58, 3 December 2015 (PST)

Build 4 - customized TallyEngine

Anuvarsh (talk) 15:18, 3 December 2015 (PST)

see Kevin's section -->

Other Progress

Reflections

Anu Varshneya

  • What worked?
    • In general, I think our group worked very well together! I think we are all motivated to get this project done well, and are communicating well with each other regarding our progress.
  • What didn't work?
    • I think for the most part we did a great job. I think the only ideas I have moving forward is a little bit more planning in regards to how we plan to attack the writing and presentation portion of the project. I am not concerned about us getting it done on time and with good quality, just that we create a plan of attack soon so that everyone is on the same page. :)
  • What will I do next to fix what didn't work?
    • Though nothing has not worked, I think we will just talk tomorrow about how we want to approach the writing and the presentation and set up some group work times.

Kevin Wyllie

  1. What worked?
    • Our initial GenMAPP import worked! 284 errors, which, out of 7251, does sound so bad to me!
  2. What didn't work?
    • Maybe this isn't actually an example of something not working, but our calculated fold changes were quite different (much lower in magnitude) from those reported in Van Acker et al's paper. However, they had the same directions and generally saw the same relative trends (ie the relatively higher fold changes in the paper were among the higher in our data). Also, very few of the genes they considered significant (with their super-lenient criteria that results in 30% of the genes seeing significant changes) were significant by our criteria.
    1. What will I do next to fix what didn't work?
    2. We just need to triple/quadruple check that our data processing protocol is legitimate. Other than that, there's not much we can do in terms of fold changes. And for statistical significance, we potentially should reconsider heightening our BH P-value threshold above 0.05, as currently we're only considering about 8% of the genes to see a significant change. But maybe this is not too low of a number.

Deliverables

Group Deliverables

  • GenMAPP Gene Database for assigned species (.gdb)
  • ReadMe file to accompany the Gene Database (.pdf)
  • Gene Database Testing Report for final submitted Gene Database (print from wiki to .pdf file)
  • Processed and analyzed DNA microarray dataset (.xls or .xlsx)
  • Data file used for import into GenMAPP (.txt or .csv)
  • GenMAPP Expression Dataset file (.gex)
  • Exceptions file of data imported into GenMAPP (.EX.txt)
  • Raw MAPPFinder results files (-GO.txt)
  • .gmf file
  • Filtered MAPPFinder Results (.xls or .xlsx)
  • Sample MAPP file of a relevant biological pathway for your species (.mapp)
  • Group Report describing the creation of the Gene Database and the biological analysis of the data (.doc, .docx, or .pdf)
  • PowerPoint presentation (.ppt, .pptx, or .pdf, given on Tuesday, December 15)

Burkholderia cenocepacia Genome Paper

Holden, M. T. G., Seth-Smith, H. M. B., Crossman, L. C., Sebaihia, M., Bentley, S. D., Cerdeño-Tárraga, A. M., … Parkhill, J. (2009). The Genome of Burkholderia cenocepacia J2315, an Epidemic Pathogen of Cystic Fibrosis Patients . Journal of Bacteriology, 191(1), 261–277. http://doi.org/10.1128/JB.01230-08

  • The link to the abstract from PubMed. [1]
  • The link to the full text of the article in PubMedCentral. [2]
  • The link to the full text of the article (HTML format) from the publisher web site. [3]
  • The link to the full PDF version of the article from the publisher web site. [4]
  • Who owns the rights to the article? American Society for Microbiology
    • Does the journal own the copyright? Yes
    • Do the authors own the copyright? No
    • Do the authors own the rights under a Creative Commons license? No
    • Is the article available “Open Access”? Yes
  • What organization is the publisher of the article? What type of organization is it? (commercial, for-profit publisher, scientific society, respected open access organization like Public Library of Science or BioMedCentral, or predatory open access organization, see the list of) (Open Access Scholarly Publishers Association Members) here. American Society for Microbiology which is a scientific society
  • Is this article available in print or online only? It is both available in print and online.
  • Has LMU paid a subscription or other fee for your access to this article? Well I first looked at this article through web of science which LMU does pay for but looking at the article through PubMed, PubMed central and the publisher website was free.
  • How many articles does this article cite? It has 150 cited references.
  • How many articles cite this article? It is cited 128 times.
  • Based on the titles and abstracts of the papers, what type of research directions have been taken now that the genome for that organism has been sequenced? A lot of the papers revolved around antibiotic resistance and therapeutic strategies.


Microarray Paper

Van Acker, H., Sass, A., Bazzini, S., De Roy, K., Udine, C., Messiaen, T., ... & Coenye, T. (2013). Biofilm-grown Burkholderia cepacia complex cells survive antibiotic treatment by avoiding production of reactive oxygen species. PLoS One, 8(3), e58943.

  • This article is suitable for your project. Kdahlquist (talk) 10:17, 10 November 2015 (PST)
  • The link to the abstract from PubMed: http://www.ncbi.nlm.nih.gov/pubmed/?term=Biofilm-Grown+Burkholderia+cepacia+Complex+Cells+Survive+Antibiotic+Treatment+by+Avoiding+Production+of+Reactive+Oxygen+Species
  • The link to the full text of the article in PubMedCentral: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3596321/
  • The link to the full text of the article (HTML format) from the publisher web site: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0058943
    • Cannot find HTML format on publisher web site.
  • The link to the full PDF version of the article from the publisher web site: http://www.plosone.org/article/fetchObject.action?uri=info:doi/10.1371/journal.pone.0058943&representation=PDF
  • Who owns the rights to the article? Authors of the article: Heleen Van Acker, Andrea Sass, Silvia Bazzini, Karen De Roy, Claudia Udine, Thomas Messiaen, Giovanna Riccardi, Nico Boon, Hans J. Nelis, Eshwar Mahenthiralingam, Tom Coenye
  • Does the journal own the copyright? Yes.
  • Do the authors own the copyright? No.
  • Do the authors own the rights under a Creative Commons license? Yes.
  • Is the article available “Open Access”? Yes.
  • What organization is the publisher of the article? What type of organization is it? Public Library of Science, Professional OA Publisher, Member of Open Access Scholarly Publishers Association
  • Is this article available in print or online only? Available in print and online.
  • Has LMU paid a subscription or other fee for your access to this article? No.
  • Where does MicroArray Data reside? https://www.ebi.ac.uk/arrayexpress/experiments/E-MEXP-3532/?keywords=&organism=Burkholderia+cenocepacia&exptype%5B%5D=%22rna+assay%22&exptype%5B%5D=%22array+assay%22&array=
  • What experiment was performed? What was the "treatment" and what was the "control" in the experiment? The experiment hoped to test whether persister cells are present in Burkholderia cepacia complex (Bcc) biofilms, what the molecular basis of antimicrobial tolerance in Bcc persisters is, and how persisters can be eradicated from Bcc biofilms. Burkholderia cenocepacia biofilms were treated with 1024 µg/ml of tobramycin in the treatment group. The control group did not receive any tobramycin.
  • Were replicate experiments of the "treatment" and "control" conditions conducted? Were these biological or technical replicates? How many of each? 2 technical replicates were made across 5 biological replicates for the control, and 2 technical replicates of 3 biological replicates of the treatments.
  • How many articles does this article cite? This article has 34 cited references.
  • How many articles cite this article? This article is cited 17 times in All Databases, and 17 time in Web of Science Core Collection.
  • Based on the titles and abstracts of the papers, what type of research directions have been taken now that the genome for that organism has been sequenced? Most of the articles are related to antimicrobial therapy, tolerance, and resistance.