<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Bklein7</id>
		<title>LMU BioDB 2015 - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="https://xmlpipedb.lmucs.io/biodb/fall2015/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Bklein7"/>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php/Special:Contributions/Bklein7"/>
		<updated>2026-05-04T02:08:45Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.25.1</generator>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_groupreport_cw20151218.pdf&amp;diff=8206</id>
		<title>File:Bpertussis groupreport cw20151218.pdf</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_groupreport_cw20151218.pdf&amp;diff=8206"/>
				<updated>2015-12-21T20:29:02Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:Bpertussis groupreport cw20151218.pdf&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;group report for bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_groupreport_cw20151218.pdf&amp;diff=8205</id>
		<title>File:Bpertussis groupreport cw20151218.pdf</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_groupreport_cw20151218.pdf&amp;diff=8205"/>
				<updated>2015-12-21T20:24:58Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:Bpertussis groupreport cw20151218.pdf&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;group report for bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=The_Class_Whoopers&amp;diff=8204</id>
		<title>The Class Whoopers</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=The_Class_Whoopers&amp;diff=8204"/>
				<updated>2015-12-21T18:59:51Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: added specific reflection questions and responses that we overlooked&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Team Information &amp;amp; Links =&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;br /&gt;
&lt;br /&gt;
= Deliverables =&lt;br /&gt;
[[Bordetella Pertussis GenMAPP Analysis Deliverables]]&lt;br /&gt;
&lt;br /&gt;
==Presentation Download Links==&lt;br /&gt;
*Journal Club&lt;br /&gt;
** Genome Paper: [[File:Genomepaper_cw20151116.pdf]]&lt;br /&gt;
** Microarray Paper: [[File: Microarray_Journal_Club_Presentation.pdf]]&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Final Project&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**[[File:Bpertussis_findings_powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==File Naming Protocol==&lt;br /&gt;
All file types generated in this project will receive their own unique names composed of two key parts:&lt;br /&gt;
#Description&lt;br /&gt;
#*This will contain a brief, file-specific description of what content the file contains.&lt;br /&gt;
#*Descriptions for different versions of the same file will remain consistent.&lt;br /&gt;
#Identifier Tag&lt;br /&gt;
#*This tag will be listed as a suffix in the following form: &amp;quot;_cwYYYYMMDD&amp;quot;&lt;br /&gt;
#**cw- team name abbreviation&lt;br /&gt;
#**YYYYMMDD- date the file was created in the form year/month/day&lt;br /&gt;
&lt;br /&gt;
Additionally, the following file naming best practices will be observed when creating descriptions for new files:&lt;br /&gt;
*Our species will be referred to consistently as &amp;quot;bpertussis&amp;quot;.&lt;br /&gt;
*Spaces will be written as underscores.&lt;br /&gt;
*No capitalization will be used.&lt;br /&gt;
*No special characters will be used.&lt;br /&gt;
*If sequential numbering systems are used, leading zeros will be included for clarity.&lt;br /&gt;
&lt;br /&gt;
=Weekly Updates=&lt;br /&gt;
==Week 15==&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Goals&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Assignment due date:&amp;#039;&amp;#039;&amp;#039; Midnight Tuesday, December 15&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Coder:&amp;#039;&amp;#039;&amp;#039; Adjust the GenMAPP Builder code to account for the one EnsemblBacteria reference ID that was missing in our last export; conduct a new import-export cycle to create the (hopefully) final .gdb file; begin characterizing the exported .gdb file in a Gene Database Testing Report; customize the GenMAPP Builder TallyEngine to account for any changes made.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Quality Assurance:&amp;#039;&amp;#039;&amp;#039; Reconfigure TallyEngine Configuration with Coder in order to accommodate missing gene IDs that were not exported the previous time. Test the revised database by running TallyEngine count, XmlpipeDB Match, and PostgreSQL. Locate missing gene IDs if any. &lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;GenMAPP User:&amp;#039;&amp;#039;&amp;#039; Import data into GenMAPP, create ColorSets, and run MAPPFinder. Document and take notes on test runs with GenMAPP. Use the EX.txt file to help the Coder/Quality Assurance team members to validate the .gdb. Create a .mapp file showing one pathway that is changed in your data.&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Progress&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Brandon (Coder and Project Manager):&amp;#039;&amp;#039;&amp;#039; I began this week by customizing the GenMAPP Builder TallyEngine to report ORF counts for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; (see [[Bklein7_Week_15]]). After this, I worked with [[User:Msaeedi23|Mahrad]] to identify the 1 gene ID that was missing in the .gdb file [[File:bpertussis-std_cw20151203.zip]]. I found that this gene was a necessary EnsemblBacteria reference ID and edited the GenMAPP Builder code with the help of [[User:Dondi|Dr. Dionisio]] to include this ID in our next export (see [[Bklein7_Week_15]]). I conducted a complete import-export cycle on 12/10/2015 to create the .gdb file [[File:bpertussis-std_cw20151210.zip]]. I then characterized this export, authoring sections 1-5.2 of its testing report: [[Gene_Database_Testing_Report-_cw20151210]]. During our Sunday meeting, I worked with [[User:Lenaolufson|Lena]] to use this new gene database in GenMAPP. During our Monday meeting, I worked on our PowerPoint presentation: [[File:Bpertussis findings powerpoint.pdf]].&lt;br /&gt;
*** [[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 22:31, 14 December 2015 (PST)&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Mahrad (Quality Assurance):&amp;#039;&amp;#039;&amp;#039; I worked closely with the coder [[User: Bklein7|Brandon]] in order to re-customize TallyEngine to include the 11 missing ORF genes. The specific customizations and following results are detailed in my [[Msaeedi23 Week 15| Week 15 Journal Entry]]  Having located the missing gene IDs, Brandon went into Eclipse to code for them to be included in the export. Following this, we tested out revised gene database to make sure these missing IDs were actually exported. We ran TallyEngine count, which gave a total of 3446 gene IDs, demonstrating that the IDs were now exported. Then we ran XMLpipeDB Match, and this provided a total of 3447 gene IDs exported, one additional. Finally, we ran PostgreSQL and this gave a total of 3446 gene IDs. We came to find that gene &amp;quot;BP3167A&amp;quot; was in the original XML file, but not accounted for in the exported file. With further investigation we concluded that &amp;quot;BP3167A&amp;quot; is a reference ID from EnsemblBacteria and corresponds to the same ID as &amp;quot;BP3167.1&amp;quot; which was exported. &lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Lena (GenMAPP User):&amp;#039;&amp;#039;&amp;#039; I was able to import the data into GenMAPP and then I created color sets in order to run MAPPFinder. I obtained the ontology results and did some background research on what exactly the top results related to from the microarray article. I then used Kegg pathways for my specific organism to create two separate MAPPS, one for ribosome and one for the nitrogen cycle.&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Meetings!&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**This week, our group used class work sessions to coordinate our work:&lt;br /&gt;
***Tuesday, December 8, 2:40 - 4:00&lt;br /&gt;
***Thursday, December 10, 2:40 - 4:00&lt;br /&gt;
**In addition, we scheduled meetings outside of class to work on the final PowerPoint Presentation and deliverables for our project:&lt;br /&gt;
***Sunday, December 13, 7:00 PM - 1:00 AM&lt;br /&gt;
***Monday, December 14, 2:00 PM - 11:00 PM&lt;br /&gt;
&lt;br /&gt;
==Week 14==&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Goals&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Assignment due date:&amp;#039;&amp;#039;&amp;#039; Midnight Tuesday, December 8&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Coder:&amp;#039;&amp;#039;&amp;#039; Create the custom species profile for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039;, run an export using the customized version of GenMAPP Builder, add further customizations to the custom species profile as appear necessary, and run a second export using the further customized version of GenMAPP Builder.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Quality Assurance:&amp;#039;&amp;#039;&amp;#039; Identify gene IDs that are missing in the first custom export, work with the coder to classify these IDs, configure the Tally Engine, and complete a gene database testing report for the second custom export.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;GenMAPP User:&amp;#039;&amp;#039;&amp;#039; Complete the statistical analysis of the data, format the data for import into GenMAPP, and coordinate with the coder/QA to import this data into GenMAPP using the custom gene database.&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Progress &amp;amp; Reflection&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Brandon (Coder and Project Manager):&amp;#039;&amp;#039;&amp;#039; This week, I focused on creating and customizing the species profile for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; in GenMAPP Builder, the details of which can be found in my [[Bklein7 Week 14| Week 14 Journal Entry]]. I documented the first export I conducted using a custom &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; species profile here: [[Gene Database Testing Report- cw20151201]]. I demonstrated that the custom species information implemented in this export worked as intended, but Mahrad and I identified 11 ORF genes that failed to export. I updated the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; species profile to account for these ORF genes and conducted a new export, detailed here: [[Gene Database Testing Report- cw20151203]]. Mahrad analyzed the exported .gdb file. In addition to this, I kept tabs on my fellow group members to keep us on track to accomplish our long-term project goals in a timely manner.&lt;br /&gt;
***What worked?&lt;br /&gt;
****Thus far, we have exported two versions of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database that have been created using modified versions of GenMAPP Builder. Both custom exports worked as intended. The first one simply created the &amp;#039;&amp;#039;Bordtella pertussis&amp;#039;&amp;#039; custom class. However, we identified 11 ORF genes conforming to the unique patterns &amp;quot;BP####A&amp;quot; and &amp;quot;BP####B&amp;quot; that warranted inclusion into the gene database. Exporting ORF gene IDs is a common issue other custom classes appear to have had, so implementing this fix was very straightforward in practice.&lt;br /&gt;
***What didn&amp;#039;t work?&lt;br /&gt;
****Although all of the changes we implemented to GenMAPP Builder worked as intended, we have yet to produce a comprehensive gene database for &amp;#039;&amp;#039;Bordtella pertussis&amp;#039;&amp;#039;. The most recent export included 11 ORF genes that we thought encompassed the only IDs with the patterns &amp;quot;BP####A&amp;quot; and &amp;quot;BP####B&amp;quot;. However, we found that there is one more relevant gene ID in the UniProt XML file that conforms to the patterns &amp;quot;BP####A&amp;quot; and was not imported. We will have to find a way to export this ID as well.&lt;br /&gt;
***What will I do next to fix what didn&amp;#039;t work?&lt;br /&gt;
****Next, I will confer with Drs. Dahlquist and Dionisio to come up with a strategy for isolating the one missing EnsemblBacteria reference ID and exporting it into our final gene database. After this is done, I will characterize the database for completeness and work on further modifying the TallyEngine. Hopefully, these steps will generate a complete gene database so that we can transition to working on our final deliverables.&lt;br /&gt;
*** [[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 13:39, 7 December 2015 (PST)&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Mahrad (Quality Assurance):&amp;#039;&amp;#039;&amp;#039; This week as Q and A I worked directly with Brandon to do the initial data exports. The work can be summarized here: [[Msaeedi23 Week 14| Week 14 Journal Entry]]. Next we meticulously characterized regular expression patterns to detect discrepancies in extracting the data from the original samples. In the following week I will work to do the tally configuration to customize it according to our specific species. Now I will focus on the tally configuration which may take some time and coding assistance from Brandon. Once the Tally Engine has been configured to our specific species, Lena can proceed with with GenMAPP processing. Week 14 reflection:&lt;br /&gt;
# What worked?&lt;br /&gt;
#*We were able to use the various counting systems to detect the total number of gene IDs that were imported into our gdb file. Through our investigation, Brandon and I came to find four specific missing IDs. &lt;br /&gt;
# What didn&amp;#039;t work?&lt;br /&gt;
#*There were four ID inconsistencies detected to be missing in our gdb file. We were able to target the specific IDs that were missing and now the code will have to be changed to incorporate these missing IDs in our database. &lt;br /&gt;
# What will I do next to fix what didn&amp;#039;t work?&lt;br /&gt;
#*Work more closely with Brandon to ensure the Tally Engine is configured properly and that we can properly import and obtain confirmation that all the gene IDs were imported successfully. &lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Lena (GenMAPP User):&amp;#039;&amp;#039;&amp;#039; This week, I made progress on performing the statistical analysis of the data to prepare it for GenMAPP. I was able to post my progress for each of the class working sessions on my [[Lenaolufson Week 14| Week 14 Journal Entry]] as I updated the excel data sheets after each session. Dr. Dahlquist helped me figure out a problem with the original raw data that was causing the values to be very skewed. I then sent her my updated data sheet and she was able to use a program to separate the duplicates of the chips. After she sent me back the data with the sorted values, I performed the statistical analysis on the data, the most updated version of the file can be found on my Week 14 journal entry linked previously. &lt;br /&gt;
[[User:Lenaolufson|Lenaolufson]] ([[User talk:Lenaolufson|talk]]) 19:54, 7 December 2015 (PST)&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Meetings!&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**This week, our group used class work sessions to coordinate our work:&lt;br /&gt;
***Tuesday, December 1, 2:40 - 4:00&lt;br /&gt;
***Thursday, December 3, 2:40 - 4:00&lt;br /&gt;
*** Monday, December 7, 10:30 - 12 am&lt;br /&gt;
&lt;br /&gt;
==Week 12==&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Goals&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Assignment due date:&amp;#039;&amp;#039;&amp;#039; Midnight Tuesday, November 24&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Coder:&amp;#039;&amp;#039;&amp;#039; Set up a GitHub repository clone of the XMLPipeDB project on your development device, the development rig, and the initial as-is build for gmbuilder. Complete an import-export cycle in association with QA.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Quality Assurance:&amp;#039;&amp;#039;&amp;#039; Complete an import-export cycle for the 1st &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database. Complete a Gene Database Testing Report for this export.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;GenMAPP Users:&amp;#039;&amp;#039;&amp;#039; Create a Master Raw Data file that contains the IDs and columns of data required for further analysis. Consult with Dr. Dahlquist on how to process the data (normalization, statistics).&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Progress&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Brandon (Quality Assurance and Interim Coder):&amp;#039;&amp;#039;&amp;#039; This week, I focused on completing an import-export cycle for our first &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database- [[File:Bpertussis-std cw20151119.zip]]. With my QA hat, I imported the appropriate data, exported the gene database, and discussed the gene database creation &amp;amp; counting protocol here- [[Gene Database Testing Report- cw20151119]]. With my Coder hat, I followed the instructions on the [[Coder| Coder Guild Page]] to setup a GitHub repository clone of the XMLPipdeDB project on my personal laptop, the Eclipse developer rig, and the initial as-is build for gmbuilder. The electronic lab notebook for my QA and Coder work is present on my [[Bklein7 Week 12| Week 12 Page]]. Finally, I wrote a PowerPoint presentation on our genome sequencing paper, which is linked to on my [[Bklein7 Week 12| Week 12 Page]] as well. &lt;br /&gt;
***[[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 18:48, 23 November 2015 (PST)&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Lena (GenMAPP):&amp;#039;&amp;#039;&amp;#039; I worked on downloading the correct data sample files from the provided files on the microarray paper page. The files were unzipped and prepared to be imported into excel. In excel, the data was manipulated to form a spreadsheet that had all of the gene IDs from the different samples with their appropriate columns to be analyzed. The corrections and further manipulations of the data are to be continued to be done in the coming week in order to create the desired dataset to be exported from excel. [[File:Bpertussis CompiledRawData MS2015.xlsx]]&lt;br /&gt;
***[[User:Lenaolufson|Lenaolufson]] ([[User talk:Lenaolufson|talk]]) 17:33, 23 November 2015 (PST)&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Mahrad (GenMAPP--&amp;gt; Quality Assurance)&amp;#039;&amp;#039;&amp;#039;: This week I downloaded the six data sample files provided  by the microarray paper. The process is detailed in my [[Msaeedi23 Week 12| Week 12 Journal Entry]]. Files were unzipped, imported into excel, and manipulated to form a single spreadsheet containing all gene IDs from the different samples. Each sample was placed in its respective column to be further analyzed and manipulated in the upcoming week. Following this, I assumed the position of quality assurance to accommodate the absence of Nicole.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Nicole&amp;#039;&amp;#039;&amp;#039; was absent this week. [[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 18:52, 23 November 2015 (PST)&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Meetings!&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** Monday, November 23: Seaver 120- Brandon and Lena met to work on the GenMAPP testing of the gene IDs from our database.&lt;br /&gt;
&lt;br /&gt;
==Week 11==&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Goals&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** For all:&lt;br /&gt;
*** Outline your assigned paper on your user page and include a list of 10 defined terms from the paper.&lt;br /&gt;
**Nicole &amp;amp; Brandon&lt;br /&gt;
***Prepare Journal Club presentation on the designated genome sequencing article&lt;br /&gt;
***Slides Due: by midnight, Tuesday, November 17&lt;br /&gt;
***Presentation Date: Tuesday, November 24&lt;br /&gt;
**Lena &amp;amp; Mahrad&lt;br /&gt;
***Prepare Journal Club presentation on the designated microarray paper&lt;br /&gt;
***Slides Due: by midnight, Tuesday, November 17&lt;br /&gt;
***Presentation Date: Tuesday, November 17&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Progress&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**Nicole Anguiano (Coder): Nicole was absent this week for a medical emergency and is (hopefully) getting some much deserved rest. [[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 23:14, 16 November 2015 (PST)&lt;br /&gt;
**Brandon Klein (QA): This week I made several edits to the [https://xmlpipedb.cs.lmu.edu/biodb/fall2015/index.php/The_Class_Whoopers Class Whoopers Team Page] in accordance with the [https://xmlpipedb.cs.lmu.edu/biodb/fall2015/index.php/Week_11 Week 11 assignment]. These edits included the following: revising the Class Whoopers template, reorganizing the Team Page structure, commenting out unneeded articles in the annotated bibliography, creating the new bibliography entry as requested by Dr. Dahlquist, and writing the naming conventions for our files. Additionally, I outlined our genome sequencing paper for &amp;quot;Bordetella pertussis&amp;quot; and assessed the [http://www.genedb.org/Homepage/Bpertussis GeneDB MOD] on my [https://xmlpipedb.cs.lmu.edu/biodb/fall2015/index.php/Bklein7_Week_11#Identifying_the_Bordetella_Pertussis_MOD Week 11 Individual Journal Entry]. A preliminary draft of the genome sequencing paper that I will likely be presenting solo was uploaded there. Finally, I kept tabs on group members as the interim Project Manager. [[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 23:14, 16 November 2015 (PST)&lt;br /&gt;
**Lena Olufson (GennMAPP): This week Mahrad and I met up and analyzed the microarray paper together. We split up the powerpoint into two halves; I did the introduction/significance of the study as well as the methods performed. Mahrad and I created our presentation together and worked through a google doc to edit it simultaneously as we discussed out loud. We also created a flow chart together that demonstrated the experimental design, thus we have the same ones included in our individual assignments. We made sure to check in with the temporary project manager and keep him updated on our progress. [[User:Lenaolufson|Lenaolufson]] ([[User talk:Lenaolufson|talk]]) 23:24, 16 November 2015 (PST) &lt;br /&gt;
**Mahrad Saeedi (GennMAPP): This week Lena and I worked on analyzing the microarray paper and creating an outline. The outline and detailed process involved with the experiment can be found in my [[Msaeedi23 Week 11| Week 11 Journal Entry]]. We each defined 10 terms separately based upon words we didn&amp;#039;t recognize in the article. We then proceeded to producing the powerpoint presentation for journal club. &lt;br /&gt;
[[User:Msaeedi23|Msaeedi23]] ([[User talk:Msaeedi23|talk]]) 23:46, 16 November 2015 (PST)&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Meetings!&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**11/15- Lena &amp;amp; Mahrad met to work on outlining article and answering questions&lt;br /&gt;
**11/16- Lena &amp;amp; Mahrad met to prepare powerpoint presentation for journal club&lt;br /&gt;
&lt;br /&gt;
==Week 10==&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Goals&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** For all:&lt;br /&gt;
*** Create an annotated bibliography including one genome sequencing paper and two microarray experiments for Bordetella pertussis&lt;br /&gt;
*** Create/update team page &amp;amp; compile group annotated bibliography&lt;br /&gt;
*** Assignment due date: Midnight Tuesday, November 10&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Progress&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**All group members created annotated bibliographies and compiled them on the newly created group page.&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Meetings!&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** Monday, November 9, 8pm-9pm, Seaver 120&lt;br /&gt;
&lt;br /&gt;
= Annotated Bibliography =&lt;br /&gt;
== Genome Sequencing Paper ==&lt;br /&gt;
&lt;br /&gt;
Neither of these papers is the &amp;#039;&amp;#039;first&amp;#039;&amp;#039; to report the genome sequence of &amp;#039;&amp;#039;B. pertussis.&amp;#039;&amp;#039;  The paper that you will want to use is [http://www.nature.com/ng/journal/v35/n1/full/ng1227.html this one].  I found it by looking at the introduction and references of the Zhang et. al (2011) paper.  For your Week 11 assignment, please remove your annotated bibliography entries for the two papers below and create one for this new paper by Parkhill et al. (2003).  You will use the Parkhill paper for your project.  &amp;#039;&amp;#039;&amp;amp;mdash; [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 09:54, 10 November 2015 (PST)&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
*Parkhill, J., Sebaihia, M., Preston, A., Murphy, L. D., et al. (2003). Comparative analysis of the genome sequences of Bordetella pertussis, Bordetella parapertussis and Bordetella bronchiseptica. Nature genetics, 35(1), 32-40. doi:10.1038/ng1227&lt;br /&gt;
* PubMed Abstract:  http://www.ncbi.nlm.nih.gov/pubmed/12910271&lt;br /&gt;
* PubMed Central:  Not available on PubMed Central.&lt;br /&gt;
* Publisher Full Text (HTML):  http://www.nature.com/ng/journal/v35/n1/full/ng1227.html&lt;br /&gt;
* Publisher Full Text (PDF):  http://www.nature.com/ng/journal/v35/n1/pdf/ng1227.pdf&lt;br /&gt;
* Copyright: ©2003 Nature Publishing Group (information found on PDF version of article). This article is not Open Access, but it is freely available 6 months after publication.&lt;br /&gt;
* Publisher: Nature Publishing Group (for-profit).&lt;br /&gt;
* Availability: In print and online.&lt;br /&gt;
* Did LMU pay a fee for this article: Yes, LMU pays a subscription fee for access to the journal &amp;#039;&amp;#039;Nature Genetics&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== Microarray Paper ==&lt;br /&gt;
&lt;br /&gt;
This paper is suitable for your project.  &amp;#039;&amp;#039;&amp;amp;mdash; [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 10:04, 10 November 2015 (PST)&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Hoo, R., Lam, J.H., Huot, L., Pant, A., Li, R., Hot, D., &amp;amp; Alonso, S. (2014). Evidence for a Role of the Polysaccharide Capsule Transport Proteins in Pertussis Pathogenesis. PLoS ONE, 9(12):e115243. doi: 10.1371/journal.pone.0115243&lt;br /&gt;
* PubMed Abstract: http://www.ncbi.nlm.nih.gov/pubmed/25501560 &lt;br /&gt;
* PubMed Central: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4264864/&lt;br /&gt;
* Publisher Full Text (HTML): http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0115243&lt;br /&gt;
* Publisher Full Text (PDF): http://www.plosone.org/article/fetchObject.action?uri=info:doi/10.1371/journal.pone.0115243&amp;amp;representation=PDF&lt;br /&gt;
* Copyright: © 2014 Hoo et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited (info found [http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0115243 here]).&lt;br /&gt;
* Publisher: PLOS ONE (respected open access organization).&lt;br /&gt;
* Availability: Online only.&lt;br /&gt;
* Did LMU pay a fee for this article: No.&lt;br /&gt;
* Web site where the data resides: [http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE62088 NCBI GEO data]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--#2&lt;br /&gt;
*Brickman, T. J., Cummings, C. A., Liew, S.-Y., Relman, D. A., &amp;amp; Armstrong, S. K. (2011). Transcriptional Profiling of the Iron Starvation Response in Bordetella pertussis Provides New Insights into Siderophore Utilization and Virulence Gene Expression . Journal of Bacteriology, 193(18), 4798–4812. http://doi.org/10.1128/JB.05136-11&lt;br /&gt;
* ArrayExpress Abstract: https://www.ebi.ac.uk/arrayexpress/experiments/E-MEXP-3263/?keywords=&amp;amp;organism=Bordetella+pertussis&amp;amp;exptype%5B%5D=%22rna+assay%22&amp;amp;exptype%5B%5D=%22array+assay%22&amp;amp;array=&lt;br /&gt;
* PubMed Central:  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC532028/&lt;br /&gt;
*PubMed Abstract: http://www.ncbi.nlm.nih.gov/pubmed/?term=Transcriptional+Profiling+of+the+Iron+Starvation+Response+in+Bordetella+pertussis+Provides+New+Insights+into+Siderophore+Utilization+and+Virulence+Gene+Expression&lt;br /&gt;
* Publisher Full Text (HTML): http://jb.asm.org/content/193/18/4798.full&lt;br /&gt;
* Publisher Full Text (PDF):  http://jb.asm.org/content/193/18/4798.full.pdf+html &lt;br /&gt;
* Copyright:  2011 by the American Society for Microbiology &lt;br /&gt;
* Publisher:  Journal of Bacteriology &lt;br /&gt;
* Availability:  in print and online&lt;br /&gt;
* Did LMU pay a fee for this article: yes&lt;br /&gt;
*Link to where the microarray data resides: https://www.ebi.ac.uk/arrayexpress/experiments/E-MEXP-3263/?keywords=&amp;amp;organism=Bordetella+pertussis&amp;amp;exptype%5B%5D=%22rna+assay%22&amp;amp;exptype%5B%5D=%22array+assay%22&amp;amp;array=&lt;br /&gt;
&lt;br /&gt;
#3&lt;br /&gt;
King, A. J., van der Lee, S., Mohangoo, A., van Gent, M., van der Ark, A., &amp;amp; van de Waterbeemd, B. (2013). Genome-Wide Gene Expression Analysis of Bordetella pertussis Isolates Associated with a Resurgence in Pertussis: Elucidation of Factors Involved in the Increased Fitness of Epidemic Strains. PLoS ONE, 8(6): e66150. doi: 10.1371/journal.pone.0066150&lt;br /&gt;
* PubMed Abstract:  http://www.ncbi.nlm.nih.gov/pubmed/23776625&lt;br /&gt;
* PubMed Central:  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3679012/&lt;br /&gt;
* Publisher Full Text (HTML):  http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0066150&lt;br /&gt;
* Publisher Full Text (PDF):  http://www.plosone.org/article/fetchObject.action?uri=info:doi/10.1371/journal.pone.0066150&amp;amp;representation=PDF&lt;br /&gt;
* Copyright: © 2013 King et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. (info found [http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0066150 here])&lt;br /&gt;
* Publisher: PLOS ONE (respected open access organization)&lt;br /&gt;
* Availability: online only&lt;br /&gt;
* Did LMU pay a fee for this article: no&lt;br /&gt;
* Web site where the data resides: [http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1594/samples/?keywords=Bordetella+pertussis&amp;amp;organism=Bordetella+pertussis&amp;amp;exptype%5B%5D=%22rna+assay%22&amp;amp;exptype%5B%5D=%22array+assay%22&amp;amp;array=&amp;amp;s_page=1&amp;amp;s_pagesize=100 EBI ArrayExpress Data]&lt;br /&gt;
&lt;br /&gt;
#4 (note, all of the papers from this point on involve additional species other than Bordetella pertussis)&lt;br /&gt;
&lt;br /&gt;
* Cummings, C. A., Bootsma, H. J., Relman, D. A., &amp;amp; Miller, J. F. (2006). Species-and strain-specific control of a complex, flexible regulon by Bordetella BvgAS. Journal of bacteriology, 188(5), 1775-1785.&lt;br /&gt;
* ArrayExpress Abstract: https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-29/?keywords=&amp;amp;organism=Bordetella+pertussis&amp;amp;exptype%5B%5D=%22rna+assay%22&amp;amp;exptype%5B%5D=%22array+assay%22&amp;amp;array=&lt;br /&gt;
* PubMed Central:  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1426559/&lt;br /&gt;
*PubMed Abstract:http://www.ncbi.nlm.nih.gov/pubmed/?term=Species-+and+Strain-Specific+Control+of+a+Complex%2C+Flexible+Regulon+by+Bordetella+BvgAS&lt;br /&gt;
* Publisher Full Text (HTML): http://jb.asm.org/content/188/5/1775.full&lt;br /&gt;
* Publisher Full Text (PDF):  http://jb.asm.org/content/188/5/1775.full.pdf+html&lt;br /&gt;
* Copyright:  2006 by the American Society for Microbiology &lt;br /&gt;
* Publisher:  Journal of Bacteriology &lt;br /&gt;
* Availability:  in print and online&lt;br /&gt;
* Did LMU pay a fee for this article: yes&lt;br /&gt;
*Link to where the microarray data resides: https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-29/?keywords=&amp;amp;organism=Bordetella+pertussis&amp;amp;exptype%5B%5D=%22rna+assay%22&amp;amp;exptype%5B%5D=%22array+assay%22&amp;amp;array=&lt;br /&gt;
&lt;br /&gt;
#5&lt;br /&gt;
&lt;br /&gt;
* Brinig, M., Register, K., Ackermann, M., &amp;amp; Relman, D. (2006). Genomic features of Bordetella parapertussis clades with distinct host species specificity. Genome Biology, 7(9). doi:doi:10.1186/gb-2006-7-9-r81&lt;br /&gt;
* PubMed Abstract:  http://www.ncbi.nlm.nih.gov/pubmed/16956413?dopt=Abstract&amp;amp;holding=f1000,f1000m,isrctn&lt;br /&gt;
* PubMed Central:  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1794550/&lt;br /&gt;
* Publisher Full Text (HTML):  http://www.genomebiology.com/2006/7/9/R81&lt;br /&gt;
* Publisher Full Text (PDF):  http://www.genomebiology.com/content/pdf/gb-2006-7-9-r81.pdf&lt;br /&gt;
* Copyright: Brinig et al.; licensee BioMed Central Ltd. (information found on the article); open access&lt;br /&gt;
* Publisher:  BioMed Central Ltd (for-profit publisher)&lt;br /&gt;
* Availability:  online&lt;br /&gt;
* Did LMU pay a fee for this article: no&lt;br /&gt;
# What experiment was performed?  What was the &amp;quot;treatment&amp;quot; and what was the &amp;quot;control&amp;quot; in the experiment?&lt;br /&gt;
# Were replicate experiments of the &amp;quot;treatment&amp;quot; and &amp;quot;control&amp;quot; conditions conducted?  Were these biological or technical replicates?  How many of each?&lt;br /&gt;
* Link to microarray data: https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-95/&lt;br /&gt;
&lt;br /&gt;
#6&lt;br /&gt;
&lt;br /&gt;
* Cummings, C., Bootsma, H., Relman, D., &amp;amp; Miller, J. (2006). Species- and Strain-Specific Control of a Complex, Flexible Regulon by Bordetella BvgAS. Journal of Bacteriology, 188(5), 1775-1785. doi:doi: 10.1128/JB.188.5.1775-1785.2006&lt;br /&gt;
* PubMed Abstract: http://www.ncbi.nlm.nih.gov/pubmed/16484188?dopt=Abstract&lt;br /&gt;
* PubMed Central:  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1426559/&lt;br /&gt;
* Publisher Full Text (HTML):  http://jb.asm.org/content/188/5/1775.full&lt;br /&gt;
* Publisher Full Text (PDF):  http://jb.asm.org/content/188/5/1775.full.pdf&lt;br /&gt;
* Copyright: American Society for Microbiology; open access&lt;br /&gt;
* Publisher:  American Society for Microbiology (professional organization for scientists)&lt;br /&gt;
* Availability:  online and in print&lt;br /&gt;
* Did LMU pay a fee for this article: no&lt;br /&gt;
# What experiment was performed?  What was the &amp;quot;treatment&amp;quot; and what was the &amp;quot;control&amp;quot; in the experiment?&lt;br /&gt;
# Were replicate experiments of the &amp;quot;treatment&amp;quot; and &amp;quot;control&amp;quot; conditions conducted?  Were these biological or technical replicates?  How many of each?&lt;br /&gt;
* Link to microarray data: https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-28/&lt;br /&gt;
&lt;br /&gt;
#7&lt;br /&gt;
&lt;br /&gt;
*King, A. J., van Gorkom, T., Pennings, J. L., van der Heide, H. G., He, Q., Diavatopoulos, D., … Mooi, F. R. (2010). Correction: Comparative genomic profiling of Dutch clinical Bordetella pertussis isolates using DNA microarrays: identification of genes absent from epidemic strains. BMC Genomics, 11, 196. http://doi.org/10.1186/1471-2164-11-196&lt;br /&gt;
* PubMed Abstract:  http://www.biomedcentral.com/1471-2164/9/311#abs&lt;br /&gt;
* PubMed Central: &amp;lt;http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2481270/&amp;gt;&lt;br /&gt;
* Publisher Full Text (HTML):  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2481270/&lt;br /&gt;
* Publisher Full Text (PDF): &amp;lt;http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2481270/pdf/1471-2164-9-311.pdf&amp;gt;&lt;br /&gt;
* Copyright:  © 2008 King et al; licensee BioMed Central Ltd.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.&lt;br /&gt;
* Publisher: BMC Genomics&lt;br /&gt;
* Availability: online access&lt;br /&gt;
* Did LMU pay a fee for this article: no&lt;br /&gt;
&lt;br /&gt;
#8&lt;br /&gt;
&lt;br /&gt;
*Nakamura, M. M., Liew, S.-Y., Cummings, C. A., Brinig, M. M., Dieterich, C., &amp;amp; Relman, D. A. (2006). Growth Phase- and Nutrient Limitation-Associated Transcript Abundance Regulation in Bordetella pertussis  . Infection and Immunity, 74(10), 5537–5548. http://doi.org/10.1128/IAI.00781-06&lt;br /&gt;
* PubMed Abstract:  &amp;lt;http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1594893/?report=reader#__abstractid499869title&amp;gt;&lt;br /&gt;
* PubMed Central: &amp;lt;http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1594893/&amp;gt;&lt;br /&gt;
* Publisher Full Text (HTML): &amp;lt;http://iai.asm.org/content/74/10/5537.full&amp;gt;&lt;br /&gt;
* Publisher Full Text (PDF):  &amp;lt;http://iai.asm.org/content/74/10/5537.full.pdf+html&amp;gt;&lt;br /&gt;
* Copyright: © 2006, American Society for Microbiology&lt;br /&gt;
* Publisher: Infection and Immunity&lt;br /&gt;
* Availability: online access&lt;br /&gt;
* Did LMU pay a fee for this article: no!--&amp;gt;&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8193</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8193"/>
				<updated>2015-12-19T03:20:18Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: updated number&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species: [[Media:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* ReadMe file to accompany the Gene Database: [[Media:ReadMe bpertussis-std cw20151210.docx]]&lt;br /&gt;
** [[Media:Bpertussis genedatabase schema cw20151210.jpg|Gene Database Schema diagram (also included in ReadMe)]]&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database: [[Media:Gdb testingreport cw20151210.pdf]]&lt;br /&gt;
* Processed and analyzed DNA microarray dataset: [[Media:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP: [[Media:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file: [[Media:Bpertussis compiledrawdata cw20151218.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP: [[Media:Bpertussis compiledrawdata cw20151218.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files: &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO.txt|Increased]]&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO.txt|Decreased]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[Media:Bpertussis compiledrawdata cw20151218.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results:&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO - Filtered.xlsx|Increased]] &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO - Filtered.xlsx|Decreased]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species: [[Media: Bpertussis ribosomepathway cw20151218.mapp]]&lt;br /&gt;
* Group Report: [[File:Bpertussis groupreport cw20151218.pdf]] &amp;#039;&amp;#039;version 2&amp;#039;&amp;#039;&lt;br /&gt;
* PowerPoint presentation: [[Media:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_groupreport_cw20151218.pdf&amp;diff=8192</id>
		<title>File:Bpertussis groupreport cw20151218.pdf</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_groupreport_cw20151218.pdf&amp;diff=8192"/>
				<updated>2015-12-19T03:17:14Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:Bpertussis groupreport cw20151218.pdf&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;group report for bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8191</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8191"/>
				<updated>2015-12-19T00:30:36Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: added group report version 1&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species: [[Media:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* ReadMe file to accompany the Gene Database: [[Media:ReadMe bpertussis-std cw20151210.docx]]&lt;br /&gt;
** [[Media:Bpertussis genedatabase schema cw20151210.jpg|Gene Database Schema diagram (also included in ReadMe)]]&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database: [[Media:Gdb testingreport cw20151210.pdf]]&lt;br /&gt;
* Processed and analyzed DNA microarray dataset: [[Media:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP: [[Media:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file: [[Media:Bpertussis compiledrawdata cw20151218.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP: [[Media:Bpertussis compiledrawdata cw20151218.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files: &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO.txt|Increased]]&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO.txt|Decreased]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[Media:Bpertussis compiledrawdata cw20151218.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results:&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO - Filtered.xlsx|Increased]] &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO - Filtered.xlsx|Decreased]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species: [[Media: Bpertussis ribosomepathway cw20151218.mapp]]&lt;br /&gt;
* [[Media:Bpertussis groupreport cw20151218.pdf| Group Report]] &amp;#039;&amp;#039;version 1&amp;#039;&amp;#039;&lt;br /&gt;
* PowerPoint presentation: [[Media:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_groupreport_cw20151218.pdf&amp;diff=8190</id>
		<title>File:Bpertussis groupreport cw20151218.pdf</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_groupreport_cw20151218.pdf&amp;diff=8190"/>
				<updated>2015-12-19T00:29:36Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: group report for bordetella pertussis&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;group report for bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8166</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8166"/>
				<updated>2015-12-18T22:39:45Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: changed file to media&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species: [[Media:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* ReadMe file to accompany the Gene Database: [[Media:ReadMe bpertussis-std cw20151210.docx]]&lt;br /&gt;
** [[Media:Bpertussis genedatabase schema cw20151210.jpg|Gene Database Schema diagram (also included in ReadMe)]]&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database: [[Media:Gdb testingreport cw20151210.pdf]]&lt;br /&gt;
* Processed and analyzed DNA microarray dataset: [[Media:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP: [[Media:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file: [[Media:Bpertussis compiledrawdata cw20151218.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP: [[Media:Bpertussis compiledrawdata cw20151218.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files: &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO.txt|Increased]]&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO.txt|Decreased]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[Media:Bpertussis compiledrawdata cw20151218.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results:&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO - Filtered.xlsx|Increased]] &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO - Filtered.xlsx|Decreased]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species: [[Media: Bpertussis ribosomepathway cw20151218.mapp]]&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;[[Gene Database Project Deliverables#Group Report | Group Report]] describing the creation of the Gene Database and the biological analysis (&amp;#039;&amp;#039;.doc&amp;#039;&amp;#039;, &amp;#039;&amp;#039;.docx&amp;#039;&amp;#039;, or &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* PowerPoint presentation: [[Media:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Gdb_testingreport_cw20151210.pdf&amp;diff=8165</id>
		<title>File:Gdb testingreport cw20151210.pdf</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Gdb_testingreport_cw20151210.pdf&amp;diff=8165"/>
				<updated>2015-12-18T22:39:22Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:Gdb testingreport cw20151210.pdf&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Final Gene Database Testing Report, .pdf version&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8164</id>
		<title>Gene Database Testing Report- cw20151210</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8164"/>
				<updated>2015-12-18T22:36:16Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: added cropped file&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Files Asked for in the Gene Database Testing Report==&lt;br /&gt;
&lt;br /&gt;
For convenience, all of the files explicitly asked for in the sections below were compressed together in this file: [[File:Testingreport cw20151210.zip]]&lt;br /&gt;
&lt;br /&gt;
==Pre-requisites==&lt;br /&gt;
&lt;br /&gt;
The following set of software was used in the creation and testing of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database:&lt;br /&gt;
# [http://www.7-zip.org/ 7-zip]tool that for unpacking .gz and .zip files&lt;br /&gt;
# [http://www.postgresql.org PostgreSQL] on Windows (version 9.4.x)&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ GenMAPP Builder]&lt;br /&gt;
# Java JDK 1.8 64-bit&lt;br /&gt;
# [https://github.com/GenMAPPCS/genmapp GenMAPP 2]&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ XMLPipeDB match utility] for counting IDs in XML files&lt;br /&gt;
# Microsoft Access for reading .mdb files&lt;br /&gt;
&lt;br /&gt;
==Gene Database Creation==&lt;br /&gt;
===Downloading Data Source Files and GenMAPP Builder===&lt;br /&gt;
&lt;br /&gt;
*We download the UniProt XML, GOA, and GO OBO-XML files for &amp;#039;&amp;#039;Bordetella Pertussis&amp;#039;&amp;#039; along with the GenMAPP Builder program.&lt;br /&gt;
**All files were saved to the folder &amp;#039;&amp;#039;Bklein7_CW\bpertussis_cw20151210&amp;#039;&amp;#039; on our computer&amp;#039;s ThawSpace.&lt;br /&gt;
**Files that required extraction were unzipped using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
**Data files that remained in a folder after unzipping were removed from their folders to facilitate organization and command line processing.&lt;br /&gt;
&lt;br /&gt;
====UniProt XML====&lt;br /&gt;
&lt;br /&gt;
* We went to the [http://www.uniprot.org/taxonomy/complete-proteomes UniProt Complete Proteomes] page.&lt;br /&gt;
**From there, we navigated to the complete proteome download page for [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)].&lt;br /&gt;
** We clicked on the &amp;quot;Download&amp;quot; button at the top of the page above and selected the following options:&lt;br /&gt;
***&amp;quot;Download all&amp;quot;&lt;br /&gt;
***&amp;quot;XML&amp;quot; from the &amp;quot;Format&amp;quot; drop-down menu&lt;br /&gt;
***&amp;quot;Compressed&amp;quot; format&lt;br /&gt;
**We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====GOA====&lt;br /&gt;
&lt;br /&gt;
* UniProt-GOA files can be downloaded from the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/ UniProt-GOA ftp site].&lt;br /&gt;
*Within the above site, we navigated to the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I].&lt;br /&gt;
**This text file was automatically opened by the browser. Therefore, we had to manually download the file.&lt;br /&gt;
&lt;br /&gt;
====GO OBO-XML====&lt;br /&gt;
&lt;br /&gt;
* We downloaded the GO OBO-XML formatted file from the [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page].&lt;br /&gt;
* We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====Downloaded GenMAPP Builder====&lt;br /&gt;
&lt;br /&gt;
# We downloaded the custom version of GenMAPP Builder including the most recent version of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; custom class (Version 3.0.0 Build 5 - cw20151210): [[File:Dist cw20151210.zip]].&lt;br /&gt;
# We extracted the GenMAPP Builder folder using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
===Creating the New Database in PostgreSQL===&lt;br /&gt;
&lt;br /&gt;
* We launched &amp;#039;&amp;#039;pgAdmin III&amp;#039;&amp;#039; and connected to the PostgreSQL 9.4 server (localhost:5432).&lt;br /&gt;
** On this server, we created a new database: &amp;#039;&amp;#039;bpertussis_cw20151210_gmb3build5&amp;#039;&amp;#039;.&lt;br /&gt;
** We opened the SQL Editor tab to use an XMLPipeDB query to create the tables in the database.&lt;br /&gt;
*** We clicked on the Open File icon and selected the file &amp;#039;&amp;#039;gmbuilder.sql&amp;#039;&amp;#039;. This imported a series of SQL commands into the editor tab.&lt;br /&gt;
*** We clicked on the Execute Query icon to run this command.&lt;br /&gt;
***In viewing the schema for this database, we confirmed that there were 167 tables after running the above command.&lt;br /&gt;
&lt;br /&gt;
===Configuring GenMAPP Builder to Connect to the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
* To begin, we launched gmbuilder.bat.&lt;br /&gt;
* We selected the &amp;quot;Configure Database&amp;quot; option and entered the following information into the fields below:&lt;br /&gt;
** Host or address: localhost&lt;br /&gt;
** Port number: 5432&lt;br /&gt;
** Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
** Username: postgres&lt;br /&gt;
** Password: Welcome1&lt;br /&gt;
&lt;br /&gt;
===Importing Data into the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
*The downloaded data files for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; were specified and imported into the database by clicking on the following buttons:&lt;br /&gt;
** Selected File &amp;gt; Import UniProt XML...&lt;br /&gt;
** Selected File &amp;gt; Import GO OBO-XML...&lt;br /&gt;
** Clicked OK to the message asking to process the GO data.&lt;br /&gt;
** Selected File &amp;gt; Import GOA...&lt;br /&gt;
&lt;br /&gt;
===Exporting a GenMAPP Gene Database (.gdb)===&lt;br /&gt;
&lt;br /&gt;
* We selected File &amp;gt; Export to GenMAPP Gene Database... to begin the export process.&lt;br /&gt;
* We typed in our coder&amp;#039;s name in the owner field (Brandon Klein).&lt;br /&gt;
* We selected the custom profile &amp;quot;Bordetella pertussis, Taxon ID 257313&amp;quot; as the gene database species and then clicked &amp;#039;&amp;#039;Next&amp;#039;&amp;#039;.&lt;br /&gt;
* The database was saved as &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;.&lt;br /&gt;
* We checked the boxes for exporting all Molecular Function, Cellular Component, and Biological Process Gene Ontology Terms.&lt;br /&gt;
* Finally, we clicked the &amp;quot;Next&amp;quot; button to begin the export process.&lt;br /&gt;
&lt;br /&gt;
==Gene Database Testing Report==&lt;br /&gt;
===Export Information===&lt;br /&gt;
&lt;br /&gt;
Version of GenMAPP Builder: Version 3.0.0 Build 5 - cw20151210&lt;br /&gt;
&lt;br /&gt;
Computer on which export was run: Seaver 120- Last computer on the right in the row farthest from the front of the room&lt;br /&gt;
&lt;br /&gt;
Postgres Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
&lt;br /&gt;
UniProt XML filename: [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
* UniProt XML version (The version information was found at [http://uniprot.org/news the UniProt News Page]): 2015_12&lt;br /&gt;
* UniProt XML download link: [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)]&lt;br /&gt;
* Time taken to import: 2.88 minutes&lt;br /&gt;
** Note: The import time was similar to that when creating the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (2.59 minute). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
GO OBO-XML filename: [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
* GO OBO-XML version (The version information was found in the file properties): Last Modified- ‎‎ ‎December ‎10, ‎2015 (TIME?)&lt;br /&gt;
* GO OBO-XML download link: [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page]&lt;br /&gt;
* Time taken to import: 6.97 minutes &lt;br /&gt;
* Time taken to process: 4.52 minutes&lt;br /&gt;
** Note: The import and processing times were similar to those for the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (7.08 minutes and 4.42 minutes respectively). No interruptions occurred during these processes.&lt;br /&gt;
&lt;br /&gt;
GOA filename: [[File:145.B pertussis ATCC BAA-589 cw20151210.zip]]&lt;br /&gt;
* GOA version (found in the Last modified field on the [ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/ FTP site]): Last Modified- 08-Dec-2015 02:45&lt;br /&gt;
* GOA download link: [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I]&lt;br /&gt;
* Time taken to import: 0.03 minutes&lt;br /&gt;
** Note: The import time was very similar to that of the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (0.04 minutes). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
Name of .gdb file: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* Time taken to export: &lt;br /&gt;
** Start time: 1:19 AM&lt;br /&gt;
** End time: 2:11 AM&lt;br /&gt;
** Elapsed time: 52 minutes&lt;br /&gt;
Note: No interruptions occurred during the export process.&lt;br /&gt;
&lt;br /&gt;
===TallyEngine===&lt;br /&gt;
* We ran the TallyEngine in GenMAPP Builder and specified the following files:&lt;br /&gt;
**XML- [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
**GO- [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
*Results:&lt;br /&gt;
**[[File:TallyEngineResults cw20151210.png]]&lt;br /&gt;
***All TallyEngine results were consistent across both files.&lt;br /&gt;
***The TallyEngine was not customized to reflect the coding changes made to GenMAPP Builder Version 3.0.0 Build 5 - cw20151210.&lt;br /&gt;
****Therefore, the total count for &amp;quot;Ordered Locus Names&amp;quot; and &amp;quot;ORF&amp;quot; gene IDs remained 3446. The extra ID that was imported in this build, &amp;quot;BP3167A&amp;quot;, was not listed in either of these categories.&lt;br /&gt;
****&amp;#039;&amp;#039;&amp;#039;Further TallyEngine customization is necessary to raise the count to 3447 gene IDs.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Using XMLPipeDB Match to Validate the XML Results from the TallyEngine===&lt;br /&gt;
The following functions were performed using the Windows command line (cmd).&lt;br /&gt;
*We entered the project folder using the following command:&lt;br /&gt;
 cd /d T:\Bklein7_CW\bpertussis_cw20151210&lt;br /&gt;
*We used XMLPipeDB match to identify matches of gene IDs in the UniProt XML file that conformed to the following the patterns: &amp;quot;BP####&amp;quot;, &amp;quot;BP####.1&amp;quot;, &amp;quot;BP####A&amp;quot;, and &amp;quot;BP####B&amp;quot;. The command used was as follows:&lt;br /&gt;
 java -jar xmlpipedb-match-1.1.1.jar &amp;quot;BP[0-9][0-9][0-9][0-9](A|B|\.1|)&amp;quot; &amp;lt; &amp;quot;uniprot-proteome%3AUP000002676_cw20151201.xml&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Match Results:&lt;br /&gt;
*[[File:Xmlpipedbmatch cw20151203.png]]&lt;br /&gt;
**The number of unique matches generated by XMLPipeDB Match, 3447, matched with our expectation. The count includes the total number of ordered locus (3435) and ORF (11) gene IDs along with the unique EnsemblBacteria reference ID &amp;quot;BP3167A&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
===Using SQL Queries to Validate the PostgreSQL Database Results from the TallyEngine===&lt;br /&gt;
We used the SQL &amp;quot;union&amp;quot; operation to count the number of &amp;quot;ordered locus&amp;quot; gene IDs, which conform to the pattern &amp;quot;BP####&amp;quot;, in addition to all gene IDs that matched the patterns &amp;quot;BP####A&amp;quot; &amp;amp; &amp;quot;BP####B&amp;quot; (including 11 &amp;quot;ORF&amp;quot; gene IDs and 1 EnsemblBacteria reference ID):&lt;br /&gt;
&lt;br /&gt;
 select count(value) from (select value from genenametype where type = &lt;br /&gt;
 &amp;#039;ordered locus&amp;#039; union select value from propertytype inner join dbreferencetype&lt;br /&gt;
  on (propertytype.dbreferencetype_property_hjid = dbreferencetype.hjid)&lt;br /&gt;
   where dbreferencetype.type = &amp;#039;EnsemblBacteria&amp;#039; and propertytype.type = &lt;br /&gt;
   &amp;#039;gene ID&amp;#039; and propertytype.value ~ &amp;#039;BP[0-9][0-9][0-9][0-9](A|B)&amp;#039;) as combined;&lt;br /&gt;
&lt;br /&gt;
Note: This query was crafted by [[User:Dondi|Dr. Dionisio]].&lt;br /&gt;
&lt;br /&gt;
Results:&lt;br /&gt;
*[[File:PostgreSQL Count cw20151210.png]]&lt;br /&gt;
* The number of unique matches yielded by this SQL query, 3447, matched the count generated by XMLPipeDB Match. Thus, the locations of all 3447 gene IDs in the PostgreSQL relational database were accounted for here.&lt;br /&gt;
&lt;br /&gt;
===OriginalRowCounts Comparison===&lt;br /&gt;
&lt;br /&gt;
We opened the gene database file [[File:Bpertussis-std_cw20151210.zip]] in  Microsoft Access and assessed the &amp;quot;OriginalRowCounts&amp;quot; table to see if the expected tables were listed with the expected number of records. The contents of this table were compared to the &amp;#039;&amp;#039;OriginalRowCounts&amp;#039;&amp;#039; table of an existing .gdb file created during Week 9.&lt;br /&gt;
 &lt;br /&gt;
Benchmark .gdb file: [[File:Vc-Std 20151027 TR.gdb]]&lt;br /&gt;
&lt;br /&gt;
&amp;quot;OriginalRowCounts&amp;quot; table from the benchmark and new gdb:&lt;br /&gt;
*[[File:ComparisonToBenchmark cw20151210.PNG]]&lt;br /&gt;
**All 52 tables present in the 2015 &amp;#039;&amp;#039;Vibrio cholerae&amp;#039;&amp;#039; database were also present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; gene database, &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;. This confirmed that all expected tables were successfully created.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; table count is listed as 3447. &amp;#039;&amp;#039;&amp;#039;This count demonstrates that the missing ID, &amp;quot;BP3167A&amp;quot;, was successfully added to the export (confirmed below).&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
***[[File:BP3167A Confirmed cw20151210.PNG]]&lt;br /&gt;
&lt;br /&gt;
Note: The &amp;quot;OriginalRowCounts&amp;quot; tables were too large to screenshot. To circumvent this problem and facilitate the comparison, I copied the &amp;quot;OriginalRowCounts&amp;quot; tables from both gene databases into an Excel file and zoomed out. The above screenshot was taken from this Excel file. The &amp;quot;OrderedLocusNames&amp;quot; row count for &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039; is highlighted in yellow.&lt;br /&gt;
&lt;br /&gt;
===Visual Inspection===&lt;br /&gt;
We visually inspected individual tables within [[File:Bpertussis-std_cw20151210.zip]] using Microsoft Access.&lt;br /&gt;
&lt;br /&gt;
*Systems Table&lt;br /&gt;
**35 gene ID systems were listed, 11 of which were used in the creation of this .gdb file and listed the appropriate import date (12/10/2015).&lt;br /&gt;
***All gene ID systems relevant to &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; were listed. This includes: EMBL, EnsemblBacteria, GeneID, GeneOntology, InterPro, OrderedLocusNames, Pfam, RefSeq, and UniProt.&lt;br /&gt;
***This result corresponded with that of the benchmark .gdb file listed in the &amp;quot;OriginalRowCounts Comparison&amp;quot; section.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; listing properly displayed customizations to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; species profile.&lt;br /&gt;
***In this row, the species was listed correctly as &amp;quot;Bordetella pertussis&amp;quot;.&lt;br /&gt;
***In this row, the link corresponded to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; database at GeneDB. The link was as follows: http://www.genedb.org/gene/~;jsessionid=A06A0EFE93C64E476380393D4CBEFA69?actionName=%2FQuery%2FquickSearch&amp;amp;resultsSize=1&amp;amp;taxonNodeName=Bpertussis.&lt;br /&gt;
*UniProt Table&lt;br /&gt;
**This table contained 3258 entries with 6 character IDs.&lt;br /&gt;
**All ID&amp;#039;s in the UniProt table conform to the following pattern:&lt;br /&gt;
*** [[File:UniProt Ascension Number info.PNG]]&lt;br /&gt;
*RefSeq Table&lt;br /&gt;
**This table contained 6627 entries. All IDs began with one of three prefixes: &amp;quot;NP_&amp;quot;, &amp;quot;YP_&amp;quot;, or &amp;quot;WP_&amp;quot;. The meanings of these prefixes can be found in the RefSeq documentation found [http://www.ncbi.nlm.nih.gov/books/NBK50679/ here].&lt;br /&gt;
***&amp;quot;NP_&amp;quot; and &amp;quot;YP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to proteins. There are 3410 &amp;quot;NP_&amp;quot; IDs and 7 &amp;quot;YP_&amp;quot; IDs.&lt;br /&gt;
***&amp;quot;WP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to &amp;quot;autonomous non-redundant proteins that are not yet directly annotated on a genome&amp;quot;. There were 3210 IDs with the &amp;quot;WP_&amp;quot; prefixes.&lt;br /&gt;
***Overall, every entry in the ID column was an expected value.&lt;br /&gt;
*OrderedLocusNames Table&lt;br /&gt;
**This table contained 3447 entries (consistent with the XMLPipeDB Match result).&lt;br /&gt;
**The IDs were copied into an Excel document for analysis:&lt;br /&gt;
***3434 IDs conformed to the pattern &amp;quot;BP####&amp;quot;.&lt;br /&gt;
***11 IDs conformed to the pattern &amp;quot;BP####A&amp;quot;.&lt;br /&gt;
****This included 10 ORF gene IDs &amp;amp; &amp;quot;BP3167A&amp;quot; (reference to an EnsemblBacteria ID).&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####B&amp;quot;.&lt;br /&gt;
****This corresponded to an ORF gene ID.&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####.1&amp;quot;.&lt;br /&gt;
****This ID was the manner in which UniProt classified &amp;quot;BP3167A&amp;quot;. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==bpertussis-std_cw20151210.gdb Use in GenMAPP==&lt;br /&gt;
&lt;br /&gt;
The following analysis was conducted in GenMAPP Version 2.1. Within GenMAPP, the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database was loaded by selecting Data &amp;gt; Choose Gene Database and then selecting the file &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
===Putting a Gene on the MAPP Using the GeneFinder Window===&lt;br /&gt;
&lt;br /&gt;
We made a sample MAPP in which gene IDs conforming to the naming conventions of the 5 major gene databases containing &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; genome data were added. A screenshot of the resulting MAPP is provided below:&lt;br /&gt;
*[[File:Samplegenemapp.png]]&lt;br /&gt;
*Gene IDs:&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;bp1123&amp;#039;&amp;#039;&amp;#039; refers to the OrderedLocusNames gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;CAE43716&amp;#039;&amp;#039;&amp;#039; refers to the EmsemblBacteria gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Q7VWE&amp;#039;&amp;#039;&amp;#039;5 refers to the UniProt gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;2665491&amp;#039;&amp;#039;&amp;#039; refers to the GeneID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;NP_881255&amp;#039;&amp;#039;&amp;#039; refers to the RefSeq gene ID system.&lt;br /&gt;
&lt;br /&gt;
Note: Gene IDs tested from the above gene ID systems all had complete Backpages and were successfully placed on the MAPP.&lt;br /&gt;
&lt;br /&gt;
===Creating an Expression Dataset in the Expression Dataset Manager===&lt;br /&gt;
The file [[File:Bpertussis compiledrawdata cw20151208.txt]] was used to create an expression dataset in GenMAPP.&lt;br /&gt;
&lt;br /&gt;
*Total Number of Gene IDs Imported&lt;br /&gt;
** 3211 of the 3552 gene IDs from the microarray dataset were imported into the expression dataset.&lt;br /&gt;
**There were 341 exceptions during the creation of the expression dataset. A screenshot of the error message is shown here: &lt;br /&gt;
***[[File:Errors in genmapp.png]]&lt;br /&gt;
*Investigating Errors in the Exceptions File (EX.txt)&lt;br /&gt;
**All 341 exceptions triggered the following error message: &amp;quot;Gene not found in OrderedLocusNames or any related system.&amp;quot;&lt;br /&gt;
**Gene IDs that triggered this error message conformed to the patterns &amp;quot;BP####&amp;quot; and &amp;quot;BP####A&amp;quot;, indicating that no unique gene ID patterns were the cause of these errors.&lt;br /&gt;
***Example gene IDs that triggered this error are the following: BP0101, BP1677, BP0910A, and BP2029A.&lt;br /&gt;
****Searching for any of these gene IDs in UniProt returns the message &amp;quot;Sorry, no results found for your search term.&amp;quot;:&lt;br /&gt;
*****[[File:ErroneousID Uniprot cw20151210.PNG]]&lt;br /&gt;
***The 341 gene IDs were copied into a new Excel file and compared to the gene IDs present in the file [[File:Bpertussis-std_cw20151210.zip]] (adapted from the &amp;quot;OrderedLocusNames&amp;quot; table in Microsoft Access).&lt;br /&gt;
****None of the 341 gene IDs were present in the .gdb file.&lt;br /&gt;
***The 341 gene IDs were each individually searched for in UniProt.&lt;br /&gt;
****None of the 341 gene IDs retrieved results in UniProt.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Conclusion: All gene IDs that triggered errors were not present in the original UniProt XML file.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Coloring a MAPP with Expression Data===&lt;br /&gt;
&lt;br /&gt;
====Creating a New Color Set====&lt;br /&gt;
We customized the new Expression Dataset by creating a new color set entitled &amp;quot;LogFoldChange&amp;quot;.&lt;br /&gt;
# We created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;increase&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Increased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as red using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;gt; 0.25 AND [B-H_Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;.&lt;br /&gt;
#Second, we created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;decrease&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Decreased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as green using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;lt; -0.25 AND [B-H_Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;&lt;br /&gt;
# Upon entering these color sets, we saved the entire Expression Dataset by selecting Save from the Expression Dataset menu. This effectively updated our .gex file with the new Color Set.&lt;br /&gt;
&lt;br /&gt;
Screenshot of Color Set criteria:&lt;br /&gt;
*[[File:Expression dataset BHpvalue criteria.png]]&lt;br /&gt;
&lt;br /&gt;
Note: No errors were encountered in the creation of the Color Set.&lt;br /&gt;
&lt;br /&gt;
====Creating a Pathway-Based MAPP Using Colored Genes====&lt;br /&gt;
====Ribosome Kegg Pathway====&lt;br /&gt;
* We were able to create a mapp of the ribosome pathway by using the genes provided from the http://www.genome.jp/kegg/ website.&lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Ribosome&amp;quot; that was under section 2.2 Translation and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the ribosome pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the ribosome pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151218&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
**Here is the picture of the final mapp for the ribosome pathway created:&lt;br /&gt;
*[[File: Bpertussis ribosomepathway cw20151218.jpg]]&lt;br /&gt;
** Most of the ribosome genes that were generated on this mapp appeared to be the color green, symbolizing a decrease, except for the grey colored genes that were not significantly changed in this experiment. Since the genes mapped for the ribosome pathway all appeared to be green, this means that the expression levels of the genes pertaining to the ribosome category all decreased during the microarray experiment. Ribosomes play a key role in the translation process in cells and without them genes are often repressed and unable to perform their proper functions as they are unable to complete the replication processes. The microarray experiment analysis revealed that the absence of a membrane-associated protein named KpsT in B. pertussis, resulted in global down-regulation of gene expression including key virulence genes. The ribosome pathway depicted genes that were decreasing in gene expression, thus linking the translation process to the down-regulated key genes from the experiment because since these genes were lacking a necessary protein to help them perform the proper replication processes, translation did not occur in these genes and thus the ribosomes were not involved, ultimately leading to the decrease in expression of the genes mapped in the ribosome pathway.&lt;br /&gt;
&lt;br /&gt;
====Nitrogen Cycle Kegg Pathway====&lt;br /&gt;
* We were also able to create another mapp using the nitrogen cycle pathway genes provided from the http://www.genome.jp/kegg/ website. &lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Nitrogen Metabolism&amp;quot; that was under section 1.2 Energy Metabolism and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the nitrogen metabolism pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the nitrogen metabolism pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151218&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
** Here is the picture of the final mapp for the nitrogen cycle pathway created:&lt;br /&gt;
*[[File:Finalnitrogencyclebpertussis cw20151218.jpg]]&lt;br /&gt;
** This mapp displayed both red and green colored genes; the green highlighted genes symbolizing a decrease and the red highlighted genes symbolizing an increase, as well a couple of gray genes that were not significant to the criterion. This nitrogen cycle mapp was created due to the important metabolic processes that occur in order to keep cells alive and reproducing, and specifically the nitrogen metabolism cycle. The genes that displayed red in this mapp had increased expression during the microarray experiment, and from the kegg pathway given for nitrogen metabolism, these genes can be seen to specifically aid in the metabolism of glutamate. Glutamate is important to cells as it plays a role in providing energy to allow the cells to operate correctly, and since the glutamate-related genes that we mapped were increased, it can be determined that glutamate plays a role in supplying the underlying energy to allow for the Bordetella pertussis strains to produce the polysaccharide capsule transport proteins, as studied in the microarray experiment.&lt;br /&gt;
&lt;br /&gt;
===Running MAPPFinder===&lt;br /&gt;
*MAPPFinder Procedure&lt;br /&gt;
** We launched the MAPPFinder program from within GenMAPP and ensured that the &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039; gene database was still loaded into GenMAPP.&lt;br /&gt;
** We clicked on the button &amp;quot;Calculate New Results&amp;quot; followed by &amp;quot;Find File&amp;quot;, at which point I specified the .gex file updated during the creation of the &amp;quot;LogFoldChange&amp;quot; color set.&lt;br /&gt;
** We chose to apply both the &amp;quot;Increased&amp;quot; and &amp;quot;Decreased&amp;quot; criteria present within the LogFoldChange color set to the data.&lt;br /&gt;
** We checked the boxes next to &amp;quot;Gene Ontology&amp;quot; and &amp;quot;p value&amp;quot;, specified the results file, and then clicked &amp;quot;Run MAPPFinder&amp;quot;.&lt;br /&gt;
***This analysis took several minutes to complete.&lt;br /&gt;
*MAPPFinder Analysis Results&lt;br /&gt;
**We selected &amp;quot;Show Ranked List&amp;quot; to see a list of the most significant Gene Ontology terms. A screenshot of this output is shown below:&lt;br /&gt;
**[[File:GeneontologyresultsBHpvalue.png]]&lt;br /&gt;
***The majority of the most significant gene ontology terms pertained to ribosome biosynthesis and translation.&lt;br /&gt;
&lt;br /&gt;
Note: The MAPPFinder analysis took approximately 8 minutes to complete. No errors were encountered in the process. MAPPFinder thus was confirmed to work with the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database.&lt;br /&gt;
&lt;br /&gt;
=== Compare Gene Database to Outside Resource===&lt;br /&gt;
&lt;br /&gt;
To assess the completeness of this version of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database, we explored the original genome sequencing data from Parkhill et al. (2003) that was deposited at the [http://www.genedb.org/Homepage/Bpertussis GeneDB Model Organism Database (MOD)]. From the GeneDB Home Page, we accessed a &amp;#039;&amp;#039;Gene Type&amp;#039;&amp;#039; search function that was used to quantify the number of gene listings present under each provided gene category. The results of this investigation are presented below.&lt;br /&gt;
&lt;br /&gt;
====Protein-Coding Genes====&lt;br /&gt;
[[File:GDB protein-coding.png]]&lt;br /&gt;
*There are 3447 protein-coding genes present in the [http://www.genedb.org/Homepage/Bpertussis GeneDB] database. This result verified that the set of protein-coding genes exported into [[File:Bpertussis-std cw20151210.zip]] from UniProt is complete. No further changes to the gene database export procedures are necessary at this time.&lt;br /&gt;
&lt;br /&gt;
====Non-Protein Genome Features====&lt;br /&gt;
&lt;br /&gt;
#Pseudogenes&lt;br /&gt;
#*[[File:GDB_pseudogenes.png]]&lt;br /&gt;
#**GeneDB indicated that 359 pseudogenes are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. Pseudogenes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#rRNA&lt;br /&gt;
#*[[File:GDB_rRNA.png]]&lt;br /&gt;
#**GeneDB indicated that 9 genes that encode for rRNA are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. These genes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#tRNA&lt;br /&gt;
#*[[File:GDB_tRNA.png]]&lt;br /&gt;
#**GeneDB indicated that 51 genes that encode for tRNA are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. These genes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#snoRNA&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for snoRNA.&lt;br /&gt;
#snRNA&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for snRNA.&lt;br /&gt;
#&amp;quot;miscRNA&amp;quot;&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for &amp;quot;miscRNA&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;A total of 419 non-protein coding genes were identified in the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; genome in addition to the 3447 protein-coding genes captured in our gene database.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Finalnitrogencyclebpertussis_cw20151218.jpg&amp;diff=8163</id>
		<title>File:Finalnitrogencyclebpertussis cw20151218.jpg</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Finalnitrogencyclebpertussis_cw20151218.jpg&amp;diff=8163"/>
				<updated>2015-12-18T22:35:33Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:Finalnitrogencyclebpertussis cw20151218.jpg&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;nitrogen cycle mapp with Bhpvalue criteria&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Gdb_testingreport_cw20151210.pdf&amp;diff=8162</id>
		<title>File:Gdb testingreport cw20151210.pdf</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Gdb_testingreport_cw20151210.pdf&amp;diff=8162"/>
				<updated>2015-12-18T22:31:35Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:Gdb testingreport cw20151210.pdf&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Final Gene Database Testing Report, .pdf version&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8161</id>
		<title>Gene Database Testing Report- cw20151210</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8161"/>
				<updated>2015-12-18T22:29:28Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: edited media&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Files Asked for in the Gene Database Testing Report==&lt;br /&gt;
&lt;br /&gt;
For convenience, all of the files explicitly asked for in the sections below were compressed together in this file: [[File:Testingreport cw20151210.zip]]&lt;br /&gt;
&lt;br /&gt;
==Pre-requisites==&lt;br /&gt;
&lt;br /&gt;
The following set of software was used in the creation and testing of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database:&lt;br /&gt;
# [http://www.7-zip.org/ 7-zip]tool that for unpacking .gz and .zip files&lt;br /&gt;
# [http://www.postgresql.org PostgreSQL] on Windows (version 9.4.x)&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ GenMAPP Builder]&lt;br /&gt;
# Java JDK 1.8 64-bit&lt;br /&gt;
# [https://github.com/GenMAPPCS/genmapp GenMAPP 2]&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ XMLPipeDB match utility] for counting IDs in XML files&lt;br /&gt;
# Microsoft Access for reading .mdb files&lt;br /&gt;
&lt;br /&gt;
==Gene Database Creation==&lt;br /&gt;
===Downloading Data Source Files and GenMAPP Builder===&lt;br /&gt;
&lt;br /&gt;
*We download the UniProt XML, GOA, and GO OBO-XML files for &amp;#039;&amp;#039;Bordetella Pertussis&amp;#039;&amp;#039; along with the GenMAPP Builder program.&lt;br /&gt;
**All files were saved to the folder &amp;#039;&amp;#039;Bklein7_CW\bpertussis_cw20151210&amp;#039;&amp;#039; on our computer&amp;#039;s ThawSpace.&lt;br /&gt;
**Files that required extraction were unzipped using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
**Data files that remained in a folder after unzipping were removed from their folders to facilitate organization and command line processing.&lt;br /&gt;
&lt;br /&gt;
====UniProt XML====&lt;br /&gt;
&lt;br /&gt;
* We went to the [http://www.uniprot.org/taxonomy/complete-proteomes UniProt Complete Proteomes] page.&lt;br /&gt;
**From there, we navigated to the complete proteome download page for [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)].&lt;br /&gt;
** We clicked on the &amp;quot;Download&amp;quot; button at the top of the page above and selected the following options:&lt;br /&gt;
***&amp;quot;Download all&amp;quot;&lt;br /&gt;
***&amp;quot;XML&amp;quot; from the &amp;quot;Format&amp;quot; drop-down menu&lt;br /&gt;
***&amp;quot;Compressed&amp;quot; format&lt;br /&gt;
**We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====GOA====&lt;br /&gt;
&lt;br /&gt;
* UniProt-GOA files can be downloaded from the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/ UniProt-GOA ftp site].&lt;br /&gt;
*Within the above site, we navigated to the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I].&lt;br /&gt;
**This text file was automatically opened by the browser. Therefore, we had to manually download the file.&lt;br /&gt;
&lt;br /&gt;
====GO OBO-XML====&lt;br /&gt;
&lt;br /&gt;
* We downloaded the GO OBO-XML formatted file from the [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page].&lt;br /&gt;
* We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====Downloaded GenMAPP Builder====&lt;br /&gt;
&lt;br /&gt;
# We downloaded the custom version of GenMAPP Builder including the most recent version of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; custom class (Version 3.0.0 Build 5 - cw20151210): [[File:Dist cw20151210.zip]].&lt;br /&gt;
# We extracted the GenMAPP Builder folder using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
===Creating the New Database in PostgreSQL===&lt;br /&gt;
&lt;br /&gt;
* We launched &amp;#039;&amp;#039;pgAdmin III&amp;#039;&amp;#039; and connected to the PostgreSQL 9.4 server (localhost:5432).&lt;br /&gt;
** On this server, we created a new database: &amp;#039;&amp;#039;bpertussis_cw20151210_gmb3build5&amp;#039;&amp;#039;.&lt;br /&gt;
** We opened the SQL Editor tab to use an XMLPipeDB query to create the tables in the database.&lt;br /&gt;
*** We clicked on the Open File icon and selected the file &amp;#039;&amp;#039;gmbuilder.sql&amp;#039;&amp;#039;. This imported a series of SQL commands into the editor tab.&lt;br /&gt;
*** We clicked on the Execute Query icon to run this command.&lt;br /&gt;
***In viewing the schema for this database, we confirmed that there were 167 tables after running the above command.&lt;br /&gt;
&lt;br /&gt;
===Configuring GenMAPP Builder to Connect to the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
* To begin, we launched gmbuilder.bat.&lt;br /&gt;
* We selected the &amp;quot;Configure Database&amp;quot; option and entered the following information into the fields below:&lt;br /&gt;
** Host or address: localhost&lt;br /&gt;
** Port number: 5432&lt;br /&gt;
** Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
** Username: postgres&lt;br /&gt;
** Password: Welcome1&lt;br /&gt;
&lt;br /&gt;
===Importing Data into the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
*The downloaded data files for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; were specified and imported into the database by clicking on the following buttons:&lt;br /&gt;
** Selected File &amp;gt; Import UniProt XML...&lt;br /&gt;
** Selected File &amp;gt; Import GO OBO-XML...&lt;br /&gt;
** Clicked OK to the message asking to process the GO data.&lt;br /&gt;
** Selected File &amp;gt; Import GOA...&lt;br /&gt;
&lt;br /&gt;
===Exporting a GenMAPP Gene Database (.gdb)===&lt;br /&gt;
&lt;br /&gt;
* We selected File &amp;gt; Export to GenMAPP Gene Database... to begin the export process.&lt;br /&gt;
* We typed in our coder&amp;#039;s name in the owner field (Brandon Klein).&lt;br /&gt;
* We selected the custom profile &amp;quot;Bordetella pertussis, Taxon ID 257313&amp;quot; as the gene database species and then clicked &amp;#039;&amp;#039;Next&amp;#039;&amp;#039;.&lt;br /&gt;
* The database was saved as &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;.&lt;br /&gt;
* We checked the boxes for exporting all Molecular Function, Cellular Component, and Biological Process Gene Ontology Terms.&lt;br /&gt;
* Finally, we clicked the &amp;quot;Next&amp;quot; button to begin the export process.&lt;br /&gt;
&lt;br /&gt;
==Gene Database Testing Report==&lt;br /&gt;
===Export Information===&lt;br /&gt;
&lt;br /&gt;
Version of GenMAPP Builder: Version 3.0.0 Build 5 - cw20151210&lt;br /&gt;
&lt;br /&gt;
Computer on which export was run: Seaver 120- Last computer on the right in the row farthest from the front of the room&lt;br /&gt;
&lt;br /&gt;
Postgres Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
&lt;br /&gt;
UniProt XML filename: [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
* UniProt XML version (The version information was found at [http://uniprot.org/news the UniProt News Page]): 2015_12&lt;br /&gt;
* UniProt XML download link: [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)]&lt;br /&gt;
* Time taken to import: 2.88 minutes&lt;br /&gt;
** Note: The import time was similar to that when creating the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (2.59 minute). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
GO OBO-XML filename: [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
* GO OBO-XML version (The version information was found in the file properties): Last Modified- ‎‎ ‎December ‎10, ‎2015 (TIME?)&lt;br /&gt;
* GO OBO-XML download link: [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page]&lt;br /&gt;
* Time taken to import: 6.97 minutes &lt;br /&gt;
* Time taken to process: 4.52 minutes&lt;br /&gt;
** Note: The import and processing times were similar to those for the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (7.08 minutes and 4.42 minutes respectively). No interruptions occurred during these processes.&lt;br /&gt;
&lt;br /&gt;
GOA filename: [[File:145.B pertussis ATCC BAA-589 cw20151210.zip]]&lt;br /&gt;
* GOA version (found in the Last modified field on the [ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/ FTP site]): Last Modified- 08-Dec-2015 02:45&lt;br /&gt;
* GOA download link: [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I]&lt;br /&gt;
* Time taken to import: 0.03 minutes&lt;br /&gt;
** Note: The import time was very similar to that of the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (0.04 minutes). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
Name of .gdb file: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* Time taken to export: &lt;br /&gt;
** Start time: 1:19 AM&lt;br /&gt;
** End time: 2:11 AM&lt;br /&gt;
** Elapsed time: 52 minutes&lt;br /&gt;
Note: No interruptions occurred during the export process.&lt;br /&gt;
&lt;br /&gt;
===TallyEngine===&lt;br /&gt;
* We ran the TallyEngine in GenMAPP Builder and specified the following files:&lt;br /&gt;
**XML- [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
**GO- [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
*Results:&lt;br /&gt;
**[[File:TallyEngineResults cw20151210.png]]&lt;br /&gt;
***All TallyEngine results were consistent across both files.&lt;br /&gt;
***The TallyEngine was not customized to reflect the coding changes made to GenMAPP Builder Version 3.0.0 Build 5 - cw20151210.&lt;br /&gt;
****Therefore, the total count for &amp;quot;Ordered Locus Names&amp;quot; and &amp;quot;ORF&amp;quot; gene IDs remained 3446. The extra ID that was imported in this build, &amp;quot;BP3167A&amp;quot;, was not listed in either of these categories.&lt;br /&gt;
****&amp;#039;&amp;#039;&amp;#039;Further TallyEngine customization is necessary to raise the count to 3447 gene IDs.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Using XMLPipeDB Match to Validate the XML Results from the TallyEngine===&lt;br /&gt;
The following functions were performed using the Windows command line (cmd).&lt;br /&gt;
*We entered the project folder using the following command:&lt;br /&gt;
 cd /d T:\Bklein7_CW\bpertussis_cw20151210&lt;br /&gt;
*We used XMLPipeDB match to identify matches of gene IDs in the UniProt XML file that conformed to the following the patterns: &amp;quot;BP####&amp;quot;, &amp;quot;BP####.1&amp;quot;, &amp;quot;BP####A&amp;quot;, and &amp;quot;BP####B&amp;quot;. The command used was as follows:&lt;br /&gt;
 java -jar xmlpipedb-match-1.1.1.jar &amp;quot;BP[0-9][0-9][0-9][0-9](A|B|\.1|)&amp;quot; &amp;lt; &amp;quot;uniprot-proteome%3AUP000002676_cw20151201.xml&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Match Results:&lt;br /&gt;
*[[File:Xmlpipedbmatch cw20151203.png]]&lt;br /&gt;
**The number of unique matches generated by XMLPipeDB Match, 3447, matched with our expectation. The count includes the total number of ordered locus (3435) and ORF (11) gene IDs along with the unique EnsemblBacteria reference ID &amp;quot;BP3167A&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
===Using SQL Queries to Validate the PostgreSQL Database Results from the TallyEngine===&lt;br /&gt;
We used the SQL &amp;quot;union&amp;quot; operation to count the number of &amp;quot;ordered locus&amp;quot; gene IDs, which conform to the pattern &amp;quot;BP####&amp;quot;, in addition to all gene IDs that matched the patterns &amp;quot;BP####A&amp;quot; &amp;amp; &amp;quot;BP####B&amp;quot; (including 11 &amp;quot;ORF&amp;quot; gene IDs and 1 EnsemblBacteria reference ID):&lt;br /&gt;
&lt;br /&gt;
 select count(value) from (select value from genenametype where type = &lt;br /&gt;
 &amp;#039;ordered locus&amp;#039; union select value from propertytype inner join dbreferencetype&lt;br /&gt;
  on (propertytype.dbreferencetype_property_hjid = dbreferencetype.hjid)&lt;br /&gt;
   where dbreferencetype.type = &amp;#039;EnsemblBacteria&amp;#039; and propertytype.type = &lt;br /&gt;
   &amp;#039;gene ID&amp;#039; and propertytype.value ~ &amp;#039;BP[0-9][0-9][0-9][0-9](A|B)&amp;#039;) as combined;&lt;br /&gt;
&lt;br /&gt;
Note: This query was crafted by [[User:Dondi|Dr. Dionisio]].&lt;br /&gt;
&lt;br /&gt;
Results:&lt;br /&gt;
*[[File:PostgreSQL Count cw20151210.png]]&lt;br /&gt;
* The number of unique matches yielded by this SQL query, 3447, matched the count generated by XMLPipeDB Match. Thus, the locations of all 3447 gene IDs in the PostgreSQL relational database were accounted for here.&lt;br /&gt;
&lt;br /&gt;
===OriginalRowCounts Comparison===&lt;br /&gt;
&lt;br /&gt;
We opened the gene database file [[File:Bpertussis-std_cw20151210.zip]] in  Microsoft Access and assessed the &amp;quot;OriginalRowCounts&amp;quot; table to see if the expected tables were listed with the expected number of records. The contents of this table were compared to the &amp;#039;&amp;#039;OriginalRowCounts&amp;#039;&amp;#039; table of an existing .gdb file created during Week 9.&lt;br /&gt;
 &lt;br /&gt;
Benchmark .gdb file: [[File:Vc-Std 20151027 TR.gdb]]&lt;br /&gt;
&lt;br /&gt;
&amp;quot;OriginalRowCounts&amp;quot; table from the benchmark and new gdb:&lt;br /&gt;
*[[File:ComparisonToBenchmark cw20151210.PNG]]&lt;br /&gt;
**All 52 tables present in the 2015 &amp;#039;&amp;#039;Vibrio cholerae&amp;#039;&amp;#039; database were also present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; gene database, &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;. This confirmed that all expected tables were successfully created.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; table count is listed as 3447. &amp;#039;&amp;#039;&amp;#039;This count demonstrates that the missing ID, &amp;quot;BP3167A&amp;quot;, was successfully added to the export (confirmed below).&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
***[[File:BP3167A Confirmed cw20151210.PNG]]&lt;br /&gt;
&lt;br /&gt;
Note: The &amp;quot;OriginalRowCounts&amp;quot; tables were too large to screenshot. To circumvent this problem and facilitate the comparison, I copied the &amp;quot;OriginalRowCounts&amp;quot; tables from both gene databases into an Excel file and zoomed out. The above screenshot was taken from this Excel file. The &amp;quot;OrderedLocusNames&amp;quot; row count for &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039; is highlighted in yellow.&lt;br /&gt;
&lt;br /&gt;
===Visual Inspection===&lt;br /&gt;
We visually inspected individual tables within [[File:Bpertussis-std_cw20151210.zip]] using Microsoft Access.&lt;br /&gt;
&lt;br /&gt;
*Systems Table&lt;br /&gt;
**35 gene ID systems were listed, 11 of which were used in the creation of this .gdb file and listed the appropriate import date (12/10/2015).&lt;br /&gt;
***All gene ID systems relevant to &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; were listed. This includes: EMBL, EnsemblBacteria, GeneID, GeneOntology, InterPro, OrderedLocusNames, Pfam, RefSeq, and UniProt.&lt;br /&gt;
***This result corresponded with that of the benchmark .gdb file listed in the &amp;quot;OriginalRowCounts Comparison&amp;quot; section.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; listing properly displayed customizations to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; species profile.&lt;br /&gt;
***In this row, the species was listed correctly as &amp;quot;Bordetella pertussis&amp;quot;.&lt;br /&gt;
***In this row, the link corresponded to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; database at GeneDB. The link was as follows: http://www.genedb.org/gene/~;jsessionid=A06A0EFE93C64E476380393D4CBEFA69?actionName=%2FQuery%2FquickSearch&amp;amp;resultsSize=1&amp;amp;taxonNodeName=Bpertussis.&lt;br /&gt;
*UniProt Table&lt;br /&gt;
**This table contained 3258 entries with 6 character IDs.&lt;br /&gt;
**All ID&amp;#039;s in the UniProt table conform to the following pattern:&lt;br /&gt;
*** [[File:UniProt Ascension Number info.PNG]]&lt;br /&gt;
*RefSeq Table&lt;br /&gt;
**This table contained 6627 entries. All IDs began with one of three prefixes: &amp;quot;NP_&amp;quot;, &amp;quot;YP_&amp;quot;, or &amp;quot;WP_&amp;quot;. The meanings of these prefixes can be found in the RefSeq documentation found [http://www.ncbi.nlm.nih.gov/books/NBK50679/ here].&lt;br /&gt;
***&amp;quot;NP_&amp;quot; and &amp;quot;YP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to proteins. There are 3410 &amp;quot;NP_&amp;quot; IDs and 7 &amp;quot;YP_&amp;quot; IDs.&lt;br /&gt;
***&amp;quot;WP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to &amp;quot;autonomous non-redundant proteins that are not yet directly annotated on a genome&amp;quot;. There were 3210 IDs with the &amp;quot;WP_&amp;quot; prefixes.&lt;br /&gt;
***Overall, every entry in the ID column was an expected value.&lt;br /&gt;
*OrderedLocusNames Table&lt;br /&gt;
**This table contained 3447 entries (consistent with the XMLPipeDB Match result).&lt;br /&gt;
**The IDs were copied into an Excel document for analysis:&lt;br /&gt;
***3434 IDs conformed to the pattern &amp;quot;BP####&amp;quot;.&lt;br /&gt;
***11 IDs conformed to the pattern &amp;quot;BP####A&amp;quot;.&lt;br /&gt;
****This included 10 ORF gene IDs &amp;amp; &amp;quot;BP3167A&amp;quot; (reference to an EnsemblBacteria ID).&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####B&amp;quot;.&lt;br /&gt;
****This corresponded to an ORF gene ID.&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####.1&amp;quot;.&lt;br /&gt;
****This ID was the manner in which UniProt classified &amp;quot;BP3167A&amp;quot;. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==bpertussis-std_cw20151210.gdb Use in GenMAPP==&lt;br /&gt;
&lt;br /&gt;
The following analysis was conducted in GenMAPP Version 2.1. Within GenMAPP, the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database was loaded by selecting Data &amp;gt; Choose Gene Database and then selecting the file &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
===Putting a Gene on the MAPP Using the GeneFinder Window===&lt;br /&gt;
&lt;br /&gt;
We made a sample MAPP in which gene IDs conforming to the naming conventions of the 5 major gene databases containing &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; genome data were added. A screenshot of the resulting MAPP is provided below:&lt;br /&gt;
*[[File:Samplegenemapp.png]]&lt;br /&gt;
*Gene IDs:&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;bp1123&amp;#039;&amp;#039;&amp;#039; refers to the OrderedLocusNames gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;CAE43716&amp;#039;&amp;#039;&amp;#039; refers to the EmsemblBacteria gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Q7VWE&amp;#039;&amp;#039;&amp;#039;5 refers to the UniProt gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;2665491&amp;#039;&amp;#039;&amp;#039; refers to the GeneID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;NP_881255&amp;#039;&amp;#039;&amp;#039; refers to the RefSeq gene ID system.&lt;br /&gt;
&lt;br /&gt;
Note: Gene IDs tested from the above gene ID systems all had complete Backpages and were successfully placed on the MAPP.&lt;br /&gt;
&lt;br /&gt;
===Creating an Expression Dataset in the Expression Dataset Manager===&lt;br /&gt;
The file [[File:Bpertussis compiledrawdata cw20151208.txt]] was used to create an expression dataset in GenMAPP.&lt;br /&gt;
&lt;br /&gt;
*Total Number of Gene IDs Imported&lt;br /&gt;
** 3211 of the 3552 gene IDs from the microarray dataset were imported into the expression dataset.&lt;br /&gt;
**There were 341 exceptions during the creation of the expression dataset. A screenshot of the error message is shown here: &lt;br /&gt;
***[[File:Errors in genmapp.png]]&lt;br /&gt;
*Investigating Errors in the Exceptions File (EX.txt)&lt;br /&gt;
**All 341 exceptions triggered the following error message: &amp;quot;Gene not found in OrderedLocusNames or any related system.&amp;quot;&lt;br /&gt;
**Gene IDs that triggered this error message conformed to the patterns &amp;quot;BP####&amp;quot; and &amp;quot;BP####A&amp;quot;, indicating that no unique gene ID patterns were the cause of these errors.&lt;br /&gt;
***Example gene IDs that triggered this error are the following: BP0101, BP1677, BP0910A, and BP2029A.&lt;br /&gt;
****Searching for any of these gene IDs in UniProt returns the message &amp;quot;Sorry, no results found for your search term.&amp;quot;:&lt;br /&gt;
*****[[File:ErroneousID Uniprot cw20151210.PNG]]&lt;br /&gt;
***The 341 gene IDs were copied into a new Excel file and compared to the gene IDs present in the file [[File:Bpertussis-std_cw20151210.zip]] (adapted from the &amp;quot;OrderedLocusNames&amp;quot; table in Microsoft Access).&lt;br /&gt;
****None of the 341 gene IDs were present in the .gdb file.&lt;br /&gt;
***The 341 gene IDs were each individually searched for in UniProt.&lt;br /&gt;
****None of the 341 gene IDs retrieved results in UniProt.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Conclusion: All gene IDs that triggered errors were not present in the original UniProt XML file.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Coloring a MAPP with Expression Data===&lt;br /&gt;
&lt;br /&gt;
====Creating a New Color Set====&lt;br /&gt;
We customized the new Expression Dataset by creating a new color set entitled &amp;quot;LogFoldChange&amp;quot;.&lt;br /&gt;
# We created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;increase&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Increased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as red using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;gt; 0.25 AND [B-H_Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;.&lt;br /&gt;
#Second, we created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;decrease&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Decreased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as green using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;lt; -0.25 AND [B-H_Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;&lt;br /&gt;
# Upon entering these color sets, we saved the entire Expression Dataset by selecting Save from the Expression Dataset menu. This effectively updated our .gex file with the new Color Set.&lt;br /&gt;
&lt;br /&gt;
Screenshot of Color Set criteria:&lt;br /&gt;
*[[File:Expression dataset BHpvalue criteria.png]]&lt;br /&gt;
&lt;br /&gt;
Note: No errors were encountered in the creation of the Color Set.&lt;br /&gt;
&lt;br /&gt;
====Creating a Pathway-Based MAPP Using Colored Genes====&lt;br /&gt;
====Ribosome Kegg Pathway====&lt;br /&gt;
* We were able to create a mapp of the ribosome pathway by using the genes provided from the http://www.genome.jp/kegg/ website.&lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Ribosome&amp;quot; that was under section 2.2 Translation and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the ribosome pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the ribosome pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151218&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
**Here is the picture of the final mapp for the ribosome pathway created:&lt;br /&gt;
*[[File: Bpertussis ribosomepathway cw20151218.jpg]]&lt;br /&gt;
** Most of the ribosome genes that were generated on this mapp appeared to be the color green, symbolizing a decrease, except for the grey colored genes that were not significantly changed in this experiment. Since the genes mapped for the ribosome pathway all appeared to be green, this means that the expression levels of the genes pertaining to the ribosome category all decreased during the microarray experiment. Ribosomes play a key role in the translation process in cells and without them genes are often repressed and unable to perform their proper functions as they are unable to complete the replication processes. The microarray experiment analysis revealed that the absence of a membrane-associated protein named KpsT in B. pertussis, resulted in global down-regulation of gene expression including key virulence genes. The ribosome pathway depicted genes that were decreasing in gene expression, thus linking the translation process to the down-regulated key genes from the experiment because since these genes were lacking a necessary protein to help them perform the proper replication processes, translation did not occur in these genes and thus the ribosomes were not involved, ultimately leading to the decrease in expression of the genes mapped in the ribosome pathway.&lt;br /&gt;
&lt;br /&gt;
====Nitrogen Cycle Kegg Pathway====&lt;br /&gt;
* We were also able to create another mapp using the nitrogen cycle pathway genes provided from the http://www.genome.jp/kegg/ website. &lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Nitrogen Metabolism&amp;quot; that was under section 1.2 Energy Metabolism and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the nitrogen metabolism pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the nitrogen metabolism pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151218&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
** Here is the picture of the final mapp for the nitrogen cycle pathway created:&lt;br /&gt;
* [[File:Finalnitrogencyclebpertussis cw20151218.jpg]]&lt;br /&gt;
** This mapp displayed both red and green colored genes; the green highlighted genes symbolizing a decrease and the red highlighted genes symbolizing an increase, as well a couple of gray genes that were not significant to the criterion. This nitrogen cycle mapp was created due to the important metabolic processes that occur in order to keep cells alive and reproducing, and specifically the nitrogen metabolism cycle. The genes that displayed red in this mapp had increased expression during the microarray experiment, and from the kegg pathway given for nitrogen metabolism, these genes can be seen to specifically aid in the metabolism of glutamate. Glutamate is important to cells as it plays a role in providing energy to allow the cells to operate correctly, and since the glutamate-related genes that we mapped were increased, it can be determined that glutamate plays a role in supplying the underlying energy to allow for the Bordetella pertussis strains to produce the polysaccharide capsule transport proteins, as studied in the microarray experiment.&lt;br /&gt;
&lt;br /&gt;
===Running MAPPFinder===&lt;br /&gt;
*MAPPFinder Procedure&lt;br /&gt;
** We launched the MAPPFinder program from within GenMAPP and ensured that the &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039; gene database was still loaded into GenMAPP.&lt;br /&gt;
** We clicked on the button &amp;quot;Calculate New Results&amp;quot; followed by &amp;quot;Find File&amp;quot;, at which point I specified the .gex file updated during the creation of the &amp;quot;LogFoldChange&amp;quot; color set.&lt;br /&gt;
** We chose to apply both the &amp;quot;Increased&amp;quot; and &amp;quot;Decreased&amp;quot; criteria present within the LogFoldChange color set to the data.&lt;br /&gt;
** We checked the boxes next to &amp;quot;Gene Ontology&amp;quot; and &amp;quot;p value&amp;quot;, specified the results file, and then clicked &amp;quot;Run MAPPFinder&amp;quot;.&lt;br /&gt;
***This analysis took several minutes to complete.&lt;br /&gt;
*MAPPFinder Analysis Results&lt;br /&gt;
**We selected &amp;quot;Show Ranked List&amp;quot; to see a list of the most significant Gene Ontology terms. A screenshot of this output is shown below:&lt;br /&gt;
**[[File:GeneontologyresultsBHpvalue.png]]&lt;br /&gt;
***The majority of the most significant gene ontology terms pertained to ribosome biosynthesis and translation.&lt;br /&gt;
&lt;br /&gt;
Note: The MAPPFinder analysis took approximately 8 minutes to complete. No errors were encountered in the process. MAPPFinder thus was confirmed to work with the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database.&lt;br /&gt;
&lt;br /&gt;
=== Compare Gene Database to Outside Resource===&lt;br /&gt;
&lt;br /&gt;
To assess the completeness of this version of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database, we explored the original genome sequencing data from Parkhill et al. (2003) that was deposited at the [http://www.genedb.org/Homepage/Bpertussis GeneDB Model Organism Database (MOD)]. From the GeneDB Home Page, we accessed a &amp;#039;&amp;#039;Gene Type&amp;#039;&amp;#039; search function that was used to quantify the number of gene listings present under each provided gene category. The results of this investigation are presented below.&lt;br /&gt;
&lt;br /&gt;
====Protein-Coding Genes====&lt;br /&gt;
[[File:GDB protein-coding.png]]&lt;br /&gt;
*There are 3447 protein-coding genes present in the [http://www.genedb.org/Homepage/Bpertussis GeneDB] database. This result verified that the set of protein-coding genes exported into [[File:Bpertussis-std cw20151210.zip]] from UniProt is complete. No further changes to the gene database export procedures are necessary at this time.&lt;br /&gt;
&lt;br /&gt;
====Non-Protein Genome Features====&lt;br /&gt;
&lt;br /&gt;
#Pseudogenes&lt;br /&gt;
#*[[File:GDB_pseudogenes.png]]&lt;br /&gt;
#**GeneDB indicated that 359 pseudogenes are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. Pseudogenes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#rRNA&lt;br /&gt;
#*[[File:GDB_rRNA.png]]&lt;br /&gt;
#**GeneDB indicated that 9 genes that encode for rRNA are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. These genes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#tRNA&lt;br /&gt;
#*[[File:GDB_tRNA.png]]&lt;br /&gt;
#**GeneDB indicated that 51 genes that encode for tRNA are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. These genes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#snoRNA&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for snoRNA.&lt;br /&gt;
#snRNA&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for snRNA.&lt;br /&gt;
#&amp;quot;miscRNA&amp;quot;&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for &amp;quot;miscRNA&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;A total of 419 non-protein coding genes were identified in the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; genome in addition to the 3447 protein-coding genes captured in our gene database.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_ribosomepathway_cw20151218.jpg&amp;diff=8160</id>
		<title>File:Bpertussis ribosomepathway cw20151218.jpg</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_ribosomepathway_cw20151218.jpg&amp;diff=8160"/>
				<updated>2015-12-18T22:27:53Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:Bpertussis ribosomepathway cw20151218.jpg&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;jpg of ribosome mapp with Bhpvalue with correct datset&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8124</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8124"/>
				<updated>2015-12-18T21:45:22Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Changed &amp;quot;File&amp;quot; to &amp;quot;Media&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species: [[Media:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* ReadMe file to accompany the Gene Database: [[Media:ReadMe bpertussis-std cw20151210.docx]]&lt;br /&gt;
** [[Media:Bpertussis genedatabase schema cw20151210.jpg|Gene Database Schema diagram (also included in ReadMe)]]&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database: [[Media:Gdb testingreport cw20151210.pdf]]&lt;br /&gt;
* Processed and analyzed DNA microarray dataset: [[Media:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP: [[Media:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file: [[Media:Bpertussis expressiondataset cw20151213.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP: [[Media:Bpertussis expressiondataset exceptions cw20151213.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files: &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO.txt|Increased]]&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO.txt|Decreased]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[Media:Bpertussis compiledrawdata cw20151213.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results:&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO - Filtered.xlsx|Increased]] &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO - Filtered.xlsx|Decreased]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species: [[Media:Bpertussis ribosomepathway cw20151215.mapp]]&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;[[Gene Database Project Deliverables#Group Report | Group Report]] describing the creation of the Gene Database and the biological analysis (&amp;#039;&amp;#039;.doc&amp;#039;&amp;#039;, &amp;#039;&amp;#039;.docx&amp;#039;&amp;#039;, or &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* PowerPoint presentation: [[Media:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8121</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8121"/>
				<updated>2015-12-18T21:43:35Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: fixed file tags&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* ReadMe file to accompany the Gene Database: [[File:ReadMe bpertussis-std cw20151210.docx]]&lt;br /&gt;
** [[Media:Bpertussis genedatabase schema cw20151210.jpg|Gene Database Schema diagram (also included in ReadMe)]]&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database: [[File:Gdb testingreport cw20151210.pdf]]&lt;br /&gt;
* Processed and analyzed DNA microarray dataset: [[File:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP: [[File:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file: [[File:Bpertussis expressiondataset cw20151213.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP: [[File:Bpertussis expressiondataset exceptions cw20151213.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files: &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO.txt|Increased]]&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO.txt|Decreased]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[File:Bpertussis compiledrawdata cw20151213.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results:&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO - Filtered.xlsx|Increased]] &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO - Filtered.xlsx|Decreased]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species: [[File:Bpertussis ribosomepathway cw20151215.mapp]]&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;[[Gene Database Project Deliverables#Group Report | Group Report]] describing the creation of the Gene Database and the biological analysis (&amp;#039;&amp;#039;.doc&amp;#039;&amp;#039;, &amp;#039;&amp;#039;.docx&amp;#039;&amp;#039;, or &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* PowerPoint presentation: [[File:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8120</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8120"/>
				<updated>2015-12-18T21:43:12Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: attempt at fixing labels&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* ReadMe file to accompany the Gene Database: [[File:ReadMe bpertussis-std cw20151210.docx]]&lt;br /&gt;
** [[Media:Bpertussis genedatabase schema cw20151210.jpg|Gene Database Schema diagram (also included in ReadMe)]]&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database: [[File:Gdb testingreport cw20151210.pdf]]&lt;br /&gt;
* Processed and analyzed DNA microarray dataset: [[File:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP: [[File:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file: [[File:Bpertussis expressiondataset cw20151213.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP: [[File:Bpertussis expressiondataset exceptions cw20151213.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files: &lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151218-Criterion0-GO.txt|Increased]]&lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151218-Criterion1-GO.txt|Decreased]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[File:Bpertussis compiledrawdata cw20151213.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results:&lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion0-GO - Filtered.xlsx|Increased]] &lt;br /&gt;
** [[Media:Bpertussis mappfinderresults cw20151218-Criterion1-GO - Filtered.xlsx|Decreased]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species: [[File:Bpertussis ribosomepathway cw20151215.mapp]]&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;[[Gene Database Project Deliverables#Group Report | Group Report]] describing the creation of the Gene Database and the biological analysis (&amp;#039;&amp;#039;.doc&amp;#039;&amp;#039;, &amp;#039;&amp;#039;.docx&amp;#039;&amp;#039;, or &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* PowerPoint presentation: [[File:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8116</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8116"/>
				<updated>2015-12-18T21:40:34Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: added description to file link&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* ReadMe file to accompany the Gene Database: [[File:ReadMe bpertussis-std cw20151210.docx]]&lt;br /&gt;
** [[File:Bpertussis genedatabase schema cw20151210.jpg|Gene Database Schema diagram (also included in ReadMe)]]&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database: [[File:Gdb testingreport cw20151210.pdf]]&lt;br /&gt;
* Processed and analyzed DNA microarray dataset: [[File:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP: [[File:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file: [[File:Bpertussis expressiondataset cw20151213.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP: [[File:Bpertussis expressiondataset exceptions cw20151213.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files: &lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151218-Criterion1-GO.txt]]&lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151218-Criterion0-GO.txt]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[File:Bpertussis compiledrawdata cw20151213.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results:&lt;br /&gt;
** [[File:Bpertussis mappfinderresults filtered cw20151213-Criterion0-GO.xlsx|Increased]] &lt;br /&gt;
** [[File:Bpertussis mappfinderresults filtered cw20151213-Criterion1-GO.xlsx|Decreased]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species: [[File:Bpertussis ribosomepathway cw20151215.mapp]]&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;[[Gene Database Project Deliverables#Group Report | Group Report]] describing the creation of the Gene Database and the biological analysis (&amp;#039;&amp;#039;.doc&amp;#039;&amp;#039;, &amp;#039;&amp;#039;.docx&amp;#039;&amp;#039;, or &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* PowerPoint presentation: [[File:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8114</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8114"/>
				<updated>2015-12-18T21:39:33Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: added ReadMe and Schema&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* ReadMe file to accompany the Gene Database: [[File:ReadMe bpertussis-std cw20151210.docx]]&lt;br /&gt;
** Gene Database Schema diagram (also included in ReadMe): [[File:Bpertussis genedatabase schema cw20151210.jpg]]&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database: [[File:Gdb testingreport cw20151210.pdf]]&lt;br /&gt;
* Processed and analyzed DNA microarray dataset: [[File:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP: [[File:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file: [[File:Bpertussis expressiondataset cw20151213.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP: [[File:Bpertussis expressiondataset exceptions cw20151213.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files: &lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151218-Criterion1-GO.txt]]&lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151218-Criterion0-GO.txt]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[File:Bpertussis compiledrawdata cw20151213.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results:&lt;br /&gt;
** [[File:Bpertussis mappfinderresults filtered cw20151213-Criterion0-GO.xlsx|Increased]] &lt;br /&gt;
** [[File:Bpertussis mappfinderresults filtered cw20151213-Criterion1-GO.xlsx|Decreased]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species: [[File:Bpertussis ribosomepathway cw20151215.mapp]]&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;[[Gene Database Project Deliverables#Group Report | Group Report]] describing the creation of the Gene Database and the biological analysis (&amp;#039;&amp;#039;.doc&amp;#039;&amp;#039;, &amp;#039;&amp;#039;.docx&amp;#039;&amp;#039;, or &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* PowerPoint presentation: [[File:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_genedatabase_schema_cw20151210.jpg&amp;diff=8113</id>
		<title>File:Bpertussis genedatabase schema cw20151210.jpg</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Bpertussis_genedatabase_schema_cw20151210.jpg&amp;diff=8113"/>
				<updated>2015-12-18T21:39:17Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: uploaded schema for final bordetella pertussis gdb file (20151210)&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;uploaded schema for final bordetella pertussis gdb file (20151210)&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:ReadMe_bpertussis-std_cw20151210.docx&amp;diff=8111</id>
		<title>File:ReadMe bpertussis-std cw20151210.docx</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:ReadMe_bpertussis-std_cw20151210.docx&amp;diff=8111"/>
				<updated>2015-12-18T21:37:54Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: uploaded readme file for the final bordetella pertussis gene database (20151210)&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;uploaded readme file for the final bordetella pertussis gene database (20151210)&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=User:Bklein7&amp;diff=8101</id>
		<title>User:Bklein7</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=User:Bklein7&amp;diff=8101"/>
				<updated>2015-12-18T20:42:31Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: made edit to invoke template&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Contact Information==&lt;br /&gt;
&lt;br /&gt;
   Brandon J. Klein&lt;br /&gt;
   Loyola Marymount University&lt;br /&gt;
   1 LMU Drive, MSB #3393&lt;br /&gt;
   Los Angeles, CA 90045&lt;br /&gt;
   E-mail: bklein7@lion.lmu.edu&lt;br /&gt;
&lt;br /&gt;
==Education==&lt;br /&gt;
===Loyola Marymount University, Los Angeles===&lt;br /&gt;
[[Image:LMU_Seal.png|right]]&lt;br /&gt;
&amp;lt;!-- find out how to scale down photo to fit within section --&amp;gt;&lt;br /&gt;
&amp;lt;!-- tack on chemistry minor in the future if appropriate along with further upper division courses --&amp;gt; &lt;br /&gt;
*Major: Biology, Minor: Applied Mathematics&lt;br /&gt;
*Expected Graduation Date: May 6, 2018&lt;br /&gt;
*Upper Division Coursework:&lt;br /&gt;
**MATH 360- Intro to Probability and Statistics&lt;br /&gt;
**[[Main Page|BIOL/CMSI 367- Biological Databases]]&lt;br /&gt;
&lt;br /&gt;
==Career Interests and Goals==&lt;br /&gt;
&lt;br /&gt;
===Career Goals===&lt;br /&gt;
#To gain admittance into medical school.&lt;br /&gt;
#To complete my residency as a specialist-ideally in ophthalmology or neurology.&lt;br /&gt;
#To apply my skills as a physician to improve the quality of life of those around me and advance medical research.&lt;br /&gt;
&lt;br /&gt;
===Research===&lt;br /&gt;
====Current Research Projects====&lt;br /&gt;
&amp;lt;!-- Update as the project progresses and is presented --&amp;gt;&lt;br /&gt;
*Character of Retinal Thickness Measurements and their Relationship to Visual Acuity in Progressing Cases of Dry Macular Degeneration&lt;br /&gt;
**Faulty Mentor: Dr. Lily Khadjavi, Loyola Marymount University&lt;br /&gt;
**We are currently preparing findings for presentation at future conferences and research symposia. A preliminary presentation on this research can be found [[media:Klein SURPResearch June24 PDF.pdf|here]].&lt;br /&gt;
&lt;br /&gt;
====Research Interests====&lt;br /&gt;
# Age-related macular degeneration: quantifying progression through statistical models and exploring treatment options such as stem cell therapy.&lt;br /&gt;
# Genomics research, particularly with respect to understanding the mechanisms and outcomes of gene expression.&lt;br /&gt;
# Abiogenesis and the stepwise creation of artificial life from synthesized organic molecules.&lt;br /&gt;
&lt;br /&gt;
==Work Experience==&lt;br /&gt;
*Ophthalmic Medical Assistant&lt;br /&gt;
**[http://www.sweye.net Southwestern Eye Associates], Las Vegas, Nevada&lt;br /&gt;
**Summer 2015 - Present&lt;br /&gt;
**Responsibilities:&lt;br /&gt;
***Preliminary patient screening-history taking, refraction, and applanation tonometry&lt;br /&gt;
***Performing OCT exams on patients using the CIRRUS Photo 600 by Zeiss©&lt;br /&gt;
***Assisting in minor medical procedures&lt;br /&gt;
&lt;br /&gt;
*Retinal Photographer&lt;br /&gt;
**[http://www.sweye.net Southwestern Eye Associates], Las Vegas, Nevada&lt;br /&gt;
**Fall 2013 - Spring 2015&lt;br /&gt;
**Responsibilities:&lt;br /&gt;
***Performing OCT exams on patients using the CIRRUS Photo 600 by Zeiss©&lt;br /&gt;
***Greeting patients and scheduling exams&lt;br /&gt;
***Cleaning office equipment&lt;br /&gt;
&lt;br /&gt;
==Personal Interests and Hobbies==&lt;br /&gt;
&lt;br /&gt;
===Hobbies===&lt;br /&gt;
[[Image:Vinyl Transparent.png|right]]&lt;br /&gt;
*Music&lt;br /&gt;
**I have been a percussionist for nearly a decade and occasionally trifle with recording performances.&lt;br /&gt;
**My passion for music often extends to listening to a broad swath of genres, from country to indie rock.&lt;br /&gt;
**My instruments:&lt;br /&gt;
**#Drum Set&lt;br /&gt;
**#Piano&lt;br /&gt;
**#Guitar&lt;br /&gt;
*Travel&lt;br /&gt;
**I enjoy travelling and immersing myself in the various cultures of the world.&lt;br /&gt;
**Unique locations I have visited include Colombia and China.&lt;br /&gt;
&lt;br /&gt;
===Academic Passions===&lt;br /&gt;
*Biology&lt;br /&gt;
**My favorite thing about biology is its capacity to explain the mechanisms of life-knowledge which can be used to doctor and improve our quality of life.&lt;br /&gt;
*Philosophy&lt;br /&gt;
**My favorite thing about philosophy is its exploration of the subjective aspects of life in endeavoring to determine what is real. Metaphysics and existentialism are fascinating.&lt;br /&gt;
*Computer Science&lt;br /&gt;
**My favorite thing about computer science is the manner in which we program mechanisms that resemble cognition. Such explorations help supplement our understanding of information processes and the human brain.&lt;br /&gt;
&lt;br /&gt;
==Links==&lt;br /&gt;
{{Template:Bklein7}}&lt;br /&gt;
&lt;br /&gt;
[[Category:Journal Entry]]&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8100</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8100"/>
				<updated>2015-12-18T20:37:38Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: added file and bolded yet to be uploaded files&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;ReadMe file to accompany the Gene Database&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Include Gene Database Schema diagram in ReadMe&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database: [[File:Gdb testingreport cw20151210.pdf]]&lt;br /&gt;
* Processed and analyzed DNA microarray dataset: [[File:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP: [[File:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file: [[File:Bpertussis expressiondataset cw20151213.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP: [[File:Bpertussis expressiondataset exceptions cw20151213.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files: &lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151213-criterion0-GO.txt|Increased]]&lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151213-criterion1-GO.txt|Decreased]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[File:Bpertussis compiledrawdata cw20151213.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results:&lt;br /&gt;
** [[File:Bpertussis mappfinderresults filtered cw20151213-Criterion0-GO.xlsx|Increased]] &lt;br /&gt;
** [[File:Bpertussis mappfinderresults filtered cw20151213-Criterion1-GO.xlsx|Decreased]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species: [[File:Bpertussis ribosomepathway cw20151215.mapp]]&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;[[Gene Database Project Deliverables#Group Report | Group Report]] describing the creation of the Gene Database and the biological analysis (&amp;#039;&amp;#039;.doc&amp;#039;&amp;#039;, &amp;#039;&amp;#039;.docx&amp;#039;&amp;#039;, or &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* PowerPoint presentation: [[File:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Gdb_testingreport_cw20151210.pdf&amp;diff=8099</id>
		<title>File:Gdb testingreport cw20151210.pdf</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Gdb_testingreport_cw20151210.pdf&amp;diff=8099"/>
				<updated>2015-12-18T20:36:41Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Final Gene Database Testing Report, .pdf version&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Final Gene Database Testing Report, .pdf version&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Gene_Database_Testing_Report-_cw20151210.pdf&amp;diff=8098</id>
		<title>File:Gene Database Testing Report- cw20151210.pdf</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:Gene_Database_Testing_Report-_cw20151210.pdf&amp;diff=8098"/>
				<updated>2015-12-18T20:35:54Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Final Gene Database Testing Report, .pdf version&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Final Gene Database Testing Report, .pdf version&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8097</id>
		<title>Gene Database Testing Report- cw20151210</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8097"/>
				<updated>2015-12-18T20:32:14Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: fixed formatting issue&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Files Asked for in the Gene Database Testing Report==&lt;br /&gt;
&lt;br /&gt;
For convenience, all of the files explicitly asked for in the sections below were compressed together in this file: [[File:Testingreport cw20151210.zip]]&lt;br /&gt;
&lt;br /&gt;
==Pre-requisites==&lt;br /&gt;
&lt;br /&gt;
The following set of software was used in the creation and testing of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database:&lt;br /&gt;
# [http://www.7-zip.org/ 7-zip]tool that for unpacking .gz and .zip files&lt;br /&gt;
# [http://www.postgresql.org PostgreSQL] on Windows (version 9.4.x)&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ GenMAPP Builder]&lt;br /&gt;
# Java JDK 1.8 64-bit&lt;br /&gt;
# [https://github.com/GenMAPPCS/genmapp GenMAPP 2]&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ XMLPipeDB match utility] for counting IDs in XML files&lt;br /&gt;
# Microsoft Access for reading .mdb files&lt;br /&gt;
&lt;br /&gt;
==Gene Database Creation==&lt;br /&gt;
===Downloading Data Source Files and GenMAPP Builder===&lt;br /&gt;
&lt;br /&gt;
*We download the UniProt XML, GOA, and GO OBO-XML files for &amp;#039;&amp;#039;Bordetella Pertussis&amp;#039;&amp;#039; along with the GenMAPP Builder program.&lt;br /&gt;
**All files were saved to the folder &amp;#039;&amp;#039;Bklein7_CW\bpertussis_cw20151210&amp;#039;&amp;#039; on our computer&amp;#039;s ThawSpace.&lt;br /&gt;
**Files that required extraction were unzipped using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
**Data files that remained in a folder after unzipping were removed from their folders to facilitate organization and command line processing.&lt;br /&gt;
&lt;br /&gt;
====UniProt XML====&lt;br /&gt;
&lt;br /&gt;
* We went to the [http://www.uniprot.org/taxonomy/complete-proteomes UniProt Complete Proteomes] page.&lt;br /&gt;
**From there, we navigated to the complete proteome download page for [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)].&lt;br /&gt;
** We clicked on the &amp;quot;Download&amp;quot; button at the top of the page above and selected the following options:&lt;br /&gt;
***&amp;quot;Download all&amp;quot;&lt;br /&gt;
***&amp;quot;XML&amp;quot; from the &amp;quot;Format&amp;quot; drop-down menu&lt;br /&gt;
***&amp;quot;Compressed&amp;quot; format&lt;br /&gt;
**We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====GOA====&lt;br /&gt;
&lt;br /&gt;
* UniProt-GOA files can be downloaded from the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/ UniProt-GOA ftp site].&lt;br /&gt;
*Within the above site, we navigated to the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I].&lt;br /&gt;
**This text file was automatically opened by the browser. Therefore, we had to manually download the file.&lt;br /&gt;
&lt;br /&gt;
====GO OBO-XML====&lt;br /&gt;
&lt;br /&gt;
* We downloaded the GO OBO-XML formatted file from the [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page].&lt;br /&gt;
* We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====Downloaded GenMAPP Builder====&lt;br /&gt;
&lt;br /&gt;
# We downloaded the custom version of GenMAPP Builder including the most recent version of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; custom class (Version 3.0.0 Build 5 - cw20151210): [[File:Dist cw20151210.zip]].&lt;br /&gt;
# We extracted the GenMAPP Builder folder using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
===Creating the New Database in PostgreSQL===&lt;br /&gt;
&lt;br /&gt;
* We launched &amp;#039;&amp;#039;pgAdmin III&amp;#039;&amp;#039; and connected to the PostgreSQL 9.4 server (localhost:5432).&lt;br /&gt;
** On this server, we created a new database: &amp;#039;&amp;#039;bpertussis_cw20151210_gmb3build5&amp;#039;&amp;#039;.&lt;br /&gt;
** We opened the SQL Editor tab to use an XMLPipeDB query to create the tables in the database.&lt;br /&gt;
*** We clicked on the Open File icon and selected the file &amp;#039;&amp;#039;gmbuilder.sql&amp;#039;&amp;#039;. This imported a series of SQL commands into the editor tab.&lt;br /&gt;
*** We clicked on the Execute Query icon to run this command.&lt;br /&gt;
***In viewing the schema for this database, we confirmed that there were 167 tables after running the above command.&lt;br /&gt;
&lt;br /&gt;
===Configuring GenMAPP Builder to Connect to the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
* To begin, we launched gmbuilder.bat.&lt;br /&gt;
* We selected the &amp;quot;Configure Database&amp;quot; option and entered the following information into the fields below:&lt;br /&gt;
** Host or address: localhost&lt;br /&gt;
** Port number: 5432&lt;br /&gt;
** Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
** Username: postgres&lt;br /&gt;
** Password: Welcome1&lt;br /&gt;
&lt;br /&gt;
===Importing Data into the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
*The downloaded data files for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; were specified and imported into the database by clicking on the following buttons:&lt;br /&gt;
** Selected File &amp;gt; Import UniProt XML...&lt;br /&gt;
** Selected File &amp;gt; Import GO OBO-XML...&lt;br /&gt;
** Clicked OK to the message asking to process the GO data.&lt;br /&gt;
** Selected File &amp;gt; Import GOA...&lt;br /&gt;
&lt;br /&gt;
===Exporting a GenMAPP Gene Database (.gdb)===&lt;br /&gt;
&lt;br /&gt;
* We selected File &amp;gt; Export to GenMAPP Gene Database... to begin the export process.&lt;br /&gt;
* We typed in our coder&amp;#039;s name in the owner field (Brandon Klein).&lt;br /&gt;
* We selected the custom profile &amp;quot;Bordetella pertussis, Taxon ID 257313&amp;quot; as the gene database species and then clicked &amp;#039;&amp;#039;Next&amp;#039;&amp;#039;.&lt;br /&gt;
* The database was saved as &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;.&lt;br /&gt;
* We checked the boxes for exporting all Molecular Function, Cellular Component, and Biological Process Gene Ontology Terms.&lt;br /&gt;
* Finally, we clicked the &amp;quot;Next&amp;quot; button to begin the export process.&lt;br /&gt;
&lt;br /&gt;
==Gene Database Testing Report==&lt;br /&gt;
===Export Information===&lt;br /&gt;
&lt;br /&gt;
Version of GenMAPP Builder: Version 3.0.0 Build 5 - cw20151210&lt;br /&gt;
&lt;br /&gt;
Computer on which export was run: Seaver 120- Last computer on the right in the row farthest from the front of the room&lt;br /&gt;
&lt;br /&gt;
Postgres Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
&lt;br /&gt;
UniProt XML filename: [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
* UniProt XML version (The version information was found at [http://uniprot.org/news the UniProt News Page]): 2015_12&lt;br /&gt;
* UniProt XML download link: [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)]&lt;br /&gt;
* Time taken to import: 2.88 minutes&lt;br /&gt;
** Note: The import time was similar to that when creating the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (2.59 minute). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
GO OBO-XML filename: [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
* GO OBO-XML version (The version information was found in the file properties): Last Modified- ‎‎ ‎December ‎10, ‎2015 (TIME?)&lt;br /&gt;
* GO OBO-XML download link: [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page]&lt;br /&gt;
* Time taken to import: 6.97 minutes &lt;br /&gt;
* Time taken to process: 4.52 minutes&lt;br /&gt;
** Note: The import and processing times were similar to those for the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (7.08 minutes and 4.42 minutes respectively). No interruptions occurred during these processes.&lt;br /&gt;
&lt;br /&gt;
GOA filename: [[File:145.B pertussis ATCC BAA-589 cw20151210.zip]]&lt;br /&gt;
* GOA version (found in the Last modified field on the [ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/ FTP site]): Last Modified- 08-Dec-2015 02:45&lt;br /&gt;
* GOA download link: [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I]&lt;br /&gt;
* Time taken to import: 0.03 minutes&lt;br /&gt;
** Note: The import time was very similar to that of the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (0.04 minutes). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
Name of .gdb file: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* Time taken to export: &lt;br /&gt;
** Start time: 1:19 AM&lt;br /&gt;
** End time: 2:11 AM&lt;br /&gt;
** Elapsed time: 52 minutes&lt;br /&gt;
Note: No interruptions occurred during the export process.&lt;br /&gt;
&lt;br /&gt;
===TallyEngine===&lt;br /&gt;
* We ran the TallyEngine in GenMAPP Builder and specified the following files:&lt;br /&gt;
**XML- [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
**GO- [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
*Results:&lt;br /&gt;
**[[File:TallyEngineResults cw20151210.png]]&lt;br /&gt;
***All TallyEngine results were consistent across both files.&lt;br /&gt;
***The TallyEngine was not customized to reflect the coding changes made to GenMAPP Builder Version 3.0.0 Build 5 - cw20151210.&lt;br /&gt;
****Therefore, the total count for &amp;quot;Ordered Locus Names&amp;quot; and &amp;quot;ORF&amp;quot; gene IDs remained 3446. The extra ID that was imported in this build, &amp;quot;BP3167A&amp;quot;, was not listed in either of these categories.&lt;br /&gt;
****&amp;#039;&amp;#039;&amp;#039;Further TallyEngine customization is necessary to raise the count to 3447 gene IDs.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Using XMLPipeDB Match to Validate the XML Results from the TallyEngine===&lt;br /&gt;
The following functions were performed using the Windows command line (cmd).&lt;br /&gt;
*We entered the project folder using the following command:&lt;br /&gt;
 cd /d T:\Bklein7_CW\bpertussis_cw20151210&lt;br /&gt;
*We used XMLPipeDB match to identify matches of gene IDs in the UniProt XML file that conformed to the following the patterns: &amp;quot;BP####&amp;quot;, &amp;quot;BP####.1&amp;quot;, &amp;quot;BP####A&amp;quot;, and &amp;quot;BP####B&amp;quot;. The command used was as follows:&lt;br /&gt;
 java -jar xmlpipedb-match-1.1.1.jar &amp;quot;BP[0-9][0-9][0-9][0-9](A|B|\.1|)&amp;quot; &amp;lt; &amp;quot;uniprot-proteome%3AUP000002676_cw20151201.xml&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Match Results:&lt;br /&gt;
*[[File:Xmlpipedbmatch cw20151203.png]]&lt;br /&gt;
**The number of unique matches generated by XMLPipeDB Match, 3447, matched with our expectation. The count includes the total number of ordered locus (3435) and ORF (11) gene IDs along with the unique EnsemblBacteria reference ID &amp;quot;BP3167A&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
===Using SQL Queries to Validate the PostgreSQL Database Results from the TallyEngine===&lt;br /&gt;
We used the SQL &amp;quot;union&amp;quot; operation to count the number of &amp;quot;ordered locus&amp;quot; gene IDs, which conform to the pattern &amp;quot;BP####&amp;quot;, in addition to all gene IDs that matched the patterns &amp;quot;BP####A&amp;quot; &amp;amp; &amp;quot;BP####B&amp;quot; (including 11 &amp;quot;ORF&amp;quot; gene IDs and 1 EnsemblBacteria reference ID):&lt;br /&gt;
&lt;br /&gt;
 select count(value) from (select value from genenametype where type = &lt;br /&gt;
 &amp;#039;ordered locus&amp;#039; union select value from propertytype inner join dbreferencetype&lt;br /&gt;
  on (propertytype.dbreferencetype_property_hjid = dbreferencetype.hjid)&lt;br /&gt;
   where dbreferencetype.type = &amp;#039;EnsemblBacteria&amp;#039; and propertytype.type = &lt;br /&gt;
   &amp;#039;gene ID&amp;#039; and propertytype.value ~ &amp;#039;BP[0-9][0-9][0-9][0-9](A|B)&amp;#039;) as combined;&lt;br /&gt;
&lt;br /&gt;
Note: This query was crafted by [[User:Dondi|Dr. Dionisio]].&lt;br /&gt;
&lt;br /&gt;
Results:&lt;br /&gt;
*[[File:PostgreSQL Count cw20151210.png]]&lt;br /&gt;
* The number of unique matches yielded by this SQL query, 3447, matched the count generated by XMLPipeDB Match. Thus, the locations of all 3447 gene IDs in the PostgreSQL relational database were accounted for here.&lt;br /&gt;
&lt;br /&gt;
===OriginalRowCounts Comparison===&lt;br /&gt;
&lt;br /&gt;
We opened the gene database file [[File:Bpertussis-std_cw20151210.zip]] in  Microsoft Access and assessed the &amp;quot;OriginalRowCounts&amp;quot; table to see if the expected tables were listed with the expected number of records. The contents of this table were compared to the &amp;#039;&amp;#039;OriginalRowCounts&amp;#039;&amp;#039; table of an existing .gdb file created during Week 9.&lt;br /&gt;
 &lt;br /&gt;
Benchmark .gdb file: [[File:Vc-Std 20151027 TR.gdb]]&lt;br /&gt;
&lt;br /&gt;
&amp;quot;OriginalRowCounts&amp;quot; table from the benchmark and new gdb:&lt;br /&gt;
*[[File:ComparisonToBenchmark cw20151210.PNG]]&lt;br /&gt;
**All 52 tables present in the 2015 &amp;#039;&amp;#039;Vibrio cholerae&amp;#039;&amp;#039; database were also present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; gene database, &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;. This confirmed that all expected tables were successfully created.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; table count is listed as 3447. &amp;#039;&amp;#039;&amp;#039;This count demonstrates that the missing ID, &amp;quot;BP3167A&amp;quot;, was successfully added to the export (confirmed below).&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
***[[File:BP3167A Confirmed cw20151210.PNG]]&lt;br /&gt;
&lt;br /&gt;
Note: The &amp;quot;OriginalRowCounts&amp;quot; tables were too large to screenshot. To circumvent this problem and facilitate the comparison, I copied the &amp;quot;OriginalRowCounts&amp;quot; tables from both gene databases into an Excel file and zoomed out. The above screenshot was taken from this Excel file. The &amp;quot;OrderedLocusNames&amp;quot; row count for &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039; is highlighted in yellow.&lt;br /&gt;
&lt;br /&gt;
===Visual Inspection===&lt;br /&gt;
We visually inspected individual tables within [[File:Bpertussis-std_cw20151210.zip]] using Microsoft Access.&lt;br /&gt;
&lt;br /&gt;
*Systems Table&lt;br /&gt;
**35 gene ID systems were listed, 11 of which were used in the creation of this .gdb file and listed the appropriate import date (12/10/2015).&lt;br /&gt;
***All gene ID systems relevant to &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; were listed. This includes: EMBL, EnsemblBacteria, GeneID, GeneOntology, InterPro, OrderedLocusNames, Pfam, RefSeq, and UniProt.&lt;br /&gt;
***This result corresponded with that of the benchmark .gdb file listed in the &amp;quot;OriginalRowCounts Comparison&amp;quot; section.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; listing properly displayed customizations to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; species profile.&lt;br /&gt;
***In this row, the species was listed correctly as &amp;quot;Bordetella pertussis&amp;quot;.&lt;br /&gt;
***In this row, the link corresponded to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; database at GeneDB. The link was as follows: http://www.genedb.org/gene/~;jsessionid=A06A0EFE93C64E476380393D4CBEFA69?actionName=%2FQuery%2FquickSearch&amp;amp;resultsSize=1&amp;amp;taxonNodeName=Bpertussis.&lt;br /&gt;
*UniProt Table&lt;br /&gt;
**This table contained 3258 entries with 6 character IDs.&lt;br /&gt;
**All ID&amp;#039;s in the UniProt table conform to the following pattern:&lt;br /&gt;
*** [[File:UniProt Ascension Number info.PNG]]&lt;br /&gt;
*RefSeq Table&lt;br /&gt;
**This table contained 6627 entries. All IDs began with one of three prefixes: &amp;quot;NP_&amp;quot;, &amp;quot;YP_&amp;quot;, or &amp;quot;WP_&amp;quot;. The meanings of these prefixes can be found in the RefSeq documentation found [http://www.ncbi.nlm.nih.gov/books/NBK50679/ here].&lt;br /&gt;
***&amp;quot;NP_&amp;quot; and &amp;quot;YP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to proteins. There are 3410 &amp;quot;NP_&amp;quot; IDs and 7 &amp;quot;YP_&amp;quot; IDs.&lt;br /&gt;
***&amp;quot;WP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to &amp;quot;autonomous non-redundant proteins that are not yet directly annotated on a genome&amp;quot;. There were 3210 IDs with the &amp;quot;WP_&amp;quot; prefixes.&lt;br /&gt;
***Overall, every entry in the ID column was an expected value.&lt;br /&gt;
*OrderedLocusNames Table&lt;br /&gt;
**This table contained 3447 entries (consistent with the XMLPipeDB Match result).&lt;br /&gt;
**The IDs were copied into an Excel document for analysis:&lt;br /&gt;
***3434 IDs conformed to the pattern &amp;quot;BP####&amp;quot;.&lt;br /&gt;
***11 IDs conformed to the pattern &amp;quot;BP####A&amp;quot;.&lt;br /&gt;
****This included 10 ORF gene IDs &amp;amp; &amp;quot;BP3167A&amp;quot; (reference to an EnsemblBacteria ID).&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####B&amp;quot;.&lt;br /&gt;
****This corresponded to an ORF gene ID.&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####.1&amp;quot;.&lt;br /&gt;
****This ID was the manner in which UniProt classified &amp;quot;BP3167A&amp;quot;. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==bpertussis-std_cw20151210.gdb Use in GenMAPP==&lt;br /&gt;
&lt;br /&gt;
The following analysis was conducted in GenMAPP Version 2.1. Within GenMAPP, the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database was loaded by selecting Data &amp;gt; Choose Gene Database and then selecting the file &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
===Putting a Gene on the MAPP Using the GeneFinder Window===&lt;br /&gt;
&lt;br /&gt;
We made a sample MAPP in which gene IDs conforming to the naming conventions of the 5 major gene databases containing &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; genome data were added. A screenshot of the resulting MAPP is provided below:&lt;br /&gt;
*[[File:Samplegenemapp.png]]&lt;br /&gt;
*Gene IDs:&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;bp1123&amp;#039;&amp;#039;&amp;#039; refers to the OrderedLocusNames gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;CAE43716&amp;#039;&amp;#039;&amp;#039; refers to the EmsemblBacteria gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Q7VWE&amp;#039;&amp;#039;&amp;#039;5 refers to the UniProt gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;2665491&amp;#039;&amp;#039;&amp;#039; refers to the GeneID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;NP_881255&amp;#039;&amp;#039;&amp;#039; refers to the RefSeq gene ID system.&lt;br /&gt;
&lt;br /&gt;
Note: Gene IDs tested from the above gene ID systems all had complete Backpages and were successfully placed on the MAPP.&lt;br /&gt;
&lt;br /&gt;
===Creating an Expression Dataset in the Expression Dataset Manager===&lt;br /&gt;
The file [[File:Bpertussis compiledrawdata cw20151208.txt]] was used to create an expression dataset in GenMAPP.&lt;br /&gt;
&lt;br /&gt;
*Total Number of Gene IDs Imported&lt;br /&gt;
** 3211 of the 3552 gene IDs from the microarray dataset were imported into the expression dataset.&lt;br /&gt;
**There were 341 exceptions during the creation of the expression dataset. A screenshot of the error message is shown here: &lt;br /&gt;
***[[File:Errors in genmapp.png]]&lt;br /&gt;
*Investigating Errors in the Exceptions File (EX.txt)&lt;br /&gt;
**All 341 exceptions triggered the following error message: &amp;quot;Gene not found in OrderedLocusNames or any related system.&amp;quot;&lt;br /&gt;
**Gene IDs that triggered this error message conformed to the patterns &amp;quot;BP####&amp;quot; and &amp;quot;BP####A&amp;quot;, indicating that no unique gene ID patterns were the cause of these errors.&lt;br /&gt;
***Example gene IDs that triggered this error are the following: BP0101, BP1677, BP0910A, and BP2029A.&lt;br /&gt;
****Searching for any of these gene IDs in UniProt returns the message &amp;quot;Sorry, no results found for your search term.&amp;quot;:&lt;br /&gt;
*****[[File:ErroneousID Uniprot cw20151210.PNG]]&lt;br /&gt;
***The 341 gene IDs were copied into a new Excel file and compared to the gene IDs present in the file [[File:Bpertussis-std_cw20151210.zip]] (adapted from the &amp;quot;OrderedLocusNames&amp;quot; table in Microsoft Access).&lt;br /&gt;
****None of the 341 gene IDs were present in the .gdb file.&lt;br /&gt;
***The 341 gene IDs were each individually searched for in UniProt.&lt;br /&gt;
****None of the 341 gene IDs retrieved results in UniProt.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Conclusion: All gene IDs that triggered errors were not present in the original UniProt XML file.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Coloring a MAPP with Expression Data===&lt;br /&gt;
&lt;br /&gt;
====Creating a New Color Set====&lt;br /&gt;
We customized the new Expression Dataset by creating a new color set entitled &amp;quot;LogFoldChange&amp;quot;.&lt;br /&gt;
# We created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;increase&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Increased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as red using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;gt; 0.25 AND [Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;.&lt;br /&gt;
#Second, we created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;decrease&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Decreased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as green using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;lt; -0.25 AND [Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;&lt;br /&gt;
# Upon entering these color sets, we saved the entire Expression Dataset by selecting Save from the Expression Dataset menu. This effectively updated our .gex file with the new Color Set.&lt;br /&gt;
&lt;br /&gt;
Screenshot of Color Set criteria:&lt;br /&gt;
*[[File:Expressioncolorset.png]]&lt;br /&gt;
&lt;br /&gt;
Note: No errors were encountered in the creation of the Color Set.&lt;br /&gt;
&lt;br /&gt;
====Creating a Pathway-Based MAPP Using Colored Genes====&lt;br /&gt;
====Ribosome Kegg Pathway====&lt;br /&gt;
* We were able to create a mapp of the ribosome pathway by using the genes provided from the http://www.genome.jp/kegg/ website.&lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Ribosome&amp;quot; that was under section 2.2 Translation and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the ribosome pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the ribosome pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151213&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
**Here is the screenshot of the final mapp for the ribosome pathway created:&lt;br /&gt;
* [[File:RibosomeGenMAPP.png]]&lt;br /&gt;
** Most of the ribosome genes that were generated on this mapp appeared to be the color green, symbolizing a decrease, except for the grey colored genes that were not significantly changed in this experiment. Since the genes mapped for the ribosome pathway all appeared to be green, this means that the expression levels of the genes pertaining to the ribosome category all decreased during the microarray experiment. Ribosomes play a key role in the translation process in cells and without them genes are often repressed and unable to perform their proper functions as they are unable to complete the replication processes. The microarray experiment analysis revealed that the absence of a membrane-associated protein named KpsT in B. pertussis, resulted in global down-regulation of gene expression including key virulence genes. The ribosome pathway depicted genes that were decreasing in gene expression, thus linking the translation process to the down-regulated key genes from the experiment because since these genes were lacking a necessary protein to help them perform the proper replication processes, translation did not occur in these genes and thus the ribosomes were not involved, ultimately leading to the decrease in expression of the genes mapped in the ribosome pathway.&lt;br /&gt;
&lt;br /&gt;
====Nitrogen Cycle Kegg Pathway====&lt;br /&gt;
* We were also able to create another mapp using the nitrogen cycle pathway genes provided from the http://www.genome.jp/kegg/ website. &lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Nitrogen Metabolism&amp;quot; that was under section 1.2 Energy Metabolism and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the nitrogen metabolism pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the nitrogen metabolism pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151213&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
** Here is the screenshot of the final mapp for the nitrogen cycle pathway created:&lt;br /&gt;
* [[File:NitrogencycleGenMAPP.png]]&lt;br /&gt;
** This mapp displayed both red and green colored genes; the green highlighted genes symbolizing a decrease and the red highlighted genes symbolizing an increase, as well a couple of gray genes that were not significant to the criterion. This nitrogen cycle mapp was created due to the important metabolic processes that occur in order to keep cells alive and reproducing, and specifically the nitrogen metabolism cycle. The genes that displayed red in this mapp had increased expression during the microarray experiment, and from the kegg pathway given for nitrogen metabolism, these genes can be seen to specifically aid in the metabolism of glutamate. Glutamate is important to cells as it plays a role in providing energy to allow the cells to operate correctly, and since the glutamate-related genes that we mapped were increased, it can be determined that glutamate plays a role in supplying the underlying energy to allow for the Bordetella pertussis strains to produce the polysaccharide capsule transport proteins, as studied in the microarray experiment.&lt;br /&gt;
&lt;br /&gt;
===Running MAPPFinder===&lt;br /&gt;
*MAPPFinder Procedure&lt;br /&gt;
** We launched the MAPPFinder program from within GenMAPP and ensured that the &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039; gene database was still loaded into GenMAPP.&lt;br /&gt;
** We clicked on the button &amp;quot;Calculate New Results&amp;quot; followed by &amp;quot;Find File&amp;quot;, at which point I specified the .gex file updated during the creation of the &amp;quot;LogFoldChange&amp;quot; color set.&lt;br /&gt;
** We chose to apply both the &amp;quot;Increased&amp;quot; and &amp;quot;Decreased&amp;quot; criteria present within the LogFoldChange color set to the data.&lt;br /&gt;
** We checked the boxes next to &amp;quot;Gene Ontology&amp;quot; and &amp;quot;p value&amp;quot;, specified the results file, and then clicked &amp;quot;Run MAPPFinder&amp;quot;.&lt;br /&gt;
***This analysis took several minutes to complete.&lt;br /&gt;
*MAPPFinder Analysis Results&lt;br /&gt;
**We selected &amp;quot;Show Ranked List&amp;quot; to see a list of the most significant Gene Ontology terms. A screenshot of this output is shown below:&lt;br /&gt;
**[[File:Mappfinderrankedlist.png]]&lt;br /&gt;
***The majority of the most significant gene ontology terms pertained to ribosome biosynthesis and translation.&lt;br /&gt;
&lt;br /&gt;
Note: The MAPPFinder analysis took approximately 8 minutes to complete. No errors were encountered in the process. MAPPFinder thus was confirmed to work with the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database.&lt;br /&gt;
&lt;br /&gt;
=== Compare Gene Database to Outside Resource===&lt;br /&gt;
&lt;br /&gt;
To assess the completeness of this version of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database, we explored the original genome sequencing data from Parkhill et al. (2003) that was deposited at the [http://www.genedb.org/Homepage/Bpertussis GeneDB Model Organism Database (MOD)]. From the GeneDB Home Page, we accessed a &amp;#039;&amp;#039;Gene Type&amp;#039;&amp;#039; search function that was used to quantify the number of gene listings present under each provided gene category. The results of this investigation are presented below.&lt;br /&gt;
&lt;br /&gt;
====Protein-Coding Genes====&lt;br /&gt;
[[File:GDB protein-coding.png]]&lt;br /&gt;
*There are 3447 protein-coding genes present in the [http://www.genedb.org/Homepage/Bpertussis GeneDB] database. This result verified that the set of protein-coding genes exported into [[File:Bpertussis-std cw20151210.zip]] from UniProt is complete. No further changes to the gene database export procedures are necessary at this time.&lt;br /&gt;
&lt;br /&gt;
====Non-Protein Genome Features====&lt;br /&gt;
&lt;br /&gt;
#Pseudogenes&lt;br /&gt;
#*[[File:GDB_pseudogenes.png]]&lt;br /&gt;
#**GeneDB indicated that 359 pseudogenes are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. Pseudogenes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#rRNA&lt;br /&gt;
#*[[File:GDB_rRNA.png]]&lt;br /&gt;
#**GeneDB indicated that 9 genes that encode for rRNA are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. These genes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#tRNA&lt;br /&gt;
#*[[File:GDB_tRNA.png]]&lt;br /&gt;
#**GeneDB indicated that 51 genes that encode for tRNA are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. These genes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#snoRNA&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for snoRNA.&lt;br /&gt;
#snRNA&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for snRNA.&lt;br /&gt;
#&amp;quot;miscRNA&amp;quot;&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for &amp;quot;miscRNA&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;A total of 419 non-protein coding genes were identified in the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; genome in addition to the 3447 protein-coding genes captured in our gene database.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_tRNA.png&amp;diff=8096</id>
		<title>File:GDB tRNA.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_tRNA.png&amp;diff=8096"/>
				<updated>2015-12-18T20:31:23Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:GDB tRNA.png&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of genes coding for tRNA in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_rRNA.png&amp;diff=8095</id>
		<title>File:GDB rRNA.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_rRNA.png&amp;diff=8095"/>
				<updated>2015-12-18T20:31:11Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:GDB rRNA.png&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of genes coding for rRNA in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_pseudogenes.png&amp;diff=8094</id>
		<title>File:GDB pseudogenes.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_pseudogenes.png&amp;diff=8094"/>
				<updated>2015-12-18T20:30:54Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:GDB pseudogenes.png&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of pseudogenes in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_protein-coding.png&amp;diff=8093</id>
		<title>File:GDB protein-coding.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_protein-coding.png&amp;diff=8093"/>
				<updated>2015-12-18T20:30:39Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:GDB protein-coding.png&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of protein-coding genes in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_tRNA.png&amp;diff=8090</id>
		<title>File:GDB tRNA.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_tRNA.png&amp;diff=8090"/>
				<updated>2015-12-18T20:24:31Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:GDB tRNA.png&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of genes coding for tRNA in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_rRNA.png&amp;diff=8089</id>
		<title>File:GDB rRNA.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_rRNA.png&amp;diff=8089"/>
				<updated>2015-12-18T20:24:20Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:GDB rRNA.png&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of genes coding for rRNA in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_pseudogenes.png&amp;diff=8088</id>
		<title>File:GDB pseudogenes.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_pseudogenes.png&amp;diff=8088"/>
				<updated>2015-12-18T20:24:04Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:GDB pseudogenes.png&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of pseudogenes in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_protein-coding.png&amp;diff=8087</id>
		<title>File:GDB protein-coding.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_protein-coding.png&amp;diff=8087"/>
				<updated>2015-12-18T20:23:52Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: Bklein7 uploaded a new version of File:GDB protein-coding.png&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of protein-coding genes in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8086</id>
		<title>Gene Database Testing Report- cw20151210</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8086"/>
				<updated>2015-12-18T20:20:53Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: completed comparison of .gdb file to outside resource&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Files Asked for in the Gene Database Testing Report==&lt;br /&gt;
&lt;br /&gt;
For convenience, all of the files explicitly asked for in the sections below were compressed together in this file: [[File:Testingreport cw20151210.zip]]&lt;br /&gt;
&lt;br /&gt;
==Pre-requisites==&lt;br /&gt;
&lt;br /&gt;
The following set of software was used in the creation and testing of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database:&lt;br /&gt;
# [http://www.7-zip.org/ 7-zip]tool that for unpacking .gz and .zip files&lt;br /&gt;
# [http://www.postgresql.org PostgreSQL] on Windows (version 9.4.x)&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ GenMAPP Builder]&lt;br /&gt;
# Java JDK 1.8 64-bit&lt;br /&gt;
# [https://github.com/GenMAPPCS/genmapp GenMAPP 2]&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ XMLPipeDB match utility] for counting IDs in XML files&lt;br /&gt;
# Microsoft Access for reading .mdb files&lt;br /&gt;
&lt;br /&gt;
==Gene Database Creation==&lt;br /&gt;
===Downloading Data Source Files and GenMAPP Builder===&lt;br /&gt;
&lt;br /&gt;
*We download the UniProt XML, GOA, and GO OBO-XML files for &amp;#039;&amp;#039;Bordetella Pertussis&amp;#039;&amp;#039; along with the GenMAPP Builder program.&lt;br /&gt;
**All files were saved to the folder &amp;#039;&amp;#039;Bklein7_CW\bpertussis_cw20151210&amp;#039;&amp;#039; on our computer&amp;#039;s ThawSpace.&lt;br /&gt;
**Files that required extraction were unzipped using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
**Data files that remained in a folder after unzipping were removed from their folders to facilitate organization and command line processing.&lt;br /&gt;
&lt;br /&gt;
====UniProt XML====&lt;br /&gt;
&lt;br /&gt;
* We went to the [http://www.uniprot.org/taxonomy/complete-proteomes UniProt Complete Proteomes] page.&lt;br /&gt;
**From there, we navigated to the complete proteome download page for [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)].&lt;br /&gt;
** We clicked on the &amp;quot;Download&amp;quot; button at the top of the page above and selected the following options:&lt;br /&gt;
***&amp;quot;Download all&amp;quot;&lt;br /&gt;
***&amp;quot;XML&amp;quot; from the &amp;quot;Format&amp;quot; drop-down menu&lt;br /&gt;
***&amp;quot;Compressed&amp;quot; format&lt;br /&gt;
**We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====GOA====&lt;br /&gt;
&lt;br /&gt;
* UniProt-GOA files can be downloaded from the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/ UniProt-GOA ftp site].&lt;br /&gt;
*Within the above site, we navigated to the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I].&lt;br /&gt;
**This text file was automatically opened by the browser. Therefore, we had to manually download the file.&lt;br /&gt;
&lt;br /&gt;
====GO OBO-XML====&lt;br /&gt;
&lt;br /&gt;
* We downloaded the GO OBO-XML formatted file from the [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page].&lt;br /&gt;
* We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====Downloaded GenMAPP Builder====&lt;br /&gt;
&lt;br /&gt;
# We downloaded the custom version of GenMAPP Builder including the most recent version of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; custom class (Version 3.0.0 Build 5 - cw20151210): [[File:Dist cw20151210.zip]].&lt;br /&gt;
# We extracted the GenMAPP Builder folder using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
===Creating the New Database in PostgreSQL===&lt;br /&gt;
&lt;br /&gt;
* We launched &amp;#039;&amp;#039;pgAdmin III&amp;#039;&amp;#039; and connected to the PostgreSQL 9.4 server (localhost:5432).&lt;br /&gt;
** On this server, we created a new database: &amp;#039;&amp;#039;bpertussis_cw20151210_gmb3build5&amp;#039;&amp;#039;.&lt;br /&gt;
** We opened the SQL Editor tab to use an XMLPipeDB query to create the tables in the database.&lt;br /&gt;
*** We clicked on the Open File icon and selected the file &amp;#039;&amp;#039;gmbuilder.sql&amp;#039;&amp;#039;. This imported a series of SQL commands into the editor tab.&lt;br /&gt;
*** We clicked on the Execute Query icon to run this command.&lt;br /&gt;
***In viewing the schema for this database, we confirmed that there were 167 tables after running the above command.&lt;br /&gt;
&lt;br /&gt;
===Configuring GenMAPP Builder to Connect to the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
* To begin, we launched gmbuilder.bat.&lt;br /&gt;
* We selected the &amp;quot;Configure Database&amp;quot; option and entered the following information into the fields below:&lt;br /&gt;
** Host or address: localhost&lt;br /&gt;
** Port number: 5432&lt;br /&gt;
** Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
** Username: postgres&lt;br /&gt;
** Password: Welcome1&lt;br /&gt;
&lt;br /&gt;
===Importing Data into the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
*The downloaded data files for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; were specified and imported into the database by clicking on the following buttons:&lt;br /&gt;
** Selected File &amp;gt; Import UniProt XML...&lt;br /&gt;
** Selected File &amp;gt; Import GO OBO-XML...&lt;br /&gt;
** Clicked OK to the message asking to process the GO data.&lt;br /&gt;
** Selected File &amp;gt; Import GOA...&lt;br /&gt;
&lt;br /&gt;
===Exporting a GenMAPP Gene Database (.gdb)===&lt;br /&gt;
&lt;br /&gt;
* We selected File &amp;gt; Export to GenMAPP Gene Database... to begin the export process.&lt;br /&gt;
* We typed in our coder&amp;#039;s name in the owner field (Brandon Klein).&lt;br /&gt;
* We selected the custom profile &amp;quot;Bordetella pertussis, Taxon ID 257313&amp;quot; as the gene database species and then clicked &amp;#039;&amp;#039;Next&amp;#039;&amp;#039;.&lt;br /&gt;
* The database was saved as &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;.&lt;br /&gt;
* We checked the boxes for exporting all Molecular Function, Cellular Component, and Biological Process Gene Ontology Terms.&lt;br /&gt;
* Finally, we clicked the &amp;quot;Next&amp;quot; button to begin the export process.&lt;br /&gt;
&lt;br /&gt;
==Gene Database Testing Report==&lt;br /&gt;
===Export Information===&lt;br /&gt;
&lt;br /&gt;
Version of GenMAPP Builder: Version 3.0.0 Build 5 - cw20151210&lt;br /&gt;
&lt;br /&gt;
Computer on which export was run: Seaver 120- Last computer on the right in the row farthest from the front of the room&lt;br /&gt;
&lt;br /&gt;
Postgres Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
&lt;br /&gt;
UniProt XML filename: [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
* UniProt XML version (The version information was found at [http://uniprot.org/news the UniProt News Page]): 2015_12&lt;br /&gt;
* UniProt XML download link: [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)]&lt;br /&gt;
* Time taken to import: 2.88 minutes&lt;br /&gt;
** Note: The import time was similar to that when creating the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (2.59 minute). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
GO OBO-XML filename: [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
* GO OBO-XML version (The version information was found in the file properties): Last Modified- ‎‎ ‎December ‎10, ‎2015 (TIME?)&lt;br /&gt;
* GO OBO-XML download link: [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page]&lt;br /&gt;
* Time taken to import: 6.97 minutes &lt;br /&gt;
* Time taken to process: 4.52 minutes&lt;br /&gt;
** Note: The import and processing times were similar to those for the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (7.08 minutes and 4.42 minutes respectively). No interruptions occurred during these processes.&lt;br /&gt;
&lt;br /&gt;
GOA filename: [[File:145.B pertussis ATCC BAA-589 cw20151210.zip]]&lt;br /&gt;
* GOA version (found in the Last modified field on the [ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/ FTP site]): Last Modified- 08-Dec-2015 02:45&lt;br /&gt;
* GOA download link: [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I]&lt;br /&gt;
* Time taken to import: 0.03 minutes&lt;br /&gt;
** Note: The import time was very similar to that of the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (0.04 minutes). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
Name of .gdb file: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* Time taken to export: &lt;br /&gt;
** Start time: 1:19 AM&lt;br /&gt;
** End time: 2:11 AM&lt;br /&gt;
** Elapsed time: 52 minutes&lt;br /&gt;
Note: No interruptions occurred during the export process.&lt;br /&gt;
&lt;br /&gt;
===TallyEngine===&lt;br /&gt;
* We ran the TallyEngine in GenMAPP Builder and specified the following files:&lt;br /&gt;
**XML- [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
**GO- [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
*Results:&lt;br /&gt;
**[[File:TallyEngineResults cw20151210.png]]&lt;br /&gt;
***All TallyEngine results were consistent across both files.&lt;br /&gt;
***The TallyEngine was not customized to reflect the coding changes made to GenMAPP Builder Version 3.0.0 Build 5 - cw20151210.&lt;br /&gt;
****Therefore, the total count for &amp;quot;Ordered Locus Names&amp;quot; and &amp;quot;ORF&amp;quot; gene IDs remained 3446. The extra ID that was imported in this build, &amp;quot;BP3167A&amp;quot;, was not listed in either of these categories.&lt;br /&gt;
****&amp;#039;&amp;#039;&amp;#039;Further TallyEngine customization is necessary to raise the count to 3447 gene IDs.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Using XMLPipeDB Match to Validate the XML Results from the TallyEngine===&lt;br /&gt;
The following functions were performed using the Windows command line (cmd).&lt;br /&gt;
*We entered the project folder using the following command:&lt;br /&gt;
 cd /d T:\Bklein7_CW\bpertussis_cw20151210&lt;br /&gt;
*We used XMLPipeDB match to identify matches of gene IDs in the UniProt XML file that conformed to the following the patterns: &amp;quot;BP####&amp;quot;, &amp;quot;BP####.1&amp;quot;, &amp;quot;BP####A&amp;quot;, and &amp;quot;BP####B&amp;quot;. The command used was as follows:&lt;br /&gt;
 java -jar xmlpipedb-match-1.1.1.jar &amp;quot;BP[0-9][0-9][0-9][0-9](A|B|\.1|)&amp;quot; &amp;lt; &amp;quot;uniprot-proteome%3AUP000002676_cw20151201.xml&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Match Results:&lt;br /&gt;
*[[File:Xmlpipedbmatch cw20151203.png]]&lt;br /&gt;
**The number of unique matches generated by XMLPipeDB Match, 3447, matched with our expectation. The count includes the total number of ordered locus (3435) and ORF (11) gene IDs along with the unique EnsemblBacteria reference ID &amp;quot;BP3167A&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
===Using SQL Queries to Validate the PostgreSQL Database Results from the TallyEngine===&lt;br /&gt;
We used the SQL &amp;quot;union&amp;quot; operation to count the number of &amp;quot;ordered locus&amp;quot; gene IDs, which conform to the pattern &amp;quot;BP####&amp;quot;, in addition to all gene IDs that matched the patterns &amp;quot;BP####A&amp;quot; &amp;amp; &amp;quot;BP####B&amp;quot; (including 11 &amp;quot;ORF&amp;quot; gene IDs and 1 EnsemblBacteria reference ID):&lt;br /&gt;
&lt;br /&gt;
 select count(value) from (select value from genenametype where type = &lt;br /&gt;
 &amp;#039;ordered locus&amp;#039; union select value from propertytype inner join dbreferencetype&lt;br /&gt;
  on (propertytype.dbreferencetype_property_hjid = dbreferencetype.hjid)&lt;br /&gt;
   where dbreferencetype.type = &amp;#039;EnsemblBacteria&amp;#039; and propertytype.type = &lt;br /&gt;
   &amp;#039;gene ID&amp;#039; and propertytype.value ~ &amp;#039;BP[0-9][0-9][0-9][0-9](A|B)&amp;#039;) as combined;&lt;br /&gt;
&lt;br /&gt;
Note: This query was crafted by [[User:Dondi|Dr. Dionisio]].&lt;br /&gt;
&lt;br /&gt;
Results:&lt;br /&gt;
*[[File:PostgreSQL Count cw20151210.png]]&lt;br /&gt;
* The number of unique matches yielded by this SQL query, 3447, matched the count generated by XMLPipeDB Match. Thus, the locations of all 3447 gene IDs in the PostgreSQL relational database were accounted for here.&lt;br /&gt;
&lt;br /&gt;
===OriginalRowCounts Comparison===&lt;br /&gt;
&lt;br /&gt;
We opened the gene database file [[File:Bpertussis-std_cw20151210.zip]] in  Microsoft Access and assessed the &amp;quot;OriginalRowCounts&amp;quot; table to see if the expected tables were listed with the expected number of records. The contents of this table were compared to the &amp;#039;&amp;#039;OriginalRowCounts&amp;#039;&amp;#039; table of an existing .gdb file created during Week 9.&lt;br /&gt;
 &lt;br /&gt;
Benchmark .gdb file: [[File:Vc-Std 20151027 TR.gdb]]&lt;br /&gt;
&lt;br /&gt;
&amp;quot;OriginalRowCounts&amp;quot; table from the benchmark and new gdb:&lt;br /&gt;
*[[File:ComparisonToBenchmark cw20151210.PNG]]&lt;br /&gt;
**All 52 tables present in the 2015 &amp;#039;&amp;#039;Vibrio cholerae&amp;#039;&amp;#039; database were also present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; gene database, &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;. This confirmed that all expected tables were successfully created.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; table count is listed as 3447. &amp;#039;&amp;#039;&amp;#039;This count demonstrates that the missing ID, &amp;quot;BP3167A&amp;quot;, was successfully added to the export (confirmed below).&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
***[[File:BP3167A Confirmed cw20151210.PNG]]&lt;br /&gt;
&lt;br /&gt;
Note: The &amp;quot;OriginalRowCounts&amp;quot; tables were too large to screenshot. To circumvent this problem and facilitate the comparison, I copied the &amp;quot;OriginalRowCounts&amp;quot; tables from both gene databases into an Excel file and zoomed out. The above screenshot was taken from this Excel file. The &amp;quot;OrderedLocusNames&amp;quot; row count for &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039; is highlighted in yellow.&lt;br /&gt;
&lt;br /&gt;
===Visual Inspection===&lt;br /&gt;
We visually inspected individual tables within [[File:Bpertussis-std_cw20151210.zip]] using Microsoft Access.&lt;br /&gt;
&lt;br /&gt;
*Systems Table&lt;br /&gt;
**35 gene ID systems were listed, 11 of which were used in the creation of this .gdb file and listed the appropriate import date (12/10/2015).&lt;br /&gt;
***All gene ID systems relevant to &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; were listed. This includes: EMBL, EnsemblBacteria, GeneID, GeneOntology, InterPro, OrderedLocusNames, Pfam, RefSeq, and UniProt.&lt;br /&gt;
***This result corresponded with that of the benchmark .gdb file listed in the &amp;quot;OriginalRowCounts Comparison&amp;quot; section.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; listing properly displayed customizations to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; species profile.&lt;br /&gt;
***In this row, the species was listed correctly as &amp;quot;Bordetella pertussis&amp;quot;.&lt;br /&gt;
***In this row, the link corresponded to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; database at GeneDB. The link was as follows: http://www.genedb.org/gene/~;jsessionid=A06A0EFE93C64E476380393D4CBEFA69?actionName=%2FQuery%2FquickSearch&amp;amp;resultsSize=1&amp;amp;taxonNodeName=Bpertussis.&lt;br /&gt;
*UniProt Table&lt;br /&gt;
**This table contained 3258 entries with 6 character IDs.&lt;br /&gt;
**All ID&amp;#039;s in the UniProt table conform to the following pattern:&lt;br /&gt;
*** [[File:UniProt Ascension Number info.PNG]]&lt;br /&gt;
*RefSeq Table&lt;br /&gt;
**This table contained 6627 entries. All IDs began with one of three prefixes: &amp;quot;NP_&amp;quot;, &amp;quot;YP_&amp;quot;, or &amp;quot;WP_&amp;quot;. The meanings of these prefixes can be found in the RefSeq documentation found [http://www.ncbi.nlm.nih.gov/books/NBK50679/ here].&lt;br /&gt;
***&amp;quot;NP_&amp;quot; and &amp;quot;YP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to proteins. There are 3410 &amp;quot;NP_&amp;quot; IDs and 7 &amp;quot;YP_&amp;quot; IDs.&lt;br /&gt;
***&amp;quot;WP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to &amp;quot;autonomous non-redundant proteins that are not yet directly annotated on a genome&amp;quot;. There were 3210 IDs with the &amp;quot;WP_&amp;quot; prefixes.&lt;br /&gt;
***Overall, every entry in the ID column was an expected value.&lt;br /&gt;
*OrderedLocusNames Table&lt;br /&gt;
**This table contained 3447 entries (consistent with the XMLPipeDB Match result).&lt;br /&gt;
**The IDs were copied into an Excel document for analysis:&lt;br /&gt;
***3434 IDs conformed to the pattern &amp;quot;BP####&amp;quot;.&lt;br /&gt;
***11 IDs conformed to the pattern &amp;quot;BP####A&amp;quot;.&lt;br /&gt;
****This included 10 ORF gene IDs &amp;amp; &amp;quot;BP3167A&amp;quot; (reference to an EnsemblBacteria ID).&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####B&amp;quot;.&lt;br /&gt;
****This corresponded to an ORF gene ID.&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####.1&amp;quot;.&lt;br /&gt;
****This ID was the manner in which UniProt classified &amp;quot;BP3167A&amp;quot;. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==bpertussis-std_cw20151210.gdb Use in GenMAPP==&lt;br /&gt;
&lt;br /&gt;
The following analysis was conducted in GenMAPP Version 2.1. Within GenMAPP, the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database was loaded by selecting Data &amp;gt; Choose Gene Database and then selecting the file &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
===Putting a Gene on the MAPP Using the GeneFinder Window===&lt;br /&gt;
&lt;br /&gt;
We made a sample MAPP in which gene IDs conforming to the naming conventions of the 5 major gene databases containing &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; genome data were added. A screenshot of the resulting MAPP is provided below:&lt;br /&gt;
[[File:Samplegenemapp.png]]&lt;br /&gt;
*Gene IDs:&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;bp1123&amp;#039;&amp;#039;&amp;#039; refers to the OrderedLocusNames gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;CAE43716&amp;#039;&amp;#039;&amp;#039; refers to the EmsemblBacteria gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Q7VWE&amp;#039;&amp;#039;&amp;#039;5 refers to the UniProt gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;2665491&amp;#039;&amp;#039;&amp;#039; refers to the GeneID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;NP_881255&amp;#039;&amp;#039;&amp;#039; refers to the RefSeq gene ID system.&lt;br /&gt;
&lt;br /&gt;
Note: Gene IDs tested from the above gene ID systems all had complete Backpages and were successfully placed on the MAPP.&lt;br /&gt;
&lt;br /&gt;
===Creating an Expression Dataset in the Expression Dataset Manager===&lt;br /&gt;
The file [[File:Bpertussis compiledrawdata cw20151208.txt]] was used to create an expression dataset in GenMAPP.&lt;br /&gt;
&lt;br /&gt;
*Total Number of Gene IDs Imported&lt;br /&gt;
** 3211 of the 3552 gene IDs from the microarray dataset were imported into the expression dataset.&lt;br /&gt;
**There were 341 exceptions during the creation of the expression dataset. A screenshot of the error message is shown here: &lt;br /&gt;
***[[File:Errors in genmapp.png]]&lt;br /&gt;
*Investigating Errors in the Exceptions File (EX.txt)&lt;br /&gt;
**All 341 exceptions triggered the following error message: &amp;quot;Gene not found in OrderedLocusNames or any related system.&amp;quot;&lt;br /&gt;
**Gene IDs that triggered this error message conformed to the patterns &amp;quot;BP####&amp;quot; and &amp;quot;BP####A&amp;quot;, indicating that no unique gene ID patterns were the cause of these errors.&lt;br /&gt;
***Example gene IDs that triggered this error are the following: BP0101, BP1677, BP0910A, and BP2029A.&lt;br /&gt;
****Searching for any of these gene IDs in UniProt returns the message &amp;quot;Sorry, no results found for your search term.&amp;quot;:&lt;br /&gt;
*****[[File:ErroneousID Uniprot cw20151210.PNG]]&lt;br /&gt;
***The 341 gene IDs were copied into a new Excel file and compared to the gene IDs present in the file [[File:Bpertussis-std_cw20151210.zip]] (adapted from the &amp;quot;OrderedLocusNames&amp;quot; table in Microsoft Access).&lt;br /&gt;
****None of the 341 gene IDs were present in the .gdb file.&lt;br /&gt;
***The 341 gene IDs were each individually searched for in UniProt.&lt;br /&gt;
****None of the 341 gene IDs retrieved results in UniProt.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Conclusion: All gene IDs that triggered errors were not present in the original UniProt XML file.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Coloring a MAPP with Expression Data===&lt;br /&gt;
&lt;br /&gt;
====Creating a New Color Set====&lt;br /&gt;
We customized the new Expression Dataset by creating a new color set entitled &amp;quot;LogFoldChange&amp;quot;.&lt;br /&gt;
# We created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;increase&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Increased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as red using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;gt; 0.25 AND [Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;.&lt;br /&gt;
#Second, we created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;decrease&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Decreased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as green using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;lt; -0.25 AND [Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;&lt;br /&gt;
# Upon entering these color sets, we saved the entire Expression Dataset by selecting Save from the Expression Dataset menu. This effectively updated our .gex file with the new Color Set.&lt;br /&gt;
&lt;br /&gt;
Screenshot of Color Set criteria:&lt;br /&gt;
*[[File:Expressioncolorset.png]]&lt;br /&gt;
&lt;br /&gt;
Note: No errors were encountered in the creation of the Color Set.&lt;br /&gt;
&lt;br /&gt;
====Creating a Pathway-Based MAPP Using Colored Genes====&lt;br /&gt;
====Ribosome Kegg Pathway====&lt;br /&gt;
* We were able to create a mapp of the ribosome pathway by using the genes provided from the http://www.genome.jp/kegg/ website.&lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Ribosome&amp;quot; that was under section 2.2 Translation and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the ribosome pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the ribosome pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151213&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
**Here is the screenshot of the final mapp for the ribosome pathway created:&lt;br /&gt;
* [[File:RibosomeGenMAPP.png]]&lt;br /&gt;
** Most of the ribosome genes that were generated on this mapp appeared to be the color green, symbolizing a decrease, except for the grey colored genes that were not significantly changed in this experiment. Since the genes mapped for the ribosome pathway all appeared to be green, this means that the expression levels of the genes pertaining to the ribosome category all decreased during the microarray experiment. Ribosomes play a key role in the translation process in cells and without them genes are often repressed and unable to perform their proper functions as they are unable to complete the replication processes. The microarray experiment analysis revealed that the absence of a membrane-associated protein named KpsT in B. pertussis, resulted in global down-regulation of gene expression including key virulence genes. The ribosome pathway depicted genes that were decreasing in gene expression, thus linking the translation process to the down-regulated key genes from the experiment because since these genes were lacking a necessary protein to help them perform the proper replication processes, translation did not occur in these genes and thus the ribosomes were not involved, ultimately leading to the decrease in expression of the genes mapped in the ribosome pathway.&lt;br /&gt;
&lt;br /&gt;
====Nitrogen Cycle Kegg Pathway====&lt;br /&gt;
* We were also able to create another mapp using the nitrogen cycle pathway genes provided from the http://www.genome.jp/kegg/ website. &lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Nitrogen Metabolism&amp;quot; that was under section 1.2 Energy Metabolism and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the nitrogen metabolism pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the nitrogen metabolism pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151213&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
** Here is the screenshot of the final mapp for the nitrogen cycle pathway created:&lt;br /&gt;
* [[File:NitrogencycleGenMAPP.png]]&lt;br /&gt;
** This mapp displayed both red and green colored genes; the green highlighted genes symbolizing a decrease and the red highlighted genes symbolizing an increase, as well a couple of gray genes that were not significant to the criterion. This nitrogen cycle mapp was created due to the important metabolic processes that occur in order to keep cells alive and reproducing, and specifically the nitrogen metabolism cycle. The genes that displayed red in this mapp had increased expression during the microarray experiment, and from the kegg pathway given for nitrogen metabolism, these genes can be seen to specifically aid in the metabolism of glutamate. Glutamate is important to cells as it plays a role in providing energy to allow the cells to operate correctly, and since the glutamate-related genes that we mapped were increased, it can be determined that glutamate plays a role in supplying the underlying energy to allow for the Bordetella pertussis strains to produce the polysaccharide capsule transport proteins, as studied in the microarray experiment.&lt;br /&gt;
&lt;br /&gt;
===Running MAPPFinder===&lt;br /&gt;
*MAPPFinder Procedure&lt;br /&gt;
** We launched the MAPPFinder program from within GenMAPP and ensured that the &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039; gene database was still loaded into GenMAPP.&lt;br /&gt;
** We clicked on the button &amp;quot;Calculate New Results&amp;quot; followed by &amp;quot;Find File&amp;quot;, at which point I specified the .gex file updated during the creation of the &amp;quot;LogFoldChange&amp;quot; color set.&lt;br /&gt;
** We chose to apply both the &amp;quot;Increased&amp;quot; and &amp;quot;Decreased&amp;quot; criteria present within the LogFoldChange color set to the data.&lt;br /&gt;
** We checked the boxes next to &amp;quot;Gene Ontology&amp;quot; and &amp;quot;p value&amp;quot;, specified the results file, and then clicked &amp;quot;Run MAPPFinder&amp;quot;.&lt;br /&gt;
***This analysis took several minutes to complete.&lt;br /&gt;
*MAPPFinder Analysis Results&lt;br /&gt;
**We selected &amp;quot;Show Ranked List&amp;quot; to see a list of the most significant Gene Ontology terms. A screenshot of this output is shown below:&lt;br /&gt;
**[[File:Mappfinderrankedlist.png]]&lt;br /&gt;
***The majority of the most significant gene ontology terms pertained to ribosome biosynthesis and translation.&lt;br /&gt;
&lt;br /&gt;
Note: The MAPPFinder analysis took approximately 8 minutes to complete. No errors were encountered in the process. MAPPFinder thus was confirmed to work with the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database.&lt;br /&gt;
&lt;br /&gt;
=== Compare Gene Database to Outside Resource===&lt;br /&gt;
&lt;br /&gt;
To assess the completeness of this version of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database, we explored the original genome sequencing data from Parkhill et al. (2003) that was deposited at the [http://www.genedb.org/Homepage/Bpertussis GeneDB Model Organism Database (MOD)]. From the GeneDB Home Page, we accessed a &amp;#039;&amp;#039;Gene Type&amp;#039;&amp;#039; search function that was used to quantify the number of gene listings present under each provided gene category. The results of this investigation are presented below.&lt;br /&gt;
&lt;br /&gt;
====Protein-Coding Genes====&lt;br /&gt;
[[File:GDB protein-coding.png]]&lt;br /&gt;
*There are 3447 protein-coding genes present in the [http://www.genedb.org/Homepage/Bpertussis GeneDB] database. This result verified that the set of protein-coding genes exported into [[File:Bpertussis-std cw20151210.zip]] from UniProt is complete. No further changes to the gene database export procedures are necessary at this time.&lt;br /&gt;
&lt;br /&gt;
====Non-Protein Genome Features====&lt;br /&gt;
&lt;br /&gt;
#Pseudogenes&lt;br /&gt;
#*[[File:GDB_pseudogenes.png]]&lt;br /&gt;
#**GeneDB indicated that 359 pseudogenes are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. Pseudogenes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#rRNA&lt;br /&gt;
#*[[File:GDB_rRNA.png]]&lt;br /&gt;
#**GeneDB indicated that 9 genes that encode for rRNA are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. These genes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#tRNA&lt;br /&gt;
#*[[File:GDB_tRNA.png]]&lt;br /&gt;
#**GeneDB indicated that 51 genes that encode for tRNA are present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; genome. These genes do not code for proteins and were therefore not included in the original UniProt listing.&lt;br /&gt;
#snoRNA&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for snoRNA.&lt;br /&gt;
#snRNA&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for snRNA.&lt;br /&gt;
#&amp;quot;miscRNA&amp;quot;&lt;br /&gt;
#*GeneDB retrieved 0 genes that encode for &amp;quot;miscRNA&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;A total of 419 non-protein coding genes were identified in the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; genome in addition to the 3447 protein-coding genes captured in our gene database.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_tRNA.png&amp;diff=8085</id>
		<title>File:GDB tRNA.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_tRNA.png&amp;diff=8085"/>
				<updated>2015-12-18T20:10:37Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: screenshot indicating the number of genes coding for tRNA in the GeneDB MOD for Bordetella pertussis&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of genes coding for tRNA in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_rRNA.png&amp;diff=8084</id>
		<title>File:GDB rRNA.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_rRNA.png&amp;diff=8084"/>
				<updated>2015-12-18T20:10:10Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: screenshot indicating the number of genes coding for rRNA in the GeneDB MOD for Bordetella pertussis&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of genes coding for rRNA in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_pseudogenes.png&amp;diff=8083</id>
		<title>File:GDB pseudogenes.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_pseudogenes.png&amp;diff=8083"/>
				<updated>2015-12-18T20:09:35Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: screenshot indicating the number of pseudogenes in the GeneDB MOD for Bordetella pertussis&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of pseudogenes in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_protein-coding.png&amp;diff=8082</id>
		<title>File:GDB protein-coding.png</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=File:GDB_protein-coding.png&amp;diff=8082"/>
				<updated>2015-12-18T20:09:09Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: screenshot indicating the number of protein-coding genes in the GeneDB MOD for Bordetella pertussis&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;screenshot indicating the number of protein-coding genes in the GeneDB MOD for Bordetella pertussis&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8081</id>
		<title>Gene Database Testing Report- cw20151210</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8081"/>
				<updated>2015-12-18T19:42:03Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: edited out excess punctuation&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Files Asked for in the Gene Database Testing Report==&lt;br /&gt;
&lt;br /&gt;
For convenience, all of the files explicitly asked for in the sections below were compressed together in this file: [[File:Testingreport cw20151210.zip]]&lt;br /&gt;
&lt;br /&gt;
==Pre-requisites==&lt;br /&gt;
&lt;br /&gt;
The following set of software was used in the creation and testing of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database:&lt;br /&gt;
# [http://www.7-zip.org/ 7-zip]tool that for unpacking .gz and .zip files&lt;br /&gt;
# [http://www.postgresql.org PostgreSQL] on Windows (version 9.4.x)&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ GenMAPP Builder]&lt;br /&gt;
# Java JDK 1.8 64-bit&lt;br /&gt;
# [https://github.com/GenMAPPCS/genmapp GenMAPP 2]&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ XMLPipeDB match utility] for counting IDs in XML files&lt;br /&gt;
# Microsoft Access for reading .mdb files&lt;br /&gt;
&lt;br /&gt;
==Gene Database Creation==&lt;br /&gt;
===Downloading Data Source Files and GenMAPP Builder===&lt;br /&gt;
&lt;br /&gt;
*We download the UniProt XML, GOA, and GO OBO-XML files for &amp;#039;&amp;#039;Bordetella Pertussis&amp;#039;&amp;#039; along with the GenMAPP Builder program.&lt;br /&gt;
**All files were saved to the folder &amp;#039;&amp;#039;Bklein7_CW\bpertussis_cw20151210&amp;#039;&amp;#039; on our computer&amp;#039;s ThawSpace.&lt;br /&gt;
**Files that required extraction were unzipped using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
**Data files that remained in a folder after unzipping were removed from their folders to facilitate organization and command line processing.&lt;br /&gt;
&lt;br /&gt;
====UniProt XML====&lt;br /&gt;
&lt;br /&gt;
* We went to the [http://www.uniprot.org/taxonomy/complete-proteomes UniProt Complete Proteomes] page.&lt;br /&gt;
**From there, we navigated to the complete proteome download page for [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)].&lt;br /&gt;
** We clicked on the &amp;quot;Download&amp;quot; button at the top of the page above and selected the following options:&lt;br /&gt;
***&amp;quot;Download all&amp;quot;&lt;br /&gt;
***&amp;quot;XML&amp;quot; from the &amp;quot;Format&amp;quot; drop-down menu&lt;br /&gt;
***&amp;quot;Compressed&amp;quot; format&lt;br /&gt;
**We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====GOA====&lt;br /&gt;
&lt;br /&gt;
* UniProt-GOA files can be downloaded from the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/ UniProt-GOA ftp site].&lt;br /&gt;
*Within the above site, we navigated to the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I].&lt;br /&gt;
**This text file was automatically opened by the browser. Therefore, we had to manually download the file.&lt;br /&gt;
&lt;br /&gt;
====GO OBO-XML====&lt;br /&gt;
&lt;br /&gt;
* We downloaded the GO OBO-XML formatted file from the [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page].&lt;br /&gt;
* We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====Downloaded GenMAPP Builder====&lt;br /&gt;
&lt;br /&gt;
# We downloaded the custom version of GenMAPP Builder including the most recent version of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; custom class (Version 3.0.0 Build 5 - cw20151210): [[File:Dist cw20151210.zip]].&lt;br /&gt;
# We extracted the GenMAPP Builder folder using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
===Creating the New Database in PostgreSQL===&lt;br /&gt;
&lt;br /&gt;
* We launched &amp;#039;&amp;#039;pgAdmin III&amp;#039;&amp;#039; and connected to the PostgreSQL 9.4 server (localhost:5432).&lt;br /&gt;
** On this server, we created a new database: &amp;#039;&amp;#039;bpertussis_cw20151210_gmb3build5&amp;#039;&amp;#039;.&lt;br /&gt;
** We opened the SQL Editor tab to use an XMLPipeDB query to create the tables in the database.&lt;br /&gt;
*** We clicked on the Open File icon and selected the file &amp;#039;&amp;#039;gmbuilder.sql&amp;#039;&amp;#039;. This imported a series of SQL commands into the editor tab.&lt;br /&gt;
*** We clicked on the Execute Query icon to run this command.&lt;br /&gt;
***In viewing the schema for this database, we confirmed that there were 167 tables after running the above command.&lt;br /&gt;
&lt;br /&gt;
===Configuring GenMAPP Builder to Connect to the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
* To begin, we launched gmbuilder.bat.&lt;br /&gt;
* We selected the &amp;quot;Configure Database&amp;quot; option and entered the following information into the fields below:&lt;br /&gt;
** Host or address: localhost&lt;br /&gt;
** Port number: 5432&lt;br /&gt;
** Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
** Username: postgres&lt;br /&gt;
** Password: Welcome1&lt;br /&gt;
&lt;br /&gt;
===Importing Data into the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
*The downloaded data files for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; were specified and imported into the database by clicking on the following buttons:&lt;br /&gt;
** Selected File &amp;gt; Import UniProt XML...&lt;br /&gt;
** Selected File &amp;gt; Import GO OBO-XML...&lt;br /&gt;
** Clicked OK to the message asking to process the GO data.&lt;br /&gt;
** Selected File &amp;gt; Import GOA...&lt;br /&gt;
&lt;br /&gt;
===Exporting a GenMAPP Gene Database (.gdb)===&lt;br /&gt;
&lt;br /&gt;
* We selected File &amp;gt; Export to GenMAPP Gene Database... to begin the export process.&lt;br /&gt;
* We typed in our coder&amp;#039;s name in the owner field (Brandon Klein).&lt;br /&gt;
* We selected the custom profile &amp;quot;Bordetella pertussis, Taxon ID 257313&amp;quot; as the gene database species and then clicked &amp;#039;&amp;#039;Next&amp;#039;&amp;#039;.&lt;br /&gt;
* The database was saved as &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;.&lt;br /&gt;
* We checked the boxes for exporting all Molecular Function, Cellular Component, and Biological Process Gene Ontology Terms.&lt;br /&gt;
* Finally, we clicked the &amp;quot;Next&amp;quot; button to begin the export process.&lt;br /&gt;
&lt;br /&gt;
==Gene Database Testing Report==&lt;br /&gt;
===Export Information===&lt;br /&gt;
&lt;br /&gt;
Version of GenMAPP Builder: Version 3.0.0 Build 5 - cw20151210&lt;br /&gt;
&lt;br /&gt;
Computer on which export was run: Seaver 120- Last computer on the right in the row farthest from the front of the room&lt;br /&gt;
&lt;br /&gt;
Postgres Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
&lt;br /&gt;
UniProt XML filename: [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
* UniProt XML version (The version information was found at [http://uniprot.org/news the UniProt News Page]): 2015_12&lt;br /&gt;
* UniProt XML download link: [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)]&lt;br /&gt;
* Time taken to import: 2.88 minutes&lt;br /&gt;
** Note: The import time was similar to that when creating the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (2.59 minute). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
GO OBO-XML filename: [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
* GO OBO-XML version (The version information was found in the file properties): Last Modified- ‎‎ ‎December ‎10, ‎2015 (TIME?)&lt;br /&gt;
* GO OBO-XML download link: [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page]&lt;br /&gt;
* Time taken to import: 6.97 minutes &lt;br /&gt;
* Time taken to process: 4.52 minutes&lt;br /&gt;
** Note: The import and processing times were similar to those for the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (7.08 minutes and 4.42 minutes respectively). No interruptions occurred during these processes.&lt;br /&gt;
&lt;br /&gt;
GOA filename: [[File:145.B pertussis ATCC BAA-589 cw20151210.zip]]&lt;br /&gt;
* GOA version (found in the Last modified field on the [ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/ FTP site]): Last Modified- 08-Dec-2015 02:45&lt;br /&gt;
* GOA download link: [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I]&lt;br /&gt;
* Time taken to import: 0.03 minutes&lt;br /&gt;
** Note: The import time was very similar to that of the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (0.04 minutes). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
Name of .gdb file: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* Time taken to export: &lt;br /&gt;
** Start time: 1:19 AM&lt;br /&gt;
** End time: 2:11 AM&lt;br /&gt;
** Elapsed time: 52 minutes&lt;br /&gt;
Note: No interruptions occurred during the export process.&lt;br /&gt;
&lt;br /&gt;
===TallyEngine===&lt;br /&gt;
* We ran the TallyEngine in GenMAPP Builder and specified the following files:&lt;br /&gt;
**XML- [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
**GO- [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
*Results:&lt;br /&gt;
**[[File:TallyEngineResults cw20151210.png]]&lt;br /&gt;
***All TallyEngine results were consistent across both files.&lt;br /&gt;
***The TallyEngine was not customized to reflect the coding changes made to GenMAPP Builder Version 3.0.0 Build 5 - cw20151210.&lt;br /&gt;
****Therefore, the total count for &amp;quot;Ordered Locus Names&amp;quot; and &amp;quot;ORF&amp;quot; gene IDs remained 3446. The extra ID that was imported in this build, &amp;quot;BP3167A&amp;quot;, was not listed in either of these categories.&lt;br /&gt;
****&amp;#039;&amp;#039;&amp;#039;Further TallyEngine customization is necessary to raise the count to 3447 gene IDs.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Using XMLPipeDB Match to Validate the XML Results from the TallyEngine===&lt;br /&gt;
The following functions were performed using the Windows command line (cmd).&lt;br /&gt;
*We entered the project folder using the following command:&lt;br /&gt;
 cd /d T:\Bklein7_CW\bpertussis_cw20151210&lt;br /&gt;
*We used XMLPipeDB match to identify matches of gene IDs in the UniProt XML file that conformed to the following the patterns: &amp;quot;BP####&amp;quot;, &amp;quot;BP####.1&amp;quot;, &amp;quot;BP####A&amp;quot;, and &amp;quot;BP####B&amp;quot;. The command used was as follows:&lt;br /&gt;
 java -jar xmlpipedb-match-1.1.1.jar &amp;quot;BP[0-9][0-9][0-9][0-9](A|B|\.1|)&amp;quot; &amp;lt; &amp;quot;uniprot-proteome%3AUP000002676_cw20151201.xml&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Match Results:&lt;br /&gt;
*[[File:Xmlpipedbmatch cw20151203.png]]&lt;br /&gt;
**The number of unique matches generated by XMLPipeDB Match, 3447, matched with our expectation. The count includes the total number of ordered locus (3435) and ORF (11) gene IDs along with the unique EnsemblBacteria reference ID &amp;quot;BP3167A&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
===Using SQL Queries to Validate the PostgreSQL Database Results from the TallyEngine===&lt;br /&gt;
We used the SQL &amp;quot;union&amp;quot; operation to count the number of &amp;quot;ordered locus&amp;quot; gene IDs, which conform to the pattern &amp;quot;BP####&amp;quot;, in addition to all gene IDs that matched the patterns &amp;quot;BP####A&amp;quot; &amp;amp; &amp;quot;BP####B&amp;quot; (including 11 &amp;quot;ORF&amp;quot; gene IDs and 1 EnsemblBacteria reference ID):&lt;br /&gt;
&lt;br /&gt;
 select count(value) from (select value from genenametype where type = &lt;br /&gt;
 &amp;#039;ordered locus&amp;#039; union select value from propertytype inner join dbreferencetype&lt;br /&gt;
  on (propertytype.dbreferencetype_property_hjid = dbreferencetype.hjid)&lt;br /&gt;
   where dbreferencetype.type = &amp;#039;EnsemblBacteria&amp;#039; and propertytype.type = &lt;br /&gt;
   &amp;#039;gene ID&amp;#039; and propertytype.value ~ &amp;#039;BP[0-9][0-9][0-9][0-9](A|B)&amp;#039;) as combined;&lt;br /&gt;
&lt;br /&gt;
Note: This query was crafted by [[User:Dondi|Dr. Dionisio]].&lt;br /&gt;
&lt;br /&gt;
Results:&lt;br /&gt;
*[[File:PostgreSQL Count cw20151210.png]]&lt;br /&gt;
* The number of unique matches yielded by this SQL query, 3447, matched the count generated by XMLPipeDB Match. Thus, the locations of all 3447 gene IDs in the PostgreSQL relational database were accounted for here.&lt;br /&gt;
&lt;br /&gt;
===OriginalRowCounts Comparison===&lt;br /&gt;
&lt;br /&gt;
We opened the gene database file [[File:Bpertussis-std_cw20151210.zip]] in  Microsoft Access and assessed the &amp;quot;OriginalRowCounts&amp;quot; table to see if the expected tables were listed with the expected number of records. The contents of this table were compared to the &amp;#039;&amp;#039;OriginalRowCounts&amp;#039;&amp;#039; table of an existing .gdb file created during Week 9.&lt;br /&gt;
 &lt;br /&gt;
Benchmark .gdb file: [[File:Vc-Std 20151027 TR.gdb]]&lt;br /&gt;
&lt;br /&gt;
&amp;quot;OriginalRowCounts&amp;quot; table from the benchmark and new gdb:&lt;br /&gt;
*[[File:ComparisonToBenchmark cw20151210.PNG]]&lt;br /&gt;
**All 52 tables present in the 2015 &amp;#039;&amp;#039;Vibrio cholerae&amp;#039;&amp;#039; database were also present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; gene database, &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;. This confirmed that all expected tables were successfully created.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; table count is listed as 3447. &amp;#039;&amp;#039;&amp;#039;This count demonstrates that the missing ID, &amp;quot;BP3167A&amp;quot;, was successfully added to the export (confirmed below).&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
***[[File:BP3167A Confirmed cw20151210.PNG]]&lt;br /&gt;
&lt;br /&gt;
Note: The &amp;quot;OriginalRowCounts&amp;quot; tables were too large to screenshot. To circumvent this problem and facilitate the comparison, I copied the &amp;quot;OriginalRowCounts&amp;quot; tables from both gene databases into an Excel file and zoomed out. The above screenshot was taken from this Excel file. The &amp;quot;OrderedLocusNames&amp;quot; row count for &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039; is highlighted in yellow.&lt;br /&gt;
&lt;br /&gt;
===Visual Inspection===&lt;br /&gt;
We visually inspected individual tables within [[File:Bpertussis-std_cw20151210.zip]] using Microsoft Access.&lt;br /&gt;
&lt;br /&gt;
*Systems Table&lt;br /&gt;
**35 gene ID systems were listed, 11 of which were used in the creation of this .gdb file and listed the appropriate import date (12/10/2015).&lt;br /&gt;
***All gene ID systems relevant to &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; were listed. This includes: EMBL, EnsemblBacteria, GeneID, GeneOntology, InterPro, OrderedLocusNames, Pfam, RefSeq, and UniProt.&lt;br /&gt;
***This result corresponded with that of the benchmark .gdb file listed in the &amp;quot;OriginalRowCounts Comparison&amp;quot; section.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; listing properly displayed customizations to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; species profile.&lt;br /&gt;
***In this row, the species was listed correctly as &amp;quot;Bordetella pertussis&amp;quot;.&lt;br /&gt;
***In this row, the link corresponded to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; database at GeneDB. The link was as follows: http://www.genedb.org/gene/~;jsessionid=A06A0EFE93C64E476380393D4CBEFA69?actionName=%2FQuery%2FquickSearch&amp;amp;resultsSize=1&amp;amp;taxonNodeName=Bpertussis.&lt;br /&gt;
*UniProt Table&lt;br /&gt;
**This table contained 3258 entries with 6 character IDs.&lt;br /&gt;
**All ID&amp;#039;s in the UniProt table conform to the following pattern:&lt;br /&gt;
*** [[File:UniProt Ascension Number info.PNG]]&lt;br /&gt;
*RefSeq Table&lt;br /&gt;
**This table contained 6627 entries. All IDs began with one of three prefixes: &amp;quot;NP_&amp;quot;, &amp;quot;YP_&amp;quot;, or &amp;quot;WP_&amp;quot;. The meanings of these prefixes can be found in the RefSeq documentation found [http://www.ncbi.nlm.nih.gov/books/NBK50679/ here].&lt;br /&gt;
***&amp;quot;NP_&amp;quot; and &amp;quot;YP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to proteins. There are 3410 &amp;quot;NP_&amp;quot; IDs and 7 &amp;quot;YP_&amp;quot; IDs.&lt;br /&gt;
***&amp;quot;WP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to &amp;quot;autonomous non-redundant proteins that are not yet directly annotated on a genome&amp;quot;. There were 3210 IDs with the &amp;quot;WP_&amp;quot; prefixes.&lt;br /&gt;
***Overall, every entry in the ID column was an expected value.&lt;br /&gt;
*OrderedLocusNames Table&lt;br /&gt;
**This table contained 3447 entries (consistent with the XMLPipeDB Match result).&lt;br /&gt;
**The IDs were copied into an Excel document for analysis:&lt;br /&gt;
***3434 IDs conformed to the pattern &amp;quot;BP####&amp;quot;.&lt;br /&gt;
***11 IDs conformed to the pattern &amp;quot;BP####A&amp;quot;.&lt;br /&gt;
****This included 10 ORF gene IDs &amp;amp; &amp;quot;BP3167A&amp;quot; (reference to an EnsemblBacteria ID).&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####B&amp;quot;.&lt;br /&gt;
****This corresponded to an ORF gene ID.&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####.1&amp;quot;.&lt;br /&gt;
****This ID was the manner in which UniProt classified &amp;quot;BP3167A&amp;quot;. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==bpertussis-std_cw20151210.gdb Use in GenMAPP==&lt;br /&gt;
&lt;br /&gt;
The following analysis was conducted in GenMAPP Version 2.1. Within GenMAPP, the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database was loaded by selecting Data &amp;gt; Choose Gene Database and then selecting the file &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
===Putting a Gene on the MAPP Using the GeneFinder Window===&lt;br /&gt;
&lt;br /&gt;
We made a sample MAPP in which gene IDs conforming to the naming conventions of the 5 major gene databases containing &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; genome data were added. A screenshot of the resulting MAPP is provided below:&lt;br /&gt;
[[File:Samplegenemapp.png]]&lt;br /&gt;
*Gene IDs:&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;bp1123&amp;#039;&amp;#039;&amp;#039; refers to the OrderedLocusNames gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;CAE43716&amp;#039;&amp;#039;&amp;#039; refers to the EmsemblBacteria gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Q7VWE&amp;#039;&amp;#039;&amp;#039;5 refers to the UniProt gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;2665491&amp;#039;&amp;#039;&amp;#039; refers to the GeneID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;NP_881255&amp;#039;&amp;#039;&amp;#039; refers to the RefSeq gene ID system.&lt;br /&gt;
&lt;br /&gt;
Note: Gene IDs tested from the above gene ID systems all had complete Backpages and were successfully placed on the MAPP.&lt;br /&gt;
&lt;br /&gt;
===Creating an Expression Dataset in the Expression Dataset Manager===&lt;br /&gt;
The file [[File:Bpertussis compiledrawdata cw20151208.txt]] was used to create an expression dataset in GenMAPP.&lt;br /&gt;
&lt;br /&gt;
*Total Number of Gene IDs Imported&lt;br /&gt;
** 3211 of the 3552 gene IDs from the microarray dataset were imported into the expression dataset.&lt;br /&gt;
**There were 341 exceptions during the creation of the expression dataset. A screenshot of the error message is shown here: &lt;br /&gt;
***[[File:Errors in genmapp.png]]&lt;br /&gt;
*Investigating Errors in the Exceptions File (EX.txt)&lt;br /&gt;
**All 341 exceptions triggered the following error message: &amp;quot;Gene not found in OrderedLocusNames or any related system.&amp;quot;&lt;br /&gt;
**Gene IDs that triggered this error message conformed to the patterns &amp;quot;BP####&amp;quot; and &amp;quot;BP####A&amp;quot;, indicating that no unique gene ID patterns were the cause of these errors.&lt;br /&gt;
***Example gene IDs that triggered this error are the following: BP0101, BP1677, BP0910A, and BP2029A.&lt;br /&gt;
****Searching for any of these gene IDs in UniProt returns the message &amp;quot;Sorry, no results found for your search term.&amp;quot;:&lt;br /&gt;
*****[[File:ErroneousID Uniprot cw20151210.PNG]]&lt;br /&gt;
***The 341 gene IDs were copied into a new Excel file and compared to the gene IDs present in the file [[File:Bpertussis-std_cw20151210.zip]] (adapted from the &amp;quot;OrderedLocusNames&amp;quot; table in Microsoft Access).&lt;br /&gt;
****None of the 341 gene IDs were present in the .gdb file.&lt;br /&gt;
***The 341 gene IDs were each individually searched for in UniProt.&lt;br /&gt;
****None of the 341 gene IDs retrieved results in UniProt.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Conclusion: All gene IDs that triggered errors were not present in the original UniProt XML file.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Coloring a MAPP with Expression Data===&lt;br /&gt;
&lt;br /&gt;
====Creating a New Color Set====&lt;br /&gt;
We customized the new Expression Dataset by creating a new color set entitled &amp;quot;LogFoldChange&amp;quot;.&lt;br /&gt;
# We created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;increase&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Increased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as red using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;gt; 0.25 AND [Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;.&lt;br /&gt;
#Second, we created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;decrease&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Decreased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as green using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;lt; -0.25 AND [Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;&lt;br /&gt;
# Upon entering these color sets, we saved the entire Expression Dataset by selecting Save from the Expression Dataset menu. This effectively updated our .gex file with the new Color Set.&lt;br /&gt;
&lt;br /&gt;
Screenshot of Color Set criteria:&lt;br /&gt;
*[[File:Expressioncolorset.png]]&lt;br /&gt;
&lt;br /&gt;
Note: No errors were encountered in the creation of the Color Set.&lt;br /&gt;
&lt;br /&gt;
====Creating a Pathway-Based MAPP Using Colored Genes====&lt;br /&gt;
====Ribosome Kegg Pathway====&lt;br /&gt;
* We were able to create a mapp of the ribosome pathway by using the genes provided from the http://www.genome.jp/kegg/ website.&lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Ribosome&amp;quot; that was under section 2.2 Translation and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the ribosome pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the ribosome pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151213&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
**Here is the screenshot of the final mapp for the ribosome pathway created:&lt;br /&gt;
* [[File:RibosomeGenMAPP.png]]&lt;br /&gt;
** Most of the ribosome genes that were generated on this mapp appeared to be the color green, symbolizing a decrease, except for the grey colored genes that were not significantly changed in this experiment. Since the genes mapped for the ribosome pathway all appeared to be green, this means that the expression levels of the genes pertaining to the ribosome category all decreased during the microarray experiment. Ribosomes play a key role in the translation process in cells and without them genes are often repressed and unable to perform their proper functions as they are unable to complete the replication processes. The microarray experiment analysis revealed that the absence of a membrane-associated protein named KpsT in B. pertussis, resulted in global down-regulation of gene expression including key virulence genes. The ribosome pathway depicted genes that were decreasing in gene expression, thus linking the translation process to the down-regulated key genes from the experiment because since these genes were lacking a necessary protein to help them perform the proper replication processes, translation did not occur in these genes and thus the ribosomes were not involved, ultimately leading to the decrease in expression of the genes mapped in the ribosome pathway.&lt;br /&gt;
&lt;br /&gt;
====Nitrogen Cycle Kegg Pathway====&lt;br /&gt;
* We were also able to create another mapp using the nitrogen cycle pathway genes provided from the http://www.genome.jp/kegg/ website. &lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Nitrogen Metabolism&amp;quot; that was under section 1.2 Energy Metabolism and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the nitrogen metabolism pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the nitrogen metabolism pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151213&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
** Here is the screenshot of the final mapp for the nitrogen cycle pathway created:&lt;br /&gt;
* [[File:NitrogencycleGenMAPP.png]]&lt;br /&gt;
** This mapp displayed both red and green colored genes; the green highlighted genes symbolizing a decrease and the red highlighted genes symbolizing an increase, as well a couple of gray genes that were not significant to the criterion. This nitrogen cycle mapp was created due to the important metabolic processes that occur in order to keep cells alive and reproducing, and specifically the nitrogen metabolism cycle. The genes that displayed red in this mapp had increased expression during the microarray experiment, and from the kegg pathway given for nitrogen metabolism, these genes can be seen to specifically aid in the metabolism of glutamate. Glutamate is important to cells as it plays a role in providing energy to allow the cells to operate correctly, and since the glutamate-related genes that we mapped were increased, it can be determined that glutamate plays a role in supplying the underlying energy to allow for the Bordetella pertussis strains to produce the polysaccharide capsule transport proteins, as studied in the microarray experiment.&lt;br /&gt;
&lt;br /&gt;
===Running MAPPFinder===&lt;br /&gt;
*MAPPFinder Procedure&lt;br /&gt;
** We launched the MAPPFinder program from within GenMAPP and ensured that the &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039; gene database was still loaded into GenMAPP.&lt;br /&gt;
** We clicked on the button &amp;quot;Calculate New Results&amp;quot; followed by &amp;quot;Find File&amp;quot;, at which point I specified the .gex file updated during the creation of the &amp;quot;LogFoldChange&amp;quot; color set.&lt;br /&gt;
** We chose to apply both the &amp;quot;Increased&amp;quot; and &amp;quot;Decreased&amp;quot; criteria present within the LogFoldChange color set to the data.&lt;br /&gt;
** We checked the boxes next to &amp;quot;Gene Ontology&amp;quot; and &amp;quot;p value&amp;quot;, specified the results file, and then clicked &amp;quot;Run MAPPFinder&amp;quot;.&lt;br /&gt;
***This analysis took several minutes to complete.&lt;br /&gt;
*MAPPFinder Analysis Results&lt;br /&gt;
**We selected &amp;quot;Show Ranked List&amp;quot; to see a list of the most significant Gene Ontology terms. A screenshot of this output is shown below:&lt;br /&gt;
**[[File:Mappfinderrankedlist.png]]&lt;br /&gt;
***The majority of the most significant gene ontology terms pertained to ribosome biosynthesis and translation.&lt;br /&gt;
&lt;br /&gt;
Note: The MAPPFinder analysis took approximately 8 minutes to complete. No errors were encountered in the process. MAPPFinder thus was confirmed to work with the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database.&lt;br /&gt;
&lt;br /&gt;
=== Compare Gene Database to Outside Resource===&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Template:Bklein7&amp;diff=8078</id>
		<title>Template:Bklein7</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Template:Bklein7&amp;diff=8078"/>
				<updated>2015-12-18T19:33:42Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: updated team journal links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;*&amp;#039;&amp;#039;&amp;#039;User Page:&amp;#039;&amp;#039;&amp;#039; [[User:Bklein7|Brandon Klein]]&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Team Page:&amp;#039;&amp;#039;&amp;#039; [[The Class Whoopers]]&lt;br /&gt;
&lt;br /&gt;
===Assignments Pages===&lt;br /&gt;
*[[Week_1|Week 1 Assignment]]&lt;br /&gt;
*[[Week_2|Week 2 Assignment]]&lt;br /&gt;
*[[Week_3|Week 3 Assignment]]&lt;br /&gt;
*[[Week_4|Week 4 Assignment]]&lt;br /&gt;
*[[Week_5|Week 5 Assignment]]&lt;br /&gt;
*[[Week_6|Week 6 Assignment]]&lt;br /&gt;
*[[Week_7|Week 7 Assignment]]&lt;br /&gt;
*[[Week_8|Week 8 Assignment]]&lt;br /&gt;
*[[Week_9|Week 9 Assignment]]&lt;br /&gt;
*[[Week_10|Week 10 Assignment]]&lt;br /&gt;
*[[Week_11|Week 11 Assignment]]&lt;br /&gt;
*[[Week_12|Week 12 Assignment]]&lt;br /&gt;
*&amp;#039;&amp;#039;No Week 13 Assignment&amp;#039;&amp;#039;&lt;br /&gt;
*[[Week_14|Week 14 Assignment]]&lt;br /&gt;
*[[Week_15|Week 15 Assignment]]&lt;br /&gt;
&lt;br /&gt;
===Individual Journal Entries===&lt;br /&gt;
*[[User:Bklein7|Week 1 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_2|Week 2 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_3|Week 3 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_4|Week 4 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_5|Week 5 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_6|Week 6 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_7|Week 7 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_8|Week 8 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_9|Week 9 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_10|Week 10 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_11|Week 11 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_12|Week 12 Individual Journal]]&lt;br /&gt;
*&amp;#039;&amp;#039;No Week 13 Journal&amp;#039;&amp;#039;&lt;br /&gt;
*[[Bklein7_Week_14|Week 14 Individual Journal]]&lt;br /&gt;
*[[Bklein7_Week_15|Week 15 Individual Journal]]&lt;br /&gt;
&lt;br /&gt;
===Shared Journal Entries===&lt;br /&gt;
*[[Class_Journal_Week_1|Week 1 Class Journal]]&lt;br /&gt;
*[[Class_Journal_Week_2|Week 2 Class Journal]]&lt;br /&gt;
*[[Class_Journal_Week_3|Week 3 Class Journal]]&lt;br /&gt;
*[[Class_Journal_Week_4|Week 4 Class Journal]]&lt;br /&gt;
*[[Class_Journal_Week_5|Week 5 Class Journal]]&lt;br /&gt;
*[[Class_Journal_Week_6|Week 6 Class Journal]]&lt;br /&gt;
*[[Class_Journal_Week_7|Week 7 Class Journal]]&lt;br /&gt;
*[[Class_Journal_Week_8|Week 8 Class Journal]]&lt;br /&gt;
*[[Class_Journal_Week_9|Week 9 Class Journal]]&lt;br /&gt;
*[[The_Class_Whoopers|Week 10 Team Journal]]&lt;br /&gt;
*[[The_Class_Whoopers#Week11|Week 11 Team Journal]]&lt;br /&gt;
*[[The_Class_Whoopers#Week12|Week 12 Team Journal]]&lt;br /&gt;
*&amp;#039;&amp;#039;No Week 13 Journal&amp;#039;&amp;#039;&lt;br /&gt;
*[[The_Class_Whoopers#Week14|Week 14 Team Journal]]&lt;br /&gt;
*[[The_Class_Whoopers#Week15|Week 15 Team Journal]]&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Template:Class_Whoopers&amp;diff=8077</id>
		<title>Template:Class Whoopers</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Template:Class_Whoopers&amp;diff=8077"/>
				<updated>2015-12-18T19:31:11Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: deleted excess info&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;GenMAPP Analysis of &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; Microarray Data&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
{{Gene Database Project Links}}&lt;br /&gt;
&lt;br /&gt;
===Journal Entries===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;width: 100%; text-align: center&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;5&amp;quot;|Class Whoopers Individual Journal Entries&lt;br /&gt;
|-&lt;br /&gt;
! [[User:Bklein7 | Brandon Klein]]&lt;br /&gt;
! [[Bklein7_Week_11|Week 11]]&lt;br /&gt;
! [[Bklein7_Week_12|Week 12]]&lt;br /&gt;
! [[Bklein7_Week_14|Week 14]]&lt;br /&gt;
! [[Bklein7_Week_15|Week 15]]&lt;br /&gt;
|-&lt;br /&gt;
! [[User:Lenaolufson | Lena Olufson]]&lt;br /&gt;
! [[lenaolufson Week 11|Week 11]]&lt;br /&gt;
! [[lenaolufson Week 12|Week 12]]&lt;br /&gt;
! [[lenaolufson Week 14|Week 14]]&lt;br /&gt;
! [[lenaolufson Week 15|Week 15]]&lt;br /&gt;
|-&lt;br /&gt;
! [[User:Msaeedi23 | Mahrad Saeedi]]&lt;br /&gt;
! [[Msaeedi23 Week 11| Week 11]]&lt;br /&gt;
! [[Msaeedi23 Week 12| Week 12]]&lt;br /&gt;
! [[Msaeedi23 Week 14| Week 14]]&lt;br /&gt;
! [[Msaeedi23 Week 15| Week 15]]&lt;br /&gt;
|-&lt;br /&gt;
! Team Entries&lt;br /&gt;
! [[#Week_11|Week 11]]&lt;br /&gt;
! [[#Week_12|Week 12]]&lt;br /&gt;
! [[#Week_14|Week 14]]&lt;br /&gt;
! [[#Week_15|Week 15]]&lt;br /&gt;
|}&lt;br /&gt;
=== Group Members ===&lt;br /&gt;
* Project Manager &amp;amp; Coder: [[User:Bklein7 | Brandon Klein]]&lt;br /&gt;
* Quality Assurance: [[User:Msaeedi23 | Mahrad Saeedi]]&lt;br /&gt;
* GenMAPP User: [[User:Lenaolufson | Lena Olufson]]&lt;br /&gt;
* [[The Class Whoopers | Team Page]]&lt;br /&gt;
&lt;br /&gt;
=== Team Weekly Assignments ===&lt;br /&gt;
* [[Week 10]] Creation of page and combined annotated bibliography (midnight 11/10)&lt;br /&gt;
* [[Week 11]] (midnight 11/17)&lt;br /&gt;
* [[Week 12]] (midnight 11/24)&lt;br /&gt;
* [[Week 14]] (midnight 12/8)&lt;br /&gt;
* [[Week 15]] (midnight 12/15)&lt;br /&gt;
&lt;br /&gt;
[[Category:Group Projects]] [[Category:Class Whoopers]]&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=The_Class_Whoopers&amp;diff=8076</id>
		<title>The Class Whoopers</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=The_Class_Whoopers&amp;diff=8076"/>
				<updated>2015-12-18T19:30:23Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: made change to invoke new template&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Team Information &amp;amp; Links =&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;br /&gt;
&lt;br /&gt;
= Deliverables =&lt;br /&gt;
[[Bordetella Pertussis GenMAPP Analysis Deliverables]]&lt;br /&gt;
&lt;br /&gt;
==Presentation Download Links==&lt;br /&gt;
*Journal Club&lt;br /&gt;
** Genome Paper: [[File:Genomepaper_cw20151116.pdf]]&lt;br /&gt;
** Microarray Paper: [[File: Microarray_Journal_Club_Presentation.pdf]]&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Final Project&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**[[File:Bpertussis_findings_powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==File Naming Protocol==&lt;br /&gt;
All file types generated in this project will receive their own unique names composed of two key parts:&lt;br /&gt;
#Description&lt;br /&gt;
#*This will contain a brief, file-specific description of what content the file contains.&lt;br /&gt;
#*Descriptions for different versions of the same file will remain consistent.&lt;br /&gt;
#Identifier Tag&lt;br /&gt;
#*This tag will be listed as a suffix in the following form: &amp;quot;_cwYYYYMMDD&amp;quot;&lt;br /&gt;
#**cw- team name abbreviation&lt;br /&gt;
#**YYYYMMDD- date the file was created in the form year/month/day&lt;br /&gt;
&lt;br /&gt;
Additionally, the following file naming best practices will be observed when creating descriptions for new files:&lt;br /&gt;
*Our species will be referred to consistently as &amp;quot;bpertussis&amp;quot;.&lt;br /&gt;
*Spaces will be written as underscores.&lt;br /&gt;
*No capitalization will be used.&lt;br /&gt;
*No special characters will be used.&lt;br /&gt;
*If sequential numbering systems are used, leading zeros will be included for clarity.&lt;br /&gt;
&lt;br /&gt;
=Weekly Updates=&lt;br /&gt;
==Week 15==&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Goals&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Assignment due date:&amp;#039;&amp;#039;&amp;#039; Midnight Tuesday, December 15&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Coder:&amp;#039;&amp;#039;&amp;#039; Adjust the GenMAPP Builder code to account for the one EnsemblBacteria reference ID that was missing in our last export; conduct a new import-export cycle to create the (hopefully) final .gdb file; begin characterizing the exported .gdb file in a Gene Database Testing Report; customize the GenMAPP Builder TallyEngine to account for any changes made.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Quality Assurance:&amp;#039;&amp;#039;&amp;#039; Reconfigure TallyEngine Configuration with Coder in order to accommodate missing gene IDs that were not exported the previous time. Test the revised database by running TallyEngine count, XmlpipeDB Match, and PostgreSQL. Locate missing gene IDs if any. &lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;GenMAPP User:&amp;#039;&amp;#039;&amp;#039; Import data into GenMAPP, create ColorSets, and run MAPPFinder. Document and take notes on test runs with GenMAPP. Use the EX.txt file to help the Coder/Quality Assurance team members to validate the .gdb. Create a .mapp file showing one pathway that is changed in your data.&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Progress&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Brandon (Coder and Project Manager):&amp;#039;&amp;#039;&amp;#039; I began this week by customizing the GenMAPP Builder TallyEngine to report ORF counts for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; (see [[Bklein7_Week_15]]). After this, I worked with [[User:Msaeedi23|Mahrad]] to identify the 1 gene ID that was missing in the .gdb file [[File:bpertussis-std_cw20151203.zip]]. I found that this gene was a necessary EnsemblBacteria reference ID and edited the GenMAPP Builder code with the help of [[User:Dondi|Dr. Dionisio]] to include this ID in our next export (see [[Bklein7_Week_15]]). I conducted a complete import-export cycle on 12/10/2015 to create the .gdb file [[File:bpertussis-std_cw20151210.zip]]. I then characterized this export, authoring sections 1-5.2 of its testing report: [[Gene_Database_Testing_Report-_cw20151210]]. During our Sunday meeting, I worked with [[User:Lenaolufson|Lena]] to use this new gene database in GenMAPP. During our Monday meeting, I worked on our PowerPoint presentation: [[File:Bpertussis findings powerpoint.pdf]].&lt;br /&gt;
*** [[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 22:31, 14 December 2015 (PST)&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Mahrad (Quality Assurance):&amp;#039;&amp;#039;&amp;#039; I worked closely with the coder [[User: Bklein7|Brandon]] in order to re-customize TallyEngine to include the 11 missing ORF genes. Having located the missing gene IDs, Brandon went into Eclipse to code for them to be included in the export. Following this, we tested out revised gene database to make sure these missing IDs were actually exported. We ran TallyEngine count, which gave a total of 3446 gene IDs, demonstrating that the IDs were now exported. Then we ran XMLpipeDB Match, and this provided a total of 3447 gene IDs exported, one additional. Finally, we ran PostgreSQL and this gave a total of 3446 gene IDs. We came to find that gene &amp;quot;BP3167A&amp;quot; was in the original XML file, but not accounted for in the exported file. With further investigation we concluded that &amp;quot;BP3167A&amp;quot; is a reference ID from EnsemblBacteria and corresponds to the same ID as &amp;quot;BP3167.1&amp;quot; which was exported. &lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Lena (GenMAPP User):&amp;#039;&amp;#039;&amp;#039; I was able to import the data into GenMAPP and then I created color sets in order to run MAPPFinder. I obtained the ontology results and did some background research on what exactly the top results related to from the microarray article. I then used Kegg pathways for my specific organism to create two separate MAPPS, one for ribosome and one for the nitrogen cycle.&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Meetings!&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**This week, our group used class work sessions to coordinate our work:&lt;br /&gt;
***Tuesday, December 8, 2:40 - 4:00&lt;br /&gt;
***Thursday, December 10, 2:40 - 4:00&lt;br /&gt;
**In addition, we scheduled meetings outside of class to work on the final PowerPoint Presentation and deliverables for our project:&lt;br /&gt;
***Sunday, December 13, 7:00 PM - 1:00 AM&lt;br /&gt;
***Monday, December 14, 2:00 PM - 11:00 PM&lt;br /&gt;
&lt;br /&gt;
==Week 14==&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Goals&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Assignment due date:&amp;#039;&amp;#039;&amp;#039; Midnight Tuesday, December 8&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Coder:&amp;#039;&amp;#039;&amp;#039; Create the custom species profile for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039;, run an export using the customized version of GenMAPP Builder, add further customizations to the custom species profile as appear necessary, and run a second export using the further customized version of GenMAPP Builder.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Quality Assurance:&amp;#039;&amp;#039;&amp;#039; Identify gene IDs that are missing in the first custom export, work with the coder to classify these IDs, configure the Tally Engine, and complete a gene database testing report for the second custom export.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;GenMAPP User:&amp;#039;&amp;#039;&amp;#039; Complete the statistical analysis of the data, format the data for import into GenMAPP, and coordinate with the coder/QA to import this data into GenMAPP using the custom gene database.&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Progress&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Brandon (Coder and Project Manager):&amp;#039;&amp;#039;&amp;#039; This week, I focused on creating and customizing the species profile for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; in GenMAPP Builder, the details of which can be found in my [[Bklein7 Week 14| Week 14 Journal Entry]]. I documented the first export I conducted using a custom &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; species profile here: [[Gene Database Testing Report- cw20151201]]. I demonstrated that the custom species information implemented in this export worked as intended, but Mahrad and I identified 11 ORF genes that failed to export. I updated the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; species profile to account for these ORF genes and conducted a new export, detailed here: [[Gene Database Testing Report- cw20151203]]. Mahrad analyzed the exported .gdb file. In addition to this, I kept tabs on my fellow group members to keep us on track to accomplish our long-term project goals in a timely manner.&lt;br /&gt;
*** [[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 13:39, 7 December 2015 (PST)&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Mahrad (Quality Assurance):&amp;#039;&amp;#039;&amp;#039; This week as Q and A I worked directly with Brandon to do the initial data exports. The work can be summarized here: [[Msaeedi23 Week 14| Week 14 Journal Entry]]. Next we meticulously characterized regular expression patterns to detect discrepancies in extracting the data from the original samples. In the following week I will work to do the tally configuration to customize it according to our specific species. Now I will focus on the tally configuration which may take some time and coding assistance from Brandon. Once the Tally Engine has been configured to our specific species, Lena can proceed with with GenMAPP processing. &lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Lena (GenMAPP User):&amp;#039;&amp;#039;&amp;#039; This week, I made progress on performing the statistical analysis of the data to prepare it for GenMAPP. I was able to post my progress for each of the class working sessions on my [[Lenaolufson Week 14| Week 14 Journal Entry]] as I updated the excel data sheets after each session. Dr. Dahlquist helped me figure out a problem with the original raw data that was causing the values to be very skewed. I then sent her my updated data sheet and she was able to use a program to separate the duplicates of the chips. After she sent me back the data with the sorted values, I performed the statistical analysis on the data, the most updated version of the file can be found on my Week 14 journal entry linked previously. &lt;br /&gt;
[[User:Lenaolufson|Lenaolufson]] ([[User talk:Lenaolufson|talk]]) 19:54, 7 December 2015 (PST)&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Meetings!&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**This week, our group used class work sessions to coordinate our work:&lt;br /&gt;
***Tuesday, December 1, 2:40 - 4:00&lt;br /&gt;
***Thursday, December 3, 2:40 - 4:00&lt;br /&gt;
*** Monday, December 7, 10:30 - 12 am&lt;br /&gt;
&lt;br /&gt;
==Week 12==&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Goals&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Assignment due date:&amp;#039;&amp;#039;&amp;#039; Midnight Tuesday, November 24&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Coder:&amp;#039;&amp;#039;&amp;#039; Set up a GitHub repository clone of the XMLPipeDB project on your development device, the development rig, and the initial as-is build for gmbuilder. Complete an import-export cycle in association with QA.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Quality Assurance:&amp;#039;&amp;#039;&amp;#039; Complete an import-export cycle for the 1st &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database. Complete a Gene Database Testing Report for this export.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;GenMAPP Users:&amp;#039;&amp;#039;&amp;#039; Create a Master Raw Data file that contains the IDs and columns of data required for further analysis. Consult with Dr. Dahlquist on how to process the data (normalization, statistics).&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Progress&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Brandon (Quality Assurance and Interim Coder):&amp;#039;&amp;#039;&amp;#039; This week, I focused on completing an import-export cycle for our first &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database- [[File:Bpertussis-std cw20151119.zip]]. With my QA hat, I imported the appropriate data, exported the gene database, and discussed the gene database creation &amp;amp; counting protocol here- [[Gene Database Testing Report- cw20151119]]. With my Coder hat, I followed the instructions on the [[Coder| Coder Guild Page]] to setup a GitHub repository clone of the XMLPipdeDB project on my personal laptop, the Eclipse developer rig, and the initial as-is build for gmbuilder. The electronic lab notebook for my QA and Coder work is present on my [[Bklein7 Week 12| Week 12 Page]]. Finally, I wrote a PowerPoint presentation on our genome sequencing paper, which is linked to on my [[Bklein7 Week 12| Week 12 Page]] as well. &lt;br /&gt;
***[[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 18:48, 23 November 2015 (PST)&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Lena (GenMAPP):&amp;#039;&amp;#039;&amp;#039; I worked on downloading the correct data sample files from the provided files on the microarray paper page. The files were unzipped and prepared to be imported into excel. In excel, the data was manipulated to form a spreadsheet that had all of the gene IDs from the different samples with their appropriate columns to be analyzed. The corrections and further manipulations of the data are to be continued to be done in the coming week in order to create the desired dataset to be exported from excel. [[File:Bpertussis CompiledRawData MS2015.xlsx]]&lt;br /&gt;
***[[User:Lenaolufson|Lenaolufson]] ([[User talk:Lenaolufson|talk]]) 17:33, 23 November 2015 (PST)&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Mahrad (GenMAPP--&amp;gt; Quality Assurance)&amp;#039;&amp;#039;&amp;#039;: Downloaded the six data sample files provided  by the microarray paper. Files were unzipped, imported into excel, and manipulated to form a single spreadsheet containing all gene IDs from the different samples. Each sample was placed in its respective column to be further analyzed and manipulated in the upcoming week. Following this, I assumed the position of quality assurance to accommodate the absence of Nicole.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Nicole&amp;#039;&amp;#039;&amp;#039; was absent this week. [[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 18:52, 23 November 2015 (PST)&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Meetings!&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** Monday, November 23: Seaver 120- Brandon and Lena met to work on the GenMAPP testing of the gene IDs from our database.&lt;br /&gt;
&lt;br /&gt;
==Week 11==&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Goals&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** For all:&lt;br /&gt;
*** Outline your assigned paper on your user page and include a list of 10 defined terms from the paper.&lt;br /&gt;
**Nicole &amp;amp; Brandon&lt;br /&gt;
***Prepare Journal Club presentation on the designated genome sequencing article&lt;br /&gt;
***Slides Due: by midnight, Tuesday, November 17&lt;br /&gt;
***Presentation Date: Tuesday, November 24&lt;br /&gt;
**Lena &amp;amp; Mahrad&lt;br /&gt;
***Prepare Journal Club presentation on the designated microarray paper&lt;br /&gt;
***Slides Due: by midnight, Tuesday, November 17&lt;br /&gt;
***Presentation Date: Tuesday, November 17&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Progress&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**Nicole Anguiano (Coder): Nicole was absent this week for a medical emergency and is (hopefully) getting some much deserved rest. [[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 23:14, 16 November 2015 (PST)&lt;br /&gt;
**Brandon Klein (QA): This week I made several edits to the [https://xmlpipedb.cs.lmu.edu/biodb/fall2015/index.php/The_Class_Whoopers Class Whoopers Team Page] in accordance with the [https://xmlpipedb.cs.lmu.edu/biodb/fall2015/index.php/Week_11 Week 11 assignment]. These edits included the following: revising the Class Whoopers template, reorganizing the Team Page structure, commenting out unneeded articles in the annotated bibliography, creating the new bibliography entry as requested by Dr. Dahlquist, and writing the naming conventions for our files. Additionally, I outlined our genome sequencing paper for &amp;quot;Bordetella pertussis&amp;quot; and assessed the [http://www.genedb.org/Homepage/Bpertussis GeneDB MOD] on my [https://xmlpipedb.cs.lmu.edu/biodb/fall2015/index.php/Bklein7_Week_11#Identifying_the_Bordetella_Pertussis_MOD Week 11 Individual Journal Entry]. A preliminary draft of the genome sequencing paper that I will likely be presenting solo was uploaded there. Finally, I kept tabs on group members as the interim Project Manager. [[User:Bklein7|Bklein7]] ([[User talk:Bklein7|talk]]) 23:14, 16 November 2015 (PST)&lt;br /&gt;
**Lena Olufson (GennMAPP): This week Mahrad and I met up and analyzed the microarray paper together. We split up the powerpoint into two halves; I did the introduction/significance of the study as well as the methods performed. Mahrad and I created our presentation together and worked through a google doc to edit it simultaneously as we discussed out loud. We also created a flow chart together that demonstrated the experimental design, thus we have the same ones included in our individual assignments. We made sure to check in with the temporary project manager and keep him updated on our progress. [[User:Lenaolufson|Lenaolufson]] ([[User talk:Lenaolufson|talk]]) 23:24, 16 November 2015 (PST) &lt;br /&gt;
**Mahrad Saeedi (GennMAPP): This week Lena and I worked on analyzing the microarray paper and creating an outline. We each defined 10 terms separately based upon words we didn&amp;#039;t recognize in the article. We then proceeded to producing the powerpoint presentation for journal club. &lt;br /&gt;
[[User:Msaeedi23|Msaeedi23]] ([[User talk:Msaeedi23|talk]]) 23:46, 16 November 2015 (PST)&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Meetings!&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**11/15- Lena &amp;amp; Mahrad met to work on outlining article and answering questions&lt;br /&gt;
**11/16- Lena &amp;amp; Mahrad met to prepare powerpoint presentation for journal club&lt;br /&gt;
&lt;br /&gt;
==Week 10==&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Goals&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** For all:&lt;br /&gt;
*** Create an annotated bibliography including one genome sequencing paper and two microarray experiments for Bordetella pertussis&lt;br /&gt;
*** Create/update team page &amp;amp; compile group annotated bibliography&lt;br /&gt;
*** Assignment due date: Midnight Tuesday, November 10&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Progress&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**All group members created annotated bibliographies and compiled them on the newly created group page.&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;Meetings!&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** Monday, November 9, 8pm-9pm, Seaver 120&lt;br /&gt;
&lt;br /&gt;
= Annotated Bibliography =&lt;br /&gt;
== Genome Sequencing Paper ==&lt;br /&gt;
&lt;br /&gt;
Neither of these papers is the &amp;#039;&amp;#039;first&amp;#039;&amp;#039; to report the genome sequence of &amp;#039;&amp;#039;B. pertussis.&amp;#039;&amp;#039;  The paper that you will want to use is [http://www.nature.com/ng/journal/v35/n1/full/ng1227.html this one].  I found it by looking at the introduction and references of the Zhang et. al (2011) paper.  For your Week 11 assignment, please remove your annotated bibliography entries for the two papers below and create one for this new paper by Parkhill et al. (2003).  You will use the Parkhill paper for your project.  &amp;#039;&amp;#039;&amp;amp;mdash; [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 09:54, 10 November 2015 (PST)&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
*Parkhill, J., Sebaihia, M., Preston, A., Murphy, L. D., et al. (2003). Comparative analysis of the genome sequences of Bordetella pertussis, Bordetella parapertussis and Bordetella bronchiseptica. Nature genetics, 35(1), 32-40. doi:10.1038/ng1227&lt;br /&gt;
* PubMed Abstract:  http://www.ncbi.nlm.nih.gov/pubmed/12910271&lt;br /&gt;
* PubMed Central:  Not available on PubMed Central.&lt;br /&gt;
* Publisher Full Text (HTML):  http://www.nature.com/ng/journal/v35/n1/full/ng1227.html&lt;br /&gt;
* Publisher Full Text (PDF):  http://www.nature.com/ng/journal/v35/n1/pdf/ng1227.pdf&lt;br /&gt;
* Copyright: ©2003 Nature Publishing Group (information found on PDF version of article). This article is not Open Access, but it is freely available 6 months after publication.&lt;br /&gt;
* Publisher: Nature Publishing Group (for-profit).&lt;br /&gt;
* Availability: In print and online.&lt;br /&gt;
* Did LMU pay a fee for this article: Yes, LMU pays a subscription fee for access to the journal &amp;#039;&amp;#039;Nature Genetics&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== Microarray Paper ==&lt;br /&gt;
&lt;br /&gt;
This paper is suitable for your project.  &amp;#039;&amp;#039;&amp;amp;mdash; [[User:Kdahlquist|Kdahlquist]] ([[User talk:Kdahlquist|talk]]) 10:04, 10 November 2015 (PST)&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Hoo, R., Lam, J.H., Huot, L., Pant, A., Li, R., Hot, D., &amp;amp; Alonso, S. (2014). Evidence for a Role of the Polysaccharide Capsule Transport Proteins in Pertussis Pathogenesis. PLoS ONE, 9(12):e115243. doi: 10.1371/journal.pone.0115243&lt;br /&gt;
* PubMed Abstract: http://www.ncbi.nlm.nih.gov/pubmed/25501560 &lt;br /&gt;
* PubMed Central: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4264864/&lt;br /&gt;
* Publisher Full Text (HTML): http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0115243&lt;br /&gt;
* Publisher Full Text (PDF): http://www.plosone.org/article/fetchObject.action?uri=info:doi/10.1371/journal.pone.0115243&amp;amp;representation=PDF&lt;br /&gt;
* Copyright: © 2014 Hoo et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited (info found [http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0115243 here]).&lt;br /&gt;
* Publisher: PLOS ONE (respected open access organization).&lt;br /&gt;
* Availability: Online only.&lt;br /&gt;
* Did LMU pay a fee for this article: No.&lt;br /&gt;
* Web site where the data resides: [http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE62088 NCBI GEO data]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--#2&lt;br /&gt;
*Brickman, T. J., Cummings, C. A., Liew, S.-Y., Relman, D. A., &amp;amp; Armstrong, S. K. (2011). Transcriptional Profiling of the Iron Starvation Response in Bordetella pertussis Provides New Insights into Siderophore Utilization and Virulence Gene Expression . Journal of Bacteriology, 193(18), 4798–4812. http://doi.org/10.1128/JB.05136-11&lt;br /&gt;
* ArrayExpress Abstract: https://www.ebi.ac.uk/arrayexpress/experiments/E-MEXP-3263/?keywords=&amp;amp;organism=Bordetella+pertussis&amp;amp;exptype%5B%5D=%22rna+assay%22&amp;amp;exptype%5B%5D=%22array+assay%22&amp;amp;array=&lt;br /&gt;
* PubMed Central:  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC532028/&lt;br /&gt;
*PubMed Abstract: http://www.ncbi.nlm.nih.gov/pubmed/?term=Transcriptional+Profiling+of+the+Iron+Starvation+Response+in+Bordetella+pertussis+Provides+New+Insights+into+Siderophore+Utilization+and+Virulence+Gene+Expression&lt;br /&gt;
* Publisher Full Text (HTML): http://jb.asm.org/content/193/18/4798.full&lt;br /&gt;
* Publisher Full Text (PDF):  http://jb.asm.org/content/193/18/4798.full.pdf+html &lt;br /&gt;
* Copyright:  2011 by the American Society for Microbiology &lt;br /&gt;
* Publisher:  Journal of Bacteriology &lt;br /&gt;
* Availability:  in print and online&lt;br /&gt;
* Did LMU pay a fee for this article: yes&lt;br /&gt;
*Link to where the microarray data resides: https://www.ebi.ac.uk/arrayexpress/experiments/E-MEXP-3263/?keywords=&amp;amp;organism=Bordetella+pertussis&amp;amp;exptype%5B%5D=%22rna+assay%22&amp;amp;exptype%5B%5D=%22array+assay%22&amp;amp;array=&lt;br /&gt;
&lt;br /&gt;
#3&lt;br /&gt;
King, A. J., van der Lee, S., Mohangoo, A., van Gent, M., van der Ark, A., &amp;amp; van de Waterbeemd, B. (2013). Genome-Wide Gene Expression Analysis of Bordetella pertussis Isolates Associated with a Resurgence in Pertussis: Elucidation of Factors Involved in the Increased Fitness of Epidemic Strains. PLoS ONE, 8(6): e66150. doi: 10.1371/journal.pone.0066150&lt;br /&gt;
* PubMed Abstract:  http://www.ncbi.nlm.nih.gov/pubmed/23776625&lt;br /&gt;
* PubMed Central:  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3679012/&lt;br /&gt;
* Publisher Full Text (HTML):  http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0066150&lt;br /&gt;
* Publisher Full Text (PDF):  http://www.plosone.org/article/fetchObject.action?uri=info:doi/10.1371/journal.pone.0066150&amp;amp;representation=PDF&lt;br /&gt;
* Copyright: © 2013 King et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. (info found [http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0066150 here])&lt;br /&gt;
* Publisher: PLOS ONE (respected open access organization)&lt;br /&gt;
* Availability: online only&lt;br /&gt;
* Did LMU pay a fee for this article: no&lt;br /&gt;
* Web site where the data resides: [http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1594/samples/?keywords=Bordetella+pertussis&amp;amp;organism=Bordetella+pertussis&amp;amp;exptype%5B%5D=%22rna+assay%22&amp;amp;exptype%5B%5D=%22array+assay%22&amp;amp;array=&amp;amp;s_page=1&amp;amp;s_pagesize=100 EBI ArrayExpress Data]&lt;br /&gt;
&lt;br /&gt;
#4 (note, all of the papers from this point on involve additional species other than Bordetella pertussis)&lt;br /&gt;
&lt;br /&gt;
* Cummings, C. A., Bootsma, H. J., Relman, D. A., &amp;amp; Miller, J. F. (2006). Species-and strain-specific control of a complex, flexible regulon by Bordetella BvgAS. Journal of bacteriology, 188(5), 1775-1785.&lt;br /&gt;
* ArrayExpress Abstract: https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-29/?keywords=&amp;amp;organism=Bordetella+pertussis&amp;amp;exptype%5B%5D=%22rna+assay%22&amp;amp;exptype%5B%5D=%22array+assay%22&amp;amp;array=&lt;br /&gt;
* PubMed Central:  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1426559/&lt;br /&gt;
*PubMed Abstract:http://www.ncbi.nlm.nih.gov/pubmed/?term=Species-+and+Strain-Specific+Control+of+a+Complex%2C+Flexible+Regulon+by+Bordetella+BvgAS&lt;br /&gt;
* Publisher Full Text (HTML): http://jb.asm.org/content/188/5/1775.full&lt;br /&gt;
* Publisher Full Text (PDF):  http://jb.asm.org/content/188/5/1775.full.pdf+html&lt;br /&gt;
* Copyright:  2006 by the American Society for Microbiology &lt;br /&gt;
* Publisher:  Journal of Bacteriology &lt;br /&gt;
* Availability:  in print and online&lt;br /&gt;
* Did LMU pay a fee for this article: yes&lt;br /&gt;
*Link to where the microarray data resides: https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-29/?keywords=&amp;amp;organism=Bordetella+pertussis&amp;amp;exptype%5B%5D=%22rna+assay%22&amp;amp;exptype%5B%5D=%22array+assay%22&amp;amp;array=&lt;br /&gt;
&lt;br /&gt;
#5&lt;br /&gt;
&lt;br /&gt;
* Brinig, M., Register, K., Ackermann, M., &amp;amp; Relman, D. (2006). Genomic features of Bordetella parapertussis clades with distinct host species specificity. Genome Biology, 7(9). doi:doi:10.1186/gb-2006-7-9-r81&lt;br /&gt;
* PubMed Abstract:  http://www.ncbi.nlm.nih.gov/pubmed/16956413?dopt=Abstract&amp;amp;holding=f1000,f1000m,isrctn&lt;br /&gt;
* PubMed Central:  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1794550/&lt;br /&gt;
* Publisher Full Text (HTML):  http://www.genomebiology.com/2006/7/9/R81&lt;br /&gt;
* Publisher Full Text (PDF):  http://www.genomebiology.com/content/pdf/gb-2006-7-9-r81.pdf&lt;br /&gt;
* Copyright: Brinig et al.; licensee BioMed Central Ltd. (information found on the article); open access&lt;br /&gt;
* Publisher:  BioMed Central Ltd (for-profit publisher)&lt;br /&gt;
* Availability:  online&lt;br /&gt;
* Did LMU pay a fee for this article: no&lt;br /&gt;
# What experiment was performed?  What was the &amp;quot;treatment&amp;quot; and what was the &amp;quot;control&amp;quot; in the experiment?&lt;br /&gt;
# Were replicate experiments of the &amp;quot;treatment&amp;quot; and &amp;quot;control&amp;quot; conditions conducted?  Were these biological or technical replicates?  How many of each?&lt;br /&gt;
* Link to microarray data: https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-95/&lt;br /&gt;
&lt;br /&gt;
#6&lt;br /&gt;
&lt;br /&gt;
* Cummings, C., Bootsma, H., Relman, D., &amp;amp; Miller, J. (2006). Species- and Strain-Specific Control of a Complex, Flexible Regulon by Bordetella BvgAS. Journal of Bacteriology, 188(5), 1775-1785. doi:doi: 10.1128/JB.188.5.1775-1785.2006&lt;br /&gt;
* PubMed Abstract: http://www.ncbi.nlm.nih.gov/pubmed/16484188?dopt=Abstract&lt;br /&gt;
* PubMed Central:  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1426559/&lt;br /&gt;
* Publisher Full Text (HTML):  http://jb.asm.org/content/188/5/1775.full&lt;br /&gt;
* Publisher Full Text (PDF):  http://jb.asm.org/content/188/5/1775.full.pdf&lt;br /&gt;
* Copyright: American Society for Microbiology; open access&lt;br /&gt;
* Publisher:  American Society for Microbiology (professional organization for scientists)&lt;br /&gt;
* Availability:  online and in print&lt;br /&gt;
* Did LMU pay a fee for this article: no&lt;br /&gt;
# What experiment was performed?  What was the &amp;quot;treatment&amp;quot; and what was the &amp;quot;control&amp;quot; in the experiment?&lt;br /&gt;
# Were replicate experiments of the &amp;quot;treatment&amp;quot; and &amp;quot;control&amp;quot; conditions conducted?  Were these biological or technical replicates?  How many of each?&lt;br /&gt;
* Link to microarray data: https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-28/&lt;br /&gt;
&lt;br /&gt;
#7&lt;br /&gt;
&lt;br /&gt;
*King, A. J., van Gorkom, T., Pennings, J. L., van der Heide, H. G., He, Q., Diavatopoulos, D., … Mooi, F. R. (2010). Correction: Comparative genomic profiling of Dutch clinical Bordetella pertussis isolates using DNA microarrays: identification of genes absent from epidemic strains. BMC Genomics, 11, 196. http://doi.org/10.1186/1471-2164-11-196&lt;br /&gt;
* PubMed Abstract:  http://www.biomedcentral.com/1471-2164/9/311#abs&lt;br /&gt;
* PubMed Central: &amp;lt;http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2481270/&amp;gt;&lt;br /&gt;
* Publisher Full Text (HTML):  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2481270/&lt;br /&gt;
* Publisher Full Text (PDF): &amp;lt;http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2481270/pdf/1471-2164-9-311.pdf&amp;gt;&lt;br /&gt;
* Copyright:  © 2008 King et al; licensee BioMed Central Ltd.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.&lt;br /&gt;
* Publisher: BMC Genomics&lt;br /&gt;
* Availability: online access&lt;br /&gt;
* Did LMU pay a fee for this article: no&lt;br /&gt;
&lt;br /&gt;
#8&lt;br /&gt;
&lt;br /&gt;
*Nakamura, M. M., Liew, S.-Y., Cummings, C. A., Brinig, M. M., Dieterich, C., &amp;amp; Relman, D. A. (2006). Growth Phase- and Nutrient Limitation-Associated Transcript Abundance Regulation in Bordetella pertussis  . Infection and Immunity, 74(10), 5537–5548. http://doi.org/10.1128/IAI.00781-06&lt;br /&gt;
* PubMed Abstract:  &amp;lt;http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1594893/?report=reader#__abstractid499869title&amp;gt;&lt;br /&gt;
* PubMed Central: &amp;lt;http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1594893/&amp;gt;&lt;br /&gt;
* Publisher Full Text (HTML): &amp;lt;http://iai.asm.org/content/74/10/5537.full&amp;gt;&lt;br /&gt;
* Publisher Full Text (PDF):  &amp;lt;http://iai.asm.org/content/74/10/5537.full.pdf+html&amp;gt;&lt;br /&gt;
* Copyright: © 2006, American Society for Microbiology&lt;br /&gt;
* Publisher: Infection and Immunity&lt;br /&gt;
* Availability: online access&lt;br /&gt;
* Did LMU pay a fee for this article: no!--&amp;gt;&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Template:Class_Whoopers&amp;diff=8075</id>
		<title>Template:Class Whoopers</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Template:Class_Whoopers&amp;diff=8075"/>
				<updated>2015-12-18T19:29:44Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: added team journal entry links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;GenMAPP Analysis of &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; Microarray Data&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
{{Gene Database Project Links}}&lt;br /&gt;
&lt;br /&gt;
===Journal Entries===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;width: 100%; text-align: center&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;5&amp;quot;|Class Whoopers Individual Journal Entries&lt;br /&gt;
|-&lt;br /&gt;
! [[User:Bklein7 | Brandon Klein]]&lt;br /&gt;
! [[Bklein7_Week_11|Week 11]]&lt;br /&gt;
! [[Bklein7_Week_12|Week 12]]&lt;br /&gt;
! [[Bklein7_Week_14|Week 14]]&lt;br /&gt;
! [[Bklein7_Week_15|Week 15]]&lt;br /&gt;
|-&lt;br /&gt;
! [[User:Lenaolufson | Lena Olufson]]&lt;br /&gt;
! [[lenaolufson Week 11|Week 11]]&lt;br /&gt;
! [[lenaolufson Week 12|Week 12]]&lt;br /&gt;
! [[lenaolufson Week 14|Week 14]]&lt;br /&gt;
! [[lenaolufson Week 15|Week 15]]&lt;br /&gt;
|-&lt;br /&gt;
! [[User:Msaeedi23 | Mahrad Saeedi]]&lt;br /&gt;
! [[Msaeedi23 Week 11| Week 11]]&lt;br /&gt;
! [[Msaeedi23 Week 12| Week 12]]&lt;br /&gt;
! [[Msaeedi23 Week 14| Week 14]]&lt;br /&gt;
! [[Msaeedi23 Week 15| Week 15]]&lt;br /&gt;
|-&lt;br /&gt;
! Team Entries&lt;br /&gt;
! [[#Week_11|Week 11]]&lt;br /&gt;
! [[#Week_12|Week 12]]&lt;br /&gt;
! [[#Week_14|Week 14]]&lt;br /&gt;
! [[#Week_15|Week 15]]&lt;br /&gt;
|}&lt;br /&gt;
=== Group Members ===&lt;br /&gt;
* Project Manager &amp;amp; Coder: [[User:Bklein7 | Brandon Klein]]&lt;br /&gt;
* Quality Assurance: [[User:Msaeedi23 | Mahrad Saeedi]]&lt;br /&gt;
* GenMAPP User: [[User:Lenaolufson | Lena Olufson]]&lt;br /&gt;
* [[The Class Whoopers | Team Page]]&lt;br /&gt;
&lt;br /&gt;
=== Team Weekly Assignments ===&lt;br /&gt;
* [[Week 10]] Creation of page and combined annotated bibliography (midnight 11/10)&lt;br /&gt;
* [[Week 11]] Uploaded the microarray presentation [[File: Microarray_Journal_Club_Presentation.pdf]] (midnight 11/17)&lt;br /&gt;
* [[Week 12]] (midnight 11/24)&lt;br /&gt;
* [[Week 14]] (midnight 12/8)&lt;br /&gt;
* [[Week 15]] (midnight 12/15)&lt;br /&gt;
&lt;br /&gt;
[[Category:Group Projects]] [[Category:Class Whoopers]]&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8074</id>
		<title>Gene Database Testing Report- cw20151210</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151210&amp;diff=8074"/>
				<updated>2015-12-18T19:22:01Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: replaced category with template&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Files Asked for in the Gene Database Testing Report==&lt;br /&gt;
&lt;br /&gt;
For convenience, all of the files explicitly asked for in the sections below were compressed together in this file: [[File:Testingreport cw20151210.zip]]&lt;br /&gt;
&lt;br /&gt;
==Pre-requisites==&lt;br /&gt;
&lt;br /&gt;
The following set of software was used in the creation and testing of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database:&lt;br /&gt;
# [http://www.7-zip.org/ 7-zip]tool that for unpacking .gz and .zip files&lt;br /&gt;
# [http://www.postgresql.org PostgreSQL] on Windows (version 9.4.x)&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ GenMAPP Builder]&lt;br /&gt;
# Java JDK 1.8 64-bit&lt;br /&gt;
# [https://github.com/GenMAPPCS/genmapp GenMAPP 2]&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ XMLPipeDB match utility] for counting IDs in XML files&lt;br /&gt;
# Microsoft Access for reading .mdb files&lt;br /&gt;
&lt;br /&gt;
==Gene Database Creation==&lt;br /&gt;
===Downloading Data Source Files and GenMAPP Builder===&lt;br /&gt;
&lt;br /&gt;
*We download the UniProt XML, GOA, and GO OBO-XML files for &amp;#039;&amp;#039;Bordetella Pertussis&amp;#039;&amp;#039; along with the GenMAPP Builder program.&lt;br /&gt;
**All files were saved to the folder &amp;#039;&amp;#039;Bklein7_CW\bpertussis_cw20151210&amp;#039;&amp;#039; on our computer&amp;#039;s ThawSpace.&lt;br /&gt;
**Files that required extraction were unzipped using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
**Data files that remained in a folder after unzipping were removed from their folders to facilitate organization and command line processing.&lt;br /&gt;
&lt;br /&gt;
====UniProt XML====&lt;br /&gt;
&lt;br /&gt;
* We went to the [http://www.uniprot.org/taxonomy/complete-proteomes UniProt Complete Proteomes] page.&lt;br /&gt;
**From there, we navigated to the complete proteome download page for [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)].&lt;br /&gt;
** We clicked on the &amp;quot;Download&amp;quot; button at the top of the page above and selected the following options:&lt;br /&gt;
***&amp;quot;Download all&amp;quot;&lt;br /&gt;
***&amp;quot;XML&amp;quot; from the &amp;quot;Format&amp;quot; drop-down menu&lt;br /&gt;
***&amp;quot;Compressed&amp;quot; format&lt;br /&gt;
**We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====GOA====&lt;br /&gt;
&lt;br /&gt;
* UniProt-GOA files can be downloaded from the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/ UniProt-GOA ftp site].&lt;br /&gt;
*Within the above site, we navigated to the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I].&lt;br /&gt;
**This text file was automatically opened by the browser. Therefore, we had to manually download the file.&lt;br /&gt;
&lt;br /&gt;
====GO OBO-XML====&lt;br /&gt;
&lt;br /&gt;
* We downloaded the GO OBO-XML formatted file from the [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page].&lt;br /&gt;
* We extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====Downloaded GenMAPP Builder====&lt;br /&gt;
&lt;br /&gt;
# We downloaded the custom version of GenMAPP Builder including the most recent version of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; custom class (Version 3.0.0 Build 5 - cw20151210): [[File:Dist cw20151210.zip]].&lt;br /&gt;
# We extracted the GenMAPP Builder folder using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
===Creating the New Database in PostgreSQL===&lt;br /&gt;
&lt;br /&gt;
* We launched &amp;#039;&amp;#039;pgAdmin III&amp;#039;&amp;#039; and connected to the PostgreSQL 9.4 server (localhost:5432).&lt;br /&gt;
** On this server, we created a new database: &amp;#039;&amp;#039;bpertussis_cw20151210_gmb3build5&amp;#039;&amp;#039;.&lt;br /&gt;
** We opened the SQL Editor tab to use an XMLPipeDB query to create the tables in the database.&lt;br /&gt;
*** We clicked on the Open File icon and selected the file &amp;#039;&amp;#039;gmbuilder.sql&amp;#039;&amp;#039;. This imported a series of SQL commands into the editor tab.&lt;br /&gt;
*** We clicked on the Execute Query icon to run this command.&lt;br /&gt;
***In viewing the schema for this database, we confirmed that there were 167 tables after running the above command.&lt;br /&gt;
&lt;br /&gt;
===Configuring GenMAPP Builder to Connect to the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
* To begin, we launched gmbuilder.bat.&lt;br /&gt;
* We selected the &amp;quot;Configure Database&amp;quot; option and entered the following information into the fields below:&lt;br /&gt;
** Host or address: localhost&lt;br /&gt;
** Port number: 5432&lt;br /&gt;
** Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
** Username: postgres&lt;br /&gt;
** Password: Welcome1&lt;br /&gt;
&lt;br /&gt;
===Importing Data into the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
*The downloaded data files for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; were specified and imported into the database by clicking on the following buttons:&lt;br /&gt;
** Selected File &amp;gt; Import UniProt XML...&lt;br /&gt;
** Selected File &amp;gt; Import GO OBO-XML...&lt;br /&gt;
** Clicked OK to the message asking to process the GO data.&lt;br /&gt;
** Selected File &amp;gt; Import GOA...&lt;br /&gt;
&lt;br /&gt;
===Exporting a GenMAPP Gene Database (.gdb)===&lt;br /&gt;
&lt;br /&gt;
* We selected File &amp;gt; Export to GenMAPP Gene Database... to begin the export process.&lt;br /&gt;
* We typed in our coder&amp;#039;s name in the owner field (Brandon Klein).&lt;br /&gt;
* We selected the custom profile &amp;quot;Bordetella pertussis, Taxon ID 257313&amp;quot; as the gene database species and then clicked &amp;#039;&amp;#039;Next&amp;#039;&amp;#039;.&lt;br /&gt;
* The database was saved as &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;.&lt;br /&gt;
* We checked the boxes for exporting all Molecular Function, Cellular Component, and Biological Process Gene Ontology Terms.&lt;br /&gt;
* Finally, we clicked the &amp;quot;Next&amp;quot; button to begin the export process.&lt;br /&gt;
&lt;br /&gt;
==Gene Database Testing Report==&lt;br /&gt;
===Export Information===&lt;br /&gt;
&lt;br /&gt;
Version of GenMAPP Builder: Version 3.0.0 Build 5 - cw20151210&lt;br /&gt;
&lt;br /&gt;
Computer on which export was run: Seaver 120- Last computer on the right in the row farthest from the front of the room&lt;br /&gt;
&lt;br /&gt;
Postgres Database name: bpertussis_cw20151210_gmb3build5&lt;br /&gt;
&lt;br /&gt;
UniProt XML filename: [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
* UniProt XML version (The version information was found at [http://uniprot.org/news the UniProt News Page]): 2015_12&lt;br /&gt;
* UniProt XML download link: [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)]&lt;br /&gt;
* Time taken to import: 2.88 minutes&lt;br /&gt;
** Note: The import time was similar to that when creating the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (2.59 minute). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
GO OBO-XML filename: [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
* GO OBO-XML version (The version information was found in the file properties): Last Modified- ‎‎ ‎December ‎10, ‎2015 (TIME?)&lt;br /&gt;
* GO OBO-XML download link: [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page]&lt;br /&gt;
* Time taken to import: 6.97 minutes &lt;br /&gt;
* Time taken to process: 4.52 minutes&lt;br /&gt;
** Note: The import and processing times were similar to those for the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (7.08 minutes and 4.42 minutes respectively). No interruptions occurred during these processes.&lt;br /&gt;
&lt;br /&gt;
GOA filename: [[File:145.B pertussis ATCC BAA-589 cw20151210.zip]]&lt;br /&gt;
* GOA version (found in the Last modified field on the [ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/ FTP site]): Last Modified- 08-Dec-2015 02:45&lt;br /&gt;
* GOA download link: [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I]&lt;br /&gt;
* Time taken to import: 0.03 minutes&lt;br /&gt;
** Note: The import time was very similar to that of the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151203.gdb (0.04 minutes). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
Name of .gdb file: [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* Time taken to export: &lt;br /&gt;
** Start time: 1:19 AM&lt;br /&gt;
** End time: 2:11 AM&lt;br /&gt;
** Elapsed time: 52 minutes&lt;br /&gt;
Note: No interruptions occurred during the export process.&lt;br /&gt;
&lt;br /&gt;
===TallyEngine===&lt;br /&gt;
* We ran the TallyEngine in GenMAPP Builder and specified the following files:&lt;br /&gt;
**XML- [[File:Uniprot-proteome-UP000002676 cw20151210.zip]]&lt;br /&gt;
**GO- [[File:Go daily-termdb cw20151210.zip]]&lt;br /&gt;
*Results:&lt;br /&gt;
**[[File:TallyEngineResults cw20151210.png]]&lt;br /&gt;
***All TallyEngine results were consistent across both files.&lt;br /&gt;
***The TallyEngine was not customized to reflect the coding changes made to GenMAPP Builder Version 3.0.0 Build 5 - cw20151210.&lt;br /&gt;
****Therefore, the total count for &amp;quot;Ordered Locus Names&amp;quot; and &amp;quot;ORF&amp;quot; gene IDs remained 3446. The extra ID that was imported in this build, &amp;quot;BP3167A&amp;quot;, was not listed in either of these categories.&lt;br /&gt;
****&amp;#039;&amp;#039;&amp;#039;Further TallyEngine customization is necessary to raise the count to 3447 gene IDs.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Using XMLPipeDB Match to Validate the XML Results from the TallyEngine===&lt;br /&gt;
The following functions were performed using the Windows command line (cmd).&lt;br /&gt;
*We entered the project folder using the following command:&lt;br /&gt;
 cd /d T:\Bklein7_CW\bpertussis_cw20151210&lt;br /&gt;
*We used XMLPipeDB match to identify matches of gene IDs in the UniProt XML file that conformed to the following the patterns: &amp;quot;BP####&amp;quot;, &amp;quot;BP####.1&amp;quot;, &amp;quot;BP####A&amp;quot;, and &amp;quot;BP####B&amp;quot;. The command used was as follows:&lt;br /&gt;
 java -jar xmlpipedb-match-1.1.1.jar &amp;quot;BP[0-9][0-9][0-9][0-9](A|B|\.1|)&amp;quot; &amp;lt; &amp;quot;uniprot-proteome%3AUP000002676_cw20151201.xml&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Match Results:&lt;br /&gt;
*[[File:Xmlpipedbmatch cw20151203.png]]&lt;br /&gt;
**The number of unique matches generated by XMLPipeDB Match, 3447, matched with our expectation. The count includes the total number of ordered locus (3435) and ORF (11) gene IDs along with the unique EnsemblBacteria reference ID &amp;quot;BP3167A&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
===Using SQL Queries to Validate the PostgreSQL Database Results from the TallyEngine===&lt;br /&gt;
We used the SQL &amp;quot;union&amp;quot; operation to count the number of &amp;quot;ordered locus&amp;quot; gene IDs, which conform to the pattern &amp;quot;BP####&amp;quot;, in addition to all gene IDs that matched the patterns &amp;quot;BP####A&amp;quot; &amp;amp; &amp;quot;BP####B&amp;quot; (including 11 &amp;quot;ORF&amp;quot; gene IDs and 1 EnsemblBacteria reference ID):&lt;br /&gt;
&lt;br /&gt;
 select count(value) from (select value from genenametype where type = &lt;br /&gt;
 &amp;#039;ordered locus&amp;#039; union select value from propertytype inner join dbreferencetype&lt;br /&gt;
  on (propertytype.dbreferencetype_property_hjid = dbreferencetype.hjid)&lt;br /&gt;
   where dbreferencetype.type = &amp;#039;EnsemblBacteria&amp;#039; and propertytype.type = &lt;br /&gt;
   &amp;#039;gene ID&amp;#039; and propertytype.value ~ &amp;#039;BP[0-9][0-9][0-9][0-9](A|B)&amp;#039;) as combined;&lt;br /&gt;
&lt;br /&gt;
Note: This query was crafted by [[User:Dondi|Dr. Dionisio]].&lt;br /&gt;
&lt;br /&gt;
Results:&lt;br /&gt;
*[[File:PostgreSQL Count cw20151210.png]]&lt;br /&gt;
* The number of unique matches yielded by this SQL query, 3447, matched the count generated by XMLPipeDB Match. Thus, the locations of all 3447 gene IDs in the PostgreSQL relational database were accounted for here.&lt;br /&gt;
&lt;br /&gt;
===OriginalRowCounts Comparison===&lt;br /&gt;
&lt;br /&gt;
We opened the gene database file [[File:Bpertussis-std_cw20151210.zip]] in  Microsoft Access and assessed the &amp;quot;OriginalRowCounts&amp;quot; table to see if the expected tables were listed with the expected number of records. The contents of this table were compared to the &amp;#039;&amp;#039;OriginalRowCounts&amp;#039;&amp;#039; table of an existing .gdb file created during Week 9.&lt;br /&gt;
 &lt;br /&gt;
Benchmark .gdb file: [[File:Vc-Std 20151027 TR.gdb]]&lt;br /&gt;
&lt;br /&gt;
&amp;quot;OriginalRowCounts&amp;quot; table from the benchmark and new gdb:&lt;br /&gt;
*[[File:ComparisonToBenchmark cw20151210.PNG]]&lt;br /&gt;
**All 52 tables present in the 2015 &amp;#039;&amp;#039;Vibrio cholerae&amp;#039;&amp;#039; database were also present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; gene database, &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039;. This confirmed that all expected tables were successfully created.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; table count is listed as 3447. &amp;#039;&amp;#039;&amp;#039;This count demonstrates that the missing ID, &amp;quot;BP3167A&amp;quot;, was successfully added to the export (confirmed below).&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
***[[File:BP3167A Confirmed cw20151210.PNG]]&lt;br /&gt;
&lt;br /&gt;
Note: The &amp;quot;OriginalRowCounts&amp;quot; tables were too large to screenshot. To circumvent this problem and facilitate the comparison, I copied the &amp;quot;OriginalRowCounts&amp;quot; tables from both gene databases into an Excel file and zoomed out. The above screenshot was taken from this Excel file. The &amp;quot;OrderedLocusNames&amp;quot; row count for &amp;#039;&amp;#039;bpertussis-std_cw20151210&amp;#039;&amp;#039; is highlighted in yellow.&lt;br /&gt;
&lt;br /&gt;
===Visual Inspection===&lt;br /&gt;
We visually inspected individual tables within [[File:Bpertussis-std_cw20151210.zip]] using Microsoft Access.&lt;br /&gt;
&lt;br /&gt;
*Systems Table&lt;br /&gt;
**35 gene ID systems were listed, 11 of which were used in the creation of this .gdb file and listed the appropriate import date (12/10/2015).&lt;br /&gt;
***All gene ID systems relevant to &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; were listed. This includes: EMBL, EnsemblBacteria, GeneID, GeneOntology, InterPro, OrderedLocusNames, Pfam, RefSeq, and UniProt.&lt;br /&gt;
***This result corresponded with that of the benchmark .gdb file listed in the &amp;quot;OriginalRowCounts Comparison&amp;quot; section.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; listing properly displayed customizations to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; species profile.&lt;br /&gt;
***In this row, the species was listed correctly as &amp;quot;Bordetella pertussis&amp;quot;.&lt;br /&gt;
***In this row, the link corresponded to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; database at GeneDB. The link was as follows: http://www.genedb.org/gene/~;jsessionid=A06A0EFE93C64E476380393D4CBEFA69?actionName=%2FQuery%2FquickSearch&amp;amp;resultsSize=1&amp;amp;taxonNodeName=Bpertussis.&lt;br /&gt;
*UniProt Table&lt;br /&gt;
**This table contained 3258 entries with 6 character IDs.&lt;br /&gt;
**All ID&amp;#039;s in the UniProt table conform to the following pattern:&lt;br /&gt;
*** [[File:UniProt Ascension Number info.PNG]]&lt;br /&gt;
*RefSeq Table&lt;br /&gt;
**This table contained 6627 entries. All IDs began with one of three prefixes: &amp;quot;NP_&amp;quot;, &amp;quot;YP_&amp;quot;, or &amp;quot;WP_&amp;quot;. The meanings of these prefixes can be found in the RefSeq documentation found [http://www.ncbi.nlm.nih.gov/books/NBK50679/ here].&lt;br /&gt;
***&amp;quot;NP_&amp;quot; and &amp;quot;YP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to proteins. There are 3410 &amp;quot;NP_&amp;quot; IDs and 7 &amp;quot;YP_&amp;quot; IDs.&lt;br /&gt;
***&amp;quot;WP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to &amp;quot;autonomous non-redundant proteins that are not yet directly annotated on a genome&amp;quot;. There were 3210 IDs with the &amp;quot;WP_&amp;quot; prefixes.&lt;br /&gt;
***Overall, every entry in the ID column was an expected value.&lt;br /&gt;
*OrderedLocusNames Table&lt;br /&gt;
**This table contained 3447 entries (consistent with the XMLPipeDB Match result).&lt;br /&gt;
**The IDs were copied into an Excel document for analysis:&lt;br /&gt;
***3434 IDs conformed to the pattern &amp;quot;BP####&amp;quot;.&lt;br /&gt;
***11 IDs conformed to the pattern &amp;quot;BP####A&amp;quot;.&lt;br /&gt;
****This included 10 ORF gene IDs &amp;amp; &amp;quot;BP3167A&amp;quot; (reference to an EnsemblBacteria ID).&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####B&amp;quot;.&lt;br /&gt;
****This corresponded to an ORF gene ID.&lt;br /&gt;
***1 ID exhibited the pattern &amp;quot;BP####.1&amp;quot;.&lt;br /&gt;
****This ID was the manner in which UniProt classified &amp;quot;BP3167A&amp;quot;. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==bpertussis-std_cw20151210.gdb Use in GenMAPP==&lt;br /&gt;
&lt;br /&gt;
The following analysis was conducted in GenMAPP Version 2.1. Within GenMAPP, the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database was loaded by selecting Data &amp;gt; Choose Gene Database and then selecting the file &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
===Putting a Gene on the MAPP Using the GeneFinder Window===&lt;br /&gt;
&lt;br /&gt;
We made a sample MAPP in which gene IDs conforming to the naming conventions of the 5 major gene databases containing &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; genome data were added. A screenshot of the resulting MAPP is provided below:&lt;br /&gt;
[[File:Samplegenemapp.png]]&lt;br /&gt;
*Gene IDs:&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;bp1123&amp;#039;&amp;#039;&amp;#039; refers to the OrderedLocusNames gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;CAE43716&amp;#039;&amp;#039;&amp;#039; refers to the EmsemblBacteria gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;Q7VWE&amp;#039;&amp;#039;&amp;#039;5 refers to the UniProt gene ID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;2665491&amp;#039;&amp;#039;&amp;#039; refers to the GeneID system.&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;NP_881255&amp;#039;&amp;#039;&amp;#039; refers to the RefSeq gene ID system.&lt;br /&gt;
&lt;br /&gt;
Note: Gene IDs tested from the above gene ID systems all had complete Backpages and were successfully placed on the MAPP.&lt;br /&gt;
&lt;br /&gt;
===Creating an Expression Dataset in the Expression Dataset Manager===&lt;br /&gt;
The file [[[[File:Bpertussis compiledrawdata cw20151208.txt]]]] was used to create an expression dataset in GenMAPP.&lt;br /&gt;
&lt;br /&gt;
*Total Number of Gene IDs Imported&lt;br /&gt;
** 3211 of the 3552 gene IDs from the microarray dataset were imported into the expression dataset.&lt;br /&gt;
**There were 341 exceptions during the creation of the expression dataset. A screenshot of the error message is shown here: &lt;br /&gt;
***[[File:Errors in genmapp.png]]&lt;br /&gt;
*Investigating Errors in the Exceptions File (EX.txt)&lt;br /&gt;
**All 341 exceptions triggered the following error message: &amp;quot;Gene not found in OrderedLocusNames or any related system.&amp;quot;&lt;br /&gt;
**Gene IDs that triggered this error message conformed to the patterns &amp;quot;BP####&amp;quot; and &amp;quot;BP####A&amp;quot;, indicating that no unique gene ID patterns were the cause of these errors.&lt;br /&gt;
***Example gene IDs that triggered this error are the following: BP0101, BP1677, BP0910A, and BP2029A.&lt;br /&gt;
****Searching for any of these gene IDs in UniProt returns the message &amp;quot;Sorry, no results found for your search term.&amp;quot;:&lt;br /&gt;
*****[[File:ErroneousID Uniprot cw20151210.PNG]]&lt;br /&gt;
***The 341 gene IDs were copied into a new Excel file and compared to the gene IDs present in the file [[File:Bpertussis-std_cw20151210.zip]] (adapted from the &amp;quot;OrderedLocusNames&amp;quot; table in Microsoft Access).&lt;br /&gt;
****None of the 341 gene IDs were present in the .gdb file.&lt;br /&gt;
***The 341 gene IDs were each individually searched for in UniProt.&lt;br /&gt;
****None of the 341 gene IDs retrieved results in UniProt.&lt;br /&gt;
**&amp;#039;&amp;#039;&amp;#039;Conclusion: All gene IDs that triggered errors were not present in the original UniProt XML file.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
===Coloring a MAPP with Expression Data===&lt;br /&gt;
&lt;br /&gt;
====Creating a New Color Set====&lt;br /&gt;
We customized the new Expression Dataset by creating a new color set entitled &amp;quot;LogFoldChange&amp;quot;.&lt;br /&gt;
# We created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;increase&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Increased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as red using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;gt; 0.25 AND [Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;.&lt;br /&gt;
#Second, we created a criterion for this color set to label genes that demonstrated a significant &amp;#039;&amp;#039;decrease&amp;#039;&amp;#039; in their expression.&lt;br /&gt;
#*We specified the gene value as &amp;quot;Avg_ABC_Samples&amp;quot; for the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; microarray dataset.&lt;br /&gt;
#*We activated the &amp;#039;&amp;#039;Criteria Builder&amp;#039;&amp;#039; by clicking the &amp;#039;&amp;#039;New&amp;#039;&amp;#039; button and named the criterion &amp;quot;Decreased&amp;quot;.&lt;br /&gt;
#*We selected the color for this criterion as green using the color box.&lt;br /&gt;
#*We stated the criterion as follows and added it to the Criteria List: &amp;lt;code&amp;gt;[Avg_ABC_Samples] &amp;lt; -0.25 AND [Pvalue] &amp;lt; 0.05&amp;lt;/code&amp;gt;&lt;br /&gt;
# Upon entering these color sets, we saved the entire Expression Dataset by selecting Save from the Expression Dataset menu. This effectively updated our .gex file with the new Color Set.&lt;br /&gt;
&lt;br /&gt;
Screenshot of Color Set criteria:&lt;br /&gt;
*[[File:Expressioncolorset.png]]&lt;br /&gt;
&lt;br /&gt;
Note: No errors were encountered in the creation of the Color Set.&lt;br /&gt;
&lt;br /&gt;
====Creating a Pathway-Based MAPP Using Colored Genes====&lt;br /&gt;
====Ribosome Kegg Pathway====&lt;br /&gt;
* We were able to create a mapp of the ribosome pathway by using the genes provided from the http://www.genome.jp/kegg/ website.&lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Ribosome&amp;quot; that was under section 2.2 Translation and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the ribosome pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the ribosome pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151213&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
**Here is the screenshot of the final mapp for the ribosome pathway created:&lt;br /&gt;
* [[File:RibosomeGenMAPP.png]]&lt;br /&gt;
** Most of the ribosome genes that were generated on this mapp appeared to be the color green, symbolizing a decrease, except for the grey colored genes that were not significantly changed in this experiment. Since the genes mapped for the ribosome pathway all appeared to be green, this means that the expression levels of the genes pertaining to the ribosome category all decreased during the microarray experiment. Ribosomes play a key role in the translation process in cells and without them genes are often repressed and unable to perform their proper functions as they are unable to complete the replication processes. The microarray experiment analysis revealed that the absence of a membrane-associated protein named KpsT in B. pertussis, resulted in global down-regulation of gene expression including key virulence genes. The ribosome pathway depicted genes that were decreasing in gene expression, thus linking the translation process to the down-regulated key genes from the experiment because since these genes were lacking a necessary protein to help them perform the proper replication processes, translation did not occur in these genes and thus the ribosomes were not involved, ultimately leading to the decrease in expression of the genes mapped in the ribosome pathway.&lt;br /&gt;
&lt;br /&gt;
====Nitrogen Cycle Kegg Pathway====&lt;br /&gt;
* We were also able to create another mapp using the nitrogen cycle pathway genes provided from the http://www.genome.jp/kegg/ website. &lt;br /&gt;
** Once accessing the website, we selected KEGG PATHWAY from the main page.&lt;br /&gt;
** Next, we scrolled down to &amp;quot;Nitrogen Metabolism&amp;quot; that was under section 1.2 Energy Metabolism and selected it.&lt;br /&gt;
** Then, we searched our organism in the drop down menu at the top of the page, and we selected the Bordetella pertussis Tomaha I organism, and clicked &amp;quot;Go&amp;quot;.&lt;br /&gt;
** This lead us to a page of the nitrogen metabolism pathway with the gene IDs that pertained to our specific organism. We were then able to create a mapp using these genes in GenMAPP.&lt;br /&gt;
** Each of the green highlighted genes on the nitrogen metabolism pathway were entered into the GenMAPP mapp by entering each gene ID and the name given from the Kegg pathway, and then the expression dataset &amp;quot;bpertussis_expressiondataset_cw20151213&amp;quot; was applied to the genes to color code them.&lt;br /&gt;
** Here is the screenshot of the final mapp for the nitrogen cycle pathway created:&lt;br /&gt;
* [[File:NitrogencycleGenMAPP.png]]&lt;br /&gt;
** This mapp displayed both red and green colored genes; the green highlighted genes symbolizing a decrease and the red highlighted genes symbolizing an increase, as well a couple of gray genes that were not significant to the criterion. This nitrogen cycle mapp was created due to the important metabolic processes that occur in order to keep cells alive and reproducing, and specifically the nitrogen metabolism cycle. The genes that displayed red in this mapp had increased expression during the microarray experiment, and from the kegg pathway given for nitrogen metabolism, these genes can be seen to specifically aid in the metabolism of glutamate. Glutamate is important to cells as it plays a role in providing energy to allow the cells to operate correctly, and since the glutamate-related genes that we mapped were increased, it can be determined that glutamate plays a role in supplying the underlying energy to allow for the Bordetella pertussis strains to produce the polysaccharide capsule transport proteins, as studied in the microarray experiment.&lt;br /&gt;
&lt;br /&gt;
===Running MAPPFinder===&lt;br /&gt;
*MAPPFinder Procedure&lt;br /&gt;
** We launched the MAPPFinder program from within GenMAPP and ensured that the &amp;#039;&amp;#039;bpertussis-std_cw20151210.gdb&amp;#039;&amp;#039; gene database was still loaded into GenMAPP.&lt;br /&gt;
** We clicked on the button &amp;quot;Calculate New Results&amp;quot; followed by &amp;quot;Find File&amp;quot;, at which point I specified the .gex file updated during the creation of the &amp;quot;LogFoldChange&amp;quot; color set.&lt;br /&gt;
** We chose to apply both the &amp;quot;Increased&amp;quot; and &amp;quot;Decreased&amp;quot; criteria present within the LogFoldChange color set to the data.&lt;br /&gt;
** We checked the boxes next to &amp;quot;Gene Ontology&amp;quot; and &amp;quot;p value&amp;quot;, specified the results file, and then clicked &amp;quot;Run MAPPFinder&amp;quot;.&lt;br /&gt;
***This analysis took several minutes to complete.&lt;br /&gt;
*MAPPFinder Analysis Results&lt;br /&gt;
**We selected &amp;quot;Show Ranked List&amp;quot; to see a list of the most significant Gene Ontology terms. A screenshot of this output is shown below:&lt;br /&gt;
**[[File:Mappfinderrankedlist.png]]&lt;br /&gt;
***The majority of the most significant gene ontology terms pertained to ribosome biosynthesis and translation.&lt;br /&gt;
&lt;br /&gt;
Note: The MAPPFinder analysis took approximately 8 minutes to complete. No errors were encountered in the process. MAPPFinder thus was confirmed to work with the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database.&lt;br /&gt;
&lt;br /&gt;
=== Compare Gene Database to Outside Resource===&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151203&amp;diff=8073</id>
		<title>Gene Database Testing Report- cw20151203</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151203&amp;diff=8073"/>
				<updated>2015-12-18T19:21:24Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: replaced category with template&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Files Asked for in the Gene Database Testing Report==&lt;br /&gt;
&lt;br /&gt;
For convenience, all of the files explicitly asked for in the sections below were compressed together in this file: [[]]&lt;br /&gt;
&lt;br /&gt;
==Pre-requisites==&lt;br /&gt;
&lt;br /&gt;
The following set of software was used in the creation and testing of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database:&lt;br /&gt;
# [http://www.7-zip.org/ 7-zip]tool that for unpacking .gz and .zip files&lt;br /&gt;
# [http://www.postgresql.org PostgreSQL] on Windows (version 9.4.x)&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ GenMAPP Builder]&lt;br /&gt;
# Java JDK 1.8 64-bit&lt;br /&gt;
# [https://github.com/GenMAPPCS/genmapp GenMAPP 2]&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ XMLPipeDB match utility] for counting IDs in XML files&lt;br /&gt;
# Microsoft Access for reading .mdb files&lt;br /&gt;
&lt;br /&gt;
==Gene Database Creation==&lt;br /&gt;
===Downloading Data Source Files and GenMAPP Builder===&lt;br /&gt;
&lt;br /&gt;
*I download the UniProt XML, GOA, and GO OBO-XML files for &amp;#039;&amp;#039;Bordetella Pertussis&amp;#039;&amp;#039; along with the GenMAPP Builder program.&lt;br /&gt;
**All files were saved to the folder &amp;#039;&amp;#039;Bklein7_CW\bpertussis_cw20151203&amp;#039;&amp;#039; on my computer&amp;#039;s ThawSpace.&lt;br /&gt;
**Files that required extraction were unzipped using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
**Data files that remained in a folder after unzipping were removed from their folders to facilitate organization and command line processing.&lt;br /&gt;
&lt;br /&gt;
====UniProt XML====&lt;br /&gt;
&lt;br /&gt;
* I went to the [http://www.uniprot.org/taxonomy/complete-proteomes UniProt Complete Proteomes] page.&lt;br /&gt;
**From there, I navigated to the complete proteome download page for [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)].&lt;br /&gt;
** I clicked on the &amp;quot;Download&amp;quot; button at the top of the page above and selected the following options:&lt;br /&gt;
***&amp;quot;Download all&amp;quot;&lt;br /&gt;
***&amp;quot;XML&amp;quot; from the &amp;quot;Format&amp;quot; drop-down menu&lt;br /&gt;
***&amp;quot;Compressed&amp;quot; format&lt;br /&gt;
**I extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====GOA====&lt;br /&gt;
&lt;br /&gt;
* UniProt-GOA files can be downloaded from the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/ UniProt-GOA ftp site].&lt;br /&gt;
*Within the above site, I navigated to the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I].&lt;br /&gt;
**This text file was automatically opened by my browser. Therefore, I had to manually download the file.&lt;br /&gt;
&lt;br /&gt;
====GO OBO-XML====&lt;br /&gt;
&lt;br /&gt;
* I downloaded the GO OBO-XML formatted file from the [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page].&lt;br /&gt;
*I extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====Downloaded GenMAPP Builder====&lt;br /&gt;
&lt;br /&gt;
# I downloaded the custom version of GenMAPP Builder including the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; custom class expanded to include ORF listings in exports (Version 3.0.0 Build 5 - cw20151203): [[File:Dist cw20151203.zip]].&lt;br /&gt;
# I extracted the GenMAPP Builder folder using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
===Creating the New Database in PostgreSQL===&lt;br /&gt;
&lt;br /&gt;
* I launched &amp;#039;&amp;#039;pgAdmin III&amp;#039;&amp;#039; and connected to the PostgreSQL 9.4 server (localhost:5432).&lt;br /&gt;
** On this server, I created a new database: &amp;#039;&amp;#039;bpertussis_cw20151201_gmb3build5&amp;#039;&amp;#039;.&lt;br /&gt;
** I opened the SQL Editor tab to use an XMLPipeDB query to create the tables in the database.&lt;br /&gt;
*** I clicked on the Open File icon and selected the file &amp;#039;&amp;#039;gmbuilder.sql&amp;#039;&amp;#039;. This imported a series of SQL commands into the editor tab.&lt;br /&gt;
*** I clicked on the Execute Query icon to run this command.&lt;br /&gt;
***In viewing the schema for this database, I confirmed that there were 167 tables after running the above command.&lt;br /&gt;
&lt;br /&gt;
===Configuring GenMAPP Builder to Connect to the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
* To begin, I launched gmbuilder.bat.&lt;br /&gt;
* I selected the &amp;quot;Configure Database&amp;quot; option and entered the following information into the fields below:&lt;br /&gt;
** Host or address: localhost&lt;br /&gt;
** Port number: 5432&lt;br /&gt;
** Database name: bpertussis_cw20151201_gmb3build5&lt;br /&gt;
** Username: postgres&lt;br /&gt;
** Password: Welcome1&lt;br /&gt;
&lt;br /&gt;
===Importing Data into the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
*The downloaded data files for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; were specified and imported into the database by clicking on the following buttons:&lt;br /&gt;
** Selected File &amp;gt; Import UniProt XML...&lt;br /&gt;
** Selected File &amp;gt; Import GO OBO-XML...&lt;br /&gt;
** Clicked OK to the message asking to process the GO data.&lt;br /&gt;
** Selected File &amp;gt; Import GOA...&lt;br /&gt;
&lt;br /&gt;
===Exporting a GenMAPP Gene Database (.gdb)===&lt;br /&gt;
&lt;br /&gt;
* I selected File &amp;gt; Export to GenMAPP Gene Database... to begin the export process.&lt;br /&gt;
* I typed my name in the owner field (Brandon Klein).&lt;br /&gt;
* I selected &amp;quot;Bordetella pertussis (strain Tohama I/ATCC BAA-589/NCTC 13251), Taxon ID 257313&amp;quot; as the gene database species and then clicked &amp;#039;&amp;#039;Next&amp;#039;&amp;#039;.&lt;br /&gt;
* The database was saved as &amp;#039;&amp;#039;bpertussis-std_cw20151203&amp;#039;&amp;#039;.&lt;br /&gt;
* I checked the boxes for exporting all Molecular Function, Cellular Component, and Biological Process Gene Ontology Terms.&lt;br /&gt;
* Finally, I clicked the &amp;quot;Next&amp;quot; button to begin the export process.&lt;br /&gt;
&lt;br /&gt;
==Gene Database Testing Report==&lt;br /&gt;
===Export Information===&lt;br /&gt;
&lt;br /&gt;
Version of GenMAPP Builder: Version 3.0.0 Build 5 - cw20151203&lt;br /&gt;
&lt;br /&gt;
Computer on which export was run: Seaver 120- Last computer on the right in the row farthest from the front of the room&lt;br /&gt;
&lt;br /&gt;
Postgres Database name: bpertussis_cw20151201_gmb3build5&lt;br /&gt;
&lt;br /&gt;
UniProt XML filename: [[File:Uniprot-proteome-UP000002676 cw20151201.zip]]&lt;br /&gt;
* UniProt XML version (The version information was found at [http://uniprot.org/news the UniProt News Page]): 2015_11&lt;br /&gt;
* UniProt XML download link: [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)]&lt;br /&gt;
* Time taken to import: 2.59 minutes&lt;br /&gt;
** Note: The import time was nearly equivalent to that when creating the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151119.gdb (2.60 minute). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
GO OBO-XML filename: [[File:Go daily-termdb cw20151201.zip]]&lt;br /&gt;
* GO OBO-XML version (The version information was found in the file properties): Last Modified- ‎‎December ‎01, ‎2015, ‏‎2:21:31 AM&lt;br /&gt;
* GO OBO-XML download link: [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page]&lt;br /&gt;
* Time taken to import: 7.08 minutes &lt;br /&gt;
* Time taken to process: 4.42 minutes&lt;br /&gt;
** Note: The import and processing times were similar to those for the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151119.gdb (6.99 minutes and 4.48 minutes respectively). No interruptions occurred during these processes.&lt;br /&gt;
&lt;br /&gt;
GOA filename: [[File:145.B pertussis ATCC BAA-589 cw20151201.zip]]&lt;br /&gt;
* GOA version (found in the Last modified field on the [ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/ FTP site]): Last Modified- 11/10/15 1:39:00 PM&lt;br /&gt;
* GOA download link: [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I]&lt;br /&gt;
* Time taken to import: 0.04 minutes&lt;br /&gt;
** Note: The import time was equal to that of the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151119.gdb. No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
Name of .gdb file: [[File:Bpertussis-std cw20151203.zip]]&lt;br /&gt;
* Time taken to export: &lt;br /&gt;
** Start time: 4:02 PM&lt;br /&gt;
** End time: 4:56 PM&lt;br /&gt;
** Elapsed time: 54 minutes&lt;br /&gt;
Note: No interruptions occurred during the export process.&lt;br /&gt;
&lt;br /&gt;
===TallyEngine===&lt;br /&gt;
* I ran the TallyEngine in GenMAPP Builder and specified the following files:&lt;br /&gt;
**XML- [[File:Uniprot-proteome-UP000002676 cw20151201.zip]]&lt;br /&gt;
**GO- [[File:Go daily-termdb cw20151201.zip]]&lt;br /&gt;
*Results:&lt;br /&gt;
**[[File: Tallyenginecustomization_cw20151203.png]]&lt;br /&gt;
***All tally results were consistent across both files.&lt;br /&gt;
***Further, the tally results reflect the customization we made to the TallyEngine, listing all 11 ORF genes present in the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database.&lt;br /&gt;
&lt;br /&gt;
===Using XMLPipeDB Match to Validate the XML Results from the TallyEngine===&lt;br /&gt;
The following functions were performed using the Windows command line (cmd).&lt;br /&gt;
*I entered my project folder using the following command:&lt;br /&gt;
 cd /d T:\Bklein7_CW\bpertussis_cw20151203&lt;br /&gt;
*I used XMLPipeDB match to identify matches of gene IDs in the UniProt XML file that conformed to the following the patterns: &amp;quot;BP####&amp;quot;, &amp;quot;BP####.1&amp;quot;, &amp;quot;BP####A&amp;quot;, and &amp;quot;BP####B&amp;quot;. The command used was as follows:&lt;br /&gt;
 java -jar xmlpipedb-match-1.1.1.jar &amp;quot;BP[0-9][0-9][0-9][0-9](A|B|\.1|)&amp;quot; &amp;lt; &amp;quot;uniprot-proteome%3AUP000002676_cw20151201.xml&amp;quot;&lt;br /&gt;
&lt;br /&gt;
Match Results:&lt;br /&gt;
*[[File:Xmlpipedbmatch cw20151203.png]]&lt;br /&gt;
**The number of unique matches generated by XMLPipeDB Match, 3447, was one higher than expected.&lt;br /&gt;
**To identify the extra gene ID, the XMLPipeDB Match output was compared to the &amp;quot;OrderedLocusNames&amp;quot; table present in the .gdb file. The discrepant ID was &amp;#039;&amp;#039;&amp;#039;BP3167A&amp;#039;&amp;#039;&amp;#039;, listed with the type &amp;quot;gene ID&amp;quot; in the original .xml file.&lt;br /&gt;
&lt;br /&gt;
===Using SQL Queries to Validate the PostgreSQL Database Results from the TallyEngine===&lt;br /&gt;
I ran a SQL query designed to count all gene IDs listed by the types &amp;quot;ordered locus&amp;quot; and &amp;quot;ORF&amp;quot;:&lt;br /&gt;
&lt;br /&gt;
 select count (*) from genenametype where type = &amp;#039;ordered locus&amp;#039;;&lt;br /&gt;
&lt;br /&gt;
Results:&lt;br /&gt;
*[[File:Sqlcount_cw20151203.png]]&lt;br /&gt;
* The number of unique matches yielded by this SQL query, 3446, reflects the total number of ordered locus (3435) and ORF (11) gene IDs present in the database. This number is consistent with that produced by the TallyEngine.&lt;br /&gt;
&lt;br /&gt;
===OriginalRowCounts Comparison===&lt;br /&gt;
&lt;br /&gt;
I opened the gene database file [[File:Bpertussis-std_cw20151203.zip]] in  Microsoft Access and assessed the &amp;quot;OriginalRowCounts&amp;quot; table to see if the expected tables were listed with the expected number of records. The contents of this table were compared to the &amp;#039;&amp;#039;OriginalRowCounts&amp;#039;&amp;#039; table of an existing .gdb file created during Week 9.&lt;br /&gt;
 &lt;br /&gt;
Benchmark .gdb file: &amp;#039;&amp;#039;Vc-Std_20151027_TR&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;quot;OriginalRowCounts&amp;quot; table from the benchmark and new gdb:&lt;br /&gt;
[[File:Rowcountcomparison cw20151203.PNG]]&lt;br /&gt;
*All 52 tables present in the 2015 &amp;#039;&amp;#039;Vibrio cholerae&amp;#039;&amp;#039; database were also present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; gene database, &amp;#039;&amp;#039;bpertussis-std_cw20151203&amp;#039;&amp;#039;. This confirmed that all expected tables were successfully created.&lt;br /&gt;
*Further, the &amp;quot;OrderedLocusNames&amp;quot; table count is listed as 3446, which represents the combined number of &amp;quot;ordered locus&amp;quot; and &amp;quot;ORF&amp;quot; gene IDs as was desired. This count is consistent with the TallyEngine result.&lt;br /&gt;
&lt;br /&gt;
Note: The &amp;quot;OriginalRowCounts&amp;quot; tables were too large to screenshot. To circumvent this problem and facilitate the comparison, I copied the &amp;quot;OriginalRowCounts&amp;quot; tables from both gene databases into an Excel file and zoomed out. The above screenshot was taken from this Excel file. The &amp;quot;OrderedLocusNames&amp;quot; row counts are highlighted in yellow.&lt;br /&gt;
&lt;br /&gt;
===Visual Inspection===&lt;br /&gt;
I visually inspected individual tables within [[File:Bpertussis-std_cw20151203.zip]] using Microsoft Access.&lt;br /&gt;
&lt;br /&gt;
*Systems Table&lt;br /&gt;
**35 gene ID systems were listed, 11 of which were used in the creation of this .gdb file and listed the appropriate import date (12/03/2015).&lt;br /&gt;
***All gene ID systems relevant to &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; were listed. This includes: EMBL, EnsemblBacteria, GeneID, GeneOntology, InterPro, OrderedLocusNames, Pfam, RefSeq, and UniProt.&lt;br /&gt;
***This result corresponded with that of the benchmark .gdb file listed in the &amp;quot;OriginalRowCounts Comparison&amp;quot; section.&lt;br /&gt;
**The &amp;quot;OrderedLocusNames&amp;quot; listing properly displayed the implemented changes.&lt;br /&gt;
***In this row, the species was listed correctly as &amp;quot;Bordetella pertussis&amp;quot;.&lt;br /&gt;
***In this row, the link corresponded to the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; database at GeneDB. The link was as follows: http://www.genedb.org/gene/~;jsessionid=A06A0EFE93C64E476380393D4CBEFA69?actionName=%2FQuery%2FquickSearch&amp;amp;resultsSize=1&amp;amp;taxonNodeName=Bpertussis.&lt;br /&gt;
*UniProt Table&lt;br /&gt;
**This table contained 3258 entries with 6 character IDs.&lt;br /&gt;
**All ID&amp;#039;s in the UniProt table conform to the following pattern: [[File:UniProt Ascension Number info.PNG]]&lt;br /&gt;
*RefSeq Table&lt;br /&gt;
**This table contained 6627 entries. All IDs began with one of three prefixes: &amp;quot;NP_&amp;quot;, &amp;quot;YP_&amp;quot;, or &amp;quot;WP_&amp;quot;. The meanings of these prefixes can be found in the RefSeq documentation found [http://www.ncbi.nlm.nih.gov/books/NBK50679/ here].&lt;br /&gt;
***&amp;quot;NP_&amp;quot; and &amp;quot;YP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to proteins. There are 3410 &amp;quot;NP_&amp;quot; IDs and 7 &amp;quot;YP_&amp;quot; IDs.&lt;br /&gt;
***&amp;quot;WP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to &amp;quot;autonomous non-redundant proteins that are not yet directly annotated on a genome&amp;quot;. There were 3210 IDs with the &amp;quot;WP_&amp;quot; prefixes.&lt;br /&gt;
***Overall, every entry in the ID column was an expected value.&lt;br /&gt;
*OrderedLocusNames Table&lt;br /&gt;
**This table contained 3446 entries (consistent with &amp;amp; SQL counts).&lt;br /&gt;
**The IDs were copied into an Excel document for analysis:&lt;br /&gt;
***3434 IDs conformed to the pattern &amp;quot;BP####&amp;quot;&lt;br /&gt;
***1 ID was unique: &amp;quot;BP3167.1&amp;quot;&lt;br /&gt;
***The other 11 IDs were ORFs that conformed to one of the following two patterns: &amp;quot;BP####A&amp;quot; or &amp;quot;BP####B&amp;quot;&lt;br /&gt;
&lt;br /&gt;
===.gdb Use in GenMAPP===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--Need to add more instructions here.--&amp;gt;&lt;br /&gt;
*By following the instructions in [http://www.openwetware.org/wiki/BIOL367/F10:GenMAPP_and_MAPPFinder_Protocols Part 2 of the &amp;#039;&amp;#039;Vibrio cholerae&amp;#039;&amp;#039; Microarray Data Analysis] and looking at Brandon&amp;#039;s Week 9 individual journal assignment, I was able to verify that our Gene Database works in GenMAPP.&lt;br /&gt;
*I was able to open the GenMAPP program on the computer, and then I went to Data -&amp;gt; Choose Gene Database -&amp;gt; and selected the cw20151119 gdb file.&lt;br /&gt;
**There were no problems thus far as our database was able to load into the program.&lt;br /&gt;
&lt;br /&gt;
====Putting a gene on the MAPP using the GeneFinder window====&lt;br /&gt;
&lt;br /&gt;
*In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
**I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
**Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
**I went through each of the five inconsistent Gene IDs in GeneFinder.&lt;br /&gt;
&lt;br /&gt;
**I typed in the gene Id BP3167.1 into the Gene ID field and selected &amp;quot;OrderedLocusName&amp;quot; for the Gene ID system.&lt;br /&gt;
**In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
***I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
***Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
*I typed in the gene Id BP1252 into the Gene ID field and selected &amp;quot;OrderedLocusName&amp;quot; for the Gene ID system. This served as a &amp;quot;control&amp;quot; to look up a consistent Gene ID to compare to.&lt;br /&gt;
**In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
***I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
***Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
&lt;br /&gt;
***I typed in the gene Id BP0101A into the Gene ID field and selected &amp;quot;OrderedLocusName&amp;quot; for the Gene ID system.&lt;br /&gt;
**In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
***I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
***Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
&lt;br /&gt;
*In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
**I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
**Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
**I typed in the gene Id BP0101B into the Gene ID field and selected &amp;quot;OrderedLocusName&amp;quot; for the Gene ID system.&lt;br /&gt;
&lt;br /&gt;
*In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
**I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
**Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
**I typed in the gene Id BP0684A into the Gene ID field and selected &amp;quot;OrderedLocusName&amp;quot; for the Gene ID system.&lt;br /&gt;
**When the gene was found this is the back page that popped up:&lt;br /&gt;
&lt;br /&gt;
*In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
**I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
**Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
**I typed in the gene Id BP0970A into the Gene ID field and selected &amp;quot;GeneID&amp;quot; for the Gene ID system.&lt;br /&gt;
&lt;br /&gt;
**All of the expected cross-referenced IDs were present.&lt;br /&gt;
&lt;br /&gt;
*Screenshot of all of the sample ID&amp;#039;s on a MAPP:&lt;br /&gt;
*[[File: Genesonmap_cw20151203.png]]&lt;br /&gt;
&lt;br /&gt;
====Expression Dataset and MAPPFinder Analysis====&lt;br /&gt;
*We do not have the expression dataset yet that is to be created by the GenMAPP Builders; they are still working on performing the corrections to the data that has been compiled into an excel spreadsheet.&lt;br /&gt;
*Once the file is complete, we will proceed with the data analysis using the desired programs.&lt;br /&gt;
&lt;br /&gt;
=== Compare Gene Database to Outside Resource===&lt;br /&gt;
*We will complete this step after progressing further into the project.&lt;br /&gt;
The OrderedLocusNames IDs in the exported Gene Database are derived from the UniProt XML.  It is a good idea to check your list of OrderedLocusNames IDs to see how complete it is using the original source of the data (the sequencing organization, the MOD, etc.)  Because UniProt is a protein database, it does not reference any non-protein genome features such as genes that code for functional RNAs, centromeres, telomeres, etc.&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151201&amp;diff=8072</id>
		<title>Gene Database Testing Report- cw20151201</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151201&amp;diff=8072"/>
				<updated>2015-12-18T19:21:10Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: replaced category with template&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Files Asked for in the Gene Database Testing Report==&lt;br /&gt;
For convenience, all of the files explicitly asked for in the sections below were compressed together in this file: [[File:Gdtr cw20151201.zip]]&lt;br /&gt;
&lt;br /&gt;
==Pre-requisites==&lt;br /&gt;
The following set of software was used in the creation and testing of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database:&lt;br /&gt;
&lt;br /&gt;
# [http://www.7-zip.org/ 7-zip]tool that for unpacking .gz and .zip files&lt;br /&gt;
# [http://www.postgresql.org PostgreSQL] on Windows (version 9.4.x)&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ GenMAPP Builder]&lt;br /&gt;
# Java JDK 1.8 64-bit&lt;br /&gt;
# [https://github.com/GenMAPPCS/genmapp GenMAPP 2]&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ XMLPipeDB match utility] for counting IDs in XML files&lt;br /&gt;
# Microsoft Access for reading .mdb files&lt;br /&gt;
&lt;br /&gt;
==Gene Database Creation==&lt;br /&gt;
===Downloading Data Source Files and GenMAPP Builder===&lt;br /&gt;
&lt;br /&gt;
*I download the UniProt XML, GOA, and GO OBO-XML files for &amp;#039;&amp;#039;Bordetella Pertussis&amp;#039;&amp;#039; along with the GenMAPP Builder program.&lt;br /&gt;
**All files were saved to the folder &amp;#039;&amp;#039;Bklein7_CW\bpertussis_cw20151201&amp;#039;&amp;#039; on my computer&amp;#039;s ThawSpace.&lt;br /&gt;
**Files that required extraction were unzipped using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
**Data files that remained in a folder after unzipping were removed from their folders to facilitate organization and command line processing.&lt;br /&gt;
&lt;br /&gt;
====UniProt XML====&lt;br /&gt;
&lt;br /&gt;
* I went to the [http://www.uniprot.org/taxonomy/complete-proteomes UniProt Complete Proteomes] page.&lt;br /&gt;
**From there, I navigated to the complete proteome download page for [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)].&lt;br /&gt;
** I clicked on the &amp;quot;Download&amp;quot; button at the top of the page above and selected the following options:&lt;br /&gt;
***&amp;quot;Download all&amp;quot;&lt;br /&gt;
***&amp;quot;XML&amp;quot; from the &amp;quot;Format&amp;quot; drop-down menu&lt;br /&gt;
***&amp;quot;Compressed&amp;quot; format&lt;br /&gt;
**I extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====GOA====&lt;br /&gt;
&lt;br /&gt;
* UniProt-GOA files can be downloaded from the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/ UniProt-GOA ftp site].&lt;br /&gt;
*Within the above site, I navigated to the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I].&lt;br /&gt;
**This text file was automatically opened by my browser. Therefore, I had to manually download the file.&lt;br /&gt;
&lt;br /&gt;
====GO OBO-XML====&lt;br /&gt;
&lt;br /&gt;
* I downloaded the GO OBO-XML formatted file from the [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page].&lt;br /&gt;
*I extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====Downloaded GenMAPP Builder====&lt;br /&gt;
&lt;br /&gt;
# I downloaded the first custom version of &amp;#039;&amp;#039;gmbuilder&amp;#039;&amp;#039; including the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; custom class: [[File:Dist cw20151201.zip]].&lt;br /&gt;
# I extracted the GenMAPP Builder folder using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
===Creating the New Database in PostgreSQL===&lt;br /&gt;
&lt;br /&gt;
* I launched &amp;#039;&amp;#039;pgAdmin III&amp;#039;&amp;#039; and connected to the PostgreSQL 9.4 server (localhost:5432).&lt;br /&gt;
** On this server, I created a new database: &amp;#039;&amp;#039;bpertussis_cw20151201_gmb3build5&amp;#039;&amp;#039;.&lt;br /&gt;
** I opened the SQL Editor tab to use an XMLPipeDB query to create the tables in the database.&lt;br /&gt;
*** I clicked on the Open File icon and selected the file &amp;#039;&amp;#039;gmbuilder.sql&amp;#039;&amp;#039;. This imported a series of SQL commands into the editor tab.&lt;br /&gt;
*** I clicked on the Execute Query icon to run this command.&lt;br /&gt;
***In viewing the schema for this database, I confirmed that there were 167 tables after running the above command.&lt;br /&gt;
&lt;br /&gt;
===Configuring GenMAPP Builder to Connect to the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
* To begin, I launched gmbuilder.bat.&lt;br /&gt;
* I selected the &amp;quot;Configure Database&amp;quot; option and entered the following information into the fields below:&lt;br /&gt;
** Host or address: localhost&lt;br /&gt;
** Port number: 5432&lt;br /&gt;
** Database name: bpertussis_cw20151201_gmb3build5custom&lt;br /&gt;
** Username: postgres&lt;br /&gt;
** Password: Welcome1&lt;br /&gt;
&lt;br /&gt;
===Importing Data into the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
*The downloaded data files for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; were specified and imported into the database by clicking on the following buttons:&lt;br /&gt;
** Selected File &amp;gt; Import UniProt XML...&lt;br /&gt;
** Selected File &amp;gt; Import GO OBO-XML...&lt;br /&gt;
** Clicked OK to the message asking to process the GO data.&lt;br /&gt;
** Selected File &amp;gt; Import GOA...&lt;br /&gt;
&lt;br /&gt;
===Exporting a GenMAPP Gene Database (.gdb)===&lt;br /&gt;
&lt;br /&gt;
* I selected File &amp;gt; Export to GenMAPP Gene Database... to begin the export process.&lt;br /&gt;
* I typed my name in the owner field (Brandon Klein).&lt;br /&gt;
* I selected &amp;quot;Bordetella pertussis (strain Tohama I/ATCC BAA-589/NCTC 13251), Taxon ID 257313&amp;quot; as the gene database species and then clicked &amp;#039;&amp;#039;Next&amp;#039;&amp;#039;.&lt;br /&gt;
* The database was saved as &amp;#039;&amp;#039;bpertussis-std_cw20151201&amp;#039;&amp;#039;.&lt;br /&gt;
* I checked the boxes for exporting all Molecular Function, Cellular Component, and Biological Process Gene Ontology Terms.&lt;br /&gt;
* Finally, I clicked the &amp;quot;Next&amp;quot; button to begin the export process.&lt;br /&gt;
&lt;br /&gt;
==Gene Database Testing Report==&lt;br /&gt;
===Export Information===&lt;br /&gt;
&lt;br /&gt;
Version of GenMAPP Builder: Version 3.0.0 Build 5&lt;br /&gt;
&lt;br /&gt;
Computer on which export was run: Seaver 120- Last computer on the right in the row farthest from the front of the room&lt;br /&gt;
&lt;br /&gt;
Postgres Database name: bpertussis_cw20151201_gmb3build5custom&lt;br /&gt;
&lt;br /&gt;
UniProt XML filename: [[File:Uniprot-proteome-UP000002676 cw20151201.zip]]&lt;br /&gt;
* UniProt XML version (The version information was found at [http://uniprot.org/news the UniProt News Page]): 2015_11&lt;br /&gt;
* UniProt XML download link: [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)]&lt;br /&gt;
* Time taken to import: 2.59 minutes&lt;br /&gt;
** Note: The import time was nearly equivalent to that when creating the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151119.gdb (2.60 minute). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
GO OBO-XML filename: [[File:Go daily-termdb cw20151201.zip]]&lt;br /&gt;
* GO OBO-XML version (The version information was found in the file properties): Last Modified- ‎‎December ‎01, ‎2015, ‏‎2:21:31 AM&lt;br /&gt;
* GO OBO-XML download link: [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page]&lt;br /&gt;
* Time taken to import: 7.08 minutes &lt;br /&gt;
* Time taken to process: 4.42 minutes&lt;br /&gt;
** Note: The import and processing times were similar to those for the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151119.gdb (6.99 minutes and 4.48 minutes respectively). No interruptions occurred during these processes.&lt;br /&gt;
&lt;br /&gt;
GOA filename: [[File:145.B pertussis ATCC BAA-589 cw20151201.zip]]&lt;br /&gt;
* GOA version (found in the Last modified field on the [ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/ FTP site]): Last Modified- 11/10/15 1:39:00 PM&lt;br /&gt;
* GOA download link: [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I]&lt;br /&gt;
* Time taken to import: 0.04 minutes&lt;br /&gt;
** Note: The import time was equal to that of the previous &amp;quot;Bordetella pertussis&amp;quot; gene database: bpertussis-std_cw20151119.gdb. No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
Name of .gdb file: [[File:Bpertussis-std cw20151201.zip]]&lt;br /&gt;
* Time taken to export: &lt;br /&gt;
** Start time: 4:06 PM&lt;br /&gt;
** End time: 5:00 PM&lt;br /&gt;
** Elapsed time: 54 minutes&lt;br /&gt;
Note: No interruptions occurred during the export process.&lt;br /&gt;
&lt;br /&gt;
===Confirming that Changes to GenMAPP Builder Worked As Expected===&lt;br /&gt;
#Systems Table Update&lt;br /&gt;
#*In creating the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; custom class, changes were implemented to edit the &amp;#039;&amp;#039;OrderedLocusNames&amp;#039;&amp;#039; column in the exported .gdb file. Specifically, the species should be listed as &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; and a custom GeneDB link should be included in this row.&lt;br /&gt;
#*To verify that these changes were working as expected, I opened the file bpertussis-std_cw20151201.gdb in Microsoft Access. Upon opening &amp;#039;&amp;#039;Systems&amp;#039;&amp;#039; table here, all changes were successfully implemented:&lt;br /&gt;
#**[[File: Systemstable_cw20151201.png]]&lt;br /&gt;
#GenMAPP Gene Backpage Update&lt;br /&gt;
#*The &amp;#039;&amp;#039;OrderedLocusNames&amp;#039;&amp;#039; GeneDB link added (as described above) also was designed to have funcitonality in GenMAPP. Sepcifically, gene backpages opened while using this gene database in GenMAPP should redirect users to the GeneDB information page for the specific &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene.&lt;br /&gt;
#*I specified bpertussis-std_cw20151201.gdb as the gene database in GenMAPP and put the gene BP3701 on the map. When I clicked on the link to the ID &amp;quot;BP3701&amp;quot; under &amp;quot;OrderedLocusNames&amp;quot; in Gene Finder, I was successfully brought to the corresponding GeneDB information page:&lt;br /&gt;
#**[[File:Custombackpage cw20151201.PNG]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151119&amp;diff=8071</id>
		<title>Gene Database Testing Report- cw20151119</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Gene_Database_Testing_Report-_cw20151119&amp;diff=8071"/>
				<updated>2015-12-18T19:20:12Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: replaced category with template&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Files Asked for in the Gene Database Testing Report==&lt;br /&gt;
For convenience, all of the files explicitly asked for in the &amp;quot;Gene Database Testing Report&amp;quot; section were compressed together in this file: [[File:Gdtr cw20151119.zip]]&lt;br /&gt;
*Note: File names were automatically capitalized and underscores were removed upon upload, but the files themselves conform to the guidelines created on the CW team page.&lt;br /&gt;
&lt;br /&gt;
==Pre-requisites==&lt;br /&gt;
The following set of software was used in the creation and testing of the &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; gene database:&lt;br /&gt;
&lt;br /&gt;
# [http://www.7-zip.org/ 7-zip]tool that for unpacking .gz and .zip files&lt;br /&gt;
# [http://www.postgresql.org PostgreSQL] on Windows (version 9.4.x)&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ GenMAPP Builder]&lt;br /&gt;
# Java JDK 1.8 64-bit&lt;br /&gt;
# [https://github.com/GenMAPPCS/genmapp GenMAPP 2]&lt;br /&gt;
# [https://sourceforge.net/projects/xmlpipedb/files/ XMLPipeDB match utility] for counting IDs in XML files&lt;br /&gt;
# Microsoft Access for reading .mdb files&lt;br /&gt;
&lt;br /&gt;
==Gene Database Creation==&lt;br /&gt;
===Downloading Data Source Files and GenMAPP Builder===&lt;br /&gt;
&lt;br /&gt;
*I download the UniProt XML, GOA, and GO OBO-XML files for &amp;#039;&amp;#039;Bordetella Pertussis&amp;#039;&amp;#039; along with the GenMAPP Builder program.&lt;br /&gt;
**All files were saved to the folder &amp;#039;&amp;#039;Bklein7_CW&amp;#039;&amp;#039; on my computer&amp;#039;s ThawSpace.&lt;br /&gt;
**Files that required extraction were unzipped using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
**Data files that remained in a folder after unzipping were removed from their folders to facilitate organization and command line processing.&lt;br /&gt;
&lt;br /&gt;
====UniProt XML====&lt;br /&gt;
&lt;br /&gt;
* I went to the [http://www.uniprot.org/taxonomy/complete-proteomes UniProt Complete Proteomes] page.&lt;br /&gt;
**From there, I navigated to the complete proteome download page for [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)].&lt;br /&gt;
** I clicked on the &amp;quot;Download&amp;quot; button at the top of the page above and selected the following options:&lt;br /&gt;
***&amp;quot;Download all&amp;quot;&lt;br /&gt;
***&amp;quot;XML&amp;quot; from the &amp;quot;Format&amp;quot; drop-down menu&lt;br /&gt;
***&amp;quot;Compressed&amp;quot; format&lt;br /&gt;
**I extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====GOA====&lt;br /&gt;
&lt;br /&gt;
* UniProt-GOA files can be downloaded from the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/ UniProt-GOA ftp site].&lt;br /&gt;
*Within the above site, I navigated to the [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I].&lt;br /&gt;
**This text file was automatically opened by my browser. Therefore, I had to manually download the file.&lt;br /&gt;
&lt;br /&gt;
====GO OBO-XML====&lt;br /&gt;
&lt;br /&gt;
* I downloaded the GO OBO-XML formatted file from the [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page].&lt;br /&gt;
*I extracted the file using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
====Downloaded GenMAPP Builder====&lt;br /&gt;
&lt;br /&gt;
# I downloaded the GenMAPP Builder zip folder: [https://github.com/lmu-bioinformatics/xmlpipedb/releases/download/untagged-bd04fffc4da853fedf30/gmbuilder-3.0.0-build-5.zip Download gmbuilder-3.0.0-build-5.zip].&lt;br /&gt;
# I extracted the GenMAPP Builder folder using [http://www.7-zip.org/ 7-zip].&lt;br /&gt;
&lt;br /&gt;
===Creating the New Database in PostgreSQL===&lt;br /&gt;
&lt;br /&gt;
* I launched &amp;#039;&amp;#039;pgAdmin III&amp;#039;&amp;#039; and connected to the PostgreSQL 9.4 server (localhost:5432).&lt;br /&gt;
** On this server, I created a new database: &amp;#039;&amp;#039;bpertussis_cw20151119_gmb3build5&amp;#039;&amp;#039;.&lt;br /&gt;
** I opened the SQL Editor tab to use an XMLPipeDB query to create the tables in the database.&lt;br /&gt;
*** I clicked on the Open File icon and selected the file &amp;#039;&amp;#039;gmbuilder.sql&amp;#039;&amp;#039;. This imported a series of SQL commands into the editor tab.&lt;br /&gt;
*** I clicked on the Execute Query icon to run this command.&lt;br /&gt;
***In viewing the schema for this database, I confirmed that there were 167 tables after running the above command.&lt;br /&gt;
&lt;br /&gt;
===Configuring GenMAPP Builder to Connect to the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
* To begin, I launched gmbuilder.bat.&lt;br /&gt;
* I selected the &amp;quot;Configure Database&amp;quot; option and entered the following information into the fields below:&lt;br /&gt;
** Host or address: localhost&lt;br /&gt;
** Port number: 5432&lt;br /&gt;
** Database name: bpertussis_cw20151119_gmb3build5&lt;br /&gt;
** Username: postgres&lt;br /&gt;
** Password: Welcome1&lt;br /&gt;
&lt;br /&gt;
===Importing Data into the PostgreSQL Database===&lt;br /&gt;
&lt;br /&gt;
*The downloaded data files for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; were specified and imported into the database by clicking on the following buttons:&lt;br /&gt;
** Selected File &amp;gt; Import UniProt XML...&lt;br /&gt;
** Selected File &amp;gt; Import GO OBO-XML...&lt;br /&gt;
** Clicked OK to the message asking to process the GO data.&lt;br /&gt;
** Selected File &amp;gt; Import GOA...&lt;br /&gt;
&lt;br /&gt;
===Exporting a GenMAPP Gene Database (.gdb)===&lt;br /&gt;
&lt;br /&gt;
* I selected File &amp;gt; Export to GenMAPP Gene Database... to begin the export process.&lt;br /&gt;
* I typed my name in the owner field (Brandon Klein).&lt;br /&gt;
* I selected &amp;quot;Bordetella pertussis (strain Tohama I/ATCC BAA-589/NCTC 13251), Taxon ID 257313&amp;quot; as the gene database species and then clicked &amp;#039;&amp;#039;Next&amp;#039;&amp;#039;.&lt;br /&gt;
* The database was saved as &amp;#039;&amp;#039;bpertussis-std_cw20151119&amp;#039;&amp;#039;.&lt;br /&gt;
* I checked the boxes for exporting all Molecular Function, Cellular Component, and Biological Process Gene Ontology Terms.&lt;br /&gt;
* Finally, I clicked the &amp;quot;Next&amp;quot; button to begin the export process.&lt;br /&gt;
&lt;br /&gt;
==Gene Database Testing Report==&lt;br /&gt;
===Export Information===&lt;br /&gt;
&lt;br /&gt;
Version of GenMAPP Builder: Version 3.0.0 Build 5&lt;br /&gt;
&lt;br /&gt;
Computer on which export was run: Seaver 120- Last computer on the right in the row farthest from the front of the room&lt;br /&gt;
&lt;br /&gt;
Postgres Database name: bpertussis_cw20151119_gmb3build5&lt;br /&gt;
&lt;br /&gt;
UniProt XML filename: [[File:Uniprot-proteome-UP000002676 cw20151119.zip]]&lt;br /&gt;
* UniProt XML version (The version information was found at [http://uniprot.org/news the UniProt News Page]): 2015_11&lt;br /&gt;
* UniProt XML download link: [http://www.uniprot.org/proteomes/UP000002676 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251)]&lt;br /&gt;
* Time taken to import: 2.60 minutes&lt;br /&gt;
** Note: The import time was similar to that of &amp;#039;&amp;#039;V. cholerae&amp;#039;&amp;#039; in Week 9 (2.92 minutes). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
GO OBO-XML filename: [[File:Go daily-termdb cw20151119.zip]]&lt;br /&gt;
* GO OBO-XML version (The version information was found in the file properties): Last Modified- ‎Thursday, ‎November ‎19, ‎2015, ‏‎2:24:27 AM&lt;br /&gt;
* GO OBO-XML download link: [http://geneontology.org/page/download-ontology#Legacy_Downloads Gene Ontology legacy download page]&lt;br /&gt;
* Time taken to import: 6.99 minutes &lt;br /&gt;
* Time taken to process: 4.48 minutes&lt;br /&gt;
** Note: The import and processing times were similar to those for &amp;#039;&amp;#039;V. cholerae&amp;#039;&amp;#039; in Week 9 (6.88 minutes and 4.49 minutes respectively). No interruptions occurred during these processes.&lt;br /&gt;
&lt;br /&gt;
GOA filename: [[File:145.B pertussis ATCC BAA-589 cw20151119.zip]]&lt;br /&gt;
* GOA version (found in the Last modified field on the [ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/ FTP site]): Last Modified- 11/10/15 1:39:00 PM&lt;br /&gt;
* GOA download link: [http://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/145.B_pertussis_ATCC_BAA-589.goa for &amp;#039;&amp;#039;Bordetella pertussis&amp;#039;&amp;#039; strain Tohama I]&lt;br /&gt;
* Time taken to import: 0.04 minutes&lt;br /&gt;
** Note: The import time was similar to that of &amp;#039;&amp;#039;V. cholerae&amp;#039;&amp;#039; in Week 9 (0.06 minutes). No interruptions occurred during this process.&lt;br /&gt;
&lt;br /&gt;
Name of .gdb file: [[File:Bpertussis-std cw20151119.zip]]&lt;br /&gt;
* Time taken to export: &lt;br /&gt;
** Start time: 4:06 PM&lt;br /&gt;
** End time: 4:46 PM&lt;br /&gt;
** Elapsed time: 40 minutes&lt;br /&gt;
Note: All export windows remained open when I returned to check the export status. No interruptions occurred during the export process.&lt;br /&gt;
&lt;br /&gt;
===TallyEngine===&lt;br /&gt;
&lt;br /&gt;
* I ran the TallyEngine in GenMAPP Builder and specified the following files:&lt;br /&gt;
**XML- [[File:Uniprot-proteome-UP000002676 cw20151119.zip]]&lt;br /&gt;
**GO- [[File:Go daily-termdb cw20151119.zip]]&lt;br /&gt;
*Results:&lt;br /&gt;
**[[File:TallyEngine cw20151119.png]]&lt;br /&gt;
**All tally results were consistent across both files.&lt;br /&gt;
=== Using XMLPipeDB match to Validate the XML Results from the TallyEngine===&lt;br /&gt;
The following functions were performed using the Windows command line (cmd).&lt;br /&gt;
*I entered my project folder using the following command:&lt;br /&gt;
 cd /d T:\Bklein7_CW&lt;br /&gt;
*I used XMLPipeDB match to identify matches of any ordered locus name following the pattern &amp;quot;BP####&amp;quot; in the UniProt XML file. The command sequence used is as follows:\&lt;br /&gt;
 java -jar xmlpipedb-match-1.1.1.jar &amp;quot;BP[0-9][0-9][0-9][0-9]&amp;quot; &amp;lt; &amp;quot;uniprot-proteome%3AUP000002676_cw20151119.xml&amp;quot;&lt;br /&gt;
*Match Results:&lt;br /&gt;
**[[File:XMLPipeDBMatch cw20151119.png]]&lt;br /&gt;
**The total number of unique matches listed above, 3438, differs from the Order Locus Names count of 3435 produced by the Tally Engine. Thus, 3 gene IDs present in the original XML file were not imported into &amp;#039;&amp;#039;bpertussis-std_cw20151119.gdb&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Using SQL Queries to Validate the PostgreSQL Database Results from the TallyEngine===&lt;br /&gt;
I ran a SQL query designed to match the pattern BP####:&lt;br /&gt;
&lt;br /&gt;
 select count (*) from genenametype where type = &amp;#039;ordered locus&amp;#039; and value ~ &amp;#039;BP[0-9][0-9][0-9][0-9]&amp;#039;;&lt;br /&gt;
&lt;br /&gt;
Results:&lt;br /&gt;
*[[File:SQLQuery cw20151119.png]]&lt;br /&gt;
* The number of unique matches yielded by this SQL query, 3435, matched that produced by the Tally Engine. However, this count was also 3 less than that yielded by the XMLPipeDB Match result reported above. This further indicates that there was an error present in importing all gene IDs from the original XML file.&lt;br /&gt;
&lt;br /&gt;
===OriginalRowCounts Comparison===&lt;br /&gt;
&lt;br /&gt;
I opened the gene database file [[File:Bpertussis-std cw20151119.zip]] in  Microsoft Access and assessed the &amp;quot;OriginalRowCounts&amp;quot; table to see if the expected tables were listed with the expected number of records. The contents of this table were compared to the &amp;#039;&amp;#039;OriginalRowCounts&amp;#039;&amp;#039; table of an existing .gdb file created during Week 9.&lt;br /&gt;
 &lt;br /&gt;
Benchmark .gdb file: &amp;#039;&amp;#039;Vc-Std_20151027_TR&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;quot;OriginalRowCounts&amp;quot; table from the benchmark and new gdb:&lt;br /&gt;
*[[File:OriginalRowCountsComparison cw20151119.PNG]]&lt;br /&gt;
*All 52 tables present in the 2015 &amp;#039;&amp;#039;Vibrio cholerae&amp;#039;&amp;#039; database were also present in the &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; database tagged _cw20151119. This confirmed that all expected tables were successfully created.&lt;br /&gt;
*Further, the &amp;quot;OrderedLocusNames&amp;quot; count of 3435 generated by the Tally Engine was confirmed to reflect that actual contents of the database.&lt;br /&gt;
&lt;br /&gt;
Note: The &amp;quot;OriginalRowCounts&amp;quot; tables were too large to screenshot. To circumvent this problem and facilitate the comparison, I copied the &amp;quot;OriginalRowCounts&amp;quot; tables from both gene databases into an Excel file and zoomed out. The above screenshot was taken from this Excel file. The &amp;quot;OrderedLocusNames&amp;quot; row counts are highlighted in yellow.&lt;br /&gt;
&lt;br /&gt;
===Visual Inspection===&lt;br /&gt;
I visually inspected individual tables within [[File:Bpertussis-std cw20151119.zip]] using Microsoft Access.&lt;br /&gt;
*Systems Table&lt;br /&gt;
**35 gene ID systems were listed, 11 of which listed the appropriate import date (11/19/2015)&lt;br /&gt;
***All gene ID systems relevant to &amp;#039;&amp;#039;B. pertussis&amp;#039;&amp;#039; were listed. This includes: EMBL, EnsemblBacteria, GeneID, GeneOntology, InterPro, OrderedLocusNames, Pfam, RefSeq, and UniProt.&lt;br /&gt;
***This result corresponded with that of the benchmark .gdb file listed in the &amp;quot;OriginalRowCounts Comparison&amp;quot; section.&lt;br /&gt;
*UniProt Table&lt;br /&gt;
**This table contained 3258 entries with 6 character IDs.&lt;br /&gt;
**All ID&amp;#039;s in the UniProt table conform to the following pattern: [[File:UniProt Ascension Number info.PNG]]&lt;br /&gt;
**There are no apparent issues with the 3258 entries. However, it is curious that only 3258 out of 3438 IDs identified through XMLPipeDB Match made it into the database. This suggests the need for gmbuilder coding changes.&lt;br /&gt;
*RefSeq Table&lt;br /&gt;
**This table contained 6627 entries. All IDs began with one of three prefixes: &amp;quot;NP_&amp;quot;, &amp;quot;YP_&amp;quot;, or &amp;quot;WP_&amp;quot;. The meanings of these prefixes can be found in the RefSeq documentation found [http://www.ncbi.nlm.nih.gov/books/NBK50679/ here].&lt;br /&gt;
***&amp;quot;NP_&amp;quot; and &amp;quot;YP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to proteins. There are 3410 &amp;quot;NP_&amp;quot; IDs and 7 &amp;quot;YP_&amp;quot; IDs.&lt;br /&gt;
***&amp;quot;WP_&amp;quot; Prefixes&lt;br /&gt;
****Refer to &amp;quot;autonomous non-redundant proteins that are not yet directly annotated on a genome&amp;quot;. There were 3210 IDs with the &amp;quot;WP_&amp;quot; prefixes.&lt;br /&gt;
***Overall, every entry in the ID column was an expected value.&lt;br /&gt;
*OrderedLocusNames Table&lt;br /&gt;
**This table contained 3435 entries (consistent with Tally Engine &amp;amp; SQL counts).&lt;br /&gt;
**The IDs were copied into an Excel document for analysis:&lt;br /&gt;
***3434 IDs conformed to the pattern &amp;quot;BP####&amp;quot;&lt;br /&gt;
***1 ID was unique: &amp;quot;BP3167.1&amp;quot;&lt;br /&gt;
&lt;br /&gt;
===.gdb Use in GenMAPP===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--Need to add more instructions here.--&amp;gt;&lt;br /&gt;
*By following the instructions in [http://www.openwetware.org/wiki/BIOL367/F10:GenMAPP_and_MAPPFinder_Protocols Part 2 of the &amp;#039;&amp;#039;Vibrio cholerae&amp;#039;&amp;#039; Microarray Data Analysis] and looking at Brandon&amp;#039;s Week 9 individual journal assignment, I was able to verify that our Gene Database works in GenMAPP.&lt;br /&gt;
*I was able to open the GenMAPP program on the computer, and then I went to Data -&amp;gt; Choose Gene Database -&amp;gt; and selected the cw20151119 gdb file.&lt;br /&gt;
**There were no problems thus far as our database was able to load into the program.&lt;br /&gt;
&lt;br /&gt;
====Putting a gene on the MAPP using the GeneFinder window====&lt;br /&gt;
&lt;br /&gt;
*In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
**I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
**Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
**I typed in the gene Id BP3723 into the Gene ID field and selected &amp;quot;OrderedLocusName&amp;quot; for the Gene ID system.&lt;br /&gt;
**When the gene was found this is the back page that popped up:&lt;br /&gt;
**[[File:GeneIDbackpage-BP3723orderedlocusname cw20151123.png]]&lt;br /&gt;
**Not all of the expected cross-referenced IDs were present.&lt;br /&gt;
*I decided to try out another orderedlocusname gene id in the GenMAPP system to see if all of the expected cross-referenced IDs were present for a different gene than the first one.&lt;br /&gt;
**In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
***I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
***Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
***I typed in the gene Id BP2980 into the Gene ID field and selected &amp;quot;OrderedLocusName&amp;quot; for the Gene ID system.&lt;br /&gt;
***When the gene was found this is the back page that popped up:&lt;br /&gt;
***[[File:GeneIDbackpage-BP2980orderedlocusname cw20151123.png]]&lt;br /&gt;
***Not all of the expected cross-referenced IDs were present.&lt;br /&gt;
*In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
**I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
**Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
**I typed in the gene Id Q7VUE5 into the Gene ID field and selected &amp;quot;UniProt&amp;quot; for the Gene ID system.&lt;br /&gt;
**When the gene was found this is the back page that popped up:&lt;br /&gt;
**[[File:GeneIDbackpage-Q7VUE5uniprot cw20151123.png]]&lt;br /&gt;
**All of the expected cross-referenced IDs were present.&lt;br /&gt;
*In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
**I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
**Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
**I typed in the gene Id CAE44048 into the Gene ID field and selected &amp;quot;EnsemblBacteria&amp;quot; for the Gene ID system.&lt;br /&gt;
**When the gene was found this is the back page that popped up:&lt;br /&gt;
**[[File:GeneIDbackpage-CAE44048ensemblbacteria cw20151123.png]]&lt;br /&gt;
**All of the expected cross-referenced IDs were present.&lt;br /&gt;
*In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
**I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
**Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
**I typed in the gene Id 2666659 into the Gene ID field and selected &amp;quot;GeneID&amp;quot; for the Gene ID system.&lt;br /&gt;
**When the gene was found this is the back page that popped up:&lt;br /&gt;
**[[File:GeneIDbackpage-2666659geneid cw20151123.png]]&lt;br /&gt;
**All of the expected cross-referenced IDs were present.&lt;br /&gt;
*In the main GenMAPP Drafting Board window, I left-clicked on the icon for &amp;quot;Gene&amp;quot; in the upper left corner of the window.  &lt;br /&gt;
**I clicked on the Drafting Board to place the Gene on the MAPP.  &lt;br /&gt;
**Then I right-clicked on the gene to access the GeneFinder window.  &lt;br /&gt;
**I typed in the gene Id NP_878949 into the Gene ID field and selected &amp;quot;RefSeq&amp;quot; for the Gene ID system.&lt;br /&gt;
**When the gene was found this is the back page that popped up:&lt;br /&gt;
**[[File:GeneIDbackpage-NP 878949refseq cw20151123.png]]&lt;br /&gt;
**Not of the expected cross-referenced IDs were present.&lt;br /&gt;
*Screenshot of all of the sample ID&amp;#039;s on a MAPP:&lt;br /&gt;
*[[File:GenMAPPgenesmapped cw20151123.png]]&lt;br /&gt;
&lt;br /&gt;
====Expression Dataset and MAPPFinder Analysis====&lt;br /&gt;
*We do not have the expression dataset yet that is to be created by the GenMAPP Builders; they are still working on performing the corrections to the data that has been compiled into an excel spreadsheet.&lt;br /&gt;
*Once the file is complete, we will proceed with the data analysis using the desired programs.&lt;br /&gt;
&lt;br /&gt;
=== Compare Gene Database to Outside Resource===&lt;br /&gt;
*We will complete this step after progressing further into the project.&lt;br /&gt;
The OrderedLocusNames IDs in the exported Gene Database are derived from the UniProt XML.  It is a good idea to check your list of OrderedLocusNames IDs to see how complete it is using the original source of the data (the sequencing organization, the MOD, etc.)  Because UniProt is a protein database, it does not reference any non-protein genome features such as genes that code for functional RNAs, centromeres, telomeres, etc.&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8070</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8070"/>
				<updated>2015-12-18T19:17:47Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: fixed template&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species (&amp;#039;&amp;#039;.gdb&amp;#039;&amp;#039;): [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* ReadMe file to accompany the Gene Database (&amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&lt;br /&gt;
** Include Gene Database Schema diagram in ReadMe&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database (print from wiki to &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039; file)&lt;br /&gt;
* Processed and analyzed DNA microarray dataset (&amp;#039;&amp;#039;.xls&amp;#039;&amp;#039; or &amp;#039;&amp;#039;.xlsx&amp;#039;&amp;#039;): [[File:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP (&amp;#039;&amp;#039;.txt&amp;#039;&amp;#039; or &amp;#039;&amp;#039;.csv&amp;#039;&amp;#039;): [[File:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file (&amp;#039;&amp;#039;.gex&amp;#039;&amp;#039;): [[File:Bpertussis expressiondataset cw20151213.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP (&amp;#039;&amp;#039;.EX.txt&amp;#039;&amp;#039;): [[File:Bpertussis expressiondataset exceptions cw20151213.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files (&amp;#039;&amp;#039;-GO.txt&amp;#039;&amp;#039;): &lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151213-criterion0-GO.txt]]&lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151213-criterion1-GO.txt]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[File:Bpertussis compiledrawdata cw20151213.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results (&amp;#039;&amp;#039;.xls&amp;#039;&amp;#039; or &amp;#039;&amp;#039;.xlsx&amp;#039;&amp;#039;):&lt;br /&gt;
** [[File:Bpertussis mappfinderresults filtered cw20151213-Criterion0-GO.xlsx]] &lt;br /&gt;
** [[File:Bpertussis mappfinderresults filtered cw20151213-Criterion1-GO.xlsx]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species (&amp;#039;&amp;#039;.mapp&amp;#039;&amp;#039;): [[File:Bpertussis ribosomepathway cw20151215.mapp]]&lt;br /&gt;
* [[Gene Database Project Deliverables#Group Report | Group Report]] describing the creation of the Gene Database and the biological analysis (&amp;#039;&amp;#039;.doc&amp;#039;&amp;#039;, &amp;#039;&amp;#039;.docx&amp;#039;&amp;#039;, or &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&lt;br /&gt;
* PowerPoint presentation: [[File:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
{{Template:Class Whoopers}}&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	<entry>
		<id>https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8069</id>
		<title>Bordetella Pertussis GenMAPP Analysis Deliverables</title>
		<link rel="alternate" type="text/html" href="https://xmlpipedb.lmucs.io/biodb/fall2015/index.php?title=Bordetella_Pertussis_GenMAPP_Analysis_Deliverables&amp;diff=8069"/>
				<updated>2015-12-18T19:17:22Z</updated>
		
		<summary type="html">&lt;p&gt;Bklein7: made edit to invoke final&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Group Files and Datasets ==&lt;br /&gt;
&lt;br /&gt;
* GenMAPP Gene Database for assigned species (&amp;#039;&amp;#039;.gdb&amp;#039;&amp;#039;): [[File:Bpertussis-std cw20151210.zip]]&lt;br /&gt;
* ReadMe file to accompany the Gene Database (&amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&lt;br /&gt;
** Include Gene Database Schema diagram in ReadMe&lt;br /&gt;
* Gene Database Testing Report for final submitted Gene Database (print from wiki to &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039; file)&lt;br /&gt;
* Processed and analyzed DNA microarray dataset (&amp;#039;&amp;#039;.xls&amp;#039;&amp;#039; or &amp;#039;&amp;#039;.xlsx&amp;#039;&amp;#039;): [[File:Bpertussis compiledrawdata cw20151208.xlsx]]&lt;br /&gt;
* Data file used for import into GenMAPP (&amp;#039;&amp;#039;.txt&amp;#039;&amp;#039; or &amp;#039;&amp;#039;.csv&amp;#039;&amp;#039;): [[File:Bpertussis compiledrawdata cw20151208.txt]]&lt;br /&gt;
* GenMAPP Expression Dataset file (&amp;#039;&amp;#039;.gex&amp;#039;&amp;#039;): [[File:Bpertussis expressiondataset cw20151213.gex]]&lt;br /&gt;
* Exceptions file of data imported into GenMAPP (&amp;#039;&amp;#039;.EX.txt&amp;#039;&amp;#039;): [[File:Bpertussis expressiondataset exceptions cw20151213.EX.txt]]&lt;br /&gt;
* Raw MAPPFinder results files (&amp;#039;&amp;#039;-GO.txt&amp;#039;&amp;#039;): &lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151213-criterion0-GO.txt]]&lt;br /&gt;
** [[File:Bpertussis mappfinderresults cw20151213-criterion1-GO.txt]]&lt;br /&gt;
* &amp;#039;&amp;#039;.gmf&amp;#039;&amp;#039; file: [[File:Bpertussis compiledrawdata cw20151213.gmf]]&lt;br /&gt;
* Filtered MAPPFinder Results (&amp;#039;&amp;#039;.xls&amp;#039;&amp;#039; or &amp;#039;&amp;#039;.xlsx&amp;#039;&amp;#039;):&lt;br /&gt;
** [[File:Bpertussis mappfinderresults filtered cw20151213-Criterion0-GO.xlsx]] &lt;br /&gt;
** [[File:Bpertussis mappfinderresults filtered cw20151213-Criterion1-GO.xlsx]]&lt;br /&gt;
* Sample MAPP file of a relevant biological pathway for your species (&amp;#039;&amp;#039;.mapp&amp;#039;&amp;#039;): [[File:Bpertussis ribosomepathway cw20151215.mapp]]&lt;br /&gt;
* [[Gene Database Project Deliverables#Group Report | Group Report]] describing the creation of the Gene Database and the biological analysis (&amp;#039;&amp;#039;.doc&amp;#039;&amp;#039;, &amp;#039;&amp;#039;.docx&amp;#039;&amp;#039;, or &amp;#039;&amp;#039;.pdf&amp;#039;&amp;#039;)&lt;br /&gt;
* PowerPoint presentation: [[File:Bpertussis findings powerpoint.pdf]]&lt;br /&gt;
&lt;br /&gt;
==Team Information &amp;amp; Links==&lt;br /&gt;
[[Template:Class Whoopers]]&lt;/div&gt;</summary>
		<author><name>Bklein7</name></author>	</entry>

	</feed>