Gene Database Testing Report- cw20151203

From LMU BioDB 2015
Revision as of 19:54, 7 December 2015 by Bklein7 (Talk | contribs) (edited language)

Jump to: navigation, search

Files Asked for in the Gene Database Testing Report

For convenience, all of the files explicitly asked for in the sections below were compressed together in this file: [[]]

Pre-requisites

The following set of software was used in the creation and testing of the Bordetella pertussis gene database:

  1. 7-ziptool that for unpacking .gz and .zip files
  2. PostgreSQL on Windows (version 9.4.x)
  3. GenMAPP Builder
  4. Java JDK 1.8 64-bit
  5. GenMAPP 2
  6. XMLPipeDB match utility for counting IDs in XML files
  7. Microsoft Access for reading .mdb files

Gene Database Creation

Downloading Data Source Files and GenMAPP Builder

  • I download the UniProt XML, GOA, and GO OBO-XML files for Bordetella Pertussis along with the GenMAPP Builder program.
    • All files were saved to the folder Bklein7_CW\bpertussis_cw20151203 on my computer's ThawSpace.
    • Files that required extraction were unzipped using 7-zip.
    • Data files that remained in a folder after unzipping were removed from their folders to facilitate organization and command line processing.

UniProt XML

GOA

GO OBO-XML

Downloaded GenMAPP Builder

  1. I downloaded the custom version of GenMAPP Builder including the Bordetella pertussis custom class expanded to include ORF listings in exports (Version 3.0.0 Build 5 - cw20151203): File:Dist cw20151203.zip.
  2. I extracted the GenMAPP Builder folder using 7-zip.

Creating the New Database in PostgreSQL

  • I launched pgAdmin III and connected to the PostgreSQL 9.4 server (localhost:5432).
    • On this server, I created a new database: bpertussis_cw20151201_gmb3build5.
    • I opened the SQL Editor tab to use an XMLPipeDB query to create the tables in the database.
      • I clicked on the Open File icon and selected the file gmbuilder.sql. This imported a series of SQL commands into the editor tab.
      • I clicked on the Execute Query icon to run this command.
      • In viewing the schema for this database, I confirmed that there were 167 tables after running the above command.

Configuring GenMAPP Builder to Connect to the PostgreSQL Database

  • To begin, I launched gmbuilder.bat.
  • I selected the "Configure Database" option and entered the following information into the fields below:
    • Host or address: localhost
    • Port number: 5432
    • Database name: bpertussis_cw20151201_gmb3build5
    • Username: postgres
    • Password: Welcome1

Importing Data into the PostgreSQL Database

  • The downloaded data files for Bordetella pertussis were specified and imported into the database by clicking on the following buttons:
    • Selected File > Import UniProt XML...
    • Selected File > Import GO OBO-XML...
    • Clicked OK to the message asking to process the GO data.
    • Selected File > Import GOA...

Exporting a GenMAPP Gene Database (.gdb)

  • I selected File > Export to GenMAPP Gene Database... to begin the export process.
  • I typed my name in the owner field (Brandon Klein).
  • I selected "Bordetella pertussis (strain Tohama I/ATCC BAA-589/NCTC 13251), Taxon ID 257313" as the gene database species and then clicked Next.
  • The database was saved as bpertussis-std_cw20151203.
  • I checked the boxes for exporting all Molecular Function, Cellular Component, and Biological Process Gene Ontology Terms.
  • Finally, I clicked the "Next" button to begin the export process.

Gene Database Testing Report

Export Information

Version of GenMAPP Builder: Version 3.0.0 Build 5 - cw20151203

Computer on which export was run: Seaver 120- Last computer on the right in the row farthest from the front of the room

Postgres Database name: bpertussis_cw20151201_gmb3build5

UniProt XML filename: File:Uniprot-proteome-UP000002676 cw20151201.zip

GO OBO-XML filename: File:Go daily-termdb cw20151201.zip

  • GO OBO-XML version (The version information was found in the file properties): Last Modified- ‎‎December ‎01, ‎2015, ‏‎2:21:31 AM
  • GO OBO-XML download link: Gene Ontology legacy download page
  • Time taken to import: 7.08 minutes
  • Time taken to process: 4.42 minutes
    • Note: The import and processing times were similar to those for the previous "Bordetella pertussis" gene database: bpertussis-std_cw20151119.gdb (6.99 minutes and 4.48 minutes respectively). No interruptions occurred during these processes.

GOA filename: File:145.B pertussis ATCC BAA-589 cw20151201.zip

  • GOA version (found in the Last modified field on the FTP site): Last Modified- 11/10/15 1:39:00 PM
  • GOA download link: for Bordetella pertussis strain Tohama I
  • Time taken to import: 0.04 minutes
    • Note: The import time was equal to that of the previous "Bordetella pertussis" gene database: bpertussis-std_cw20151119.gdb. No interruptions occurred during this process.

Name of .gdb file: File:Bpertussis-std cw20151203.zip

  • Time taken to export:
    • Start time: 4:02 PM
    • End time: 4:56 PM
    • Elapsed time: 54 minutes

Note: No interruptions occurred during the export process.