Difference between revisions of "Jwoodlee Week 14"

From LMU BioDB 2015
Jump to: navigation, search
(added some procedure)
 
(Milestone 3: Species Profile Creation: added non customized procedure)
Line 1: Line 1:
 
=== Milestone 3: Species Profile Creation ===
 
=== Milestone 3: Species Profile Creation ===
 +
==== Adding a Species Profile to GenMAPP Builder ====
 +
 +
All of this work happens in the ''Java'' perspective, so switch to that first if you’re not already there.
 +
 +
====== Create the Species Profile ======
 +
 +
# Expose the contents of the ''src'' folder.
 +
# Right-click on the ''edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles'' package and choose '''New > Class''' from the popup menu.
 +
# In the dialog that appears, enter the following:
 +
#* '''Name:''' <code>''name-of-your-species-without-spaces''UniProtSpeciesProfile</code> (in ''camel case'': no spaces, capitalizing the first letters of each word)
 +
#* '''Superclass:''' <code>edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtSpeciesProfile</code> (you can also click on ''Browse...'' to navigate to this if you don’t feel like typing)
 +
# Click ''Finish''. There should now be a new ''.java'' file within the ''edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles'' package (the one you just created).
 +
 +
====== Customize the Species Profile ======
 +
 +
* Open the file that you have just created. It should appear in the editor area of Eclipse.
 +
* Override the method that supplies the name of the species and the description of the profile: add the following constructor block right below the ''public class'' line in the new file. Remember to customize according to your particular species; the portions that need to be customized are highlighted in asterisks.
 +
public ***NameOfYourSpecies***UniProtSpeciesProfile() {
 +
    super("***Genus species***",
 +
        ***taxonIDOfYourSpecies***,
 +
        "This profile customizes the GenMAPP Builder export for " +
 +
            "***Genus species***" +
 +
            " data loaded from a UniProt XML file.");
 +
}
 +
* To customize the species profile with the species name in the OrderedLocusNames record of the Systems table as well as a link query for that same record, add the following method block right below the constructor block that you added above.  Again, the key information to customize is highlighted in asterisks.
 +
@Override
 +
public TableManager getSystemsTableManagerCustomizations(TableManager tableManager, DatabaseProfile dbProfile) {
 +
    super.getSystemsTableManagerCustomizations(tableManager, dbProfile);
 +
    tableManager.submit("Systems", QueryType.update, new String[][] {
 +
        { "SystemCode", "N" },
 +
        { "Species", "|" + getSpeciesName() + "|" }
 +
    });
 +
 +
    tableManager.submit("Systems", QueryType.update, new String[][] {
 +
        { "SystemCode", "N" },
 +
        { "Link", "***species-specific-database-link***" }
 +
    });
 +
 +
    return tableManager;
 +
}
 +
* Note the '''species-specific-database-link''' placeholder above. This is a species-specific URL that returns a web page describing a gene for that species. It should look like a standard URL, with the tilde ('''~''') standing in for the gene ID. For example, the link for ''Vibrio cholerae'' is <code>http://bacteria.ensembl.org/Multi/Search/Results?species=all;idx=;q=~;site=ensemblunit</code>. The link for ''Plasmodium falciparum'' is <code>http://plasmodb.org/plasmo/showRecord.do?name=GeneRecordClasses.GeneRecordClass&project_id=PlasmoDB&source_id=~</code>. Work with your GenMAPP User and/or QA to determine the appropriate URL for your species.
 +
* Your code may have a red error badge at this point; assuming you typed everything in exactly, the fix for this is to choose ''Organize Imports'' from the ''Source'' menu. If the red error badge persists, make sure that you typed everything in correctly.
 +
* Save the file and see if these changes worked (see below).
 +
 +
Additional customization, particularly with regard to the exported data, will depend on the species. Communicate with your QA to see if additional customization is needed. If the additional customization is not too complicated, you might be able to do the work yourself with some instructions. However, if the customization is too difficult, Dr. Dionisio will probably be the one to do the work.
 +
 +
====== Add the Species Profile to the Catalog of Known Species Profiles ======
 +
 +
The last step involves actually making GenMAPP Builder ''know'' that your new species profile exists. This involves a change in an existing file:
 +
* Under ''edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles'', open ''UniProtDatabaseProfile.java''.
 +
* Near the top of the file is a block that looks like this:
 +
super("org.uniprot.uniprot.Uniprot",
 +
    "This profile defines the requirements "
 +
        + "for any UniProt centric gene database.",
 +
    new SpeciesProfile[] {
 +
    new EscherichiaColiUniProtSpeciesProfile(),
 +
    new ArabidopsisThalianaUniProtSpeciesProfile(),
 +
    new PlasmodiumFalciparumUniProtSpeciesProfile(),
 +
    new VibrioCholeraeUniprotSpeciesProfile() });
 +
* What you want to do is add the species profile that you just created to this block. If your species profile is called ''MySpecialUniProtSpeciesProfile'', your modified code should look like this:
 +
super("org.uniprot.uniprot.Uniprot",
 +
    "This profile defines the requirements "
 +
        + "for any UniProt centric gene database.",
 +
    new SpeciesProfile[] {
 +
    new EscherichiaColiUniProtSpeciesProfile(),
 +
    new ArabidopsisThalianaUniProtSpeciesProfile(),
 +
    new PlasmodiumFalciparumUniProtSpeciesProfile(),
 +
    new VibrioCholeraeUniprotSpeciesProfile(),
 +
    new MySpecialUniProtSpeciesProfile() });
 +
* Essentially, you need to add an item to the comma-separated list, beginning with ''new'', followed by the species profile name, finally followed by ''()''.
 +
* Save your changes, do ''Organize Imports'' to eliminate any red errors, and try a test build!
 +
 +
====== Build, Test, and Possibly Commit ======
 +
 +
# Create a new distribution of GenMAPP Builder based on [[#Creating a Distribution|Creating a Distribution]].
 +
# Perform a new export run with this version of GenMAPP Builder (you can skip the import steps and use the same PostgreSQL database if it’s available).
 +
# Check the ''Systems'' table in the resulting ''.gdb'' to see if it contains the custom information:
 +
#* Open the ''.gdb'' in Microsoft Access, then open the ''Systems'' table.
 +
#* Look for the record for ''OrderedLocusNames''. Your species name should appear under the ''Species'' column and your link URL should appear under the ''Link'' column.
 +
# If all goes well, commit your code as described in [[#Updating and Committing Code|Updating and Committing Code]].  You have now officially contributed to the XMLPipeDB project ''':)'''
 +
 +
 +
  
 
Follow the instructions in the [[#Adding a Species Profile to GenMAPP Builder|Adding a Species Profile to GenMAPP Builder]] section of this wiki page in order to:
 
Follow the instructions in the [[#Adding a Species Profile to GenMAPP Builder|Adding a Species Profile to GenMAPP Builder]] section of this wiki page in order to:

Revision as of 22:41, 3 December 2015

Milestone 3: Species Profile Creation

Adding a Species Profile to GenMAPP Builder

All of this work happens in the Java perspective, so switch to that first if you’re not already there.

Create the Species Profile
  1. Expose the contents of the src folder.
  2. Right-click on the edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles package and choose New > Class from the popup menu.
  3. In the dialog that appears, enter the following:
    • Name: name-of-your-species-without-spacesUniProtSpeciesProfile (in camel case: no spaces, capitalizing the first letters of each word)
    • Superclass: edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtSpeciesProfile (you can also click on Browse... to navigate to this if you don’t feel like typing)
  4. Click Finish. There should now be a new .java file within the edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles package (the one you just created).
Customize the Species Profile
  • Open the file that you have just created. It should appear in the editor area of Eclipse.
  • Override the method that supplies the name of the species and the description of the profile: add the following constructor block right below the public class line in the new file. Remember to customize according to your particular species; the portions that need to be customized are highlighted in asterisks.
public ***NameOfYourSpecies***UniProtSpeciesProfile() {
    super("***Genus species***",
        ***taxonIDOfYourSpecies***,
        "This profile customizes the GenMAPP Builder export for " +
            "***Genus species***" +
            " data loaded from a UniProt XML file.");
}
  • To customize the species profile with the species name in the OrderedLocusNames record of the Systems table as well as a link query for that same record, add the following method block right below the constructor block that you added above. Again, the key information to customize is highlighted in asterisks.
@Override
public TableManager getSystemsTableManagerCustomizations(TableManager tableManager, DatabaseProfile dbProfile) {
    super.getSystemsTableManagerCustomizations(tableManager, dbProfile);
    tableManager.submit("Systems", QueryType.update, new String[][] {
        { "SystemCode", "N" },
        { "Species", "|" + getSpeciesName() + "|" }
    });

    tableManager.submit("Systems", QueryType.update, new String[][] {
        { "SystemCode", "N" },
        { "Link", "***species-specific-database-link***" }
    });

    return tableManager;
}

Additional customization, particularly with regard to the exported data, will depend on the species. Communicate with your QA to see if additional customization is needed. If the additional customization is not too complicated, you might be able to do the work yourself with some instructions. However, if the customization is too difficult, Dr. Dionisio will probably be the one to do the work.

Add the Species Profile to the Catalog of Known Species Profiles

The last step involves actually making GenMAPP Builder know that your new species profile exists. This involves a change in an existing file:

  • Under edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles, open UniProtDatabaseProfile.java.
  • Near the top of the file is a block that looks like this:
super("org.uniprot.uniprot.Uniprot",
    "This profile defines the requirements "
        + "for any UniProt centric gene database.",
    new SpeciesProfile[] {
    new EscherichiaColiUniProtSpeciesProfile(),
    new ArabidopsisThalianaUniProtSpeciesProfile(),
    new PlasmodiumFalciparumUniProtSpeciesProfile(),
    new VibrioCholeraeUniprotSpeciesProfile() });
  • What you want to do is add the species profile that you just created to this block. If your species profile is called MySpecialUniProtSpeciesProfile, your modified code should look like this:
super("org.uniprot.uniprot.Uniprot",
    "This profile defines the requirements "
        + "for any UniProt centric gene database.",
    new SpeciesProfile[] {
    new EscherichiaColiUniProtSpeciesProfile(),
    new ArabidopsisThalianaUniProtSpeciesProfile(),
    new PlasmodiumFalciparumUniProtSpeciesProfile(),
    new VibrioCholeraeUniprotSpeciesProfile(),
    new MySpecialUniProtSpeciesProfile() });
  • Essentially, you need to add an item to the comma-separated list, beginning with new, followed by the species profile name, finally followed by ().
  • Save your changes, do Organize Imports to eliminate any red errors, and try a test build!
Build, Test, and Possibly Commit
  1. Create a new distribution of GenMAPP Builder based on Creating a Distribution.
  2. Perform a new export run with this version of GenMAPP Builder (you can skip the import steps and use the same PostgreSQL database if it’s available).
  3. Check the Systems table in the resulting .gdb to see if it contains the custom information:
    • Open the .gdb in Microsoft Access, then open the Systems table.
    • Look for the record for OrderedLocusNames. Your species name should appear under the Species column and your link URL should appear under the Link column.
  4. If all goes well, commit your code as described in Updating and Committing Code. You have now officially contributed to the XMLPipeDB project :)



Follow the instructions in the Adding a Species Profile to GenMAPP Builder section of this wiki page in order to:

  • Add a species profile to the GenMAPP Builder code base.
  • Customize the species profile with the species name in the OrderedLocusNames record of the Systems table.
  • Customize the Link field in the OrderedLocusNames record of the Systems table to hold a URL query with ~ standing in for the gene ID.
    • (with QA) The URL would need to be determined first, of course.

Milestone 4: Species Export Customization

  1. Based on observations from the GenMAPP User and QA, determine and document (as thoroughly as possible) any other modified export behavior that GenMAPP Builder will have to manifest for this species.
  2. Implement this export behavior.
  3. As needed, commit and push your work to your GitHub branch.
  4. Additional milestones will depend on how the rest of the project goes, and the bugs/features generated by that work.
  5. Document/log all work done, problems encountered, and how they were resolved.
  6. When your work is complete, issue a GitHub pull request to merge your branch into the main development line.