Nanguiano Week 6
From LMU BioDB 2015
Revision as of 22:35, 13 October 2015 by Nanguiano (Talk | contribs) (→The FDA Drug Database: added text analysis)
Contents
The FDA Drug Database
Direct Download/Unzipping Commands
- First, to begin the assignment, I ssh'd into my server.
ssh nanguiano@lion.lmu.edu
- Next, I entered my folder for this class and created a directory for this week's assignment, then entered the directory.
cd biodb mkdir week6 cd week6
- In the directory, I downloaded the file from the FDA.
curl -O http://www.fda.gov/downloads/Drugs/InformationOnDrugs/UCM054599.zip
- After the download completed, I unzipped the file.
unzip UCM054599.zip
- Next, I needed to move the application.txt and Product.txt files into my public_html folder. However, I first wanted to create a folder for the files. I entered my public_html folder and created a folder for this class, biodb.
cd ~/public_html mkdir biodb
- I noticed after making the folder that the "movie.sql.txt" file from a few classes ago was still there. I moved that file into the biodb folder, then returned to my week6 folder to begin the transfer of the files I needed to move.
mv movie.sql.txt biodb cd ~/biodb/week6
- I moved the files into the prepared folder in the public_html directory.
mv application.txt ~/public_html/biodb mv Product.txt ~/public_html/biodb
- Next, I needed to process the files to prepare them for entry into PostgreSQL. The files needed to be prepared with the insert commands so that they could be added into my database. Before doing this, i needed to ensure I knew the structure of the tables that I was going to enter the files into. To begin this process, I downloaded the two files and opened them in Excel so I could see what the header of each column was, and what type of data it contained.
- Opening the "application.txt" file showed me that it contained 9 columns: ApplNo (int), ApplType (char), SponsorApplicant (varchar), MostRecentLabelAvailableFlag (bool), CurrentPatentFlag (bool), ActionType (varchar), ChemicalType (int), Ther_Potential (char), and Orphan_Code (char).
- Opening the "Product.txt" file showed me that it also contained 9 columns: ApplNo (int), ProductNo (int), Form (varchar), Dosage (varchar), ProductMktStatus (int), TECode (varchar), ReferenceDrug (int), drugname (varchar), and activeingred (varchar).
Questions to Answer
- Provide the DDL (
create table
) statements that you used for your application and product tables. - Provide the
sed
command sequences that you used to convert the raw text files into sequences of SQLinsert
statements. - Using the command line, how can you determine the number of records in each file? Provide the command.
- Using SQL, how can you determine the number of records in the table corresponding to the file? Provide the SQL
select
statement. - In your database, are these numbers the same or different? Explain why you think so.
- For the remaining questions, supply (a) the answer to the question and (b) the SQL statement that you used to answer the question.
- What are the names of the drug products that are administered in the form
INJECTABLE;INTRAVENOUS, SUBCUTANEOUS
? - What are the names of the drug products whose active ingredient (activeingred) is
ATROPINE
? - In what forms and dosages can the drug product named
BENADRYL
be administered? - Which drug products have a name ending in
ESTROL
? - Produce a table listing all of the known values for the therapeutic_potential column in the application table and how many application records there are of each. (Side note: The therapeutic_potential codes are explained in the ReviewClass_Lookup.txt file, in case you’re interested.)
- Produce a table listing all of the known values for the chemical_type column in the application table and how many application records there are of each. (Side note: The chemical_type codes are explained in the ChemTypeLookup.txt file, in case you’re interested.)
- What are the names of the drug products that are sponsored (sponsor applicant column) by
MERCK
? - Which sponsor applicant companies have the text
LABS
in their names and have products whose active ingredients (activeingred) include bothASPIRIN
andCAFFEINE
?
Links
Nicole Anguiano
BIOL 367, Fall 2015
Assignment Links
- Week 1 Assignment
- Week 2 Assignment
- Week 3 Assignment
- Week 4 Assignment
- Week 5 Assignment
- Week 6 Assignment
- Week 7 Assignment
- Week 8 Assignment
- Week 9 Assignment
- Week 10 Assignment
- Week 11 Assignment
- Week 12 Assignment
- Week 14 Assignment
- Week 15 Assignment
Individual Journals
- Individual Journal Week 2
- Individual Journal Week 3
- Individual Journal Week 4
- Individual Journal Week 5
- Individual Journal Week 6
- Individual Journal Week 7
- Individual Journal Week 8
- Individual Journal Week 9
- Individual Journal Week 10
- Individual Journal Week 11
- Individual Assessment
- Deliverables