Bklein7 Week 6
From LMU BioDB 2015
Revision as of 02:06, 12 October 2015 by Bklein7 (Talk | contribs) (Added notes on downloading and editing the zip file)
Contents
The FDA Drug Database
Acquiring Data From the Drugs@FDA ZIP File
Downloading and Viewing
- To download the zip file directly from the FDA website, I typed the following into the command line:
curl -O http://www.fda.gov/downloads/Drugs/InformationOnDrugs/UCM054599.zip
- Next, I unziped the FDA file
unzip UCM054599.zip
- The resulting output included the 'application.txt' and 'Product.txt' files. To view the format of these files, I used more commands. The output I received appeared to have listed columns which were not clearly defined. Therefore, I visualized the data using 'Notepad'. This made the columns easier to read.
- In 'Notepad', I got a better idea of how I needed to edit the txt files to import them into a table in 'pgAdminIII'.
Editing the txt files with sed
- application.txt
- To begin, I wanted to determine if there were any special characters in the text file such as ';' or that needed to be worked around. I did two searches in notepad for these characters and found that this was not the case.
- Next, I began the pipe with a cat command and then wanted to clean up the file to make it easier to read.
- I started with a sed command to remove excess spaces. Although this cleaned up the file, there were still spaces separating separate values. I deduced that these values were tabs.
- To figure out how to replace the tabs (from a tab delimited file) with commas, I did some google searching. I found a way to tackle this problem on electrictoolbox. Using this piece of information, I wrote a sed command to replace tabs with commas. Immediately the output looked more like what I wanted.
- Having separated
- I started the pipe with a cat command and a sed command to delete excess spaces and make the file easier to read.
- Next, I added a sed command to include the proper insert command at the beginning of each line.
Questions to Answer
- Provide the DDL (
create table
) statements that you used for your application and product tables. - Provide the
sed
command sequences that you used to convert the raw text files into sequences of SQLinsert
statements. - Using the command line, how can you determine the number of records in each file? Provide the command.
- Using SQL, how can you determine the number of records in the table corresponding to the file? Provide the SQL
select
statement. - In your database, are these numbers the same or different? Explain why you think so.
- For the remaining questions, supply (a) the answer to the question and (b) the SQL statement that you used to answer the question.
- What are the names of the drug products that are administered in the form
INJECTABLE;INTRAVENOUS, SUBCUTANEOUS
? - What are the names of the drug products whose active ingredient (activeingred) is
ATROPINE
? - In what forms and dosages can the drug product named
BENADRYL
be administered? - Which drug products have a name ending in
ESTROL
? - Produce a table listing all of the known values for the therapeutic_potential column in the application table and how many application records there are of each. (Side note: The therapeutic_potential codes are explained in the ReviewClass_Lookup.txt file, in case you’re interested.)
- Produce a table listing all of the known values for the chemical_type column in the application table and how many application records there are of each. (Side note: The chemical_type codes are explained in the ChemTypeLookup.txt file, in case you’re interested.)
- What are the names of the drug products that are sponsored (sponsor applicant column) by
MERCK
? - Which sponsor applicant companies have the text
LABS
in their names and have products whose active ingredients (activeingred) include bothASPIRIN
andCAFFEINE
?
Links
- User Page: Brandon Klein
- Team Page: The Class Whoopers
Assignments Pages
- Week 1 Assignment
- Week 2 Assignment
- Week 3 Assignment
- Week 4 Assignment
- Week 5 Assignment
- Week 6 Assignment
- Week 7 Assignment
- Week 8 Assignment
- Week 9 Assignment
- Week 10 Assignment
- Week 11 Assignment
- Week 12 Assignment
- No Week 13 Assignment
- Week 14 Assignment
- Week 15 Assignment
Individual Journal Entries
- Week 1 Individual Journal
- Week 2 Individual Journal
- Week 3 Individual Journal
- Week 4 Individual Journal
- Week 5 Individual Journal
- Week 6 Individual Journal
- Week 7 Individual Journal
- Week 8 Individual Journal
- Week 9 Individual Journal
- Week 10 Individual Journal
- Week 11 Individual Journal
- Week 12 Individual Journal
- No Week 13 Journal
- Week 14 Individual Journal
- Week 15 Individual Journal
- Week 1 Class Journal
- Week 2 Class Journal
- Week 3 Class Journal
- Week 4 Class Journal
- Week 5 Class Journal
- Week 6 Class Journal
- Week 7 Class Journal
- Week 8 Class Journal
- Week 9 Class Journal
- Week 10 Team Journal
- Week 11 Team Journal
- Week 12 Team Journal
- No Week 13 Journal
- Week 14 Team Journal
- Week 15 Team Journal