Difference between revisions of "Malverso Week 6"
(fixed error in column name) |
(answered #13) |
||
(17 intermediate revisions by the same user not shown) | |||
Line 22: | Line 22: | ||
*I created a table for Product.txt using this command: | *I created a table for Product.txt using this command: | ||
− | create table Product(ApplNo varchar, productNo varchar, Form varchar, Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug | + | create table Product(ApplNo varchar, productNo varchar, Form varchar, Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug int, Drugname varchar, Activeingred varchar) |
+ | *I couldn't remeber how to declare a foreign key when I was making the table, and ended up adding it afterwards in with the help of the tutorial found on http://www.w3schools.com/sql/sql_foreignkey.asp . | ||
+ | Here are the additional SQL commands I used: | ||
+ | ALTER TABLE Product | ||
+ | ADD FOREIGN KEY (ApplNo) | ||
+ | REFERENCES application(ApplNo) | ||
+ | |||
===#2=== | ===#2=== | ||
Line 35: | Line 41: | ||
*I changed the previous sed command that added the ' to the end of the line to add '); to finish off the SQL command. | *I changed the previous sed command that added the ' to the end of the line to add '); to finish off the SQL command. | ||
− | Here is my final | + | Here is my final command sequence (with line breaks for readability): |
cat application.txt | sed "1D" | sed "s/\t/','/g" | sed "s/ *'/'/g" | sed "s/^/'/g" | sed "s/\r$/');/g" | cat application.txt | sed "1D" | sed "s/\t/','/g" | sed "s/ *'/'/g" | sed "s/^/'/g" | sed "s/\r$/');/g" | ||
− | | sed "s/''/null/g" | sed "s/^/insert into application(ApplNo, ApplType, SponsorApplicant, | + | | sed "s/<nowiki>''</nowiki>/null/g" | sed "s/^/insert into application(ApplNo, ApplType, SponsorApplicant, |
MostRecentLabelAvailableFlag, CurrentPatentFlag, ActionType, Chemical_Type, Therapeutic_Potential, Orphan_Code) | MostRecentLabelAvailableFlag, CurrentPatentFlag, ActionType, Chemical_Type, Therapeutic_Potential, Orphan_Code) | ||
values (/g" > ~/public_html/application.sql.txt | values (/g" > ~/public_html/application.sql.txt | ||
+ | The final output SQL command sequence can be found at http://my.cs.lmu.edu/~malverso/application.sql.txt . | ||
+ | I copy and pasted this into the SQL editor to fill up my table. | ||
+ | ====Product.txt==== | ||
+ | *First I removed the first line, using sed command "1D". | ||
+ | *Since there are single quotes in this .txt file, I decided to replace them with double quotes using the sed command "s/'/"/g", which didn't work for reasons which now seem obvious, so I changed my sed command to "s/'/\"/g" which worked. | ||
+ | *Noting that there weren't any extra spaces, I got rid of the tabs by using the sed command "s/\t/','". I Then added the extra characters to the end of the line, using the sed command "s/\r$/');/g" which includes the ending of the SQL code. | ||
+ | *I used the same sed command as before to insert the null values : "s/''/null/g". | ||
+ | *Next I added the SQL command to the beginning. | ||
+ | Here is the final command sequence (with line breaks for readability): | ||
+ | cat Product.txt | sed "1D" | sed "s/'/\"/g" | sed "s/\t/','/g"| sed "s/\r$/');/g" | ||
+ | | sed "s/<nowiki>''</nowiki>/null/g" | sed "s/^/insert into Product(ApplNo, ProductNo, Form, Dosage, ProductMktStatus, | ||
+ | TECode, ReferenceDrug, drugname, activeingred) values ('/g" > ~/public_html/Product.sql.txt | ||
+ | The final output SQL command sequence can be found at http://my.cs.lmu.edu/~malverso/Product.sql.txt . | ||
+ | I copy and pasted this into the SQL editor to fill up my table. | ||
+ | |||
+ | ===#3=== | ||
+ | |||
+ | Here is the command that produces the number of records in each file on the command line (if added to the end of my command sequence found in the answer to #2, after removing the > ~/public_html/Product.sql.txt or > ~/public_html/application.sql.txt): | ||
+ | |||
+ | wc | ||
+ | |||
+ | The correct number is the first one, because it is the number of lines in the file which is equal to the number of records. | ||
+ | |||
+ | ===#4=== | ||
+ | |||
+ | Here are the SQL commands that produce the number of records in each file: | ||
+ | |||
+ | select count(*) from Product | ||
+ | select count(*) from application | ||
+ | |||
+ | ===#5=== | ||
+ | |||
+ | These numbers are the same because the number of lines in the text file is equal to the number of insert statements, each statement inserting one record into the table. | ||
+ | |||
+ | ===#6=== | ||
+ | *Form is found in the Product table. | ||
+ | ====a.==== | ||
+ | Name of Drugs: | ||
+ | LOVENOX | ||
+ | VELCADE | ||
+ | VIDAZA | ||
+ | ENOXAPARIN SODIUM | ||
+ | ACTEMRA | ||
+ | AZACITIDINE | ||
+ | |||
+ | ====b.==== | ||
+ | *I looked at my notes to see how to specialize select statements. | ||
+ | SQL statement: | ||
+ | Select drugname from Product where Form = 'INJECTABLE;INTRAVENOUS, SUBCUTANEOUS' | ||
+ | |||
+ | ===#7=== | ||
+ | *activeingred is found in the Product table. | ||
+ | ====a.==== | ||
+ | *There are four drugs with the name ATROPEN. | ||
+ | Name of Drugs: | ||
+ | ATROPEN | ||
+ | ATROPINE | ||
+ | ====b.==== | ||
+ | SQL statement: | ||
+ | Select drugname from Product where activeingred = 'ATROPINE' | ||
+ | |||
+ | ===#8=== | ||
+ | *Form and dosage are found in the Product table. | ||
+ | ====a.==== | ||
+ | ''Form, Dosage | ||
+ | CAPSULE;ORAL, 50MG | ||
+ | ELIXIR;ORAL, 12.5MG/5ML | ||
+ | CAPSULE;ORAL, 25MG | ||
+ | INJECTABLE;INJECTION, 10MG/ML | ||
+ | INJECTABLE;INJECTION, 50MG/ML | ||
+ | |||
+ | ====b.==== | ||
+ | SQL statement: | ||
+ | Select form,dosage from Product where drugname = 'BENADRYL' | ||
+ | |||
+ | ===#9=== | ||
+ | *Drugname is found in the Product table. | ||
+ | *I looked through my notes to see that we can use grep and sed commands within SQL. | ||
+ | ====a.==== | ||
+ | Name of Drugs: | ||
+ | DIETHYLSTILBESTROL // Eleven results with this drug name | ||
+ | STILBESTROL // Twelve results with this drug name | ||
+ | DIENESTROL // One result with this drug name | ||
+ | |||
+ | ====b.==== | ||
+ | SQL statement: | ||
+ | select drugname from Product where drugname ~'ESTROL$' | ||
+ | |||
+ | ===#10=== | ||
+ | *I referred to my notes to remember how to use the group by statement when counting. | ||
+ | *I wasn't quite sure how we were supposed to format the table within our wiki. | ||
+ | ====a.==== | ||
+ | *I removed the count of the Therapeutic_Potential when the value is null. | ||
+ | ''Therapeutic_Potential; Count | ||
+ | S*; 3 | ||
+ | S; 3460 | ||
+ | P; 928 | ||
+ | P*; 26 | ||
+ | |||
+ | ====b.==== | ||
+ | SQL statement: | ||
+ | select therapeutic_potential,count(*) from application group by therapeutic_potential | ||
+ | |||
+ | ===#11=== | ||
+ | ====a.==== | ||
+ | *I removed the count of the chemical_type when the value is null. | ||
+ | ''Chemical_Type; Count | ||
+ | 8; 27 | ||
+ | 9; 1 | ||
+ | 2; 139 | ||
+ | 23; 8 | ||
+ | 34; 9 | ||
+ | 5; 1047 | ||
+ | 3; 1464 | ||
+ | 1; 1272 | ||
+ | 7; 35 | ||
+ | 24; 8 | ||
+ | 14; 34 | ||
+ | 4; 323 | ||
+ | 6; 87 | ||
+ | 10; 2 | ||
+ | |||
+ | ====b.==== | ||
+ | SQL statement: | ||
+ | select chemical_type,count(*) from application group by chemical_type | ||
+ | |||
+ | ===#12=== | ||
+ | *Sponsor applicant is found in the application table, but drugname is found in the product table. Therefore, this exercise will make use of the foreign key. | ||
+ | ====a.==== | ||
+ | Name of Drug: | ||
+ | DECADRON | ||
+ | HUMORSOL | ||
+ | NEO-HYDELTRASOL | ||
+ | PRINIVIL | ||
+ | MAXALT-MLT | ||
+ | HYDROCORTONE | ||
+ | PERIACTIN | ||
+ | PROPECIA | ||
+ | PROSCAR | ||
+ | CLINORIL | ||
+ | PRINZIDE | ||
+ | ELSPAR | ||
+ | ALDOMET | ||
+ | ALDORIL D30 | ||
+ | EMEND | ||
+ | DIUPRES-500 | ||
+ | NEODECADRON | ||
+ | ALDOCLOR-250 | ||
+ | MEFOXIN IN DEXTROSE 5% IN PLASTIC CONTAINER | ||
+ | ARAMINE | ||
+ | TIAMATE | ||
+ | BLOCADREN | ||
+ | CANCIDAS | ||
+ | CHIBROXIN | ||
+ | CORTONE | ||
+ | PEPCID | ||
+ | TRUSOPT | ||
+ | REDISOL | ||
+ | VIOXX | ||
+ | FLOROPRYL | ||
+ | ALPHAREDISOL | ||
+ | DECADRON-LA | ||
+ | MEFOXIN IN SODIUM CHLORIDE 0.9% IN PLASTIC CONTAINER | ||
+ | PEPCID RPD | ||
+ | NOROXIN | ||
+ | ZOCOR | ||
+ | LERITINE | ||
+ | DECADERM | ||
+ | DOLOBID | ||
+ | MANNITOL 25% | ||
+ | DECADRON W/ XYLOCAINE | ||
+ | ALDORIL D50 | ||
+ | TIMOLIDE 10-25 | ||
+ | CYCLAINE | ||
+ | HYDROPRES 25 | ||
+ | AMINOHIPPURATE SODIUM | ||
+ | MEVACOR | ||
+ | MODURETIC 5-50 | ||
+ | ALDORIL 25 | ||
+ | SINGULAIR | ||
+ | COLBENEMID | ||
+ | DIUPRES-250 | ||
+ | HYDELTRA-TBA | ||
+ | PRIMAXIN | ||
+ | BENEMID | ||
+ | MAXALT | ||
+ | FOSAMAX PLUS D | ||
+ | HYDRODIURIL | ||
+ | HYDELTRASOL | ||
+ | HYDROPRES 50 | ||
+ | ZOLINZA | ||
+ | ALDORIL 15 | ||
+ | FOSAMAX | ||
+ | DECASPRAY | ||
+ | COGENTIN | ||
+ | ALDOCLOR-150 | ||
+ | PEPCID PRESERVATIVE FREE | ||
+ | |||
+ | ====b.==== | ||
+ | * First I tried select drugname from product where Product.applno = application.applno and application.sponsorapplicant = 'MERCK', but it threw an error message. I realized I needed to select from both tables. | ||
+ | |||
+ | SQL statement: | ||
+ | select drugname from product,application where Product.applno = application.applno and application.sponsorapplicant = 'MERCK' group by drugname | ||
+ | |||
+ | ===#13=== | ||
+ | *Sponsor applicant is found in the application table and activeingred is found in the product table. | ||
+ | ====a.==== | ||
+ | Sponsor Applicants: | ||
+ | ACTAVIS LABS UT INC | ||
+ | WATSON LABS | ||
+ | |||
+ | ====b.==== | ||
+ | SQL statement: | ||
+ | select sponsorapplicant from product,application where application.applno = product.applno and sponsorapplicant ~ 'LABS' and activeingred ~ 'ASPIRIN' and activeingred ~ 'CAFFEINE' group by sponsorapplicant | ||
{{Template:Malverso}} | {{Template:Malverso}} |
Latest revision as of 03:38, 15 October 2015
Contents
FDA Drug Database
Direct Download/Unzipping Commands
- I logged onto PuTTy and also brought up my public folder on the web browser at the web address my.cs.lmu.edu/~malverso/
- I downloaded and unzipped my files through the command line, using the curl and unzip commands.
- I put the untouched file onto my public web page so I could visually see my progress.
- My database is saved under the student account of PgAdminIII under the name FDA-malverso.
#1
application.txt
- I viewed the application.txt file using the more command to see the column titles, which are:
ApplNo ApplType SponsorApplicant MostRecentLabelAvailableFlag CurrentPatentFlag ActionType Chemical_Type Ther_Potential Orphan_Code
- I referenced the PostgreSQL Tutorial for how to create a table and http://www.fda.gov/Drugs/InformationOnDrugs/ucm079750.htm for primary key information. I then created a table using this command:
create table application (ApplNo varchar primary key, ApplType varchar, SponsorApplicant varchar, MostRecentLabelAvailableFlag boolean, CurrentPatentFlag boolean, ActionType varchar, Chemical_Type varchar, Therapeutic_Potential varchar, Orphan_Code varchar)
Product.txt
- I viewed the Product.txt file using the more command to see the column titles, which are:
ApplNo ProductNo Form Dosage ProductMktStatus TECode ReferenceDrug drugname activeingred
- I created a table for Product.txt using this command:
create table Product(ApplNo varchar, productNo varchar, Form varchar, Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug int, Drugname varchar, Activeingred varchar)
- I couldn't remeber how to declare a foreign key when I was making the table, and ended up adding it afterwards in with the help of the tutorial found on http://www.w3schools.com/sql/sql_foreignkey.asp .
Here are the additional SQL commands I used:
ALTER TABLE Product ADD FOREIGN KEY (ApplNo) REFERENCES application(ApplNo)
#2
application.txt
- I began by removing the first line, since that is just the column headers, using the sed command "1D".
- I then aimed to put single quotes around all of the values and get rid of the spaces and replace the tabs with commas, which I did with only two sed commands, "s/^/'/g" and "s/\t/','/g".
- I saw next that there was a lot of extra spaces after the values that were unnecessary. I tried to remove the extra spaces by using the sed command "s/( ){*}'/'/g", but that did nothing. I looked back at my assignment 4 to see that it was actually "s/ *'/'/g" that would remove all the extra spaces.
- I also saw that the end of every line was misisng a ' character. I tried a bunch of sed commands to try and add this on, such as "s/$/'/g" but the only command that ended up working was "s/,'/,/8" , which I found buy counting out the 8 commas that separated the 9 columns. I reread the supplementary information section to realize my previous sed commands did not work because the end of lines were formatted differently...so I decided to change my sed command to "s/\r$/'/g"
- Next I saw all of the empty fields and replaced the empty single quotes with the word null. Veronica informed me while I was doing this that null could not be surrounded by single quotes in order to be recognized accurately by SQL, so I used the sed command "s//null/g".
- I then added the appropriate SQL commands to the beginning of my line so that the text would insert into the table.
- I changed the previous sed command that added the ' to the end of the line to add '); to finish off the SQL command.
Here is my final command sequence (with line breaks for readability):
cat application.txt | sed "1D" | sed "s/\t/','/g" | sed "s/ *'/'/g" | sed "s/^/'/g" | sed "s/\r$/');/g" | sed "s/''/null/g" | sed "s/^/insert into application(ApplNo, ApplType, SponsorApplicant, MostRecentLabelAvailableFlag, CurrentPatentFlag, ActionType, Chemical_Type, Therapeutic_Potential, Orphan_Code) values (/g" > ~/public_html/application.sql.txt
The final output SQL command sequence can be found at http://my.cs.lmu.edu/~malverso/application.sql.txt . I copy and pasted this into the SQL editor to fill up my table.
Product.txt
- First I removed the first line, using sed command "1D".
- Since there are single quotes in this .txt file, I decided to replace them with double quotes using the sed command "s/'/"/g", which didn't work for reasons which now seem obvious, so I changed my sed command to "s/'/\"/g" which worked.
- Noting that there weren't any extra spaces, I got rid of the tabs by using the sed command "s/\t/','". I Then added the extra characters to the end of the line, using the sed command "s/\r$/');/g" which includes the ending of the SQL code.
- I used the same sed command as before to insert the null values : "s//null/g".
- Next I added the SQL command to the beginning.
Here is the final command sequence (with line breaks for readability):
cat Product.txt | sed "1D" | sed "s/'/\"/g" | sed "s/\t/','/g"| sed "s/\r$/');/g" | sed "s/''/null/g" | sed "s/^/insert into Product(ApplNo, ProductNo, Form, Dosage, ProductMktStatus, TECode, ReferenceDrug, drugname, activeingred) values ('/g" > ~/public_html/Product.sql.txt
The final output SQL command sequence can be found at http://my.cs.lmu.edu/~malverso/Product.sql.txt . I copy and pasted this into the SQL editor to fill up my table.
#3
Here is the command that produces the number of records in each file on the command line (if added to the end of my command sequence found in the answer to #2, after removing the > ~/public_html/Product.sql.txt or > ~/public_html/application.sql.txt):
wc
The correct number is the first one, because it is the number of lines in the file which is equal to the number of records.
#4
Here are the SQL commands that produce the number of records in each file:
select count(*) from Product select count(*) from application
#5
These numbers are the same because the number of lines in the text file is equal to the number of insert statements, each statement inserting one record into the table.
#6
- Form is found in the Product table.
a.
Name of Drugs:
LOVENOX VELCADE VIDAZA ENOXAPARIN SODIUM ACTEMRA AZACITIDINE
b.
- I looked at my notes to see how to specialize select statements.
SQL statement:
Select drugname from Product where Form = 'INJECTABLE;INTRAVENOUS, SUBCUTANEOUS'
#7
- activeingred is found in the Product table.
a.
- There are four drugs with the name ATROPEN.
Name of Drugs:
ATROPEN ATROPINE
b.
SQL statement:
Select drugname from Product where activeingred = 'ATROPINE'
#8
- Form and dosage are found in the Product table.
a.
Form, Dosage CAPSULE;ORAL, 50MG ELIXIR;ORAL, 12.5MG/5ML CAPSULE;ORAL, 25MG INJECTABLE;INJECTION, 10MG/ML INJECTABLE;INJECTION, 50MG/ML
b.
SQL statement:
Select form,dosage from Product where drugname = 'BENADRYL'
#9
- Drugname is found in the Product table.
- I looked through my notes to see that we can use grep and sed commands within SQL.
a.
Name of Drugs:
DIETHYLSTILBESTROL // Eleven results with this drug name STILBESTROL // Twelve results with this drug name DIENESTROL // One result with this drug name
b.
SQL statement:
select drugname from Product where drugname ~'ESTROL$'
#10
- I referred to my notes to remember how to use the group by statement when counting.
- I wasn't quite sure how we were supposed to format the table within our wiki.
a.
- I removed the count of the Therapeutic_Potential when the value is null.
Therapeutic_Potential; Count S*; 3 S; 3460 P; 928 P*; 26
b.
SQL statement: select therapeutic_potential,count(*) from application group by therapeutic_potential
#11
a.
- I removed the count of the chemical_type when the value is null.
Chemical_Type; Count 8; 27 9; 1 2; 139 23; 8 34; 9 5; 1047 3; 1464 1; 1272 7; 35 24; 8 14; 34 4; 323 6; 87 10; 2
b.
SQL statement:
select chemical_type,count(*) from application group by chemical_type
#12
- Sponsor applicant is found in the application table, but drugname is found in the product table. Therefore, this exercise will make use of the foreign key.
a.
Name of Drug:
DECADRON HUMORSOL NEO-HYDELTRASOL PRINIVIL MAXALT-MLT HYDROCORTONE PERIACTIN PROPECIA PROSCAR CLINORIL PRINZIDE ELSPAR ALDOMET ALDORIL D30 EMEND DIUPRES-500 NEODECADRON ALDOCLOR-250 MEFOXIN IN DEXTROSE 5% IN PLASTIC CONTAINER ARAMINE TIAMATE BLOCADREN CANCIDAS CHIBROXIN CORTONE PEPCID TRUSOPT REDISOL VIOXX FLOROPRYL ALPHAREDISOL DECADRON-LA MEFOXIN IN SODIUM CHLORIDE 0.9% IN PLASTIC CONTAINER PEPCID RPD NOROXIN ZOCOR LERITINE DECADERM DOLOBID MANNITOL 25% DECADRON W/ XYLOCAINE ALDORIL D50 TIMOLIDE 10-25 CYCLAINE HYDROPRES 25 AMINOHIPPURATE SODIUM MEVACOR MODURETIC 5-50 ALDORIL 25 SINGULAIR COLBENEMID DIUPRES-250 HYDELTRA-TBA PRIMAXIN BENEMID MAXALT FOSAMAX PLUS D HYDRODIURIL HYDELTRASOL HYDROPRES 50 ZOLINZA ALDORIL 15 FOSAMAX DECASPRAY COGENTIN ALDOCLOR-150 PEPCID PRESERVATIVE FREE
b.
- First I tried select drugname from product where Product.applno = application.applno and application.sponsorapplicant = 'MERCK', but it threw an error message. I realized I needed to select from both tables.
SQL statement:
select drugname from product,application where Product.applno = application.applno and application.sponsorapplicant = 'MERCK' group by drugname
#13
- Sponsor applicant is found in the application table and activeingred is found in the product table.
a.
Sponsor Applicants:
ACTAVIS LABS UT INC WATSON LABS
b.
SQL statement:
select sponsorapplicant from product,application where application.applno = product.applno and sponsorapplicant ~ 'LABS' and activeingred ~ 'ASPIRIN' and activeingred ~ 'CAFFEINE' group by sponsorapplicant
Team Page
Assignments
- Week 1
- Week 2
- Week 3
- Week 4
- Week 5
- Week 6
- Week 7
- Week 8
- Week 9
- Week 10
- Week 11
- Week 12
- Week 13
- Week 14
- Week 15
Individual Journal Entries
- Malverso User Page (Week 1)
- Week 2
- Week 3
- Week 4
- Week 5
- Week 6
- Week 7
- Week 8
- Week 9
- Week 10
- Week 11
- Week 12
- Week 13
- Week 14
- Week 15