Difference between revisions of "Jkuroda Week 6"
(first edit) |
(edits) |
||
Line 16: | Line 16: | ||
sed "s/''/null/g" | sed "s/'//1" | sed "s/^/insert into | sed "s/''/null/g" | sed "s/'//1" | sed "s/^/insert into | ||
application(id,applicationType,sponsorApplicant,mostRecentLabelAvailableFlag, | application(id,applicationType,sponsorApplicant,mostRecentLabelAvailableFlag, | ||
− | currentPatentFlag,actionType,chemicalType,therPotential,orphanCode)/g" > ~/public_html/application.sql.txt | + | currentPatentFlag,actionType,chemicalType,therPotential,orphanCode) values (/g" > ~/public_html/application.sql.txt |
+ | |||
+ | cat Product.txt | sed "1D" | sed "s/'/\"/g" | sed "s/\t/','/g" | sed "s/\r$/');/g" | sed "s/'//1" | | ||
+ | sed "s/''/null/g" | sed "s/^/insert into product | ||
+ | (applNumber,productNumber,form,dosage,productMktStatus,TECode,referenceDrug,drugName,activeIngredient) values (/g" | ||
+ | > ~/public_html/Product.sql.txt | ||
+ | |||
+ | ==Number of Records== | ||
+ | |||
+ | We learned that the word count command is used for finding the number of lines in a file, so I used that command on both files: | ||
+ | |||
+ | cat application.txt | wc | ||
+ | |||
+ | cat Product.txt | wc | ||
+ | |||
+ | I found that there was 19757 - 1 = 19756 records in the application file, and 32796 - 1 = 32795 records in the product file. | ||
+ | |||
+ | ==SQL Command== | ||
+ | |||
+ | In SQL, I used the select command using <code>count</code>. | ||
+ | |||
+ | select count(*) from application | ||
+ | |||
+ | select count(*) from product | ||
+ | |||
+ | Using this method, I got 19756 records in the application table and 32795 in the product table. | ||
+ | |||
+ | I got slightly different numbers for both methods since the <code>wc</code> command included the first line of column labels. | ||
+ | |||
+ | ==Injectable; Intravenous, and Subcutaneous== | ||
+ | |||
+ | To find the drugs that were administered in these forms, I searched the product table and got the following drugs: | ||
+ | |||
+ | "LOVENOX" | ||
+ | "VELCADE" | ||
+ | "VIDAZA" | ||
+ | "ENOXAPARIN SODIUM" | ||
+ | "ACTEMRA" | ||
+ | "AZACITIDINE" | ||
+ | |||
+ | select drugname from product where form = 'INJECTABLE;INTRAVENOUS, SUBCUTANEOUS' | ||
+ | |||
+ | ==Atropine as the Active Ingredient== | ||
+ | |||
+ | Similarly, I searched the product table for atropine in the active ingredient column and got the following drugs: | ||
+ | |||
+ | "ATROPEN" | ||
+ | "ATROPEN" | ||
+ | "ATROPEN" | ||
+ | "ATROPEN" | ||
+ | "ATROPINE" | ||
+ | |||
+ | Atropen came up four times. | ||
+ | |||
+ | select drugname from product where activeIngredient = 'ATROPINE' | ||
+ | |||
+ | ==Benadryl== | ||
+ | |||
+ | To find the forms and dosages for which benadryl can be administered, I once again used the <code>select</code> command to get these results: (DOSAGE;FORM) | ||
+ | |||
+ | "50MG";"CAPSULE;ORAL" | ||
+ | "12.5MG/5ML";"ELIXIR;ORAL" | ||
+ | "25MG";"CAPSULE;ORAL" | ||
+ | "10MG/ML";"INJECTABLE;INJECTION" | ||
+ | "50MG/ML";"INJECTABLE;INJECTION" | ||
+ | |||
+ | select dosage, form from product where drugname = 'BENADRYL' | ||
+ | |||
+ | ==Ending in Estrol== | ||
+ | |||
+ | To find the drug products whose name ended in 'estrol,' I had to use the <code>$</code> symbol, which I learned from using the command line. I got the following results: | ||
+ | |||
+ | "DIETHYLSTILBESTROL" | ||
+ | "DIETHYLSTILBESTROL" | ||
+ | "DIETHYLSTILBESTROL" | ||
+ | "DIETHYLSTILBESTROL" | ||
+ | "DIETHYLSTILBESTROL" | ||
+ | "DIETHYLSTILBESTROL" | ||
+ | "DIETHYLSTILBESTROL" | ||
+ | "DIETHYLSTILBESTROL" | ||
+ | "DIETHYLSTILBESTROL" | ||
+ | "DIETHYLSTILBESTROL" | ||
+ | "DIETHYLSTILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | "DIENESTROL" | ||
+ | "STILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | "STILBESTROL" | ||
+ | |||
+ | select drugname from product where drugname ~ 'ESTROL$' | ||
+ | |||
+ | ==Therapeutic Potential== | ||
+ | |||
+ | Used the <code>group</code> modifier to get a result that showed me the following: | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! Therapeutic Potential | ||
+ | ! Number | ||
+ | |- | ||
+ | | | ||
+ | | 15339 | ||
+ | |- | ||
+ | | S* | ||
+ | | 3 | ||
+ | |- | ||
+ | | S | ||
+ | | 3460 | ||
+ | |- | ||
+ | | P* | ||
+ | | 928 | ||
+ | |- | ||
+ | | P | ||
+ | | 26 | ||
+ | |} | ||
+ | |||
+ | select therPotential, count(*) from application group by therPotential | ||
+ | |||
+ | ==Chemical Types== | ||
+ | |||
+ | Used the same method as above to get the chemical types from the application table. | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! ChemicalType | ||
+ | ! count | ||
+ | |- | ||
+ | | || 15300 | ||
+ | |- | ||
+ | | 6 || 87 | ||
+ | |- | ||
+ | | 14 || 34 | ||
+ | |- | ||
+ | | 34 || 9 | ||
+ | |- | ||
+ | | 8 || 27 | ||
+ | |- | ||
+ | | 1 || 1272 | ||
+ | |- | ||
+ | | 2 || 139 | ||
+ | |- | ||
+ | | 3 || 1464 | ||
+ | |- | ||
+ | | 10 || 2 | ||
+ | |- | ||
+ | | 23 || 8 | ||
+ | |- | ||
+ | | 4 || 323 | ||
+ | |- | ||
+ | | 5 || 1047 | ||
+ | |- | ||
+ | | 9 || 1 | ||
+ | |- | ||
+ | | 24 || 8 | ||
+ | |- | ||
+ | | 7 || 35 | ||
+ | |- | ||
+ | |} | ||
+ | |||
+ | ==MERCK== | ||
+ | |||
+ | To get this result, the command was a little more tricky, and involved the <code>inner join</code> modifier to bring the two tables together. I got the following names, with a couple of repeated ones: | ||
+ | |||
+ | "AMINOHIPPURATE SODIUM" | ||
+ | "MANNITOL 25%" | ||
+ | "REDISOL" | ||
+ | "CORTONE" | ||
+ | "CORTONE" | ||
+ | "CORTONE" | ||
+ | "BENEMID" | ||
+ | "HYDROCORTONE" | ||
+ | "HYDROCORTONE" | ||
+ | "CYCLAINE" | ||
+ | "HYDROCORTONE" | ||
+ | "HYDROCORTONE" | ||
+ | "HYDROCORTONE" | ||
+ | "COGENTIN" | ||
+ | "COGENTIN" | ||
+ | "COGENTIN" | ||
+ | "ARAMINE" | ||
+ | "LERITINE" | ||
+ | "HYDELTRA-TBA" | ||
+ | "LERITINE" | ||
+ | "FLOROPRYL" | ||
+ | "HYDELTRASOL" | ||
+ | "HYDELTRASOL" | ||
+ | "DIUPRES-250" | ||
+ | "DIUPRES-500" | ||
+ | "DECADRON" | ||
+ | "DECADRON" | ||
+ | "DECADRON" | ||
+ | "DECADRON" | ||
+ | "DECADRON" | ||
+ | "DECADRON" | ||
+ | "HYDRODIURIL" | ||
+ | "HYDRODIURIL" | ||
+ | "HYDRODIURIL" | ||
+ | "HUMORSOL" | ||
+ | "HUMORSOL" | ||
+ | "HYDROPRES 25" | ||
+ | "HYDROPRES 50" | ||
+ | "DECADRON" | ||
+ | "DECADRON" | ||
+ | "DECADRON" | ||
+ | "HYDROCORTONE" | ||
+ | "DECADRON" | ||
+ | "DECADRON" | ||
+ | "DECADRON" | ||
+ | "COLBENEMID" | ||
+ | "PERIACTIN" | ||
+ | "DECASPRAY" | ||
+ | "PERIACTIN" | ||
+ | "DECADRON W/ XYLOCAINE" | ||
+ | "ALDOMET" | ||
+ | "ALDOMET" | ||
+ | "ALDOMET" | ||
+ | "ALDOMET" | ||
+ | "ALDORIL 15" | ||
+ | "ALDORIL 25" | ||
+ | "ALDORIL D30" | ||
+ | "ALDORIL D50" | ||
+ | "DECADERM" | ||
+ | "ALDOCLOR-150" | ||
+ | "ALDOCLOR-250" | ||
+ | "DECADRON-LA" | ||
+ | "CLINORIL" | ||
+ | "CLINORIL" | ||
+ | "BLOCADREN" | ||
+ | "BLOCADREN" | ||
+ | "BLOCADREN" | ||
+ | "TIMOLIDE 10-25" | ||
+ | "MODURETIC 5-50" | ||
+ | "ALDOMET" | ||
+ | "DOLOBID" | ||
+ | "DOLOBID" | ||
+ | "NOROXIN" | ||
+ | "PEPCID" | ||
+ | "PEPCID PRESERVATIVE FREE" | ||
+ | "PRINIVIL" | ||
+ | "PRINIVIL" | ||
+ | "PRINIVIL" | ||
+ | "PRINIVIL" | ||
+ | "PRINIVIL" | ||
+ | "MEVACOR" | ||
+ | "MEVACOR" | ||
+ | "MEVACOR" | ||
+ | "ZOCOR" | ||
+ | "ZOCOR" | ||
+ | "ZOCOR" | ||
+ | "ZOCOR" | ||
+ | "CHIBROXIN" | ||
+ | "ZOCOR" | ||
+ | "PRINZIDE" | ||
+ | "PRINZIDE" | ||
+ | "PRINZIDE" | ||
+ | "PROSCAR" | ||
+ | "TRUSOPT" | ||
+ | "TIAMATE" | ||
+ | "TIAMATE" | ||
+ | "TIAMATE" | ||
+ | "PEPCID RPD" | ||
+ | "PEPCID RPD" | ||
+ | "PROPECIA" | ||
+ | "SINGULAIR" | ||
+ | "SINGULAIR" | ||
+ | "SINGULAIR" | ||
+ | "MAXALT" | ||
+ | "MAXALT" | ||
+ | "MAXALT-MLT" | ||
+ | "MAXALT-MLT" | ||
+ | "VIOXX" | ||
+ | "VIOXX" | ||
+ | "VIOXX" | ||
+ | "VIOXX" | ||
+ | "VIOXX" | ||
+ | "CANCIDAS" | ||
+ | "CANCIDAS" | ||
+ | "SINGULAIR" | ||
+ | "EMEND" | ||
+ | "EMEND" | ||
+ | "EMEND" | ||
+ | "FOSAMAX" | ||
+ | "VIOXX" | ||
+ | "VIOXX" | ||
+ | "VIOXX" | ||
+ | "FOSAMAX PLUS D" | ||
+ | "FOSAMAX PLUS D" | ||
+ | "ZOLINZA" | ||
+ | "NEODECADRON" | ||
+ | "NEODECADRON" | ||
+ | "NEO-HYDELTRASOL" | ||
+ | "MEFOXIN IN SODIUM CHLORIDE 0.9% IN PLASTIC CONTAINER" | ||
+ | "MEFOXIN IN SODIUM CHLORIDE 0.9% IN PLASTIC CONTAINER" | ||
+ | "MEFOXIN IN DEXTROSE 5% IN PLASTIC CONTAINER" | ||
+ | "MEFOXIN IN DEXTROSE 5% IN PLASTIC CONTAINER" | ||
+ | "PRIMAXIN" | ||
+ | "PRIMAXIN" | ||
+ | "PRIMAXIN" | ||
+ | "PRIMAXIN" | ||
+ | "PRIMAXIN" | ||
+ | "PRIMAXIN" | ||
+ | "ALPHAREDISOL" | ||
+ | "ELSPAR" | ||
+ | |||
+ | select drugname from product inner join application on (product.applNumber = application.id) where sponsorApplicant = 'MERCK' | ||
+ | |||
+ | ==Labs, Aspirin and Caffeine== | ||
+ | |||
+ | At first, this command seemed like it would be tricky, but it just turned out to be long, and after a couple of failed attempts, I got the two results that matched up with the requirements: | ||
+ | |||
+ | "ACTAVIS LABS UT INC" | ||
+ | "ACTAVIS LABS UT INC" | ||
+ | "ACTAVIS LABS UT INC" | ||
+ | "WATSON LABS" | ||
+ | "WATSON LABS" | ||
+ | "WATSON LABS" | ||
+ | "WATSON LABS" | ||
+ | |||
+ | select sponsorApplicant from application inner join product on (product.applNumber = application.id) where | ||
+ | (sponsorApplicant like '%LABS%' and product.activeIngredient like '%CAFFEINE%' and activeIngredient like '%ASPIRIN%') |
Revision as of 05:12, 15 October 2015
Contents
Creating Tables
To begin the assignment, I first downloaded pgAdmin3 to my computer and created my own database with which I could work. I used the curl command in the terminal to download the FDA files to my directory in my.cs.lmu.edu
then I set up the tables for both the application and product lists using the following commands:
create table application (id int primary key, applicationType varchar, sponsorApplicant varchar, mostRecentLabelAvailableFlag boolean, currentPatentFlag boolean, actionType varchar, chemicalType int, therPotential varchar, orphanCode varchar)
create table product (int references application(id), productNumber int, form varchar, dosage varchar, productMktStatus int, TECode varchar, referenceDrug int, drugName varchar, activeIngredient varchar)
SQL Insert Statements
Now I had to get the data from the FDA into these formatted tables, so I drew upon my experience from using the command line in previous weeks and came up with the following:
cat application.txt | sed "1D" | sed "s/\t/','/g" | sed "s/\r$/');/g" | sed "s/ *'/'/g" | sed "s//null/g" | sed "s/'//1" | sed "s/^/insert into application(id,applicationType,sponsorApplicant,mostRecentLabelAvailableFlag, currentPatentFlag,actionType,chemicalType,therPotential,orphanCode) values (/g" > ~/public_html/application.sql.txt
cat Product.txt | sed "1D" | sed "s/'/\"/g" | sed "s/\t/','/g" | sed "s/\r$/');/g" | sed "s/'//1" | sed "s//null/g" | sed "s/^/insert into product (applNumber,productNumber,form,dosage,productMktStatus,TECode,referenceDrug,drugName,activeIngredient) values (/g" > ~/public_html/Product.sql.txt
Number of Records
We learned that the word count command is used for finding the number of lines in a file, so I used that command on both files:
cat application.txt | wc
cat Product.txt | wc
I found that there was 19757 - 1 = 19756 records in the application file, and 32796 - 1 = 32795 records in the product file.
SQL Command
In SQL, I used the select command using count
.
select count(*) from application
select count(*) from product
Using this method, I got 19756 records in the application table and 32795 in the product table.
I got slightly different numbers for both methods since the wc
command included the first line of column labels.
Injectable; Intravenous, and Subcutaneous
To find the drugs that were administered in these forms, I searched the product table and got the following drugs:
"LOVENOX" "VELCADE" "VIDAZA" "ENOXAPARIN SODIUM" "ACTEMRA" "AZACITIDINE"
select drugname from product where form = 'INJECTABLE;INTRAVENOUS, SUBCUTANEOUS'
Atropine as the Active Ingredient
Similarly, I searched the product table for atropine in the active ingredient column and got the following drugs:
"ATROPEN" "ATROPEN" "ATROPEN" "ATROPEN" "ATROPINE"
Atropen came up four times.
select drugname from product where activeIngredient = 'ATROPINE'
Benadryl
To find the forms and dosages for which benadryl can be administered, I once again used the select
command to get these results: (DOSAGE;FORM)
"50MG";"CAPSULE;ORAL" "12.5MG/5ML";"ELIXIR;ORAL" "25MG";"CAPSULE;ORAL" "10MG/ML";"INJECTABLE;INJECTION" "50MG/ML";"INJECTABLE;INJECTION"
select dosage, form from product where drugname = 'BENADRYL'
Ending in Estrol
To find the drug products whose name ended in 'estrol,' I had to use the $
symbol, which I learned from using the command line. I got the following results:
"DIETHYLSTILBESTROL" "DIETHYLSTILBESTROL" "DIETHYLSTILBESTROL" "DIETHYLSTILBESTROL" "DIETHYLSTILBESTROL" "DIETHYLSTILBESTROL" "DIETHYLSTILBESTROL" "DIETHYLSTILBESTROL" "DIETHYLSTILBESTROL" "DIETHYLSTILBESTROL" "DIETHYLSTILBESTROL" "STILBESTROL" "STILBESTROL" "STILBESTROL" "STILBESTROL" "STILBESTROL" "STILBESTROL" "DIENESTROL" "STILBESTROL" "STILBESTROL" "STILBESTROL" "STILBESTROL" "STILBESTROL" "STILBESTROL"
select drugname from product where drugname ~ 'ESTROL$'
Therapeutic Potential
Used the group
modifier to get a result that showed me the following:
Therapeutic Potential | Number |
---|---|
15339 | |
S* | 3 |
S | 3460 |
P* | 928 |
P | 26 |
select therPotential, count(*) from application group by therPotential
Chemical Types
Used the same method as above to get the chemical types from the application table.
ChemicalType | count |
---|---|
15300 | |
6 | 87 |
14 | 34 |
34 | 9 |
8 | 27 |
1 | 1272 |
2 | 139 |
3 | 1464 |
10 | 2 |
23 | 8 |
4 | 323 |
5 | 1047 |
9 | 1 |
24 | 8 |
7 | 35 |
MERCK
To get this result, the command was a little more tricky, and involved the inner join
modifier to bring the two tables together. I got the following names, with a couple of repeated ones:
"AMINOHIPPURATE SODIUM" "MANNITOL 25%" "REDISOL" "CORTONE" "CORTONE" "CORTONE" "BENEMID" "HYDROCORTONE" "HYDROCORTONE" "CYCLAINE" "HYDROCORTONE" "HYDROCORTONE" "HYDROCORTONE" "COGENTIN" "COGENTIN" "COGENTIN" "ARAMINE" "LERITINE" "HYDELTRA-TBA" "LERITINE" "FLOROPRYL" "HYDELTRASOL" "HYDELTRASOL" "DIUPRES-250" "DIUPRES-500" "DECADRON" "DECADRON" "DECADRON" "DECADRON" "DECADRON" "DECADRON" "HYDRODIURIL" "HYDRODIURIL" "HYDRODIURIL" "HUMORSOL" "HUMORSOL" "HYDROPRES 25" "HYDROPRES 50" "DECADRON" "DECADRON" "DECADRON" "HYDROCORTONE" "DECADRON" "DECADRON" "DECADRON" "COLBENEMID" "PERIACTIN" "DECASPRAY" "PERIACTIN" "DECADRON W/ XYLOCAINE" "ALDOMET" "ALDOMET" "ALDOMET" "ALDOMET" "ALDORIL 15" "ALDORIL 25" "ALDORIL D30" "ALDORIL D50" "DECADERM" "ALDOCLOR-150" "ALDOCLOR-250" "DECADRON-LA" "CLINORIL" "CLINORIL" "BLOCADREN" "BLOCADREN" "BLOCADREN" "TIMOLIDE 10-25" "MODURETIC 5-50" "ALDOMET" "DOLOBID" "DOLOBID" "NOROXIN" "PEPCID" "PEPCID PRESERVATIVE FREE" "PRINIVIL" "PRINIVIL" "PRINIVIL" "PRINIVIL" "PRINIVIL" "MEVACOR" "MEVACOR" "MEVACOR" "ZOCOR" "ZOCOR" "ZOCOR" "ZOCOR" "CHIBROXIN" "ZOCOR" "PRINZIDE" "PRINZIDE" "PRINZIDE" "PROSCAR" "TRUSOPT" "TIAMATE" "TIAMATE" "TIAMATE" "PEPCID RPD" "PEPCID RPD" "PROPECIA" "SINGULAIR" "SINGULAIR" "SINGULAIR" "MAXALT" "MAXALT" "MAXALT-MLT" "MAXALT-MLT" "VIOXX" "VIOXX" "VIOXX" "VIOXX" "VIOXX" "CANCIDAS" "CANCIDAS" "SINGULAIR" "EMEND" "EMEND" "EMEND" "FOSAMAX" "VIOXX" "VIOXX" "VIOXX" "FOSAMAX PLUS D" "FOSAMAX PLUS D" "ZOLINZA" "NEODECADRON" "NEODECADRON" "NEO-HYDELTRASOL" "MEFOXIN IN SODIUM CHLORIDE 0.9% IN PLASTIC CONTAINER" "MEFOXIN IN SODIUM CHLORIDE 0.9% IN PLASTIC CONTAINER" "MEFOXIN IN DEXTROSE 5% IN PLASTIC CONTAINER" "MEFOXIN IN DEXTROSE 5% IN PLASTIC CONTAINER" "PRIMAXIN" "PRIMAXIN" "PRIMAXIN" "PRIMAXIN" "PRIMAXIN" "PRIMAXIN" "ALPHAREDISOL" "ELSPAR"
select drugname from product inner join application on (product.applNumber = application.id) where sponsorApplicant = 'MERCK'
Labs, Aspirin and Caffeine
At first, this command seemed like it would be tricky, but it just turned out to be long, and after a couple of failed attempts, I got the two results that matched up with the requirements:
"ACTAVIS LABS UT INC" "ACTAVIS LABS UT INC" "ACTAVIS LABS UT INC" "WATSON LABS" "WATSON LABS" "WATSON LABS" "WATSON LABS"
select sponsorApplicant from application inner join product on (product.applNumber = application.id) where (sponsorApplicant like '%LABS%' and product.activeIngredient like '%CAFFEINE%' and activeIngredient like '%ASPIRIN%')