Difference between revisions of "Msaeedi23 Week 6"
From LMU BioDB 2015
(questions to answer) |
(→Questions to Answer: #1) |
||
Line 58: | Line 58: | ||
# Provide the DDL (<code>create table</code>) statements that you used for your ''application'' and ''product'' tables. | # Provide the DDL (<code>create table</code>) statements that you used for your ''application'' and ''product'' tables. | ||
+ | '''application.txt''' | ||
+ | #*create table application (ApplNo int primary key, ApplType varchar, | ||
+ | SponsorApplicant varchar, MostRecentLabelFlag boolean, CurrentPatentFlag boolean, | ||
+ | ActionType varchar, ChemicalType int, Ther_Potential varchar, OrphanCode varchar); | ||
+ | |||
+ | '''product.txt''' | ||
+ | #*create table product (ApplNo int references application, ProductNo int, Form varchar, | ||
+ | Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug int, | ||
+ | drugname varchar, activeingred varchar); | ||
+ | |||
# Provide the <code>sed</code> command sequences that you used to convert the raw text files into sequences of SQL <code>insert</code> statements. | # Provide the <code>sed</code> command sequences that you used to convert the raw text files into sequences of SQL <code>insert</code> statements. | ||
# Using the command line, how can you determine the number of records in each file? Provide the command. | # Using the command line, how can you determine the number of records in each file? Provide the command. |
Revision as of 04:20, 15 October 2015
Contents
Table Creation on PostgreSQL
Start by logging into ssh & enter password
ssh <msaeedi@my.cs.lmu.edu>
Then I inputted:
ls
To show the files in the directory that we want to use.
In PostgreSQL I typed the following to create an "application" table:
create table application (ApplNo int primary key, ApplType varchar, SponsorApplicant varchar, MostRecentLabelFlag boolean, CurrentPatentFlag boolean, ActionType varchar, ChemicalType int, Ther_Potential varchar, OrphanCode varchar);
Using similar techniques I typed in the following command into PostgreSQL to create the "Product" table:
create table product (ApplNo int references application, ProductNo int, Form varchar, Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug int, drugname varchar, activeingred varchar);
Formatting application.txt
I removed the spaces using the regex for the tab character which was \t
.
Redirected the processed data and formatted the original lines using:
cat application.txt | sed "1D" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/'/" | sed "s/,,/,null,/g"
Following this command it was necessary to remove excess spaces:
cat application.txt | sed "1D" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/'/" | sed "s/,,/,null,/g" | sed "s/ *//g"
Then I targeted the V at the end of every line, if there was one there I put it in quotes, if not I added a null:
cat application.txt | sed "1D" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/'/" | sed "s/,,/,null,/g" | sed "s/ *//g" | sed "s/V\r$/,'V'/g"
Lastly I added null to the end of each line, and replaced the empty characters -"- with null as well:
cat application.txt | sed "1D" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/'/" | sed "s/,,/,null,/g" | sed "s/ *//g" | sed "s/V\r$/,'V'/g" | sed "s/\r$/,null/g" | sed "s/,'V',null/,'V'/g" | sed "s/,,/,null,/g"
Finally, we pipe it all into the application.sql.txt file:
cat application.txt | sed "1D" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/'/" | sed "s/,,/,null,/g" | sed "s/ *//g" | sed "s/V\r$/,'V'/g" | sed "s/\r$/,null/g" | sed "s/,'V',null/,'V'/g" | sed "s/,,/,null,/g" | sed "s/^/insert into application(ApplNo,ApplType,SponsorApplicant,MostRecentLabelAvailableFlag, CurrentPatentFlag,ActionType,ChemicalType,Ther_Potential,Orphan_Code) values(/g" | sed "1d" > ~/public_html/application.sql.txt\
Formatting product.txt
This was much easier to do after formatting the application.txt due to the fact that I looked over the data prior to processing it: cat Product.txt | sed "1D" | sed "s/'//g"| sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\r$/'/g" | sed "s/,,/,null,/g" | Finally we pipe it all into the product.sql.txt file: sed "s/^/insert into product(ApplNo,ProductNo,Form,Dosage,ProductMktStatus,TECode, ReferenceDrug,drugname,activeingred) values(/g" > ~/public_html/Product.sql.txt
Questions to Answer
- Provide the DDL (
create table
) statements that you used for your application and product tables.
application.txt
- create table application (ApplNo int primary key, ApplType varchar,
SponsorApplicant varchar, MostRecentLabelFlag boolean, CurrentPatentFlag boolean, ActionType varchar, ChemicalType int, Ther_Potential varchar, OrphanCode varchar);
product.txt
- create table product (ApplNo int references application, ProductNo int, Form varchar,
Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug int, drugname varchar, activeingred varchar);
- Provide the
sed
command sequences that you used to convert the raw text files into sequences of SQLinsert
statements. - Using the command line, how can you determine the number of records in each file? Provide the command.
- Using SQL, how can you determine the number of records in the table corresponding to the file? Provide the SQL
select
statement. - In your database, are these numbers the same or different? Explain why you think so.
- For the remaining questions, supply (a) the answer to the question and (b) the SQL statement that you used to answer the question.
- What are the names of the drug products that are administered in the form
INJECTABLE;INTRAVENOUS, SUBCUTANEOUS
? - What are the names of the drug products whose active ingredient (activeingred) is
ATROPINE
? - In what forms and dosages can the drug product named
BENADRYL
be administered? - Which drug products have a name ending in
ESTROL
? - Produce a table listing all of the known values for the therapeutic_potential column in the application table and how many application records there are of each. (Side note: The therapeutic_potential codes are explained in the ReviewClass_Lookup.txt file, in case you’re interested.)
- Produce a table listing all of the known values for the chemical_type column in the application table and how many application records there are of each. (Side note: The chemical_type codes are explained in the ChemTypeLookup.txt file, in case you’re interested.)
- What are the names of the drug products that are sponsored (sponsor applicant column) by
MERCK
? - Which sponsor applicant companies have the text
LABS
in their names and have products whose active ingredients (activeingred) include bothASPIRIN
andCAFFEINE
?