Difference between revisions of "Malverso Week 6"

From LMU BioDB 2015
Jump to: navigation, search
(added foreign key)
(answered #3)
Line 65: Line 65:
 
I copy and pasted this into the SQL editor to fill up my table.
 
I copy and pasted this into the SQL editor to fill up my table.
  
 +
===#3===
 +
 +
Here are the commands that produce the number of records in each file:
 +
 +
select count(*) from Product
 +
select count(*) from application
 
   
 
   
  
  
 
{{Template:Malverso}}
 
{{Template:Malverso}}

Revision as of 01:39, 15 October 2015

FDA Drug Database

Direct Download/Unzipping Commands

  • I logged onto PuTTy and also brought up my public folder on the web browser at the web address my.cs.lmu.edu/~malverso/
  • I downloaded and unzipped my files through the command line, using the curl and unzip commands.
  • I put the untouched file onto my public web page so I could visually see my progress.
  • My database is saved under the student account of PgAdminIII under the name FDA-malverso.

#1

application.txt

  • I viewed the application.txt file using the more command to see the column titles, which are:
ApplNo	ApplType	SponsorApplicant	MostRecentLabelAvailableFlag	CurrentPatentFlag	ActionType Chemical_Type	Ther_Potential	Orphan_Code
create table application (ApplNo varchar primary key, ApplType varchar, SponsorApplicant varchar, MostRecentLabelAvailableFlag boolean, CurrentPatentFlag boolean, ActionType varchar, Chemical_Type varchar, Therapeutic_Potential varchar, Orphan_Code varchar)


Product.txt

  • I viewed the Product.txt file using the more command to see the column titles, which are:
ApplNo	ProductNo	Form	Dosage	ProductMktStatus	TECode	ReferenceDrug	drugname	activeingred 
  • I created a table for Product.txt using this command:
create table Product(ApplNo varchar, productNo varchar, Form varchar, Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug int, Drugname varchar, Activeingred varchar)

Here are the additional SQL commands I used:

ALTER TABLE Product
ADD FOREIGN KEY (ApplNo)
REFERENCES application(ApplNo)

#2

application.txt

  • I began by removing the first line, since that is just the column headers, using the sed command "1D".
  • I then aimed to put single quotes around all of the values and get rid of the spaces and replace the tabs with commas, which I did with only two sed commands, "s/^/'/g" and "s/\t/','/g".
  • I saw next that there was a lot of extra spaces after the values that were unnecessary. I tried to remove the extra spaces by using the sed command "s/( ){*}'/'/g", but that did nothing. I looked back at my assignment 4 to see that it was actually "s/ *'/'/g" that would remove all the extra spaces.
  • I also saw that the end of every line was misisng a ' character. I tried a bunch of sed commands to try and add this on, such as "s/$/'/g" but the only command that ended up working was "s/,'/,/8" , which I found buy counting out the 8 commas that separated the 9 columns. I reread the supplementary information section to realize my previous sed commands did not work because the end of lines were formatted differently...so I decided to change my sed command to "s/\r$/'/g"
  • Next I saw all of the empty fields and replaced the empty single quotes with the word null. Veronica informed me while I was doing this that null could not be surrounded by single quotes in order to be recognized accurately by SQL, so I used the sed command "s//null/g".
  • I then added the appropriate SQL commands to the beginning of my line so that the text would insert into the table.
  • I changed the previous sed command that added the ' to the end of the line to add '); to finish off the SQL command.

Here is my final command sequence (with line breaks for readability):

cat application.txt | sed "1D" | sed "s/\t/','/g" | sed "s/ *'/'/g" | sed "s/^/'/g" | sed "s/\r$/');/g" 
| sed "s/''/null/g" | sed "s/^/insert into application(ApplNo, ApplType, SponsorApplicant, 
MostRecentLabelAvailableFlag, CurrentPatentFlag, ActionType, Chemical_Type, Therapeutic_Potential, Orphan_Code)
 values (/g"  > ~/public_html/application.sql.txt

The final output SQL command sequence can be found at http://my.cs.lmu.edu/~malverso/application.sql.txt . I copy and pasted this into the SQL editor to fill up my table.

Product.txt

  • First I removed the first line, using sed command "1D".
  • Since there are single quotes in this .txt file, I decided to replace them with double quotes using the sed command "s/'/"/g", which didn't work for reasons which now seem obvious, so I changed my sed command to "s/'/\"/g" which worked.
  • Noting that there weren't any extra spaces, I got rid of the tabs by using the sed command "s/\t/','". I Then added the extra characters to the end of the line, using the sed command "s/\r$/');/g" which includes the ending of the SQL code.
  • I used the same sed command as before to insert the null values : "s//null/g".
  • Next I added the SQL command to the beginning.

Here is the final command sequence (with line breaks for readability):

cat Product.txt | sed "1D" | sed "s/'/\"/g" | sed "s/\t/','/g"| sed "s/\r$/');/g" 
| sed "s/''/null/g" | sed "s/^/insert into Product(ApplNo, ProductNo, Form, Dosage, ProductMktStatus,
 TECode, ReferenceDrug, drugname, activeingred) values ('/g" > ~/public_html/Product.sql.txt

The final output SQL command sequence can be found at http://my.cs.lmu.edu/~malverso/Product.sql.txt . I copy and pasted this into the SQL editor to fill up my table.

#3

Here are the commands that produce the number of records in each file:

select count(*) from Product
select count(*) from application


Team Page

Heavy Metal HaterZ

Assignments

Individual Journal Entries

Shared Journal Entries