Malverso Week 6

From LMU BioDB 2015
Revision as of 00:37, 15 October 2015 by Malverso (Talk | contribs) (answered #2 for application.txt)

Jump to: navigation, search

FDA Drug Database

Direct Download/Unzipping Commands

  • I logged onto PuTTy and also brought up my public folder on the web browser at the web address my.cs.lmu.edu/~malverso/
  • I downloaded and unzipped my files through the command line, using the curl and unzip commands.
  • I put the untouched file onto my public web page so I could visually see my progress.
  • My database is saved under the student account of PgAdminIII under the name FDA-malverso.

#1

application.txt

  • I viewed the application.txt file using the more command to see the column titles, which are:
ApplNo	ApplType	SponsorApplicant	MostRecentLabelAvailableFlag	CurrentPatentFlag	ActionType Chemical_Type	Ther_Potential	Orphan_Code
create table application (ApplNo varchar primary key, ApplType varchar, SponsorApplicant varchar, MostRecentLabelAvailableFlag boolean, CurrentPatentFlag boolean, ActionType varchar, Chemical_Type varchar, Therapeutic_Potential varchar, Orphan_Code varchar)


Product.txt

  • I viewed the Product.txt file using the more command to see the column titles, which are:
ApplNo	ProductNo	Form	Dosage	ProductMktStatus	TECode	ReferenceDrug	drugname	activeingred 
  • I created a table for Product.txt using this command:
create table Product(ApplNo varchar, productNo varchar, Form varchar, Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug boolean, Drugname varchar, Activeingred varchar)

#2

application.txt

  • I began by removing the first line, since that is just the column headers, using the sed command "1D".
  • I then aimed to put single quotes around all of the values and get rid of the spaces and replace the tabs with commas, which I did with only two sed commands, "s/^/'/g" and "s/\t/','/g".
  • I saw next that there was a lot of extra spaces after the values that were unnecessary. I tried to remove the extra spaces by using the sed command "s/( ){*}'/'/g", but that did nothing. I looked back at my assignment 4 to see that it was actually "s/ *'/'/g" that would remove all the extra spaces.
  • I also saw that the end of every line was misisng a ' character. I tried a bunch of sed commands to try and add this on, such as "s/$/'/g" but the only command that ended up working was "s/,'/,/8" , which I found buy counting out the 8 commas that separated the 9 columns. I reread the supplementary information section to realize my previous sed commands did not work because the end of lines were formatted differently...so I decided to change my sed command to "s/\r$/'/g"
  • Next I saw all of the empty fields and replaced the empty single quotes with the word null. Veronica informed me while I was doing this that null could not be surrounded by single quotes in order to be recognized accurately by SQL, so I used the sed command "s//null/g".
  • I then added the appropriate SQL commands to the beginning of my line so that the text would insert into the table.
  • I changed the previous sed command that added the ' to the end of the line to add '); to finish off the SQL command.

Here is my final sed command (with line breaks for readability):

cat application.txt | sed "1D" | sed "s/\t/','/g" | sed "s/ *'/'/g" | sed "s/^/'/g" | sed "s/\r$/');/g" 
| sed "s//null/g" | sed "s/^/insert into application(ApplNo, ApplType, SponsorApplicant, 
MostRecentAvailableFlag, CurrentPatentFlag, ActionType, Chemical_Type, Therapeutic_Potential, Orphan_Code)
 values (/g"  > ~/public_html/application.sql.txt




Team Page

Heavy Metal HaterZ

Assignments

Individual Journal Entries

Shared Journal Entries