Difference between revisions of "Troque Week 6"
From LMU BioDB 2015
m (→Creating the Table on PostgreSQL: Made the PostgreSQL command neater) |
(→Creating the Table on PostgreSQL: Added creating product table) |
||
Line 15: | Line 15: | ||
SponsorApplicant varchar, MostRecentLabelFlag boolean, CurrentPatentFlag boolean, | SponsorApplicant varchar, MostRecentLabelFlag boolean, CurrentPatentFlag boolean, | ||
ActionType varchar, ChemicalType int, Ther_Potential varchar, OrphanCode varchar); | ActionType varchar, ChemicalType int, Ther_Potential varchar, OrphanCode varchar); | ||
+ | |||
+ | To create the table for the product data, enter the following command on the PostgreSQL window: | ||
+ | |||
+ | create table product (ApplNo int references application, ProductNo int, Form varchar, | ||
+ | Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug int, | ||
+ | drugname varchar, activeingred varchar) | ||
== Processing the Data on the Command Line == | == Processing the Data on the Command Line == |
Revision as of 00:21, 15 October 2015
Contents
Creating the Table on PostgreSQL
First, login to the LMU CS server using ssh. Type in the following in a command prompt (Windows) or terminal (Mac) window:
ssh <username@my.cs.lmu.edu>
Enter your password. Note: You will not visibly see the cursor move when typing in your password so just keep typing. You will end up on your home directory.
Here, you can use the command ls
in order to see the list of files in the directory. Then we can start manipulating some files.
On the PostgreSQL command window, type the following to create a table called "application":
create table application (ApplNo int primary key, ApplType varchar, SponsorApplicant varchar, MostRecentLabelFlag boolean, CurrentPatentFlag boolean, ActionType varchar, ChemicalType int, Ther_Potential varchar, OrphanCode varchar);
To create the table for the product data, enter the following command on the PostgreSQL window:
create table product (ApplNo int references application, ProductNo int, Form varchar, Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug int, drugname varchar, activeingred varchar)
Processing the Data on the Command Line
Formatting the application.txt file
cat application.txt | sed "s/\t/,/g" | sed "s/,/,'/1" | sed "s/,'[A-Z]/&'/1" | sed "s/,/,'/2" | sed "s/\s*,False/',False/1" | sed "s/,/,'/5 " | sed "s/,/',/6" | sed "s/,,/,null,/" | sed "s/,,/,null,/" | sed "s/,\r$/,null/g" | sed "s/,S,/,'S',/g" | sed "s/,P,/,'P',/g" | sed "s/,V/,'V'/g" | sed "s/P\*/'P\*'/g" | sed "s/S\*/'S\*'/g" | sed "s/,AP,/,'AP',/g" | sed "s/$/);/g" | sed "s/^/insert into application (ApplNo,ApplType,SponsorApplicant,MostRecentLabelFlag, CurrentPatentFlag,ActionType,ChemicalType,Ther_Potential, OrphanCode) values (/g" | sed "1d" > ~/public_html/application.sql.txt
Formatting product.txt file
From doing this assignment, I realized how badly formatted these data are.
Assignment Links
Weekly Assignments
- Week 1
- Week 2
- Week 3
- Week 4
- Week 5
- Week 6
- Week 7
- Week 8
- Week 9
- Week 10
- Week 11
- Week 12
- No Week 13 Assignment
- Week 14
- Week 15
Individual Journal Entries
- Week 1 - This is technically the user page.
- Week 2
- Week 3
- Week 4
- Week 5
- Week 6
- Week 7
- Week 8
- Week 9
- Week 10
- Week 11
- Week 12
- No Week 13 Assignment
- Week 14
- Week 15