Difference between revisions of "Msaeedi23 Week 6"
From LMU BioDB 2015
(formatting product.txt) |
(→Formatting product.txt) |
||
Line 36: | Line 36: | ||
application(ApplNo,ApplType,SponsorApplicant,MostRecentLabelAvailableFlag, CurrentPatentFlag,ActionType,ChemicalType,Ther_Potential,Orphan_Code) values(/g" | sed "1d" > ~/public_html/application.sql.txt\ | application(ApplNo,ApplType,SponsorApplicant,MostRecentLabelAvailableFlag, CurrentPatentFlag,ActionType,ChemicalType,Ther_Potential,Orphan_Code) values(/g" | sed "1d" > ~/public_html/application.sql.txt\ | ||
− | ==Formatting product.txt == | + | === Formatting product.txt === |
+ | This was much easier to do after formatting the application.txt due to the fact that I looked over the data prior to processing it: | ||
+ | cat Product.txt | | ||
+ | sed "1D" | | ||
+ | sed "s/'//g"| | ||
+ | sed "s/\t/,/" | | ||
+ | sed "s/\t/,'/" | | ||
+ | sed "s/\t/','/" | | ||
+ | sed "s/\t/',/" | | ||
+ | sed "s/\t/,'/" | | ||
+ | sed "s/\t/',/" | | ||
+ | sed "s/\t/,'/" | | ||
+ | sed "s/\t/','/" | | ||
+ | sed "s/\r$/'/g" | | ||
+ | sed "s/,,/,null,/g" | | ||
+ | Finally we pipe it all into the product.sql.txt file: | ||
+ | sed "s/^/insert into product(ApplNo,ProductNo,Form,Dosage,ProductMktStatus,TECode, | ||
+ | ReferenceDrug,drugname,activeingred) values(/g" > ~/public_html/Product.sql.txt |
Revision as of 03:55, 15 October 2015
Table Creation on PostgreSQL
Start by logging into ssh & enter password
ssh <msaeedi@my.cs.lmu.edu>
Then I inputted:
ls
To show the files in the directory that we want to use.
In PostgreSQL I typed the following to create an "application" table:
create table application (ApplNo int primary key, ApplType varchar, SponsorApplicant varchar, MostRecentLabelFlag boolean, CurrentPatentFlag boolean, ActionType varchar, ChemicalType int, Ther_Potential varchar, OrphanCode varchar);
Using similar techniques I typed in the following command into PostgreSQL to create the "Product" table:
create table product (ApplNo int references application, ProductNo int, Form varchar, Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug int, drugname varchar, activeingred varchar);
Formatting application.txt
I removed the spaces using the regex for the tab character which was \t
.
Redirected the processed data and formatted the original lines using:
cat application.txt | sed "1D" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/'/" | sed "s/,,/,null,/g"
Following this command it was necessary to remove excess spaces:
cat application.txt | sed "1D" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/'/" | sed "s/,,/,null,/g" | sed "s/ *//g"
Then I targeted the V at the end of every line, if there was one there I put it in quotes, if not I added a null:
cat application.txt | sed "1D" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/'/" | sed "s/,,/,null,/g" | sed "s/ *//g" | sed "s/V\r$/,'V'/g"
Lastly I added null to the end of each line, and replaced the empty characters -"- with null as well:
cat application.txt | sed "1D" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/'/" | sed "s/,,/,null,/g" | sed "s/ *//g" | sed "s/V\r$/,'V'/g" | sed "s/\r$/,null/g" | sed "s/,'V',null/,'V'/g" | sed "s/,,/,null,/g"
Finally, we pipe it all into the application.sql.txt file:
cat application.txt | sed "1D" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/'/" | sed "s/,,/,null,/g" | sed "s/ *//g" | sed "s/V\r$/,'V'/g" | sed "s/\r$/,null/g" | sed "s/,'V',null/,'V'/g" | sed "s/,,/,null,/g" | sed "s/^/insert into application(ApplNo,ApplType,SponsorApplicant,MostRecentLabelAvailableFlag, CurrentPatentFlag,ActionType,ChemicalType,Ther_Potential,Orphan_Code) values(/g" | sed "1d" > ~/public_html/application.sql.txt\
Formatting product.txt
This was much easier to do after formatting the application.txt due to the fact that I looked over the data prior to processing it: cat Product.txt | sed "1D" | sed "s/'//g"| sed "s/\t/,/" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/',/" | sed "s/\t/,'/" | sed "s/\t/','/" | sed "s/\r$/'/g" | sed "s/,,/,null,/g" | Finally we pipe it all into the product.sql.txt file: sed "s/^/insert into product(ApplNo,ProductNo,Form,Dosage,ProductMktStatus,TECode, ReferenceDrug,drugname,activeingred) values(/g" > ~/public_html/Product.sql.txt