Difference between revisions of "Troque Week 6"

From LMU BioDB 2015
Jump to: navigation, search
m (Formatting the application.txt file: Starting to add some info on my thought process)
m (Added more infor)
Line 27: Line 27:
 
Before starting this assignment, I didn't realized that I could easily remove the spaces using the regex for the tab character <code>\t</code>, and so I tried to take out the unnecessary blank spaces using the regex <code>\s*</code>. The result of using this method is the following (which didn't really work out since I got frustrated from all the outliers in this txt file and I eventually trashed this set of commands):
 
Before starting this assignment, I didn't realized that I could easily remove the spaces using the regex for the tab character <code>\t</code>, and so I tried to take out the unnecessary blank spaces using the regex <code>\s*</code>. The result of using this method is the following (which didn't really work out since I got frustrated from all the outliers in this txt file and I eventually trashed this set of commands):
  
 +
cat application.txt |
 +
sed -e 's/\s\+/,/2' |
 +
sed -e 's/\s/,/1'|
 +
sed "s/,/,'/1" |
 +
sed "s/,/',/2" |
 +
sed "s/,/,'/2" |
 +
sed "s/\s*False/',False/1" |
 +
sed "s/False\s*False/FalseFalse/g" |
 +
sed "s/False\s*/False,/g" |
 +
sed "s/,/,'/5" |
 +
sed "s/,'../&',/3" |
 +
sed "s/,\s*/,/6" |
 +
sed "s/\t/,'/" |
 +
sed "s/\t/'/" |
 +
sed "s/P',$/&'','',''/g" |
 +
sed "s/TA',$/&'','',''/g" |
 +
sed "s/''/null/g" |
 +
sed "s/V\r$/,'V'/g" |
 +
sed "s/'S'/&,null/g" |
 +
sed "s/'P'/&,null/g" |
 +
sed "1d" |
 +
sed "s/\t//g" |
 +
sed "s/[0-9],null/&,null/" |
 +
sed "s/^/insert into application (ApplNo,ApplType,SponsorApplicant,MostRecentLabelFlag,CurrentPatentFlag,
 +
                                  ActionType,ChemicalType,Ther_Potential,OrphanCode) values (/g" |
 +
sed "s/\r$/);/" |
 +
sed "s/null$/&);/g" |
 +
sed "s/'V'$/&);/" > ~/public_html/application.sql.txt
  
 +
From doing this, I realized how badly formatted these data are so I decided to start over:
  
 
  cat application.txt |
 
  cat application.txt |
Line 73: Line 102:
 
                               (/g" > ~/public_html/Product.sql.txt
 
                               (/g" > ~/public_html/Product.sql.txt
  
<!-- From doing this assignment, I realized how badly formatted these data are. -->
+
<!--  -->
 
+
  <!-- cat application.txt |
+
sed -e 's/\s\+/,/2' |
+
sed -e 's/\s/,/1'|
+
sed "s/,/,'/1" |
+
sed "s/,/',/2" |
+
sed "s/,/,'/2" |
+
sed "s/\s*False/',False/1" |
+
sed "s/False\s*False/FalseFalse/g" |
+
sed "s/False\s*/False,/g" |
+
sed "s/,/,'/5" |
+
sed "s/,'../&',/3" |
+
sed "s/,\s*/,/6" |
+
sed "s/\t/,'/" |
+
sed "s/\t/'/" |
+
sed "s/P',$/&'','',''/g" |
+
sed "s/TA',$/&'','',''/g" |
+
sed "s/''/null/g" |
+
sed "s/V\r$/,'V'/g" |
+
sed "s/'S'/&,null/g" |
+
sed "s/'P'/&,null/g" |
+
sed "1d" |
+
sed "s/\t//g" |
+
sed "s/[0-9],null/&,null/" |
+
sed "s/^/insert into application (ApplNo,ApplType,SponsorApplicant,MostRecentLabelFlag,CurrentPatentFlag,ActionType,ChemicalType,Ther_Potential,OrphanCode) values (/g" |
+
sed "s/\r$/);/" |
+
sed "s/null$/&);/g" |
+
sed "s/'V'$/&);/" > ~/public_html/application.sql.txt
+
  
-->
 
  
 
{{Template:Troque_Journal}}
 
{{Template:Troque_Journal}}

Revision as of 00:28, 15 October 2015

User Page        Bio Databases Main Page       


Creating the Table on PostgreSQL

First, login to the LMU CS server using ssh. Type in the following in a command prompt (Windows) or terminal (Mac) window:

ssh <username@my.cs.lmu.edu>

Enter your password. Note: You will not visibly see the cursor move when typing in your password so just keep typing. You will end up on your home directory.

Here, you can use the command ls in order to see the list of files in the directory. Then we can start manipulating some files.

On the PostgreSQL command window, type the following to create a table called "application":

create table application (ApplNo int primary key, ApplType varchar, 
SponsorApplicant varchar, MostRecentLabelFlag boolean, CurrentPatentFlag boolean, 
ActionType varchar, ChemicalType int, Ther_Potential varchar, OrphanCode varchar);

To create the table for the product data, enter the following command on the PostgreSQL window:

create table product (ApplNo int references application, ProductNo int, Form varchar,
Dosage varchar, ProductMktStatus int, TECode varchar, ReferenceDrug int,
drugname varchar, activeingred varchar)

Processing the Data on the Command Line

Formatting the application.txt file

Before starting this assignment, I didn't realized that I could easily remove the spaces using the regex for the tab character \t, and so I tried to take out the unnecessary blank spaces using the regex \s*. The result of using this method is the following (which didn't really work out since I got frustrated from all the outliers in this txt file and I eventually trashed this set of commands):

cat application.txt |

sed -e 's/\s\+/,/2' | 
sed -e 's/\s/,/1'| 
sed "s/,/,'/1" | 
sed "s/,/',/2" | 
sed "s/,/,'/2" | 
sed "s/\s*False/',False/1" | 
sed "s/False\s*False/FalseFalse/g" | 
sed "s/False\s*/False,/g" | 
sed "s/,/,'/5" | 
sed "s/,'../&',/3" | 
sed "s/,\s*/,/6" | 
sed "s/\t/,'/" | 
sed "s/\t/'/" | 
sed "s/P',$/&,,/g" | 
sed "s/TA',$/&,,/g" | 
sed "s//null/g" | 
sed "s/V\r$/,'V'/g" | 
sed "s/'S'/&,null/g" | 
sed "s/'P'/&,null/g" | 
sed "1d" | 
sed "s/\t//g" | 
sed "s/[0-9],null/&,null/" | 
sed "s/^/insert into application (ApplNo,ApplType,SponsorApplicant,MostRecentLabelFlag,CurrentPatentFlag,
                                  ActionType,ChemicalType,Ther_Potential,OrphanCode) values (/g" | 
sed "s/\r$/);/" | 
sed "s/null$/&);/g" | 
sed "s/'V'$/&);/" > ~/public_html/application.sql.txt

From doing this, I realized how badly formatted these data are so I decided to start over:

cat application.txt |
sed "s/\t/,/g" | 
sed "s/,/,'/1" | 
sed "s/,'[A-Z]/&'/1" | 
sed "s/,/,'/2" | 
sed "s/\s*,False/',False/1" | 
sed "s/,/,'/5 " | 
sed "s/,/',/6" | 
sed "s/,,/,null,/" |
sed "s/,,/,null,/" |
sed "s/,\r$/,null/g" |
sed "s/,S,/,'S',/g" |
sed "s/,P,/,'P',/g" |
sed "s/,V/,'V'/g" |
sed "s/P\*/'P\*'/g" |
sed "s/S\*/'S\*'/g" |
sed "s/,AP,/,'AP',/g" |
sed "s/$/);/g" |
sed "s/^/insert into application (ApplNo,ApplType,SponsorApplicant,MostRecentLabelFlag,
                                  CurrentPatentFlag,ActionType,ChemicalType,Ther_Potential,
                                  OrphanCode) values (/g" |
sed "1d"  > ~/public_html/application.sql.txt

Formatting product.txt file

cat Product.txt |
sed "1d" |
sed "s/,/;/g" |
sed "s/\t/,/g" |
sed "s/,,/,null,/g" |
sed "s/'//g" |
sed "s/,/,'/2" |
sed "s/,/','/3" |
sed "s/,/',/4" |
sed "s/,/,'/5" |
sed "s/,/',/6" |
sed "s/,/,'/7" |
sed "s/,/','/8" |
sed "s/\r$/');/g" |
sed "s/'null'/null/" |
sed "s/^/insert into product (ApplNo,ProductNo,Form,Dosage,ProductMktStatus,TECode,
                              ReferenceDrug,drugname,activeingred) values 
                              (/g" > ~/public_html/Product.sql.txt


Assignment Links

Weekly Assignments

Individual Journal Entries

Shared Journal Entries