Vpachec3 Week 6

From LMU BioDB 2015
Revision as of 07:10, 13 October 2015 by Vpachec3 (Talk | contribs) (Added to section Product text file)

Jump to: navigation, search

application text file

My partner,Kevin, and I met on Sunday at 1:30pm in Seaver 120 to begin our assignment. Downloading and unzipping the file was not a problem. The problem occurred when we had to condense the information. What took me a while to understand is how to get rid of the tabs and spaces efficiently.


Here is the break down for each section of the command:

cat application.txt 

This opened the text file.

cat application.txt|sed "s/\t/~/g"

The added sed command replaced each tab instance with a tilde. We used a tilde as a filler because it was easier to put the tilde instead of another marker such as an asterick in terms of syntax. Now we need to get rid of the extra spaces in the line. However, we can't just use a command to get rid of the the spaces because some of the company names have spaces in them. Since the names only have one space we can make the command to get rid of spaces in pairs.

cat application.txt|sed "s/\t/~/g"|sed -r "s/( ){2}//g"

The second sed command would allow for any pairs of spaces to reduce.

I am going to jump in my explanation and explain the next four added commands (see below). I am lumping the explanation because they all are going for the same goal. The fist of the four gets rid of the remaining space in front of the tilde by replacing it with just a tilde, no space. The following step was to make sure that the columns with no answer in them (any empty entry in the columns) have null instead of an empty section. We need to do this exact same command twice because the computer will only read the groups of two if there were a greater number it would only count the first two. The last command of the four newly added makes sure that the end of line has a null if there is no information in the column for each row.

cat application.txt|sed "s/\t/~/g"|sed -r "s/( ){2}//g"| sed "s/ \~/~/g"| sed "s/~~/~null~/g"| sed "s/~~/~null~/g"| sed "s/~\r$/~null/g"


adding apostrophes to the varchar components in the file. We need the apostrophes so that they can be read as text in SQL. Booleans don't need the single quotes so we are in the clear for those. We also have to make sure that the nulls do not have apostrophes. The following pipeline shows that we need to have go by instance of each tilde to add in the apostrophe. And at the end, we had to make sure that the nulls didn't have apostrophes.

cat application.txt|sed "s/\t/~/g"|sed -r "s/( ){2}//g"| sed "s/ \~/~/g"| sed "s/~~/~null~/g"| sed "s/~~/~null~/g"| sed "s/~\r$/~null/g"| sed "s/~/~'/1"| sed "s/~/'~'/2" | sed "s/~/'~/3"| sed "s/~/~'/5"|sed "s/~/'~/6"| sed "s/~/~'/7"|sed "s/~/'~/8"|sed "s/~'null'~/~null~/g"| sed "s/\r$/'/g"| sed "s/~/~'/8"| sed "s/~'null/~null/g"


change the tilde into commas.

cat application.txt|sed "s/\t/~/g"|sed -r "s/( ){2}//g"| sed "s/ \~/~/g"| sed "s/~~/~null~/g"| sed "s/~~/~null~/g"| sed "s/~\r$/~null/g"| sed "s/~/~'/1"| sed "s/~/'~'/2" | sed "s/~/'~/3"| sed "s/~/~'/5"|sed "s/~/'~/6"| sed "s/~/~'/7"|sed "s/~/'~/8"|sed "s/~'null'~/~null~/g"| sed "s/\r$/'/g"| sed "s/~/~'/8"| sed "s/~'null/~null/g"|sed "s/~/,/g"


To get to our final command we just added the wording and syntax necessary for SQL in front of the line so that we can just copy and paste all the commands straight into SQL. http://my.cs.lmu.edu/~vpachec3/application.sql.txt

FINAL COMMAND

cat application.txt|sed "s/\t/~/g"|sed -r "s/( ){2}//g"| sed "s/ \~/~/g"| sed "s/~~/~null~/g"| sed "s/~~/~null~/g"| sed "s/~\r$/~null/g"| sed "s/~/~'/1"| sed "s/~/'~'/2" | sed "s/~/'~/3"| sed "s/~/~'/5"|sed "s/~/'~/6"| sed "s/~/~'/7"|sed "s/~/'~/8"|sed "s/~'null'~/~null~/g"| sed "s/\r$/'/g"| sed "s/~/~'/8"| sed "s/~'null/~null/g"|sed "s/~/,/g"|sed "s/^/insert into application(ApplNo,ApplType,SponsorApplicant,MostRecentLabelAvailableFlag,CurrentPatentFlag,ActionType,Chemical_Type,Ther_Potential,Orphan_code) values(/g" |sed "s/$/);/g"  > ~/public_html/application.sql.txt

Product text file

Here is where I left off on Monday 10/12

cat Product.txt | sed "s/\t/~/g"|sed "s/~~/~null~/g"| sed "s/~/~'/2"|sed "s/~/'~'/3"|sed "s/~/'~/4"|sed "s/~/~'/7"| sed "s/~/'~'/8"|sed "s/~/,/g"|sed "s/^/insert into Product(ApplNo,ProductNo,Form,Dosage,ProductMktStatus,TECode,ReferenceDrug,drugname,activeingred) values(/g" |sed "s/\r$/);/g" > ~/public_html/application.sql.txt


Links

Vpachec3 User Page