Difference between revisions of "Kmill104 Week 8"
(creating week 8 page) |
(→Methods: fixing typo again again) |
||
(45 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
==Purpose== | ==Purpose== | ||
+ | The purpose of this week's assignment was to learn how data sets are organized and how relationships are defined between them, as well as how to download data and set up these relationships on our own database. It also gave us practice using queries in Microsoft Access and following the SQL format. | ||
==Methods== | ==Methods== | ||
+ | #My first step was to click on the [http://www.fda.gov/Drugs/InformationOnDrugs/ucm079750.htm Drugs@FDA Data Files] link on the Week 8 page. | ||
+ | #*On the database website, there is a file containing the FDA Drug Database's data sets. | ||
+ | #**Screenshot of website: [[File:DBScreenshot.png]] [http://www.fda.gov/Drugs/InformationOnDrugs/ucm079750.htm Drugs@FDA Data Files] | ||
+ | #*Clicking this file link then downloaded a zip folder, which I had to decompress in FileExplorer. | ||
+ | #*Within the folder, there were 11 text documents for each table in this image: [[File:DataBTables.png]] [http://www.fda.gov/Drugs/InformationOnDrugs/ucm079750.htm Drugs@FDA Data Files] | ||
+ | #I then created my own Microsoft Access database, and navigated to the External Data tab in the toolbar. | ||
+ | #*From there, I was able to click on New Data Source, which gave me the option to choose a text file. | ||
+ | #*I then imported each of the text files as a new table into the current database. | ||
+ | #For each file, I kept the data in the Delimited format and chose tab as the delimiter to separate the fields. | ||
+ | #*I also made sure to check off that the first row contained the field names. | ||
+ | #*For each field type, I kept the names the same but, if needed, I changed the data type to match what is seen on the FDA website. | ||
+ | #**On the website, each table column is assigned some data type, with the five options for data types being char, varchar, int, text, and datetime. | ||
+ | #**Submissions table as website example: [[File:DBSubmissions.png]] [http://www.fda.gov/Drugs/InformationOnDrugs/ucm079750.htm Drugs@FDA Data Files] | ||
+ | #**In Access, each column that has data type char on the website I then assigned the Short Text data type. For the others, varchar was assigned as Long Text, int as Integer or Long Integer, text as Long Text, and datetime as Date with Time. | ||
+ | #***For the Submissions and ApplicationDocs files, I ended up with import errors when downloading some of the columns, specifically the datetime columns. Because these columns were listed as NULL on the website, I then chose to not import them and I had no more import errors. | ||
+ | #*I also attempted to match the table's primary keys to what is seen on the website, but Access often stated that the column could not be a primary key as it contained duplicates or was missing data in some rows. In these cases, I chose there to be no primary key. | ||
+ | #After downloading each file and configuring the fields in its table, I navigated to the Database Tools tab and then clicked on Relationships. | ||
+ | #*In Relationships, I was able to click on "Edit Relationships" which opened a tool where I could enter my desired two tables under Table/Query: and Related Table/Query:, and then underneath these I could choose the related columns. After clicking OK, a line indicating the relationship is displayed between the two tables. | ||
+ | #*I repeated this step for every relationship that is shown on the FDA website, which is similarly indicated by lines between tables. | ||
+ | #**For those tables that had multiple columns in common, I made sure to define the relationship for each of them. | ||
+ | #After setting up my database, I was able to navigate to the Create tab, where I clicked QueryDesign to create a new query. The queries I used to answer the questions are indicated below, and they are all in the SQL format. | ||
==Results== | ==Results== | ||
#What are the names of the drug products that are administered in the form INJECTABLE;INTRAVENOUS, SUBCUTANEOUS? | #What are the names of the drug products that are administered in the form INJECTABLE;INTRAVENOUS, SUBCUTANEOUS? | ||
+ | #*Drug Names: [[Media:MillerQuery1.xlsx | Question 1 Answers]] | ||
+ | #**SELECT DrugName FROM Products WHERE form ='INJECTABLE;INTRAVENOUS, SUBCUTANEOUS'; | ||
#What are the names of the drug products whose active ingredient (activeingred) is ATROPINE? | #What are the names of the drug products whose active ingredient (activeingred) is ATROPINE? | ||
+ | #*Drug Names: [[Media:MillerQuery2.xlsx | Question 2 Answers]] | ||
+ | #**SELECT DrugName FROM Products WHERE ActiveIngredient ='ATROPINE'; | ||
#In what forms and dosages can the drug product named BENADRYL be administered? | #In what forms and dosages can the drug product named BENADRYL be administered? | ||
+ | #*Forms and Dosages: [[Media:MillerQuery3.xlsx | Question 3 Answers]] | ||
+ | #**SELECT Form, Strength FROM Products WHERE DrugName = 'BENADRYL'; | ||
#Which drug products have a name ending in ESTROL? | #Which drug products have a name ending in ESTROL? | ||
− | #Produce a table listing all of the known values for the therapeutic_potential column in the application table and how many application records there are of each. (Side note: The therapeutic_potential codes are explained in the ReviewClass_Lookup.txt file, in case you’re interested.) | + | #*Drug Names: [[Media:MillerQuery4.xlsx | Question 4 Answers]] |
+ | #**SELECT DrugName FROM Products WHERE DrugName LIKE '*ESTROL'; | ||
+ | #Produce a table listing all of the known values for the therapeutic_potential column in the application table and how many application records there are of each. (Side note: The therapeutic_potential codes are explained in the ReviewClass_Lookup.txt file, in case you’re interested.) | ||
+ | #*Not possible, no therapeutic_potential column | ||
#Produce a table listing all of the known values for the chemical_type column in the application table and how many application records there are of each. (Side note: The chemical_type codes are explained in the ChemTypeLookup.txt file, in case you’re interested.) | #Produce a table listing all of the known values for the chemical_type column in the application table and how many application records there are of each. (Side note: The chemical_type codes are explained in the ChemTypeLookup.txt file, in case you’re interested.) | ||
+ | #*Not possible, no chemical_type column | ||
#What are the names of the drug products that are sponsored (sponsor applicant column) by MERCK? | #What are the names of the drug products that are sponsored (sponsor applicant column) by MERCK? | ||
+ | #*Drug Names: [[Media:MillerQuery7.xlsx | Question 7 Answers]] | ||
+ | #**SELECT Products.DrugName FROM Products LEFT JOIN Applications ON Products.ApplNo = Applications.ApplNo WHERE (((Applications.[SponsorName])='MERCK')); | ||
#Which sponsor applicant companies have the text LABS in their names and have products whose active ingredients (activeingred) include both ASPIRIN and CAFFEINE? | #Which sponsor applicant companies have the text LABS in their names and have products whose active ingredients (activeingred) include both ASPIRIN and CAFFEINE? | ||
− | #Also answer, would you have been able to answer these questions using the web interface that the FDA provides to this database (Drugs@FDA?) | + | #*Sponsor Applicant Companies: [[Media:MillerQuery8.xlsx | Question 8 Answers]] |
+ | #**SELECT SponsorName, ActiveIngredient FROM Applications RIGHT JOIN Products ON Applications.ApplNo = Products.ApplNo WHERE ActiveIngredient LIKE '*ASPIRIN*CAFFEINE*' AND SponsorName LIKE '*LABS*'; | ||
+ | #Also answer, would you have been able to answer these questions using the web interface that the FDA provides to this database (Drugs@FDA?) No, the web interface [https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm search function] only allows the user to search by drug name, active ingredient, or application number. For many of the questions, we would not be able to find the specific answers for what we are searching for. For example, in Question 1, the FDA website does not have an option to search by form. Another example is in Question 9, where we have two requirements for our answers, which is that the name contains LABS and the active ingredients include both aspirin and caffeine. The FDA website cannot search by company name, and it can only search by one active ingredient at a time, so finding the answers for Question 9 is not possible. | ||
+ | |||
+ | ==Scientific Conclusion== | ||
+ | This project taught me how to download data that is stored in files, as well as how to convert it into tables for one's own database. I also learned how to define relationships between these tables, as well as how to use the query function in Microsoft Access. I did fulfill the purpose, as I learned how to download data sets and configure them, and I got some practice with performing queries. | ||
+ | |||
+ | ==Data and Files== | ||
+ | [[Media:KmillerDataBase.zip | Katie Miller's Database]] | ||
+ | |||
+ | [[Media:MillerQuery1.xlsx | Question 1 Answers]] | ||
+ | |||
+ | [[Media:MillerQuery2.xlsx | Question 2 Answers]] | ||
+ | |||
+ | [[Media:MillerQuery3.xlsx | Question 3 Answers]] | ||
+ | |||
+ | [[Media:MillerQuery4.xlsx | Question 4 Answers]] | ||
+ | |||
+ | [[Media:MillerQuery7.xlsx | Question 7 Answers]] | ||
+ | |||
+ | [[Media:MillerQuery8.xlsx | Question 8 Answers]] | ||
+ | |||
+ | ==Acknowledgements== | ||
+ | I worked with my classmates during class time on 3-07-24, 3-09-24, and 3-12-24. On the 7th and 9th, we worked under the guidance of Dr. Dionisio, and on the 12th under the guidance of [[User:Kdahlquist | Dr. Dahlquist]]. | ||
+ | |||
+ | Except for what is noted above, this individual journal entry was completed by me and not copied from another source. | ||
+ | |||
+ | [[User:Kmill104|Kmill104]] ([[User talk:Kmill104|talk]]) 22:59, 13 March 2024 (PDT) | ||
+ | |||
+ | ==References== | ||
+ | *Access SQL: basic concepts, vocabulary, and syntax—Microsoft Support. (n.d.). Retrieved March 13, 2024, from https://support.microsoft.com/en-us/office/access-sql-basic-concepts-vocabulary-and-syntax-444d0303-cde1-424e-9a74-e8dc3e460671?ui=en-us&rs=en-us&ad=us | ||
+ | *Drugs@FDA Data Files. (n.d.). FDA; FDA. Retrieved March 13, 2024, from https://www.fda.gov/drugs/drug-approvals-and-databases/drugsfda-data-files | ||
+ | *Drugs@FDA: FDA-Approved Drugs. (n.d.). Retrieved March 13, 2024, from https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm | ||
+ | *Ford, P. (2015, June 11). What Is Code? Bloomberg. https://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/ | ||
+ | *LMU BioDB 2024. (2024). Week 8. Retrieved March 13, 2024, from https://xmlpipedb.cs.lmu.edu/biodb/spring2024/index.php/Week_8 | ||
+ | |||
+ | {{Template:Kmill104}} |
Latest revision as of 22:50, 13 March 2024
Contents
Purpose
The purpose of this week's assignment was to learn how data sets are organized and how relationships are defined between them, as well as how to download data and set up these relationships on our own database. It also gave us practice using queries in Microsoft Access and following the SQL format.
Methods
- My first step was to click on the Drugs@FDA Data Files link on the Week 8 page.
- On the database website, there is a file containing the FDA Drug Database's data sets.
- Screenshot of website: Drugs@FDA Data Files
- Clicking this file link then downloaded a zip folder, which I had to decompress in FileExplorer.
- Within the folder, there were 11 text documents for each table in this image: Drugs@FDA Data Files
- On the database website, there is a file containing the FDA Drug Database's data sets.
- I then created my own Microsoft Access database, and navigated to the External Data tab in the toolbar.
- From there, I was able to click on New Data Source, which gave me the option to choose a text file.
- I then imported each of the text files as a new table into the current database.
- For each file, I kept the data in the Delimited format and chose tab as the delimiter to separate the fields.
- I also made sure to check off that the first row contained the field names.
- For each field type, I kept the names the same but, if needed, I changed the data type to match what is seen on the FDA website.
- On the website, each table column is assigned some data type, with the five options for data types being char, varchar, int, text, and datetime.
- Submissions table as website example: Drugs@FDA Data Files
- In Access, each column that has data type char on the website I then assigned the Short Text data type. For the others, varchar was assigned as Long Text, int as Integer or Long Integer, text as Long Text, and datetime as Date with Time.
- For the Submissions and ApplicationDocs files, I ended up with import errors when downloading some of the columns, specifically the datetime columns. Because these columns were listed as NULL on the website, I then chose to not import them and I had no more import errors.
- I also attempted to match the table's primary keys to what is seen on the website, but Access often stated that the column could not be a primary key as it contained duplicates or was missing data in some rows. In these cases, I chose there to be no primary key.
- After downloading each file and configuring the fields in its table, I navigated to the Database Tools tab and then clicked on Relationships.
- In Relationships, I was able to click on "Edit Relationships" which opened a tool where I could enter my desired two tables under Table/Query: and Related Table/Query:, and then underneath these I could choose the related columns. After clicking OK, a line indicating the relationship is displayed between the two tables.
- I repeated this step for every relationship that is shown on the FDA website, which is similarly indicated by lines between tables.
- For those tables that had multiple columns in common, I made sure to define the relationship for each of them.
- After setting up my database, I was able to navigate to the Create tab, where I clicked QueryDesign to create a new query. The queries I used to answer the questions are indicated below, and they are all in the SQL format.
Results
- What are the names of the drug products that are administered in the form INJECTABLE;INTRAVENOUS, SUBCUTANEOUS?
- Drug Names: Question 1 Answers
- SELECT DrugName FROM Products WHERE form ='INJECTABLE;INTRAVENOUS, SUBCUTANEOUS';
- Drug Names: Question 1 Answers
- What are the names of the drug products whose active ingredient (activeingred) is ATROPINE?
- Drug Names: Question 2 Answers
- SELECT DrugName FROM Products WHERE ActiveIngredient ='ATROPINE';
- Drug Names: Question 2 Answers
- In what forms and dosages can the drug product named BENADRYL be administered?
- Forms and Dosages: Question 3 Answers
- SELECT Form, Strength FROM Products WHERE DrugName = 'BENADRYL';
- Forms and Dosages: Question 3 Answers
- Which drug products have a name ending in ESTROL?
- Drug Names: Question 4 Answers
- SELECT DrugName FROM Products WHERE DrugName LIKE '*ESTROL';
- Drug Names: Question 4 Answers
- Produce a table listing all of the known values for the therapeutic_potential column in the application table and how many application records there are of each. (Side note: The therapeutic_potential codes are explained in the ReviewClass_Lookup.txt file, in case you’re interested.)
- Not possible, no therapeutic_potential column
- Produce a table listing all of the known values for the chemical_type column in the application table and how many application records there are of each. (Side note: The chemical_type codes are explained in the ChemTypeLookup.txt file, in case you’re interested.)
- Not possible, no chemical_type column
- What are the names of the drug products that are sponsored (sponsor applicant column) by MERCK?
- Drug Names: Question 7 Answers
- SELECT Products.DrugName FROM Products LEFT JOIN Applications ON Products.ApplNo = Applications.ApplNo WHERE (((Applications.[SponsorName])='MERCK'));
- Drug Names: Question 7 Answers
- Which sponsor applicant companies have the text LABS in their names and have products whose active ingredients (activeingred) include both ASPIRIN and CAFFEINE?
- Sponsor Applicant Companies: Question 8 Answers
- SELECT SponsorName, ActiveIngredient FROM Applications RIGHT JOIN Products ON Applications.ApplNo = Products.ApplNo WHERE ActiveIngredient LIKE '*ASPIRIN*CAFFEINE*' AND SponsorName LIKE '*LABS*';
- Sponsor Applicant Companies: Question 8 Answers
- Also answer, would you have been able to answer these questions using the web interface that the FDA provides to this database (Drugs@FDA?) No, the web interface search function only allows the user to search by drug name, active ingredient, or application number. For many of the questions, we would not be able to find the specific answers for what we are searching for. For example, in Question 1, the FDA website does not have an option to search by form. Another example is in Question 9, where we have two requirements for our answers, which is that the name contains LABS and the active ingredients include both aspirin and caffeine. The FDA website cannot search by company name, and it can only search by one active ingredient at a time, so finding the answers for Question 9 is not possible.
Scientific Conclusion
This project taught me how to download data that is stored in files, as well as how to convert it into tables for one's own database. I also learned how to define relationships between these tables, as well as how to use the query function in Microsoft Access. I did fulfill the purpose, as I learned how to download data sets and configure them, and I got some practice with performing queries.
Data and Files
Acknowledgements
I worked with my classmates during class time on 3-07-24, 3-09-24, and 3-12-24. On the 7th and 9th, we worked under the guidance of Dr. Dionisio, and on the 12th under the guidance of Dr. Dahlquist.
Except for what is noted above, this individual journal entry was completed by me and not copied from another source.
Kmill104 (talk) 22:59, 13 March 2024 (PDT)
References
- Access SQL: basic concepts, vocabulary, and syntax—Microsoft Support. (n.d.). Retrieved March 13, 2024, from https://support.microsoft.com/en-us/office/access-sql-basic-concepts-vocabulary-and-syntax-444d0303-cde1-424e-9a74-e8dc3e460671?ui=en-us&rs=en-us&ad=us
- Drugs@FDA Data Files. (n.d.). FDA; FDA. Retrieved March 13, 2024, from https://www.fda.gov/drugs/drug-approvals-and-databases/drugsfda-data-files
- Drugs@FDA: FDA-Approved Drugs. (n.d.). Retrieved March 13, 2024, from https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm
- Ford, P. (2015, June 11). What Is Code? Bloomberg. https://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/
- LMU BioDB 2024. (2024). Week 8. Retrieved March 13, 2024, from https://xmlpipedb.cs.lmu.edu/biodb/spring2024/index.php/Week_8
User Page
Assignment Pages
- Week 1
- Week 2
- Week 3
- Week 4
- Week 5
- Week 6
- Week 8
- Week 9
- Week 10
- Week 11
- Week 12
- Week 13
- Week 14
- Week 15
Individual Journal Entry Pages
- Kmill104 Week 1
- Kmill104 Week 2
- MSymond1 KMill104 Week 3
- Monarch Initiative Week 4
- Kmill104 Week 5
- Kmill104 Week 6
- Kmill104 Week 8
- Kmill104 Week 9
- Kmill104 Week 10
- Kmill104 Week 11
- Kmill104 Week 12
- Data Analysts Week 13
- Data Analysts Week 14
- Data Analysts Week 15