Difference between revisions of "A Quick Relational Database Tour"

From LMU BioDB 2017
Jump to: navigation, search
(Loading Data: Start typing out data-loading instructions.)
(Loading Data: Adjust title; fix typo.)
Line 32: Line 32:
 
To start using a database, click on its icon. The red '''x''' disappears from the database icon and you should now be able to work.
 
To start using a database, click on its icon. The red '''x''' disappears from the database icon and you should now be able to work.
  
== Loading Data ==
+
== Creating a Table ==
  
 
It’s hard to use a database without any data—so let’s bring some in. For this tour, we’ll use some publicly-available data: the [https://www.fda.gov/Drugs/InformationOnDrugs/ucm079750.htm FDA Drug Database].
 
It’s hard to use a database without any data—so let’s bring some in. For this tour, we’ll use some publicly-available data: the [https://www.fda.gov/Drugs/InformationOnDrugs/ucm079750.htm FDA Drug Database].
Line 56: Line 56:
 
  select * from product
 
  select * from product
  
The '''Data Output''' tab of the ''Output Pane'' should now show an empty tabular display with headings for '''apple_no''', '''product_no''', '''form''', '''strength''', '''reference_drug''', '''drug_name''', '''active_ingredient''', and '''reference_standard'''.
+
The '''Data Output''' tab of the ''Output Pane'' should now show an empty tabular display with headings for '''appl_no''', '''product_no''', '''form''', '''strength''', '''reference_drug''', '''drug_name''', '''active_ingredient''', and '''reference_standard'''.
  
 
== SQL Injection ==
 
== SQL Injection ==

Revision as of 05:34, 26 September 2017

This page gives you a tutorial-style walkthrough of a relational database, specifically PostgreSQL. The walkthrough assumes that you are at a Seaver 120 lab computer.

Overview

Architecture-layers.png

This diagram is a simplified rendition of how many web databases are structured. Most of the time, we only interact with these systems through a web browser. Later this semester, we will interact with the web service layer. This tutorial gives you a taste of what it’s like to interact directly with a database.

Running PostgreSQL on the Lab Machines

  1. Login to the computer as usual
  2. Launch pgAdmin III
  3. The pgAdmin III window starts with a hierarchical view on the left that starts with three layers:
    • Server Groups
      • Servers (1)
        • PostgreSQL 9.6 (localhost:5432)
  4. Double-click on PostgreSQL 9.6 (localhost:5432) to connect to the database server
  5. The password to start the server will be given in class

Creating a Database

Once the server is running, the red x disappears from the PostgreSQL 9.6 (localhost:5432) icon, and additional icons appear beneath it. If you click on the + button to the left of the Databases icon, you will see the databases that are currently available. Initially, you will see a single database called postgres.

To create a database, right-click on the Databases icon and choose New Database... from the menu that appears. In the New Database dialog, the only information you need to supply is your new database’s name.

When you click OK, you will return to the main pgAdmin III window and you should see your new database underneath the Databases icon.

Note that you only need to go through this creation process once; that database will remain available until it is explicitly deleted.

Connecting to a Database

To start using a database, click on its icon. The red x disappears from the database icon and you should now be able to work.

Creating a Table

It’s hard to use a database without any data—so let’s bring some in. For this tour, we’ll use some publicly-available data: the FDA Drug Database.

Download the linked .zip file in the Download File section of that page. Or curl it from the command line! You can use curl -O to automatically save the download as a file.

When you decompress the file, you’ll see that it’s a bunch of .txt files. Many databases are distributed in this format for maximum interoperability. However, this means that we have to do some work to get it into an actual relational database management system like PostgreSQL.

First, we have to know what is in the database. The FDA provides this in the “Data Definitions and ERD for Drugs@FDA” section of the above-linked page. If you click it, you’ll see a diagram of how the data in the files are related and the column definitions of each table.

We don’t have time to load everything, so we’ll just take one: the Products table. We see its definition on the page, and we can translate this into an SQL CREATE statement:

create table product(appl_no varchar(6) not null, product_no varchar(6) not null primary key, form varchar(255), strength varchar(240), reference_drug int, drug_name varchar(125), active_ingredient varchar(255), reference_standard int)

To do this in pgAdmin III, click the SQL button in the toolbar. A new window with an SQL Editor tab appears. You can type (or copy-paste) the above command into that tab.

As always, watch out for typos! When ready, click on the Execute query button in the toolbar. (its button looks like a green play button)

Upon executing the query, a success message should appear in the Messages tab of the Output Pane in the bottom half of the window.

To assure yourself that the product table is indeed there, type and execute this query:

select * from product

The Data Output tab of the Output Pane should now show an empty tabular display with headings for appl_no, product_no, form, strength, reference_drug, drug_name, active_ingredient, and reference_standard.

SQL Injection

First, a little leadoff cartoon: http://xkcd.com/327

SQL injection is the term for a security exploit that attempts to bypass the web front end of a database and go directly to an SQL query. The exploit works on websites that do not—as stated in the XKCD strip—“sanitize their inputs.” In other words, a site is at risk for SQL injection if its query text field does not check for characters that might “break out of” the search term and thus spill into the SQL code.

This is probably best illustrated with an example…