SQL basics

SQL is a programming language for working with data stored in relational databases. There are many flavors of SQL used by various database systems (MySQL, Postgres, Oracle, SQL Server etc.), but they share the same basic syntax and functionality. Here we show how to use SQL while working with databases created using SQLite software.

SQL in Jupyter Notebook

In order to execute SQL code in Jupyter Notebook, we will use the ipython-sql extension of the notebook. The following command, executed in a notebook, will install this extension:

[1]:
Copy to clipboard
%pip install ipython-sql
Copy to clipboard

Next, we need to load the extension:

[2]:
Copy to clipboard
 %load_ext sql
Copy to clipboard

From now on, when the first line of a code cell is %%sql, the rest of the cell will be evaluated as an SQL (and not Python) code.

Creating tables

To access data stored in a database, we need to connect to the database first. Typically this requires connecting to a remote database server, but SQLite is special in this respect - its databases are saved in files stored locally on a computer. The command below connects to a database stored in a file tiny.sqlite. If such file does not exist, it will be automatically created:

[3]:
Copy to clipboard
%%sql
sqlite:///tiny.sqlite
Copy to clipboard
[3]:
Copy to clipboard
'Connected: @tiny.sqlite'
Copy to clipboard

The newly created database is empty. The code below creates in it a table named courses with three columns named course_id, course_name, and credits:

[ ]:
Copy to clipboard
%%sql
CREATE TABLE courses (
    course_id TEXT,
    course_name TEXT,
    credits INT);
Copy to clipboard

The code

course_id TEXT,
course_name TEXT,
credits INT
Copy to clipboard

specifies that the columns course_id and course_name will be used to store text, and the credits column will contain integer values. While many database systems enforce such data type specifications, SQLite treats them merely as suggestions - e.g. it will let us store text values in the credits column without any problems.

SQL does not provide a standard way of listing tables existing in a database. However, every SQLite database contains a special table sqlite_master which records all database components. The code below retrieves the whole content of sqlite_master. It shows that the courses table now exists in the database:

[5]:
Copy to clipboard
%%sql
SELECT *
FROM sqlite_master;
Copy to clipboard
 * sqlite:///tiny.sqlite
Done.
Copy to clipboard
[5]:
Copy to clipboard
type name tbl_name rootpage sql
table courses courses 2 CREATE TABLE courses (
course_id TEXT,
course_name TEXT,
credits INT)

Deleting tables

In order to delete a table my_table from a database we use the statement

DROP TABLE  my_table;
Copy to clipboard

This will generate an error if there is no table with the specified name:

[6]:
Copy to clipboard
%%sql
DROP TABLE  my_table;
Copy to clipboard
 * sqlite:///tiny.sqlite
(sqlite3.OperationalError) no such table: my_table
[SQL: DROP TABLE  my_table;]
(Background on this error at: http://sqlalche.me/e/e3q8)
Copy to clipboard

This error can be avoided by adding a check if the table exists prior to deleting it:

[ ]:
Copy to clipboard
%%sql
DROP TABLE IF EXISTS my_table;
Copy to clipboard

Creating table records

The statement

SELECT *
FROM my_table
Copy to clipboard

retrieves all content of my_table. Applied to the courses table, it will give just column headers, since at the moment this table is empty:

[8]:
Copy to clipboard
%%sql
SELECT *
FROM courses;
Copy to clipboard
 * sqlite:///tiny.sqlite
Done.
Copy to clipboard
[8]:
Copy to clipboard
course_id course_name credits

Records can be added to a table using the syntax

INSERT INTO my_table (column1, column2 ,..)
VALUES (value1, value2 ,...);
Copy to clipboard

This adds a new record to my_table with value1 placed in column1 etc.:

[9]:
Copy to clipboard
%%sql
INSERT INTO courses (course_id, course_name, credits)
VALUES ('MTH 141','Calculus I', 4);
Copy to clipboard
 * sqlite:///tiny.sqlite
1 rows affected.
Copy to clipboard
[9]:
Copy to clipboard
[]
Copy to clipboard

We can verify that a record has been added to the courses table:

[10]:
Copy to clipboard
%%sql
SELECT *
FROM courses;
Copy to clipboard
 * sqlite:///tiny.sqlite
Done.
Copy to clipboard
[10]:
Copy to clipboard
course_id course_name credits
MTH 141 Calculus I 4

Several records can be added to a table at once:

[11]:
Copy to clipboard
%%sql
INSERT INTO courses (course_id, course_name, credits)
VALUES
  ('MTH 142', 'Calculus II', 4),
  ('MTH 448', 'Data Oriented Computing', 3),
  ('MTH 462', 'Topics in Analysis', 2);

SELECT *
FROM courses;
Copy to clipboard
 * sqlite:///tiny.sqlite
3 rows affected.
Done.
Copy to clipboard
[11]:
Copy to clipboard
course_id course_name credits
MTH 141 Calculus I 4
MTH 142 Calculus II 4
MTH 448 Data Oriented Computing 3
MTH 462 Topics in Analysis 2

Modifying records

Existing records can be modified using the statement

UPDATE my_table
SET
    column_1 = new_value_1,
    column_2 = new_value_2
WHERE
    condition;
Copy to clipboard

The condition given after WHERE specifies that only the records for which this condition holds true are to be changed. For example, the code below changes the number of credits of the MTH 141 course to 6:

[12]:
Copy to clipboard
%%sql
UPDATE courses
SET
    credits = 6
WHERE
    course_id = 'MTH 141';

SELECT *
FROM courses;
Copy to clipboard
 * sqlite:///tiny.sqlite
1 rows affected.
Done.
Copy to clipboard
[12]:
Copy to clipboard
course_id course_name credits
MTH 141 Calculus I 6
MTH 142 Calculus II 4
MTH 448 Data Oriented Computing 3
MTH 462 Topics in Analysis 2

Deleting records

The statement

DELETE FROM my_table
WHERE
    condition;
Copy to clipboard

deletes from my_table all records for which the specified condition is satisfied. For example, we can use it to delete all courses with less than 4 credits from the courses table:

[13]:
Copy to clipboard
%%sql
DELETE FROM courses
WHERE
    credits < 4;

SELECT *
FROM courses;
Copy to clipboard
 * sqlite:///tiny.sqlite
2 rows affected.
Done.
Copy to clipboard
[13]:
Copy to clipboard
course_id course_name credits
MTH 141 Calculus I 6
MTH 142 Calculus II 4

Note

Here are a few comments on the general SQL syntax:

  1. Indentations and newlines in SQL code are ignored, but they can be used to increase code readability.

  2. While it is customary to capitalize keywords in SQL code, this is not necessary. select * from courses is equivalent to SELECT * FROM courses.

  3. The semicolon ; marks the end of an SQL statement and separates each statement from the next one. If we are executing only one statement, its ending semicolon can be omitted.

  4. In general, names of tables, columns, and other database objects can be used without quotes. The exception from this rule are names which contain special characters (e.g. spaces) or coincide with SQL keywords. Such names must be enclosed in “double quotes”. Good practice is to avoid names which require this treatment.

  5. All other strings in SQL code (data in table records etc.) must be enclosed in ‘single quotes’.