This specifies a search condition for a group.
select sender_id, sum(amount) as balance from transactions
where sender_id is not NULL
group by sender_id HAVING balance > 20
order by balance;
Get an average
SELECT AVG(age) from USERS
WHERE country_code = 'AU';
If no precision is given, will round to nearest integer
ROUND (value, precision)
SELECT song_name, ROUND(AVG(song_length),1)
A single value that's derived by combing several other values.
get a count of records rather than display the data
SELECT COUNT(*) from employees;
Returns the sum of a set of values
SELECT SUM(amount) FROM transactions
WHERE user_id = 9;
Get a max value, and in this case, change the displayed column data from MAX(age) to age
SELECT MAX(age) as age from users
where is_admin = true;
SELECT MIN(age) from users;
Group rows with similar values into summary rows. It returns one row for each group.
Each group can have an aggregate function applied to it that operates only on the group data
Sorts queries, by default in ascending order.
SELECT name, price, quantity FROM products
ORDER BY price;
-- if there is a where clause, put it before ORDER BY
SELECT * FROM transactions
WHERE amount > 10
ORDER BY amount DESC;
Select or view data from a table(s)
SELECT <fields> FROM <table>;
SELECT amount from transactions;
SELECT sender_id AS id from transaction; -- rename the field just for displaying this query
DISTINCT. Use when you want to eliminate any duplicates
SELECT DISTINCT country_code FROM users;
select records where a condition is met
SELECT username FROM users WHERE is_admin IS true;
SELECT username FROM users WHERE hobby IS NOT NULL;
-- add in some logical operators
WHERE shipment_status = 'pending' AND quantity BETWEEN 0 AND 10;
-- user parantheses to adjust order of operations
(this AND that) OR the_other
SELECT COUNT(*) AS junior_count FROM USERS
WHERE (country_code = 'US' OR country_code = 'AU')
AND age < 18;
The IN operator is shorthand for multiple OR conditions
SELECT product_name, shipment_status
FROM product
WHERE shipment_status IN ('shipped','preparing','out of stock');
-- not using IN
SELECT product_name, shipment_status
FROM product
WHERE shipment_status = 'shipped'
OR shipment_status = 'preparing'
OR shipment_status = 'out of stock'
Can be put at the end of a SELECT statement to limit the number of records returned
SELECT * FROM products
WHERE product_name LIKE '%cocko%'
LIMIT 50;
Allows use of % and _ for wildcard operations.
% will match zero or more characters. Similar to how * is used as a wildcard most other places
_ will match only matches 1 character
SELECT* from products
WHERE product_name LIKE 'fu%';
-- or
WHERE product_name LIKE '%it';
-- or
WHERE product_name LIKE 'fu_';
-- or
WHERE product_name LIKE 'fu__';
You specify the fields you want to insert into, then the data
INSERT INTO users (
id,
name,
age
) VALUES (
1,
"cocko",
69
);
Create a new table in a database
CREATE TABLE <table name> ( <field name> <field type>, ... );
CREATE TABLE employees(
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
age INTEGER,
is_manager BOOLEAN,
salary INTEGER
);
Creating foreign keys
Below:
CONSTRAINT fk_departments creates a constraint called fk_departments
FOREIGN KEY (department_id) makes this constraint a foreign key assigned to the department_id field
REFERENCES department(id) link the foreign field id from the departments table
CREATE TABLE departments (
id INTEGER PRIMARY KEY,
department_name TEXT NOT NULL
);
CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
department_id INTEGER,
CONSTRAINT fk_departments
FOREIGN KEY (department_id)
REFERENCES departments(id)
);
While both INT and INTEGER are functionally the same, INTEGER is a standard compliant keyword.
Same with BOOLEAN. BOOL can be used, but BOOLEAN is preferred.
Change a table' structure.
-- rename a table
ALTER TABLE employees
RENAME TO contractors;
-- rename a column
ALTER TABLE contractors
RENAME COLUMN salary TO invoice;
-- add a column
ALTER TABLE contractors
ADD COLUMN job_title TEXT;
-- DROP (delete) a column
ALTER TABLE contractors
DROP COLUMN is_manager;
Some databases do support adding multiple columns in a single ALTER TABLE. Some don't
SQLite doesn't. Each column needs to be added in a seperate ALTER TABLE command.
Removes all records from a table that match the WHERE clause
DELETE FROM users WHERE id = 69;
Update the fields in one or more records
UPDATE users
SET is_admin = true
WHERE id = 69;
UPDATE users
SET job_title = "Big cocko", salary = 999999
WHERE id = 200;
not commonly supported.
select *,
iif (was_successful = true, 'No action required','Perform an audit') as audit
from transactions;
select *,
iif ( country_code = 'CA' OR age > 55, true, false) as discount_eligible
from users;
When creating a join table, it could be an idea to add the UNIQUE constraint as duplicates don't make sense. Below, it doens't make sense for there to be multiple entries in the users_countries join table where both entries are the same.
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
username TEXT UNIQUE NOT NULL,
);
CREATE TABLE countries (
id INTEGER PRIMARY KEY,
country_code TEXT,
name TEXT
);
CREATE TABLE users_countries (
country_id INTEGER,
user_id INTEGER,
UNIQUE(country_id, user_id)
);
A database migration is a change to the structure of a database.
Every migration records how the structure changes over time.
Good migrations are small, incremental and ideally reversible
UP Migrations applies changes to move the scheme forward
DOWN migrations rolls those changes back to the previous state
There are tools to help manage this
Go - Goose
Python - Alembic
In SQL, a null value indicates the absence of a value. Different to a zero value
When creating a table, columns can be constrained such that they can't be NULL
PRIMARY KEY - uniquely identified each row (i would assume this implies not null and unique)
NOT NULL - can't be null
UNIQUE - no two rows can have the same value
You can have a query that uses the result of a previous query rather than from the database.
The syntax for a subquery is that it is surrounded by parentheses.
In the below example, IN is uses as the subquery could return multiple rows
SELECT id, song_name, artist_id
FROM songs
WHERE artist_id IN (
SELECT id
FROM artists
WHERE artist_name LIKE 'Rick%'
);
The Simplist and default join is the Inner Join.
It returns all records in table_1 that have matching recordes in table_2
It returns all fields from both tables.
The INNER keybword affect the number of rows returned, not the number of columns.
department_id is the field in the employees table
department.id is the id field in the departments table
ON clause ensure that rows are matched based on these columns, creating the relationship
SELECT * FROM employees
INNER JOIN departments
ON employees.department_id = departments.id;
SELECT users.name, age, countries.name AS country_name
FROM users
INNER JOIN countries ON countries.country_code = users.country_code
order by country_name;
SELECT users.name, SUM(transactions.amount) AS sum, count(transactions.id) AS count
FROM users
LEFT JOIN transactions
ON users.id = transactions.user_id
GROUP BY users.id
ORDER BY sum DESC;
Will return all records in table_1 and any matching records from table_2
Will return all records in table_2 regardless of matches, and all matching records between the tables
Returns all records from both tables
the primary key is automatically indexed.
If you want to index other columns, you manually specify it.
Don't index too much. Only fields that you often to lookups on.
Indexing more than you need uses up memory and slows down writes
It's common to name an index after the column it's created on with the suffix _idx
CREATE INDEX index_name ON table_name (column_name);
eg
CREATE INDEX first_name_last_name_age_idx
ON users (first_name, last_name, age);
A multi column index is sorted by first column then 2nd... an so on
A lookup only on the first column as almost as fast as a single column index, but lookups on the 2nd or subsequent column of a multi column index don't perform well.
Only do a multi column index if you are frequently doing lookups on a specific combination of columns