SQL Basics (DB2 Version)⚓︎
This post provides the SQL basics. I thought I better right it up as its been a long time since I used a relational database and I want to have some skills if I ever have to come back to using a relationship database. The examples below a based around a simple two table data structure: - Customers - Transactions
The majority of this post will be on SELECT Command - this command can be used to extract raw data and summary data from a database.
This command is used to store data into the database (e.g. customers and transactions). Example; Insert a new customer record into the customer or transaction tables.
This command is used to update data already stored in the database. E.g. Update the customer's email address or change the status of a transaction.
UNION ALL - Puts the output of more than one query together.⚓︎
SELECT ‘2019’, * from <schema>.TRANSACTION_2019 UNON ALL SELECT ‘2020’, * from <schema>.TRANSACTION_2020
DECIMAL- Converts a character string containing a full stop to a number⚓︎
SELECT DECIMAL(amount), DECIMAL(fee) FROM <schema>.transaction
CHAR - Converts a Integer into a character field⚓︎
SELECT * FROM <schema>.transaction WHERE CHAR(ID) = 'REFERENCE_ID'
BIGINT - Converts a character field into a Number. Note; the Character field can not contain a full stop.⚓︎
SELECT * FROM <schema>.transaction WHERE ID = BIGINT(REFERENCE_ID)
LENGTH - Returns the character length of the field.⚓︎
select LENGTH(email) from <schema>.CUSTOMER
DATE - Converts Character string into a Date Value And Converts a DateTime field to a Date Value⚓︎
select date('2004-01-01') from <schema>.TRANSACTION
select DATE(created_date_time) from <schema>.TRANSACTION
COLASE - Return a specified value when the displayed field contains NULL (Does not work for character fields)⚓︎
SELECT COALASE(DECIMAL(fee),0) FROM <schema>.TRANSACTION
SUBSTR - Returns certain characters from a specified field⚓︎
SELECT SUBSTR(account,1,3) from <schema>.TRANSACTION
POSSTR - Returns the first position of the character specified.⚓︎
SELECT POSSTR(‘@’, EMAIL) from <schema>.CUSTOMER
CASE - Allows you to apply logic to your output fields⚓︎
SELECT USE CASE WHEN DIRECTION = ‘I’ THEN ‘Incoming’ AND DIRECTION = ‘O’ THEN ‘Outgoin’ AND ELSE ‘Unknown’ END FROM <schema>.TRANSACTION
COUNT - Counts the number of records returned⚓︎
SELECT COUNT(1) from <schema>.CUSTOMER (to count all the customers).
MAX = Returns the maximum value for the field⚓︎
SELECT MAX(created_date_time) from <schema>.TRANSACTION
MIN - Returns the minimum value for the field⚓︎
SELECT MIN(created_date_time) from <schema>.TRANSACTION
AVG - Returns the average value for a field⚓︎
SELECT AVG(fee) from <schema>.TRANSACTION
STDEV - Returns the standard deviation of a field. Standard deviation is representation of the dispersal of values.⚓︎
SELECT STDEV(age) from <schema>.CUSTOMER
This means that 68% of customers are aged between the age of 20 and 30 This means that 96% of customers are aged between the age of 15 and 35
ORDER BY - The order by command allows users to sort the output of a query⚓︎
select FEE from <schema>.TRANSACTION order by FEE desc
GROUP BY - The Group By command allows users to summarize data⚓︎
SELECT DATE(created_date_time) , avg(fee) from <schema>.TRANSACTION group by DATE(created_date_time)
HAVING - Allows us to apply conditions to the results of a group by.⚓︎
SELECT customer_id, count(1) from <schema>.TRANSACTION group by customer_id HAVING count(1) > 100
Combing - SQL Functions can be combined to perform relatively complex logic.⚓︎
SELECT SUBSTR( POSSTR(‘@’, EMAIL) SUBSTR(POSSTR(‘@’, EMAIL) , COUNT(1) from <schema>.CUSTOMER GROUP BY SUBSTR( POSSTR(‘@’, EMAIL) SUBSTR(POSSTR(‘@’, EMAIL)
Creates a count of all the customers domains. The query is broken down as follows: - First, find the position of the “@” symbol - Second, find the position of the first “.” after the “@” symbol - Third, return the fields in between these to characters - Fourth, Group by this field to provide a summary of the information
- Inner joins are the most common join and have the best performance.
- Include only records that are contained in both tables being joined (intersection of the two datasets).
- Customers with no transactions would not be returned in the results.
SELECT * FROM TRANSACTION, CUSTOMER WHERE TRANSACTION.CUSTOMER_ID = CUSTOMER.CUSTOMER_ID
Full Outer Joins:⚓︎
- Full Outer joins are rarely used and have the poorest performance.
- The result of a full outer join includes data from both sides of the join even if no corresponding record is found within the other dataset.
SELECT * FROM TRANSACTION FULL OUTER JOIN CUSTOMER ON TRANSACTION.CUSTOMER_ID = CUSTOMER.CUSTOMER_ID
Left and Right Outer Joins:⚓︎
SELECT * FROM TRANSACTION LEFT OUTER JOIN CUSTOMER ON CUSTOMER.CUSTOMER_ID = TRANSCTION.CUSTOMER_ID