Listing of the different types of queries
Equi Joins
Hierarchy Queries
Corelated Subqueries
Hierarchy Query SYS_CONNECT_BY_PATH
The Oracle CASE Statement
Listagg: Usage
Pivot Query
Cube, Rollup,Grouping
The WITH Clause
Equi Joins An equijoin is is a join condition having an equality operator on tables having rows containing similar values. Here care must be taken to include all the required columns in the join condition.
The join could be tables having a 1to 1 mapping or a 1 to many relation.
If a join is made on 2 tables having a 1 to 1 mapping of rows between the two tables, an ideal table design would be to have a primary key defined on the join columns on both the tables and to include all the required columns of the primary key in the join condition.
If the join is a 1 to many join, the join column could be a foreign key in the secondary table and index could be created on this column in the secondary table.
Below is an example of an employee and his office location details. Here at a time the employee could be only in 1 office location. We have a join of 4 tables here based on the principles of normalisation.
The employee details are in EMPLOYEES table.
The mapping of the employee to a location is in the EMP_LOCATION_DETAILS table.
The location address details are in the LOCATIONS table.
The country details for the employee location are in the COUNTRIES table.
SELECT emp.first_name,
emp.last_name,
loc.street_address,
loc.postal_code,
loc.city,
loc.state_province,
cnt.COUNTRY_NAME
FROM EMPLOYEES emp,
EMP_LOCATION_DETAILS e1d,
LOCATIONS loc,
COUNTRIES cnt
WHERE emp.EMPLOYEE_ID = e1d.EMPLOYEE_ID
AND e1d.LOCATION_ID = loc.LOCATION_ID
AND loc.COUNTRY_ID = cnt.COUNTRY_ID
AND emp.EMPLOYEE_ID IN (100,101,102,103,104);
Hierarchy Queries
A hierarchial query is a query which handles hierarchial model data.
Example - To trace a family tree from the youngest child to the oldest traceable great great grand parent.
Or to trace the organisation tree from the junior most employee to the highest level manager.
A few rules to be followed for hierarchial queries are
The CONNECT BY condition must be qualified by the PRIOR operator.
PRIOR evaluates the immediately following expression for the parent row of the current row in a hierarchial query. It causes Oracle to use the value of the immediate parent row in the expression column to connect to the child row.
LEVEL is a pseudo column which shows the level in the hierarchy.
Below is an example of a transaction table.
When a new trade is formed the status of the trade is marked as TRADE.
When an amend is put on that trade, the status of the new trade is marked as REJECT and the status of the amend is marked as TRADE.
Thus a trade can be amended multiple times and each time the status of the latest AMEND trade is marked as TRADE and the status of the previous AMEND trade is updated to REJECT.
Below is a query to get the latest amended trades in the table and the number of amend cycles it has gone through.
CREATE TABLE TRADE_TBL(
trade_id NUMBER,
txn_type VARCHAR2(100),
trade_sts VARCHAR2(100),
trade_rprtd_rfrnce VARCHAR2(100),
trade_rltd_rfrnce VARCHAR2(100)
);
ALTER TABLE TRADE_TBL ADD CONSTRAINT PK_TR_ID PRIMARY KEY (trade_id);
A sample set of trades with their amend history.
INSERT INTO TRADE_TBL VALUES(1003, 'AMNDTRADE', 'TRADE','AMND2','AMND1');
INSERT INTO TRADE_TBL VALUES(1002, 'AMNDTRADE', 'REJECT','AMND1','NEWT1');
INSERT INTO TRADE_TBL VALUES(1001, 'NEWTRADE', 'REJECT','NEWT1',NULL);
INSERT INTO TRADE_TBL VALUES(1014, 'AMNDTRADE', 'TRADE','AMNDTR3','AMNDTR2');
INSERT INTO TRADE_TBL VALUES(1013, 'AMNDTRADE', 'REJECT','AMNDTR2','AMNDTR1');
INSERT INTO TRADE_TBL VALUES(1012, 'AMNDTRADE', 'REJECT','AMNDTR1','NEWTR');
INSERT INTO TRADE_TBL VALUES(1011, 'NEWTRADE', 'REJECT','NEWTR',NULL);
INSERT INTO TRADE_TBL VALUES(1021, 'NEWTRADE', 'TRADE','TRREF1',NULL);
The hierarchy query to get the latest trade and the level of amends on that trade
SELECT a.*, LEVEL FROM TRADE_TBL a
WHERE TRADE_STS = 'TRADE'
CONNECT BY PRIOR TRADE_RPRTD_RFRNCE = TRADE_RLTD_RFRNCE
START WITH TRADE_RLTD_RFRNCE IS NULL
AND TXN_TYPE = 'NEWTRADE';
A corelated subquery is a subquery that contains a reference to a table that also appears in the outer query. It uses values from the outer query, requiring the inner query to execute once for each row of the outer query. Thus the two queries are interdependent on each other. The inner query takes the value from the outer query to derive an output which is in turn used by the outer query for further evaluation.
The working of the corelated subquery can become more clear with a few examples below
For the example we will use the HR sample schema provided with the Oracle database.
To get the list of employees who have handled more than one portfolio in the company.
SELECT FIRST_NAME || ' ' || LAST_NAME, JOB_ID
FROM EMPLOYEES e
WHERE EXISTS
(SELECT 1
FROM JOB_HISTORY h
WHERE e.employee_id = h.employee_id);
Using a corelated subquery in an dml update.
In the employee table we also want to store the city in which the employee is working.
Add a column in the EMPLOYEES table.
ALTER TABLE EMPLOYEES ADD (EMP_CITY VARCHAR2(30 BYTE));
In the employee table, for each employee, we corelate it with the DEPARTMENTS table and get the department location id.
This we join with the LOCATIONS table to get the location CITY. This CITY fetched from the inner query is updated in EMP_CITY column in the EMPLOYEES table.
UPDATE EMPLOYEES e
SET EMP_CITY =
(
SELECT CITY
FROM DEPARTMENTS d, LOCATIONS l
WHERE d.DEPARTMENT_ID = e.DEPARTMENT_ID
AND d.LOCATION_ID = l.LOCATION_ID
)
WHERE e.DEPARTMENT_ID IS NOT NULL;
Suppose we want to assign employees not allocated to any department to Seattle location, we can do that with NVL condition.
UPDATE EMPLOYEES e
SET EMP_CITY =
NVL((
SELECT CITY
FROM DEPARTMENTS d, LOCATIONS l
WHERE d.DEPARTMENT_ID = e.DEPARTMENT_ID
AND d.LOCATION_ID = l.LOCATION_ID
),'Seattle');
A corelated subquery can also be written as a normal join query between 2 tables, but in many cases it is useful to write the query as a corelated query. For example
To get a list of all customers who have at least 1 order in the ORDERS table you can use the EXISTS condition.
This also enhances the performance as EXISTS checks for a single occurance and evaluates to TRUE without scanning for further records.
The WITH Clause
The WITH clause (also known and Common Table Expression CTE) can be considered as a sub query, an inline view or a temporary table to store the data in cache, which can then be read multiple times in the main query. The WITH clause makes the query more manageable making it easier to read and maintain especially in the case of large queries involving multiple sub queries.
An example to select the departments where the count of employees are more than the overall average count of employees in each department.
WITH avg_dept_count as (
select count(EMPLOYEE_ID)/count(distinct(DEPARTMENT_ID)) avg_count FROM EMPLOYEES)
SELECT DEPARTMENT_ID, count(1) from employees
group by DEPARTMENT_ID
HAVING COUNT(1) > (SELECT avg_count from avg_dept_count);
DEPARTMENT_ID, COUNT(1) | |
---|---|
50 | 45 |
80 | 34 |
- a correlated sub query
- an inline query
- the WITH clause Here for each employee, it will corelate the department id of that employee with the department id in the sub query and get the average salary for that department.
- the WITH clause used as an inline subquery
SELECT EMPLOYEE_ID, FIRST_NAME || LAST_NAME emp_name, SALARY, DEPARTMENT_ID
FROM EMPLOYEES a
WHERE SALARY > (SELECT ROUND(AVG(SALARY)) avg_sal
FROM EMPLOYEES b WHERE a.DEPARTMENT_ID = b.DEPARTMENT_ID
GROUP BY b.DEPARTMENT_ID)
ORDER BY DEPARTMENT_ID, emp_name;
In the inline sub query example below, the sub query will compute department wise average salary.
The employee table is joined with this inline table on department id, and then checks if the employee salary is greater than the average salary for that department which is computed in the sub query.
SELECT EMPLOYEE_ID, FIRST_NAME || LAST_NAME emp_name, SALARY, a.DEPARTMENT_ID
FROM EMPLOYEES a, (SELECT ROUND(AVG(SALARY)) avg_sal, DEPARTMENT_ID
from employees b
group by DEPARTMENT_ID) tbl_avg_sal
WHERE
a.DEPARTMENT_ID = tbl_avg_sal.DEPARTMENT_ID
AND a.SALARY > tbl_avg_sal.avg_sal
ORDER BY a.DEPARTMENT_ID, emp_name;
Below is an example of using the WITH clause similar to the corelated sub query.
WITH dept_avg_sal AS (SELECT ROUND(AVG(SALARY)) avg_sal, DEPARTMENT_ID
from employees b
group by DEPARTMENT_ID)
SELECT EMPLOYEE_ID, FIRST_NAME || LAST_NAME emp_name, SALARY, DEPARTMENT_ID
FROM EMPLOYEES a
WHERE SALARY > (select avg_sal
FROM dept_avg_sal b WHERE a.DEPARTMENT_ID = b.DEPARTMENT_ID)
ORDER BY DEPARTMENT_ID, emp_name;
Below is an example of using the WITH clause similar to the inline sub query.WITH dept_avg_sal AS (SELECT ROUND(AVG(SALARY)) avg_sal, DEPARTMENT_ID
from employees b
group by DEPARTMENT_ID)
SELECT EMPLOYEE_ID, FIRST_NAME || LAST_NAME emp_name, SALARY, a.DEPARTMENT_ID, avg_sal
FROM EMPLOYEES a, dept_avg_sal b
WHERE a.DEPARTMENT_ID = b.DEPARTMENT_ID
AND SALARY > avg_sal
ORDER BY a.DEPARTMENT_ID, emp_name;
- The output from all three queries is the same as show below.
EMPLOYEE_ID EMP_NAME SALARY DEPARTMENT_ID 201 MichaelHartstein 13000 20 114 DenRaphaely 11000 30 121 AdamFripp 8200 50 185 AlexisBull 4100 50 193 BritneyEverett 3900 50 189 JenniferDilly 3600 50 188 KellyChung 3800 50 124 KevinMourgos 5800 50 120 MatthewWeiss 8000 50 184 NanditaSarchand 4200 50 122 PayamKaufling 7900 50 137 RenskeLadwig 3600 50 192 SarahBell 4000 50 123 ShantaVollman 6500 50 141 TrennaRajs 3500 50 103 AlexanderHunold 9000 60 104 BruceErnst 6000 60 147 AlbertoErrazuriz 12000 80 158 AllanMcEwen 9000 80 162 ClaraVishney 10500 80 163 DanielleGreene 9500 80 151 DavidBernstein 9500 80 149 EleniZlotkey 10500 80 174 EllenAbel 11000 80 148 GeraldCambrault 11000 80 169 HarrisonBloom 10000 80 156 JanetteKing 10000 80 145 JohnRussell 14000 80 146 KarenPartners 13500 80 168 LisaOzer 11500 80 157 PatrickSully 9500 80 152 PeterHall 9000 80 150 PeterTucker 10000 80 170 TaylerFox 9600 80 100 StevenKing 30000 90 109 DanielFaviet 9000 100 108 NancyGreenberg 12008 100 205 ShelleyHiggins 12008 110
The explain plan for all three queries is as below
EXPLAIN PLAN FOR {query}
The corelated sub query and the WITH clause query have the same execution plan and cost.
The inline query has a different plan and overall has a lower cost.
Execution plan for query WITH clause and corelated subquery
Plan hash value: 2823490115
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 62 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL | EMPLOYEES | 107 | 2782 | 3 (0)| 00:00:01 |
| 3 | SORT GROUP BY NOSORT| | 1 | 7 | 3 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | EMPLOYEES | 10 | 70 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("SALARY"> (SELECT ROUND(SUM("SALARY")/COUNT("SALARY")) FROM
"EMPLOYEES" "B" WHERE "B"."DEPARTMENT_ID"=:B1 GROUP BY
"B"."DEPARTMENT_ID"))
4 - filter("B"."DEPARTMENT_ID"=:B1)
Plan hash value: 112417831
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 714 | 8 (25)| 00:00:01 |
| 1 | SORT ORDER BY | | 17 | 714 | 8 (25)| 00:00:01 |
|* 2 | HASH JOIN | | 17 | 714 | 7 (15)| 00:00:01 |
| 3 | VIEW | | 11 | 176 | 4 (25)| 00:00:01 |
| 4 | HASH GROUP BY | | 11 | 77 | 4 (25)| 00:00:01 |
| 5 | TABLE ACCESS FULL| EMPLOYEES | 107 | 749 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | EMPLOYEES | 107 | 2782 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"."DEPARTMENT_ID"="TBL_AVG_SAL"."DEPARTMENT_ID")
filter("A"."SALARY">"TBL_AVG_SAL"."AVG_SAL")
Thus we see that the cost of the query using inline subquery is less than that of the corelated sub query and the WITH clause. Based on the nature, size and distribution of data the programmer can judiciously decide the type of query to use which gives the fastest output with minimum resources.
Another Example of the WITH clause using Analytic function
To get the list of all employees from the same department have same job title as ADAMS.
WITH x AS ( SELECT first_name || ' ' || last_name ename, job_id job, department_id deptno, COUNT(DECODE(first_name, 'Bruce', 1)) OVER(PARTITION BY department_id, job_id) cnt FROM hr.employees ) SELECT ename, job, deptno FROM x WHERE cnt > 0 ORDER BY 2, 3, 1;
ENAME | JOB | DEPTNO |
---|---|---|
Alexander Hunold | IT_PROG | 60 |
Bruce Ernst | IT_PROG | 60 |
David Austin | IT_PROG | 60 |
Diana Lorentz | IT_PROG | 60 |
Valli Pataballa | IT_PROG | 60 |
References
https://sqlpatterns.wordpress.com
References
A Positive thought for the day
Believe you can and you are half way there
--Theodore Roosevelt
No comments:
Post a Comment