Hierarchical Queries in Oracle

1- What is hierarchical query?

A hierarchical query is a type of SQL query that handles hierarchical model data. They are special case of more general recursive fixpoint queries

2- Demo Database

In this post, I use SCOTT schema in order to illustrate the examples. If you do not have this schema, you can create it via the following Script:
** Create Table **
-- Create table
create table EMP
(
  EMPNO    NUMBER(4) not null,
  ENAME    VARCHAR2(10),
  JOB      VARCHAR2(9),
  MGR      NUMBER(4),
  HIREDATE DATE,
  SAL      NUMBER(7,2),
  COMM     NUMBER(7,2),
  DEPTNO   NUMBER(2)
)
alter table EMP
  add constraint EMP_PK primary key (EMPNO);

alter table EMP
  add constraint EMP_E_FK foreign key (MGR)
  references EMP (EMPNO);

 
** Insert Data **
insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7839, 'KING', 'PRESIDENT', null, to_date('17-11-1981', 'dd-mm-yyyy'), 5000, null, 10);

insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7566, 'JONES', 'MANAGER', 7839, to_date('02-04-1981', 'dd-mm-yyyy'), 2975, null, 20);


insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7698, 'BLAKE', 'MANAGER', 7839, to_date('01-05-1981', 'dd-mm-yyyy'), 2850, null, 30);

insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7782, 'CLARK', 'MANAGER', 7839, to_date('09-06-1981', 'dd-mm-yyyy'), 2450, null, 10);

insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7788, 'SCOTT', 'ANALYST', 7566, to_date('19-04-1987', 'dd-mm-yyyy'), 3000, null, 20);


insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7844, 'TURNER', 'SALESMAN', 7698, to_date('08-09-1981', 'dd-mm-yyyy'), 1500, 0, 30);

insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7876, 'ADAMS', 'CLERK', 7788, to_date('23-05-1987', 'dd-mm-yyyy'), 1100, null, 20);

insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7900, 'JAMES', 'CLERK', 7698, to_date('03-12-1981', 'dd-mm-yyyy'), 950, null, 30);

insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7902, 'FORD', 'ANALYST', 7566, to_date('03-12-1981', 'dd-mm-yyyy'), 3000, null, 20);

insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7934, 'MILLER', 'CLERK', 7782, to_date('23-01-1982', 'dd-mm-yyyy'), 1300, null, 10);

insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7369, 'SMITH', 'CLERK', 7902, to_date('17-12-1980', 'dd-mm-yyyy'), 800, null, 20);

insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7499, 'ALLEN', 'SALESMAN', 7698, to_date('20-02-1981', 'dd-mm-yyyy'), 1600, 300, 30);

insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7521, 'WARD', 'SALESMAN', 7698, to_date('22-02-1981', 'dd-mm-yyyy'), 1250, 500, 30);


insert into EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)
values (7654, 'MARTIN', 'SALESMAN', 7698, to_date('28-09-1981', 'dd-mm-yyyy'), 1250, 1400, 30);

commit;
In the EMP table:
  • EMPNO is ID of employee (Primary key of table)
  • MRG is ID of direct manager of the employee.

3- Hierarchical queries

Syntax of Hierarchical query:
** Syntax **
Select Column1, Column2, ...

From <Table1>, <Table2>, ...

Where <Condition3>

Connect By <Codition2>

Start With <Condition1>

3.1- Principle of operation

  1. Firstly, statement will retrieve all rows in the table correspond to the conditions in the 'start with' clause (condition1) as the root of the tree (root or level 1).
  2. Next, for each of root, it is needed to scan the entire table in order to take out the next records that sastifies the condition in the connect by clause (condition2) (level 2 node). Similarly, for each of level 2 node, it is needed to scan the entire table in order to take out the next records for the next level that sastifies the condition in the connect by clause (level 3 node), let's keep it continuous until there is no any records sastifying the connect by condition, the previous node is the level of tree.
  3. Finally, let's check the condition of where (condition3) clause in order to take out all records of the select tree clause.
For example, the hierarchical tree starts from those who do not have a manager (MRG is null).
Select Level
     ,Emp.Empno
     ,Emp.Ename
     
     , -- Manager no
      Emp.Mgr Manager_No
     
     , -- Name of manager
     -- Prior: The Operator point to prior record.
      Prior Emp.Ename Manager_Name
     
From   Emp
Connect By Prior Emp.Empno = Emp.Mgr
Start  With Emp.Mgr Is Null;
Results:
Tree model:

3.2- Removing a Node or a tree branch

Based on the operational mechanism that is above mentioned, you can solve the problem: How to remove 1 certain node or the entire branch of the tree.

Remove Nodes:

Remove any node: You need to wait for the tree to be successfully created ( connect by finished) and put conditions on where clause to remove the node
Select Level
     ,Emp.Empno Emp_No
     ,Lpad(' '
          ,4 * (Level - 1)) || Emp.Ename Emp_Name
     ,Emp.Mgr Manager_No
     ,Prior Emp.Ename Manager_Name
From   Emp
Where  Emp.Ename != 'JONES'
Connect By Prior Emp.Empno = Emp.Mgr
Start  With Emp.Mgr Is Null;

Remove a branch:

To remove any branch. You need to put conditions on the process of creating the tree, which is included in the CONNECT BY clause.
Select Level
     ,Emp.Empno Emp_No
     ,Lpad(' '
          ,4 * (Level - 1)) || Emp.Ename Emp_Name
     ,Emp.Mgr Manager_No
     ,Prior Emp.Ename Manager_Name
From   Emp
Connect By Prior Emp.Empno = Emp.Mgr
   And    Emp.Ename != 'JONES'
Start  With Emp.Mgr Is Null;

4- The virtual columns used in the Hierarchical query

Từ khóa/Toán tử Ý nghĩa
Level Show the level of node (depth level), root node with level 1
Prior Operator shows the previous record
Connect_By_Isleaf Leaf node or not? the leaf node is 1, if not is 0
CONNECT_BY_ROOT(Column) Return the column value of root node (level = 1)
NOCYCLE Operator in the 'connect by' clause to remove duplication. For example, A manages B, B manages C, C manages A. In case, connect by determines manager who will be duplicated endlessly and  NOCYCLE operator will be helpful in this case.
CONNECT_BY_ISCYCLE Column determine the value is be duplicated or not? The duplicated result is 1 and if not is 0. The column needs to be used together with  NOCYCLE operator in 'connect by' clause
SYS_CONNECT_BY_PATH(column, separated values) Create column value path from root node to current node, each of note is separated by "separated values".
ORDER SIBLINGS BY Arrange child nodes in the same level.
Example:
Select Level
     ,Emp.Empno Emp_No
     ,Lpad(' '
          ,4 * (Level - 1)) || Emp.Ename Emp_Name
     ,Emp.Mgr Mgr_No
     ,Prior Emp.Ename Mgr_Name
     ,Connect_By_Isleaf Is_Leaf -- 
     ,Connect_By_Root(Emp.Ename) Root_Mgr_Name -- Name of root employee
     ,Connect_By_Iscycle Iscycle -
     ,Sys_Connect_By_Path(Emp.Empno
                         ,':') Path -- The path
From   Emp
Connect By Nocycle Prior Emp.Empno = Emp.Mgr
Start  With Emp.Mgr Is Null
Order  Siblings By Emp.Ename;