8.6 Enhancements in Oracle Database 10g
Oracle
Database
10g introduces some new features for
hierarchical queries. The new features include the CONNECT_BY_ROOT
operator, the new CONNECT_BY_ISCYCLE and CONNECT_BY_ISLEAF
pseudocolumns, and the NOCYCLE keyword. We will discuss each of these
enhancements in the following sections.
8.6.1 Getting Data from the Root Row
Remember how you can use the PRIOR operator to retrieve a value from
a node's parent row? You can now use the
CONNECT_BY_ROOT operator to retrieve
a value from a node's root. For example:
SELECT lname "Employee", CONNECT_BY_ROOT lname "Top Manager"
FROM employee
START WITH manager_emp_id = 7839
CONNECT BY PRIOR emp_id = manager_emp_id;
Employee Top Manager
-------------------- ------------
JONES JONES
SCOTT JONES
ADAMS JONES
FORD JONES
SMITH JONES
BLAKE BLAKE
ALLEN BLAKE
WARD BLAKE
MARTIN BLAKE
TURNER BLAKE
JAMES BLAKE
CLARK CLARK
MILLER CLARK
In this example, the hierarchy is built by starting with the rows
that meet the condition manager_emp_id = 7839.
This means that anyone whose manager is 7839 will be considered a
root for this query. Those employees will be listed in the result set
of the query along with the name of the top-most manager in their
tree. The CONNECT_BY_ROOT operator returns that top-most manager name
by accessing the root row for each row returned by the query.
8.6.2 Ignoring Cycles
Cycles are not
allowed in
a true tree structure. But life is not perfect, and someday
you're bound to encounter hierarchical data
containing cycles in which a node's child is also
its parent. Such cycles are usually not good, need to be fixed, but
can be frustratingly difficult to identify. You can try to find
cycles by issuing a START WITH . . . CONNECT BY query, but such a
query will report an error if there is a cycle (also known as a loop)
in the data. In Oracle Database 10g, all this
changes.
To allow the START WITH . . . CONNECT BY construct
to work properly even if cycles are present in the data, Oracle
Database 10g provides the new NOCYCLE keyword.
If there are cycles in your data, you can use the
NOCYCLE
keyword in the CONNECT BY clause, and you will not get an error when
hierarchically querying that data.
The test data we have in the employee table
doesn't have a cycle. To test the NOCYCLE feature,
you can introduce a cycle into the existing
employee data by updating the
manager_emp_id column of the top-most employee
(KING with emp_id=7839) with the
manager_emp_id of one of the lowest level
employees (MARTIN with emp_id =
7654):
UPDATE employee
SET manager_emp_id
= 7654
WHERE manager_emp_id IS NULL;
Now, if you perform the following hierarchical query, you will get an
ORA-01436 error:
SELECT LEVEL, LPAD(' ',2*(LEVEL - 1)) || lname "EMPLOYEE",
emp_id, manager_emp_id
FROM employee
START WITH emp_id = 7839
CONNECT BY PRIOR emp_id = manager_emp_id;
LEVEL EMPLOYEE EMP_ID MANAGER_EMP_ID
---------- -------------------- ---------- --------------
1 KING 7839 7654
2 JONES 7566 7839
3 SCOTT 7788 7566
4 ADAMS 7876 7788
3 FORD 7902 7566
4 SMITH 7369 7902
2 BLAKE 7698 7839
3 ALLEN 7499 7698
3 WARD 7521 7698
3 MARTIN 7654 7698
4 KING 7839 7654
5 JONES 7566 7839
6 SCOTT 7788 7566
7 ADAMS 7876 7788
6 FORD 7902 7566
ERROR:
ORA-01436: CONNECT BY loop in user data
15 rows selected.
Other than the error, notice that the whole tree starting with KING
starts repeating under MARTIN. This is erroneous and confusing. Use
the NOCYCLE keyword in the CONNECT BY clause to get rid of the error
message, and to prevent the listing of erroneously cyclic data:
SELECT LEVEL, LPAD(' ',2*(LEVEL - 1)) || lname "EMPLOYEE",
emp_id, manager_emp_id
FROM employee
START WITH emp_id = 7839
CONNECT BY NOCYCLE PRIOR emp_id = manager_emp_id;
LEVEL EMPLOYEE EMP_ID MANAGER_EMP_ID
---------- -------------------- ---------- --------------
1 KING 7839 7654
2 JONES 7566 7839
3 SCOTT 7788 7566
4 ADAMS 7876 7788
3 FORD 7902 7566
4 SMITH 7369 7902
2 BLAKE 7698 7839
3 ALLEN 7499 7698
3 WARD 7521 7698
3 MARTIN 7654 7698
3 TURNER 7844 7698
3 JAMES 7900 7698
2 CLARK 7782 7839
3 MILLER 7934 7782
This query recognizes that there is a cycle, ignores the cycle (as an
impact of the NOCYCLE keyword), and returns the rows as if there were
no cycles. Having the ability to query data containing cycles, your
next problem is to identify those cycles.
|
You can use the NOCYCLE keyword regardless of whether you have a
cycle in your data.
|
|
8.6.3 Identifying Cycles
It is sometimes difficult
to
identify cycles in hierarchical data. Oracle Database
10g's new pseudocolumn,
CONNECT_BY_ISCYCLE,
can help you identify the cycles in the data easily.
CONNECT_BY_ISCYCLE can be used only in conjunction with the NOCYCLE
keyword in a hierarchical query. The CONNECT_BY_ISCYCLE pseudocolumn
returns 1 if the current row has a child that is also its ancestor;
otherwise, it returns 0. For example:
SELECT lname, CONNECT_BY_ISCYCLE
FROM employee
START WITH emp_id = 7839
CONNECT BY NOCYCLE PRIOR emp_id = manager_emp_id;
LNAME CONNECT_BY_ISCYCLE
-------------------- ------------------
KING 0
JONES 0
SCOTT 0
ADAMS 0
FORD 0
SMITH 0
BLAKE 0
ALLEN 0
WARD 0
MARTIN 1
TURNER 0
JAMES 0
CLARK 0
MILLER 0
Since MARTIN is KING's manager in this data set, and
MARTIN also comes under KING in the organization tree, the row for
MARTIN has the value 1 for CONNECT_BY_ISCYCLE.
|
For correct results in subsequent queries, you should revert our
example data back to its original state by rolling back the earlier
change that forced a cycle in the data. If you have already committed
the previous UPDATE, you should update the
employee table again to set the
manager_emp_id column to NULL for KING.
|
|
8.6.4 Identifying Leaf Nodes
In a tree structure, the
nodes at the lowest level of the
tree are referred to as leaf nodes. Leaf nodes have no children.
CONNECT_BY_ISLEAF
is a pseudocolumn that returns 1 if the current row is a leaf, and
returns 0 if the current row is not a leaf. For example:
SELECT lname, CONNECT_BY_ISLEAF
FROM employee
START WITH manager_emp_id IS NULL
CONNECT BY PRIOR emp_id = manager_emp_id;
LNAME CONNECT_BY_ISLEAF
--------------- -----------------
KING 0
JONES 0
SCOTT 0
ADAMS 1
FORD 0
SMITH 1
BLAKE 0
ALLEN 1
WARD 1
MARTIN 1
TURNER 1
JAMES 1
CLARK 0
MILLER 1
This new feature can help simplify SQL statements that need to
identify all the leaf nodes in a hierarchy. Without this
pseudocolumn, to identify the leaf nodes, you would write a query
like the following:
SELECT emp_id, lname, salary, hire_date
FROM employee e
WHERE NOT EXISTS
(SELECT emp_id FROM employee e1 WHERE e.emp_id = e1.manager_emp_id);
EMP_ID LNAME SALARY HIRE_DATE
------- --------------- ---------- ---------
7369 SMITH 800 17-DEC-80
7499 ALLEN 1600 20-FEB-81
7521 WARD 1250 22-FEB-81
7654 MARTIN 1250 28-SEP-81
7844 TURNER 1500 08-SEP-81
7876 ADAMS 1100 23-MAY-87
7900 JAMES 950 03-DEC-81
7934 MILLER 1300 23-JAN-82
However, you can make this query much simpler by using the new
pseudocolumn CONNECT_BY_ISLEAF, as shown here:
SELECT emp_id, lname, salary, hire_date
FROM employee e
WHERE CONNECT_BY_ISLEAF = 1
START WITH manager_emp_id IS NULL
CONNECT BY PRIOR emp_id = manager_emp_id;
EMP_ID LNAME SALARY HIRE_DATE
------- --------------- ---------- ---------
7876 ADAMS 1100 23-MAY-87
7369 SMITH 800 17-DEC-80
7499 ALLEN 1600 20-FEB-81
7521 WARD 1250 22-FEB-81
7654 MARTIN 1250 28-SEP-81
7844 TURNER 1500 08-SEP-81
7900 JAMES 950 03-DEC-81
7934 MILLER 1300 23-JAN-82
This query builds the complete organization tree, and filters out
only the leaf nodes by performing the check CONNECT_BY_ISLEAF = 1.
|
No comments:
Post a Comment