Tuesday, January 19, 2010

8.6 Enhancements in Oracle Database 10g











 < Day Day Up > 







8.6 Enhancements in Oracle Database 10g





Oracle

Database

10g introduces some new features for

hierarchical queries. The new features include the CONNECT_BY_ROOT

operator, the new CONNECT_BY_ISCYCLE and CONNECT_BY_ISLEAF

pseudocolumns, and the NOCYCLE keyword. We will discuss each of these

enhancements in the following sections.







8.6.1 Getting Data from the Root Row





Remember how you can use the PRIOR operator to retrieve a value from

a node's parent row? You can now use the







CONNECT_BY_ROOT operator to retrieve

a value from a node's root. For example:





SELECT lname "Employee", CONNECT_BY_ROOT lname "Top Manager"

FROM employee

START WITH manager_emp_id = 7839

CONNECT BY PRIOR emp_id = manager_emp_id;



Employee Top Manager

-------------------- ------------

JONES JONES

SCOTT JONES

ADAMS JONES

FORD JONES

SMITH JONES

BLAKE BLAKE

ALLEN BLAKE

WARD BLAKE

MARTIN BLAKE

TURNER BLAKE

JAMES BLAKE

CLARK CLARK

MILLER CLARK







In this example, the hierarchy is built by starting with the rows

that meet the condition manager_emp_id = 7839.

This means that anyone whose manager is 7839 will be considered a

root for this query. Those employees will be listed in the result set

of the query along with the name of the top-most manager in their

tree. The CONNECT_BY_ROOT operator returns that top-most manager name

by accessing the root row for each row returned by the query.









8.6.2 Ignoring Cycles





Cycles are not



allowed in

a true tree structure. But life is not perfect, and someday

you're bound to encounter hierarchical data

containing cycles in which a node's child is also

its parent. Such cycles are usually not good, need to be fixed, but

can be frustratingly difficult to identify. You can try to find

cycles by issuing a START WITH . . . CONNECT BY query, but such a

query will report an error if there is a cycle (also known as a loop)

in the data. In Oracle Database 10g, all this

changes.





To allow the START WITH . . . CONNECT BY construct

to work properly even if cycles are present in the data, Oracle

Database 10g provides the new NOCYCLE keyword.

If there are cycles in your data, you can use the



NOCYCLE

keyword in the CONNECT BY clause, and you will not get an error when

hierarchically querying that data.





The test data we have in the employee table

doesn't have a cycle. To test the NOCYCLE feature,

you can introduce a cycle into the existing

employee data by updating the

manager_emp_id column of the top-most employee

(KING with emp_id=7839) with the

manager_emp_id of one of the lowest level

employees (MARTIN with emp_id =

7654):





UPDATE employee

SET manager_emp_id

= 7654

WHERE manager_emp_id IS NULL;







Now, if you perform the following hierarchical query, you will get an

ORA-01436 error:





SELECT LEVEL, LPAD('  ',2*(LEVEL - 1)) || lname "EMPLOYEE", 

emp_id, manager_emp_id

FROM employee

START WITH emp_id = 7839

CONNECT BY PRIOR emp_id = manager_emp_id;



LEVEL EMPLOYEE EMP_ID MANAGER_EMP_ID

---------- -------------------- ---------- --------------

1 KING 7839 7654

2 JONES 7566 7839

3 SCOTT 7788 7566

4 ADAMS 7876 7788

3 FORD 7902 7566

4 SMITH 7369 7902

2 BLAKE 7698 7839

3 ALLEN 7499 7698

3 WARD 7521 7698

3 MARTIN 7654 7698

4 KING 7839 7654

5 JONES 7566 7839

6 SCOTT 7788 7566

7 ADAMS 7876 7788

6 FORD 7902 7566

ERROR:

ORA-01436: CONNECT BY loop in user data







15 rows selected.







Other than the error, notice that the whole tree starting with KING

starts repeating under MARTIN. This is erroneous and confusing. Use

the NOCYCLE keyword in the CONNECT BY clause to get rid of the error

message, and to prevent the listing of erroneously cyclic data:





SELECT LEVEL, LPAD('  ',2*(LEVEL - 1)) || lname "EMPLOYEE", 

emp_id, manager_emp_id

FROM employee

START WITH emp_id = 7839

CONNECT BY NOCYCLE PRIOR emp_id = manager_emp_id;



LEVEL EMPLOYEE EMP_ID MANAGER_EMP_ID

---------- -------------------- ---------- --------------

1 KING 7839 7654

2 JONES 7566 7839

3 SCOTT 7788 7566

4 ADAMS 7876 7788

3 FORD 7902 7566

4 SMITH 7369 7902

2 BLAKE 7698 7839

3 ALLEN 7499 7698

3 WARD 7521 7698

3 MARTIN 7654 7698

3 TURNER 7844 7698

3 JAMES 7900 7698

2 CLARK 7782 7839

3 MILLER 7934 7782







This query recognizes that there is a cycle, ignores the cycle (as an

impact of the NOCYCLE keyword), and returns the rows as if there were

no cycles. Having the ability to query data containing cycles, your

next problem is to identify those cycles.







You can use the NOCYCLE keyword regardless of whether you have a

cycle in your data.












8.6.3 Identifying Cycles





It is sometimes difficult

to



identify cycles in hierarchical data. Oracle Database

10g's new pseudocolumn,





CONNECT_BY_ISCYCLE,

can help you identify the cycles in the data easily.

CONNECT_BY_ISCYCLE can be used only in conjunction with the NOCYCLE

keyword in a hierarchical query. The CONNECT_BY_ISCYCLE pseudocolumn

returns 1 if the current row has a child that is also its ancestor;

otherwise, it returns 0. For example:





SELECT lname, CONNECT_BY_ISCYCLE

FROM employee

START WITH emp_id = 7839

CONNECT BY NOCYCLE PRIOR emp_id = manager_emp_id;



LNAME CONNECT_BY_ISCYCLE

-------------------- ------------------

KING 0

JONES 0

SCOTT 0

ADAMS 0

FORD 0

SMITH 0

BLAKE 0

ALLEN 0

WARD 0

MARTIN 1

TURNER 0

JAMES 0

CLARK 0

MILLER 0







Since MARTIN is KING's manager in this data set, and

MARTIN also comes under KING in the organization tree, the row for

MARTIN has the value 1 for CONNECT_BY_ISCYCLE.







For correct results in subsequent queries, you should revert our

example data back to its original state by rolling back the earlier

change that forced a cycle in the data. If you have already committed

the previous UPDATE, you should update the

employee table again to set the

manager_emp_id column to NULL for KING.












8.6.4 Identifying Leaf Nodes





In a tree structure, the

nodes at the lowest level of the

tree are referred to as leaf nodes. Leaf nodes have no children.





CONNECT_BY_ISLEAF

is a pseudocolumn that returns 1 if the current row is a leaf, and

returns 0 if the current row is not a leaf. For example:





SELECT lname, CONNECT_BY_ISLEAF

FROM employee

START WITH manager_emp_id IS NULL

CONNECT BY PRIOR emp_id = manager_emp_id;



LNAME CONNECT_BY_ISLEAF

--------------- -----------------

KING 0

JONES 0

SCOTT 0

ADAMS 1

FORD 0

SMITH 1

BLAKE 0

ALLEN 1

WARD 1

MARTIN 1

TURNER 1

JAMES 1

CLARK 0

MILLER 1







This new feature can help simplify SQL statements that need to

identify all the leaf nodes in a hierarchy. Without this

pseudocolumn, to identify the leaf nodes, you would write a query

like the following:





SELECT emp_id, lname, salary, hire_date

FROM employee e

WHERE NOT EXISTS

(SELECT emp_id FROM employee e1 WHERE e.emp_id = e1.manager_emp_id);



EMP_ID LNAME SALARY HIRE_DATE

------- --------------- ---------- ---------

7369 SMITH 800 17-DEC-80

7499 ALLEN 1600 20-FEB-81

7521 WARD 1250 22-FEB-81

7654 MARTIN 1250 28-SEP-81

7844 TURNER 1500 08-SEP-81

7876 ADAMS 1100 23-MAY-87

7900 JAMES 950 03-DEC-81

7934 MILLER 1300 23-JAN-82







However, you can make this query much simpler by using the new

pseudocolumn CONNECT_BY_ISLEAF, as shown here:





SELECT emp_id, lname, salary, hire_date

FROM employee e

WHERE CONNECT_BY_ISLEAF = 1

START WITH manager_emp_id IS NULL

CONNECT BY PRIOR emp_id = manager_emp_id;



EMP_ID LNAME SALARY HIRE_DATE

------- --------------- ---------- ---------

7876 ADAMS 1100 23-MAY-87

7369 SMITH 800 17-DEC-80

7499 ALLEN 1600 20-FEB-81

7521 WARD 1250 22-FEB-81

7654 MARTIN 1250 28-SEP-81

7844 TURNER 1500 08-SEP-81

7900 JAMES 950 03-DEC-81

7934 MILLER 1300 23-JAN-82







This query builds the complete organization tree, and filters out

only the leaf nodes by performing the check CONNECT_BY_ISLEAF = 1.





















     < Day Day Up > 



    No comments: