Thursday, February 4, 2010

Section 3.3. Self Joins








 

 












3.3 Self Joins





There are situations





in which one row of a table is related

to another row of the same table. The EMPLOYEE table is a good

example. The manager of one employee is also an employee. The rows

for both are in the same EMPLOYEE table. This relationship is

indicated in the MANAGER_EMP_ID column:





CREATE TABLE EMPLOYEE (

EMP_ID NUMBER (4) NOT NULL PRIMARY KEY,

FNAME VARCHAR2 (15),

LNAME VARCHAR2 (15),

DEPT_ID NUMBER (2),

MANAGER_EMP_ID NUMBER (4) REFERENCES EMPLOYEE(EMP_ID),

SALARY NUMBER (7,2),

HIRE_DATE DATE,

JOB_ID NUMBER (3));




To get information about an employee and his manager, you have to

join the EMPLOYEE table with itself. This is achieved by specifying

the EMPLOYEE table twice in the FROM clause and using two different

table

aliases,

thereby treating EMPLOYEE as if it were two separate tables. The

following example lists the name of each employee and his manager:





SELECT E.NAME EMPLOYEE, M.NAME MANAGER

FROM EMPLOYEE E, EMPLOYEE M

WHERE E.MANAGER_EMP_ID = M.EMP_ID;



EMPLOYEE MANAGER

---------- ----------

SMITH FORD

ALLEN BLAKE

WARD BLAKE

JONES KING

MARTIN BLAKE

BLAKE KING

CLARK KING

SCOTT JONES

TURNER BLAKE

ADAMS SCOTT

JAMES BLAKE

FORD JONES

MILLER CLARK



13 rows selected.




Notice the use of the EMPLOYEE table twice in the FROM clause with

two different aliases. Also notice the join condition that reads as:

"Where the employee's

MANAGER_EMP_ID is the same as his manager's

EMP_ID."







3.3.1 Self Outer Joins





Even though







the EMPLOYEE

table has 14 rows, the previous query returned only 13 rows. This is

because there is an employee without a MANAGER_EMP_ID. Oracle

excludes this row from the result set while performing the self inner

join. To include the employee(s) without a MANAGER_EMP_ID, you need

an outer join:





SELECT E.LNAME EMPLOYEE, M.LNAME MANAGER

FROM EMPLOYEE E, EMPLOYEE M

WHERE E.MANAGER_EMP_ID = M.EMP_ID (+);



EMPLOYEE MANAGER

---------- ----------

SMITH FORD

ALLEN BLAKE

WARD BLAKE

JONES KING

MARTIN BLAKE

BLAKE KING

CLARK KING

SCOTT JONES

KING

TURNER BLAKE

ADAMS SCOTT

JAMES BLAKE

FORD JONES

MILLER CLARK



14 rows selected.




Be careful when placing the (+) operator



in a join

condition. If you put the (+) on the wrong side, you will get an

absurd result set that makes no sense. In this case, the EMPLOYEE

table we need to make optional is the one from which we are drawing

manager names.









3.3.2 Self Non-Equi-Joins





The previous

example showed self-equi-joins.

However, there are situations when you need to perform

self-non-equi-joins. We will illustrate this by an example.

Let's assume that you are in charge of organizing

interdepartmental basket ball competition within your company. It is

your responsibility to draw the teams and schedule the competition.

You query the DEPARTMENT table and get the following result:





SELECT NAME FROM DEPARTMENT;



NAME

--------------

ACCOUNTING

RESEARCH

SALES

OPERATIONS




You find that there are four departments, and to make a fair

competition, you decide that each department plays against the other

three departments once, and at the end the department with the

maximum wins is declared the winner. You have been to an Oracle SQL

training class recently, and decide to apply the concept of self

join. You execute the following query:





SELECT D1.NAME TEAM1, D2.NAME TEAM2

FROM DEPARTMENT D1, DEPARTMENT D2;



TEAM1 TEAM2

-------------- --------------

ACCOUNTING ACCOUNTING

RESEARCH ACCOUNTING

SALES ACCOUNTING

OPERATIONS ACCOUNTING

ACCOUNTING RESEARCH

RESEARCH RESEARCH

SALES RESEARCH

OPERATIONS RESEARCH

ACCOUNTING SALES

RESEARCH SALES

SALES SALES

OPERATIONS SALES

ACCOUNTING OPERATIONS

RESEARCH OPERATIONS

SALES OPERATIONS

OPERATIONS OPERATIONS



16 rows selected.




Disappointing results. From your knowledge of high school

mathematics, you know that four teams each playing once with the

other three makes six combinations. However, your SQL query returned

16 rows. Now you realize that since you didn't

specify any join condition, you got a Cartesian product from your

query. You put in a join condition, and your query and results now

look as follows:





SELECT D1.NAME TEAM1, D2.NAME TEAM2

FROM DEPARTMENT D1, DEPARTMENT D2

WHERE D1.DEPT_ID = D2.DEPT_ID;



TEAM1 TEAM2

-------------- --------------

ACCOUNTING ACCOUNTING

RESEARCH RESEARCH

SALES SALES

OPERATIONS OPERATIONS




Oops! The equi-join returned a very unwanted result. A team

can't play against itself. You realize your mistake,

and this sparks the idea that you can use non-equi-joins in this

situation. You rewrite the query as a non-equi-join. You

don't want a team to play against itself, and

therefore replace the "=" operator

in the join condition with "!=".

Let's look at the results:





SELECT D1.NAME TEAM1, D2.NAME TEAM2

FROM DEPARTMENT D1, DEPARTMENT D2

WHERE D1.DEPT_ID != D2.DEPT_ID;



TEAM1 TEAM2

-------------- --------------

RESEARCH ACCOUNTING

SALES ACCOUNTING

OPERATIONS ACCOUNTING

ACCOUNTING RESEARCH

SALES RESEARCH

OPERATIONS RESEARCH

ACCOUNTING SALES

RESEARCH SALES

OPERATIONS SALES

ACCOUNTING OPERATIONS

RESEARCH OPERATIONS

SALES OPERATIONS



12 rows selected.




Still not done. In this result set, you have permutations such as

(RESEARCH, ACCOUNTING) and (ACCOUNTING, RESEARCH), and so on.

Therefore, each team plays against the others twice. You need to

remove these permutations, which you rightly consider to be

duplicates. You think about using DISTINCT. DISTINCT will not help here,

because the row (RESEARCH, ACCOUNTING) is different from the row

(ACCOUNTING, RESEARCH) from the viewpoint of DISTINCT; but not from

the viewpoint of your requirement. After some thought, you want to

try out an inequality operator other than

"!=". You decide to go with the

less-than (<) operator. Here are the results you get:





SELECT D1.NAME TEAM1, D2.NAME TEAM2

FROM DEPARTMENT D1, DEPARTMENT D2

WHERE D1.DEPT_ID < D2.DEPT_ID;



TEAM1 TEAM2

-------------- --------------

ACCOUNTING RESEARCH

ACCOUNTING SALES

RESEARCH SALES

ACCOUNTING OPERATIONS

RESEARCH OPERATIONS

SALES OPERATIONS



6 rows selected.




That's it! Now you have six combinations: each team

plays against the other three just once. Let's

examine why this version of the query works. Conceptually, when

Oracle executes this query, a Cartesian product is first formed with

16 rows. Then the less-than (<) operator

in the join condition restricts

the result set to those rows in which the DEPT_ID of Team 1 is less

than the DEPT_ID of Team 2. The less-than (<) operator eliminates

the duplicates, because for any given permutation of two departments

this condition is satisfied for only one. Using greater-than (>)



instead of less-than (<) will

also give you the required result, but the TEAM1 and TEAM2 values

will be reversed:





SELECT D1.NAME TEAM1, D2.NAME TEAM2

FROM DEPARTMENT D1, DEPARTMENT D2

WHERE D1.DEPT_ID > D2.DEPT_ID;



TEAM1 TEAM2

-------------- --------------

RESEARCH ACCOUNTING

SALES ACCOUNTING

OPERATIONS ACCOUNTING

SALES RESEARCH

OPERATIONS RESEARCH

OPERATIONS SALES



6 rows selected.




Don't be disheartened by the painful process you had

to go through to get this result. Sometimes you have to go through an

agonizing experience to get simple results such as these.

That's life. Now that you have the team combinations

right, go a bit further and assign a date for each match. Use

"tomorrow" as the starting date:





SELECT D1.NAME TEAM1, D2.NAME TEAM2, SYSDATE + ROWNUM MATCH_DATE

FROM DEPARTMENT D1, DEPARTMENT D2

WHERE D1.DEPT_ID < D2.DEPT_ID;



TEAM1 TEAM2 MATCH_DATE

-------------- -------------- ---------

ACCOUNTING RESEARCH 30-APR-01

ACCOUNTING SALES 01-MAY-01

RESEARCH SALES 02-MAY-01

ACCOUNTING OPERATIONS 03-MAY-01

RESEARCH OPERATIONS 04-MAY-01

SALES OPERATIONS 05-MAY-01



6 rows selected.




Now publish these results on the corporate intranet along with the

rules and regulations for the competition, and you are done.


















     

     


    No comments: