Monday, January 11, 2010

Section 1.4. DML Statements








 

 












1.4 DML Statements





In this section, we introduce the four statements that comprise the

DML portion of SQL. The information presented in this section should

be enough to allow you to start writing DML statements. As is

discussed at the end of the section, however, DML can look

deceptively simple, so keep in mind while reading the section that

there are many more facets to DML than are discussed here.







1.4.1 The SELECT Statement





The SELECT statement

is used to retrieve data from a database. The

set of data retrieved via a SELECT statement is referred to as a

result set. Like a table, a result set is comprised

of rows and columns, making it possible to populate a table using the

result set of a SELECT statement. The SELECT statement can be

summarized as follows:





SELECT <one or more things>

FROM <one or more places>

WHERE <zero, one, or more conditions apply>




While the SELECT and FROM clauses are required, the WHERE clause is

optional (although you will seldom see it omitted). We therefore

begin with a simple example that retrieves three columns from every

row of the



customer table:





SELECT cust_nbr, name, region_id

FROM customer;



CUST_NBR NAME REGION_ID

---------- ------------------------------ ----------

1 Cooper Industries 5

2 Emblazon Corp. 5

3 Ditech Corp. 5

4 Flowtech Inc. 5

5 Gentech Industries 5

6 Spartan Industries 6

7 Wallace Labs 6

8 Zantech Inc. 6

9 Cardinal Technologies 6

10 Flowrite Corp. 6

11 Glaven Technologies 7

12 Johnson Labs 7

13 Kimball Corp. 7

14 Madden Industries 7

15 Turntech Inc. 7

16 Paulson Labs 8

17 Evans Supply Corp. 8

18 Spalding Medical Inc. 8

19 Kendall-Taylor Corp. 8

20 Malden Labs 8

21 Crimson Medical Inc. 9

22 Nichols Industries 9

23 Owens-Baxter Corp. 9

24 Jackson Medical Inc. 9

25 Worcester Technologies 9

26 Alpha Technologies 10

27 Phillips Labs 10

28 Jaztech Corp. 10

29 Madden-Taylor Inc. 10

30 Wallace Industries 10




Since we neglected to impose any conditions via a WHERE clause, our

query returns every row from the customer table. If we want to

restrict the set of data returned by the query, we could





include a

WHERE clause with a single condition:





SELECT cust_nbr, name, region_id 

FROM customer

WHERE region_id = 8;



CUST_NBR NAME REGION_ID

---------- ------------------------------ ----------

16 Paulson Labs 8

17 Evans Supply Corp. 8

18 Spalding Medical Inc. 8

19 Kendall-Taylor Corp. 8

20 Malden Labs 8




Our result set now includes only those customers residing in the

region with a region_id of 8. But what if we want to specify a region

by name instead of region_id? We could query the region table for a

particular name and then query the customer table using the retrieved

region_id. Instead of issuing two different queries, however, we

could produce the same

outcome using a single query by

introducing a join, as in:





SELECT customer.cust_nbr, customer.name, region.name

FROM customer, region

WHERE region.name = 'New England'

AND region.region_id = customer.region_id;



CUST_NBR NAME NAME

---------- ------------------------------ -----------

1 Cooper Industries New England

2 Emblazon Corp. New England

3 Ditech Corp. New England

4 Flowtech Inc. New England

5 Gentech Industries New England




Our FROM clause now contains two tables instead of one, and the WHERE

clause contains a join condition that specifies

that the customer and region tables are to be joined using the

region_id column found in both tables. Joins and join conditions will

be explored in detail in Chapter 3.





Since both the customer and region tables contain a column called

name, you must specify which

table's name column you are interested in. This is

done in the previous example by using dot-notation to append the

table name in front of each column name. If you would rather not type

the full table names, you can assign

table aliases

to each table in the FROM clause and use those aliases instead of the

table names in the SELECT and WHERE clauses, as in:





SELECT c.cust_nbr, c.name, r.name

FROM customer c, region r

WHERE r.name = `New England'

AND r.region_id = c.region_id;




In this example, we assigned the alias

"c" to the customer table and the

alias "r" to the region table.

Thus, we can use "c." and

"r." instead of

"customer." and

"region." in the SELECT and WHERE





clauses.







1.4.1.1 SELECT clause elements




In the examples thus far, the result sets generated

by our queries have contained columns from one or more tables. While

most elements in your SELECT clauses will typically be simple column

references, a SELECT clause may also include:







  • Literal values, such as numbers (1) or strings ('abc')



  • Expressions, such as shape.diameter * 3.1415927



  • Functions, such as TO_DATE('01-JAN-2002','DD-MON-YYYY')



  • Pseudocolumns, such as ROWID, ROWNUM, or LEVEL





While the first three items in this list are fairly straightforward,

the last item merits further discussion. Oracle makes available

several phantom columns, known as pseudocolumns,

that do not exist in any tables. Rather, they are values visible

during query execution that can be helpful in certain situations.





For example, the

pseudocolumn ROWID

represents the physical location of a row. This information

represents the fastest possible access mechanism. It can be useful if

you plan to delete or update a row retrieved via a query. However,

you should never store ROWID values in the database, nor should you

reference them outside of the transaction in which they are

retrieved, since a row's ROWID can change in certain

situations, and ROWIDs can be reused after a row has been deleted.





The next example demonstrates each of the different elements from the

previous list:





SELECT rownum,

cust_nbr,

1 multiplier,

'cust # ' || cust_nbr cust_nbr_str,

'hello' greeting,

TO_CHAR(last_order_dt, 'DD-MON-YYYY') last_order

FROM customer;



ROWNUM CUST_NBR MULTIPLIER CUST_NBR_STR GREETING LAST_ORDER

------ -------- ---------- ------------ -------- -----------

1 1 1 cust # 1 hello 15-JUN-2000

2 2 1 cust # 2 hello 27-JUN-2000

3 3 1 cust # 3 hello 07-JUL-2000

4 4 1 cust # 4 hello 15-JUL-2000

5 5 1 cust # 5 hello 01-JUN-2000

6 6 1 cust # 6 hello 10-JUN-2000

7 7 1 cust # 7 hello 17-JUN-2000

8 8 1 cust # 8 hello 22-JUN-2000

9 9 1 cust # 9 hello 25-JUN-2000

10 10 1 cust # 10 hello 01-JUN-2000

11 11 1 cust # 11 hello 05-JUN-2000

12 12 1 cust # 12 hello 07-JUN-2000

13 13 1 cust # 13 hello 07-JUN-2000

14 14 1 cust # 14 hello 05-JUN-2000

15 15 1 cust # 15 hello 01-JUN-2000

16 16 1 cust # 16 hello 31-MAY-2000

17 17 1 cust # 17 hello 28-MAY-2000

18 18 1 cust # 18 hello 23-MAY-2000

19 19 1 cust # 19 hello 16-MAY-2000

20 20 1 cust # 20 hello 01-JUN-2000

21 21 1 cust # 21 hello 26-MAY-2000

22 22 1 cust # 22 hello 18-MAY-2000

23 23 1 cust # 23 hello 08-MAY-2000

24 24 1 cust # 24 hello 26-APR-2000

25 25 1 cust # 25 hello 01-JUN-2000

26 26 1 cust # 26 hello 21-MAY-2000

27 27 1 cust # 27 hello 08-MAY-2000

28 28 1 cust # 28 hello 23-APR-2000

29 29 1 cust # 29 hello 06-APR-2000

30 30 1 cust # 30 hello 01-JUN-2000




Interestingly, your SELECT clause is not required to reference

columns from any of the tables in the FROM clause. For example, the

next query's result set is composed entirely of

literals:





SELECT 1 num, 'abc' str

FROM customer;



NUM STR

---------- ---

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc

1 abc




Since there are 30 rows in the customer table, the

query's result set includes 30 identical rows of





data.











1.4.1.2 Ordering your results




In general,



there is no guarantee that the result set

generated by your query will be in any particular order. If you want

your results to be sorted by one or more columns, you can add an

ORDER BY clause after the WHERE clause. The following example sorts

the results from our New England query by customer name:





SELECT c.cust_nbr, c.name, r.name

FROM customer c, region r

WHERE r.name = 'New England'

AND r.region_id = c.region_id

ORDER BY c.name;



CUST_NBR NAME NAME

-------- ------------------------------ -----------

1 Cooper Industries New England

3 Ditech Corp. New England

2 Emblazon Corp. New England

4 Flowtech Inc. New England

5 Gentech Industries New England




You may also designate the sort column(s) by their position in the

SELECT clause. To sort the previous query by customer number, which

is the first column in the SELECT clause, you could issue the

following statement:





SELECT c.cust_nbr, c.name, r.name

FROM customer c, region r

WHERE r.name = 'New England'

AND r.region_id = c.region_id

ORDER BY 1;



CUST_NBR NAME NAME

---------- ------------------------------ -----------

1 Cooper Industries New England

2 Emblazon Corp. New England

3 Ditech Corp. New England

4 Flowtech Inc. New England

5 Gentech Industries New England




Specifying sort keys by position will certainly save you some typing,

but it can often lead to errors if you later change the order of the

columns in your SELECT clause.











1.4.1.3 Removing duplicates




In some

cases, your

result set

may contain duplicate data. For example, if you are compiling a list

of parts that were included in last month's orders,

the same part number would appear multiple times if more than one

order included that part. If you want duplicates removed from your

result set, you can include the DISTINCT keyword in your SELECT

clause, as in:





SELECT DISTINCT li.part_nbr

FROM cust_order co, line_item li

WHERE co.order_dt >= TO_DATE('01-JUL-2001','DD-MON-YYYY')

AND co.order_dt < TO_DATE('01-AUG-2001','DD-MON-YYYY')

AND co.order_nbr = li.order_nbr;




This query returns the distinct set of parts ordered during July of

2001. Without the DISTINCT keyword, the result set would contain one

row for every line-item of every order, and the same part would

appear multiple times if it was included in multiple orders. When

deciding whether to include DISTINCT in your SELECT clause, keep in

mind that finding and removing duplicates necessitates a sort

operation, which can add quite a bit of overhead to your query.











1.4.2 The INSERT Statement





The INSERT statement

is

the mechanism for loading data into your database. Data can be

inserted into only one table at a time, although the data being

loaded into the table can be pulled from one or more additional

tables. When inserting data into a table, you do not need to provide

values for every column in the table; however, you need to be aware

of the columns that require

non-NULL[3] values

and the ones that do not. Let's look at the

definition of the employee table:



[3] NULL indicates the absence of a value. The use of NULL will be

studied in Chapter 2.





describe employee



Name Null? Type

----------------------------------------- -------- ------------

EMP_ID NOT NULL NUMBER(5)

FNAME VARCHAR2(20)

LNAME NOT NULL VARCHAR2(20)

DEPT_ID NOT NULL NUMBER(5)

MANAGER_EMP_ID NUMBER(5)

SALARY NUMBER(5)

HIRE_DATE DATE

JOB_ID NUMBER(3)




The NOT NULL designation for the emp_id, lname, and dept_id columns

indicates that values are required for these three columns.

Therefore, we must be sure to provide values for at least these three

columns in our INSERT statements, as demonstrated by the following:





INSERT INTO employee (emp_id, lname, dept_id)

VALUES (101, 'Smith', 2);




The VALUES clause must contain the same number of elements as the

column list, and the data types must match the column definitions. In

the example, emp_id and dept_id hold numeric values while lname holds

character data, so our INSERT statement will execute without error.

Oracle always tries to convert data from one type to another

automatically, however, so the following statement will also run

without errors:





INSERT INTO employee (emp_id, lname, dept_id)

VALUES ('101', 'Smith', '2');




Sometimes, the data to be inserted needs to be retrieved from one or

more tables. Since the SELECT statement generates a result set

consisting of rows and columns of data, you can feed the result set

from a SELECT statement directly into an INSERT statement, as in:





INSERT INTO employee (emp_id, fname, lname, dept_id, hire_date)

SELECT 101, 'Dave', 'Smith', d.dept_id, SYSDATE

FROM department d

WHERE d.name = 'Accounting';




In this example, the purpose of the SELECT statement is to retrieve

the department ID for the Accounting department. The other four

columns in the SELECT clause are supplied as literals.









1.4.3 The DELETE Statement





The DELETE

statement

facilitates the removal of data from the database. Like the SELECT

statement, the DELETE statement contains a WHERE clause that

specifies the conditions used to identify rows to be deleted. If you

neglect to add a WHERE clause to your DELETE statement, all rows will

be deleted from the target table. The following statement will delete

all employees with the last name of Hooper from the employee table:





DELETE FROM employee

WHERE lname = 'Hooper';




In some cases, the values needed for one or more of the conditions in

your WHERE clause exist in another table. For example, your company

may decide to outsource its accounting functions, thereby

necessitating the removal of all Accounting personnel from the

employee table:





DELETE FROM employee

WHERE dept_id =

(SELECT dept_id

FROM department

WHERE name = 'Accounting');




The use of the SELECT statement in this example is known as a

subquery and will be studied in detail in Chapter 5.









1.4.4 The UPDATE Statement





Modifications to

existing

data are handled by the UPDATE statement. Like the DELETE statement,

the UPDATE statement includes a WHERE clause in order to specify

which rows should be targeted. The following example shows how you

might give a 10% raise to everyone making less than $40,000:





UPDATE employee

SET salary = salary * 1.1

WHERE salary < 40000;




If you want to modify more than one column in the table, you have two

choices: provide a set of column/value pairs separated by commas, or

provide a set of columns and a subquery. The following two UPDATE

statements modify the inactive_dt and inactive_ind columns in the

customer table for any customer who hasn't placed an

order in the past year:





UPDATE customer

SET inactive_dt = SYSDATE, inactive_ind = 'Y'

WHERE last_order_dt < SYSDATE -- 365;



UPDATE customer

SET (inactive_dt, inactive_ind) =

(SELECT SYSDATE, 'Y' FROM dual)

WHERE last_order_dt < SYSDATE -- 365;




The subquery in the second example is a bit forced, since it uses a

query against the dual[4] table to build a result set containing two

literals, but it should give you an idea of how you would use a

subquery in an UPDATE statement. In later chapters, you will see far

more interesting uses for subqueries.



[4] Dual is an Oracle-provided

table containing exactly one row with one column. It comes in handy

when you need to construct a query that returns exactly one

row.









1.4.5 So Why Are There 13 More Chapters?





After reading this chapter, you might think that SQL looks pretty

simple (at least the DML portion). At a high level, it is fairly

simple, and you now know enough about the language to go write some

code. However, you will learn over time that there are numerous ways

to arrive at the same end point, and some are more efficient and

elegant than others. The true test of SQL mastery is when you no

longer have the desire to return to what you were working on the

previous year, rip out all the SQL, and recode it. For one of us, it

took about nine years to reach that point. Hopefully, this book will

help you reach that point in far less time.





While you are reading the rest of the book, you might notice that the

majority of examples use SELECT statements, with the remainder

somewhat evenly distributed across INSERT, UPDATE, and DELETE

statements. This disparity is not indicative of the relative

importance of SELECT statements over the other three DML statements;

rather, SELECT statements are favored because we can show the

query's result set, which should help you to better

understand the query, and because many of the points being made using

SELECT statements can be applied to UPDATE and DELETE statements as

well.


















     

     


    No comments: