Hotsos Symposium Null Values: Nothing to Worry About...
Dallas, March 2005 Lex de Haan (lex.de.haan@naturaljoin.nl)
Who Am I Lex de Haan lex.de.haan@naturaljoin.nl
Applied Maths, TU Delft Oracle employee 1990-2004 – Teacher, delivery manager, product manager seminars, curriculum developer/manager, ST development
ISO SQL Standardization March 2004: Natural Join B.V. http://www.naturaljoin.nl
2
Š2005, Natural Join B.V.
Null Values: Nothing to Worry About - 1
Topics 1. 2. 3. 4. 5.
Four simple sample tables Some teasers and appetizers Null-values and three-valued logic Misleading behavior of group functions Compare IN and EXISTS operators
6. Writing correct SQL statements: 1. Who has attended all general courses? 2. Which departments are unknown to have salesmen? 3. For which departments is the total salary of all clerks less than 2000? 3
1.
Simple Sample Tables: Diagram EMPCRS EMP
DEPT #deptno dname loc
#empno ename job mgr hiredate sal comm deptno
#empno #crsid #crsdate days_att
COURSES #crsid crsname type duration 4
Š2005, Natural Join B.V.
Null Values: Nothing to Worry About - 2
Sample Table: EMP
EMPNO ----7782 7839 7934 7369 7876 7902 7788 7566 7499 7698 7654 7900 7844 7521
ENAME -----CLARK KING MILLER SMITH ADAMS FORD SCOTT JONES ALLEN BLAKE MARTIN JAMES TURNER WARD
JOB MGR HIREDATE SAL COMM DEPTNO --------- ----- ---------- ----- ----- -----MANAGER 7839 09-06-1981 2450 10 PRESIDENT 17-11-1981 5000 10 CLERK 7782 23-01-1982 1300 10 CLERK 7902 17-12-1980 800 20 CLERK 7788 12-01-1983 1100 20 ANALYST 7566 03-12-1981 3000 20 ANALYST 7566 09-12-1982 3000 20 MANAGER 7839 02-04-1981 2975 20 SALESMAN 7698 20-02-1981 1600 300 30 MANAGER 7839 01-05-1981 2850 30 SALESMAN 7698 28-09-1981 1250 1400 30 CLERK 7698 03-12-1981 950 30 SALESMAN 7698 08-09-1991 1500 0 30 SALESMAN 7698 22-02-1981 1250 500 30
5
Sample Tables: DEPT and COURSES DEPTNO -----10 20 30 40 CRSID ----PLS JAVA XML D2K SQL OSO ERM RSD
DNAME -------------ACCOUNTING RESEARCH SALES OPERATIONS
CRSNAME -----------------------PL/SQL FUNDAMENTALS JAVA FOR DEVELOPERS ORACLE XML DEVELOPER 2000 INTRODUCTION TO SQL ORACLE SERVER OVERVIEW ER MODELLING RELATIONAL SYSTEM DESIGN
LOC -------NEW YORK DALLAS CHICAGO BOSTON TYPE DURATION ----------- -------DEVELOPMENT 4 DEVELOPMENT 3 DEVELOPMENT 4 DEVELOPMENT 3 GENERAL 4 GENERAL 2 DESIGN 4 DESIGN 3 6
©2005, Natural Join B.V.
Null Values: Nothing to Worry About - 3
Sample Table: EMPCRS EMPNO ----7499 7566 7566 7566 7788 7788 7788 7876 7876 7902 7902 7902 7902 7902 7902 7934
CRSID ----SQL PLS D2K D2K SQL OSO XML SQL PLS SQL SQL PLS JAVA ERM D2K SQL
CRSDATE DAYS_ATT ---------- -------20-03-2002 4 25-03-2001 3 15-04-2001 2 12-05-2001 3 10-01-2001 2 18-02-2001 2 10-04-2001 4 20-03-2002 4 20-09-2002 3 14-01-2002 2 14-12-2002 4 15-01-2001 4 25-03-2001 3 01-04-2001 4 12-05-2001 3 20-03-2002 4 7
2. Some Teasers and Appetizers (1/3) Which of these five predicates are true? e1,e2 RANGES OVER emp d RANGES OVER dept s RANGES OVER emp where job='SALESMAN' -----------------------------------------------1. FORALL e1 (FORALL e2 (e1.empno <> e2.empno) ) 2. FORALL e1 (EXISTS e2 (e1.mgr = e2.empno)
)
3. EXISTS d (NOT EXISTS e (d.deptno = e.deptno)) 4. NOT EXISTS d (EXISTS e (d.deptno = e.deptno)) 5. FORALL s (NOT EXISTS e (s.empno
What is the result of this query?
= e.mgr)
)
select e.* from emp e where e.comm = e.comm 8
©2005, Natural Join B.V.
Null Values: Nothing to Worry About - 4
Some Teasers and Appetizers (2/3)
Equivalent queries? select e1.* from emp e1 where e1.sal (select from where and
select e1.* from emp e1 where e1.sal (select from where and
> ALL e2.sal emp e2 e2.deptno = 10 e2.job = 'SALESMAN')
Equivalent PL/SQL?
> ALL MAX(e2.sal) emp e2 e2.deptno = 10 e2.job = 'SALESMAN')
if P then r:='Y' else r:='N' if NOT P then r:='N' else r:='Y'
9
Some Teasers and Appetizers (3/3) Equivalent queries?
select e1.* from emp e1 where e1.empno NOT IN (select e2.mgr from emp e2)
select e1.* from emp e1 where NOT EXISTS (select e2.* from emp e2 where e2.mgr = e1.empno)
Equivalent expressions? Equivalent expressions? sum(sal) + sum(comm) sum(sal + comm)
sum(comm) nvl(sum(comm),0) sum(nvl(comm ,0)) 10
Š2005, Natural Join B.V.
Null Values: Nothing to Worry About - 5
3. Truth Tables of Three-Valued Logic “Not TRUE” is not the same as “NOT TRUE” (NOT is not the complement operator anymore) Two expressions are logically equivalent iff (if and only if) they have identical truth tables P T T T T T F U U F U F F F
Q T T U F F T U F F T U F
P AND Q P OR Q P => Q T T T T T T U T U F T F F T F F U T T U U U F F T F U U F T T Two-valued logic F U T F F T
P T T U F F
NOT P F F U T T
11
4. Misleading Behaviour of Group Functions Arithmetic expressions yield NULL if any operand is NULL ... but NULLs are ignored by group functions!
C1
C2
C1+C2
10 NULL 30
NULL 20 10
NULL NULL 40
SUM(C1+C2) = 40 SUM(C1)+SUM(C2) = 70
SUM(C1+C2) != SUM(C1)+SUM(C2) ... 12
©2005, Natural Join B.V.
Null Values: Nothing to Worry About - 6
Group Functions on the Empty Set COUNT correctly returns zero SUM returns NULL, but should return zero too (?) AVG, MAX, and MIN return NULL
SQL> 2 3 4
select , from where
count(*), count(sal) sum(sal), avg(sal), min(sal) emp 1 = 2;
COUNT(*) COUNT(SAL) SUM(SAL) AVG(SAL) MIN(SAL) -------- ---------- -------- -------- -------0 0 NULL NULL NULL 13
5.
The IN Operator in SQL t IN (T)
<=> t = t1 t = t2 ... t = tn
t NOT IN (T) <=> t != t1 t != t2 ... t != tn
OR OR OR
AND AND AND
What happens if T contains (or returns) a NULL value? 14
Š2005, Natural Join B.V.
Null Values: Nothing to Worry About - 7
The EXISTS Operator In SQL:
EXISTS (select * from ...) <=> (select count(*)...) > 0
In two-valued calculus: EXISTS t(P) <=> count(t where P) > 0 No problems in two-valued logic; however …
15
EXISTS and Three-Valued Calculus EXISTS t(P) <=> count(t where P) > 0 This is not true anymore in three-valued logic. Check the situation if predicate P results in: – UNKNOWN for at least one t – FALSE for all other t
Or look at it this way: – In SQL: EXISTS can only yield TRUE or FALSE – In Calculus: TRUE, FALSE or UNKNOWN …
16
©2005, Natural Join B.V.
Null Values: Nothing to Worry About - 8
6.
Three Query Examples 1. Which employees have attended all general courses? 2. Which departments are unknown to have salesmen? 3. For which departments is the total salary of all clerks less than 2000?
17
Example 1 Show: Number and name of all employees who have attended all courses of type GENERAL Alternative formulation: Employee x should appear in the result if and only if for every GENERAL course, you can find at least one attendance of that course by that employee x
18
Š2005, Natural Join B.V.
Null Values: Nothing to Worry About - 9
Example 1: Relational Calculus RC e.empno, e.ename where FORALL c (c.type = 'GENERAL' => exists ec ( ec.crsid = c.crsid and ec.empno = e.empno))
Use the following three rewrite rules: (P => Q) <=> (NOT P) OR Q FORALL t (P) <=> NOT EXISTS t (NOT P) NOT(P OR Q) <=> (NOT P) AND (NOT Q)
19
Example 1: Relational Calculus RC e.empno, e.ename -- 1 where forall c (c.type = 'GENERAL' => exists ec ( ec.crsid = c.crsid and ec.empno = e.empno)) <=>
e.empno, e.ename -- 2 where forall c (c.type <> 'GENERAL' or exists ec ( ec.crsid = c.crsid and ec.empno = e.empno)) 20
Š2005, Natural Join B.V.
Null Values: Nothing to Worry About - 10
RC
<=>
<=>
e.empno, e.ename -- 2 where FORALL c (c.type <> 'GENERAL' or exists ec ( ec.crsid = c.crsid and ec.empno = e.empno)) e.empno, e.ename -- 3 where NOT EXISTS c NOT (c.type <> 'GENERAL' or exists ec ( ec.crsid = c.crsid and ec.empno = e.empno)) e.empno, e.ename -- 4 where NOT EXISTS c (c.type = 'GENERAL' AND NOT exists ec ( ec.crsid = c.crsid and ec.empno = e.empno)) 21
Example 1: SQL Solution 1 SQL> 2 3 4 5 6 7 8 9 10 11 12 13 14
select e.empno , e.ename from emp e where NOT EXISTS (select c.* from courses c where c.type = 'GENERAL' and NOT EXISTS (select ec.* from empcrs ec where ec.crsid = c.crsid and ec.empno = e.empno ) ); 22
Š2005, Natural Join B.V.
Null Values: Nothing to Worry About - 11
Example 1 Revisited Show: Number and name of all employees who have attended all courses of type GENERAL Another alternative formulation: Show number and name of all employees for which the set of GENERAL courses is a subset of the set of courses they attended
23
Example 1: Venn Diagram
Set of all courses GC: Set of GENERAL courses
CA(e): Courses attended by employee e
The fixed set GC must be a subset of the variable set CA(e) 24
Š2005, Natural Join B.V.
Null Values: Nothing to Worry About - 12
Example 1: Set Theory A is a subset of B <=> (A - B) is the empty set Applied to this example:
GC is a subset of CA(e) <=> GC minus CA(e) is the empty set This leads to an SQL solution with the MINUS operator, using NOT EXISTS as an “empty set checker”
25
Example 1: SQL Solution 2 SQL> select e.empno, e.ename 2 from emp e 3 where not exists 4 (select c.crsid 5 from courses c 6 where c.type = 'GENERAL' 7 MINUS 8 select ec.crsid 9 from empcrs ec 10 where ec.empno = e.empno);
26
©2005, Natural Join B.V.
Null Values: Nothing to Worry About - 13
Example 2 Show:
Departments for which it is unknown whether they have salesmen RC
range of d is DEPT range of e is EMP d where IS_UNK (exists e (e.deptno = d.deptno and e.job = 'SALESMAN'))
This is easy; but what about SQL? 27
Example 2: Intermediate Step Departments which are known to have no salesmen SQL> select d.* 2 from dept d 3 where NOT exists 4 (select e.* 5 from emp e 6 where (e.deptno = d.deptno 7 or e.deptno IS NULL ) 8 and ( e.job = 'SALESMAN' 9 or e.job IS NULL )) no rows selected. The subquery looks for employees for which it is TRUE or UNKNOWN that they are salesmen belonging to a department, so the main query returns departments for which it is FALSE that they have salesmen 28
Š2005, Natural Join B.V.
Null Values: Nothing to Worry About - 14
Example 2: Solution
SQL> 2 3 4 5 6 7 8 9 10 11 12
select d.* from dept d where NOT exists (select e.* from emp e where e.deptno = d.deptno TRUE and e.job='SALESMAN' ) and exists (select e.* from emp e where (e.deptno = d.deptno or e.deptno is null) TRUE or UNK and (e.job = 'SALESMAN' or e.job is null ));
29
Example 3 EMPNO ----7839 7698 7654 7844 7900 7934
ENAME ------KING BLAKE MARTIN TURNER JAMES MILLER
JOB --------PRESIDENT MANAGER SALESMAN SALESMAN CLERK CLERK
DEPTNO -----10 30 40
DNAME ---------ACCOUNTING SALES OPERATIONS
LOC -------NEW YORK CHICAGO BOSTON
MGR ---7839 7698 7698 7839 7839
SAL ---5000 1250 1500 1300
DEPTNO -----10 30 30 30 10 10
Show the departments for which it is sure that the total salary of all clerks is less than $2000 30
©2005, Natural Join B.V.
Null Values: Nothing to Worry About - 15
Example 3: Expected Answer? EMPNO ----7839 7698 7654 7844 7900 7934
ENAME ------KING BLAKE MARTIN TURNER JAMES MILLER
JOB --------PRESIDENT MANAGER SALESMAN SALESMAN CLERK CLERK
DEPTNO -----10 30 40
DNAME ---------ACCOUNTING SALES OPERATIONS
LOC -------NEW YORK CHICAGO BOSTON
MGR ---7839 7698 7698 7839 7839
SAL ---5000 1250 1500 NULL 1300
DEPTNO -----10 30 30 30 10 10
Show the departments for which it is sure that the total salary of all clerks is less than $2000 31
Example 3: First Attempt This query does not yield the desired result, but at least it does not show department 10 ☺ SQL> 2 3 4 5
select from where group by having
e.deptno, sum(e.sal) emp e e.job = 'CLERK' e.deptno sum(COALESCE(e.sal,2000)) < 2000; no rows selected
32
©2005, Natural Join B.V.
Null Values: Nothing to Worry About - 16
Example 3: Correct Solution SQL> select d.deptno 2 from dept d 3 where (select COALESCE 4 (sum(COALESCE(e.sal,2000)),0) 5 from emp e 6 where e.deptno = d.deptno 7 and e.job = 'CLERK' 8 ) < 2000; DEPTNO -----30 40 33
That’s All...
By the way, did you ever play with the LNNVL function? 34
©2005, Natural Join B.V.
Null Values: Nothing to Worry About - 17