Database System Concepts, 6th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use Chapter 3: Introduction to SQL ©Silberschatz, Korth and Sudarshan3.2Database System Concepts - 6th Edition Chapter 3: Introduction to SQL  Overview of the SQL Query Language  Data Definition  Basic Query Structure  Additional Basic Operations  Set Operations  Null Values  Aggregate Functions  Nested Subqueries  Modification of the Database ©Silberschatz, Korth and Sudarshan3.3Database System Concepts - 6th Edition History  IBM Sequel language developed as part of System R project at the IBM San Jose Research Laboratory  Renamed Structured Query Language (SQL)  ANSI and ISO standard SQL:  SQL-86, SQL-89, SQL-92  SQL:1999, SQL:2003, SQL:2008  Commercial systems offer most, if not all, SQL-92 features, plus varying feature sets from later standards and special proprietary features.  Not all examples here may work on your particular system. ©Silberschatz, Korth and Sudarshan3.4Database System Concepts - 6th Edition Data Definition Language  The schema for each relation.  The domain of values associated with each attribute.  Integrity constraints  And as we will see later, also other information such as  The set of indices to be maintained for each relations.  Security and authorization information for each relation.  The physical storage structure of each relation on disk. The SQL data-definition language (DDL) allows the specification of information about relations, including: ©Silberschatz, Korth and Sudarshan3.5Database System Concepts - 6th Edition Domain Types in SQL  char(n). Fixed length character string, with user-specified length n.  varchar(n). Variable length character strings, with user-specified maximum length n.  int. Integer (a finite subset of the integers that is machine- dependent).  smallint. Small integer (a machine-dependent subset of the integer domain type).  numeric(p,d). Fixed point number, with user-specified precision of p digits, with d digits to the right of decimal point.  real, double precision. Floating point and double-precision floating point numbers, with machine-dependent precision.  float(n). Floating point number, with user-specified precision of at least n digits.  More are covered in Chapter 4. ©Silberschatz, Korth and Sudarshan3.6Database System Concepts - 6th Edition Create Table Construct  An SQL relation is defined using the create table command: create table r (A1 D1, A2 D2, ..., An Dn, (integrity-constraint1), ..., (integrity-constraintk))  r is the name of the relation  each Ai is an attribute name in the schema of relation r  Di is the data type of values in the domain of attribute Ai  Example: create table instructor ( ID char(5), name varchar(20) not null, dept_name varchar(20), salary numeric(8,2))  insert into instructor values („10211‟, ‟Smith‟, ‟Biology‟, 66000);  insert into instructor values („10211‟, null, ‟Biology‟, 66000); ©Silberschatz, Korth and Sudarshan3.7Database System Concepts - 6th Edition Integrity Constraints in Create Table  not null  primary key (A1, ..., An )  foreign key (Am, ..., An ) references r Example: Declare dept_name as the primary key for department . create table instructor ( ID char(5), name varchar(20) not null, dept_name varchar(20), salary numeric(8,2), primary key (ID), foreign key (dept_name) references department) primary key declaration on an attribute automatically ensures not null ©Silberschatz, Korth and Sudarshan3.8Database System Concepts - 6th Edition And a Few More Relation Definitions  create table student ( ID varchar(5), name varchar(20) not null, dept_name varchar(20), tot_cred numeric(3,0), primary key (ID), foreign key (dept_name) references department) );  create table takes ( ID varchar(5), course_id varchar(8), sec_id varchar(8), semester varchar(6), year numeric(4,0), grade varchar(2), primary key (ID, course_id, sec_id, semester, year), foreign key (ID) references student, foreign key (course_id, sec_id, semester, year) references section );  Note: sec_id can be dropped from primary key above, to ensure a student cannot be registered for two sections of the same course in the same semester ©Silberschatz, Korth and Sudarshan3.9Database System Concepts - 6th Edition And more still  create table course ( course_id varchar(8) primary key, title varchar(50), dept_name varchar(20), credits numeric(2,0), foreign key (dept_name) references department) );  Primary key declaration can be combined with attribute declaration as shown above ©Silberschatz, Korth and Sudarshan3.10Database System Concepts - 6th Edition Drop and Alter Table Constructs  drop table student  Deletes the table and its contents  delete from student  Deletes all contents of table, but retains table  alter table  alter table r add A D  where A is the name of the attribute to be added to relation r and D is the domain of A.  All tuples in the relation are assigned null as the value for the new attribute.  alter table r drop A  where A is the name of an attribute of relation r  Dropping of attributes not supported by many databases ©Silberschatz, Korth and Sudarshan3.11Database System Concepts - 6th Edition Basic Query Structure  The SQL data-manipulation language (DML) provides the ability to query information, and insert, delete and update tuples  A typical SQL query has the form: select A1, A2, ..., An from r1, r2, ..., rm where P  Ai represents an attribute  ri represents a relation  P is a predicate.  The result of an SQL query is a relation. ©Silberschatz, Korth and Sudarshan3.12Database System Concepts - 6th Edition The select Clause  The select clause list the attributes desired in the result of a query  corresponds to the projection operation of the relational algebra  Example: find the names of all instructors: select name from instructor  NOTE: SQL names are case insensitive (i.e., you may use upper- or lower-case letters.)  E.g. Name ≡ NAME ≡ name  Some people use upper case wherever we use bold font. ©Silberschatz, Korth and Sudarshan3.13Database System Concepts - 6th Edition The select Clause (Cont.)  SQL allows duplicates in relations as well as in query results.  To force the elimination of duplicates, insert the keyword distinct after select.  Find the names of all departments with instructor, and remove duplicates select distinct dept_name from instructor  The keyword all specifies that duplicates not be removed. select all dept_name from instructor ©Silberschatz, Korth and Sudarshan3.14Database System Concepts - 6th Edition The select Clause (Cont.)  An asterisk in the select clause denotes “all attributes” select * from instructor  The select clause can contain arithmetic expressions involving the operation, +, –, , and /, and operating on constants or attributes of tuples.  The query: select ID, name, salary/12 from instructor would return a relation that is the same as the instructor relation, except that the value of the attribute salary is divided by 12. ©Silberschatz, Korth and Sudarshan3.15Database System Concepts - 6th Edition The where Clause  The where clause specifies conditions that the result must satisfy  Corresponds to the selection predicate of the relational algebra.  To find all instructors in Comp. Sci. dept with salary > 80000 select name from instructor where dept_name = ‘Comp. Sci.' and salary > 80000  Comparison results can be combined using the logical connectives and, or, and not.  Comparisons can be applied to results of arithmetic expressions. ©Silberschatz, Korth and Sudarshan3.16Database System Concepts - 6th Edition The from Clause  The from clause lists the relations involved in the query  Corresponds to the Cartesian product operation of the relational algebra.  Find the Cartesian product instructor X teaches select  from instructor, teaches  generates every possible instructor – teaches pair, with all attributes from both relations  Cartesian product not very useful directly, but useful combined with where-clause condition (selection operation in relational algebra) ©Silberschatz, Korth and Sudarshan3.17Database System Concepts - 6th Edition Cartesian Product: instructor  teaches instructor teaches ©Silberschatz, Korth and Sudarshan3.18Database System Concepts - 6th Edition Joins  For all instructors who have taught some course, find their names and the course ID of the courses they taught. select name, course_id from instructor, teaches where instructor.ID = teaches.ID  Find the course ID, semester, year and title of each course offered by the Comp. Sci. department select section.course_id, semester, year, title from section, course where section.course_id = course.course_id and dept_name = „Comp. Sci.' ©Silberschatz, Korth and Sudarshan3.19Database System Concepts - 6th Edition Natural Join  Natural join matches tuples with the same values for all common attributes, and retains only one copy of each common column  select * from instructor natural join teaches; ©Silberschatz, Korth and Sudarshan3.20Database System Concepts - 6th Edition Natural Join Example  List the names of instructors along with the course ID of the courses that they taught.  select name, course_id from instructor, teaches where instructor.ID = teaches.ID;  select name, course_id from instructor natural join teaches; ©Silberschatz, Korth and Sudarshan3.21Database System Concepts - 6th Edition Natural Join (Cont.)  Danger in natural join: beware of unrelated attributes with same name which get equated incorrectly  List the names of instructors along with the titles of courses that they teach  Incorrect version (makes course.dept_name = instructor.dept_name)  select name, title from instructor natural join teaches natural join course;  Correct version  select name, title from instructor natural join teaches, course where teaches.course_id = course.course_id;  Another correct version  select name, title from (instructor natural join teaches) join course using(course_id); ©Silberschatz, Korth and Sudarshan3.22Database System Concepts - 6th Edition The Rename Operation  The SQL allows renaming relations and attributes using the as clause: old-name as new-name  E.g.  select ID, name, salary/12 as monthly_salary from instructor  Find the names of all instructors who have a higher salary than some instructor in „Comp. Sci‟.  select distinct T. name from instructor as T, instructor as S where T.salary > S.salary and S.dept_name = ‘Comp. Sci.’  Keyword as is optional and may be omitted instructor as T ≡ instructor T  Keyword as must be omitted in Oracle ©Silberschatz, Korth and Sudarshan3.23Database System Concepts - 6th Edition String Operations  SQL includes a string-matching operator for comparisons on character strings. The operator “like” uses patterns that are described using two special characters:  percent (%). The % character matches any substring.  underscore (_). The _ character matches any character.  Find the names of all instructors whose name includes the substring “dar”. select name from instructor where name like '%dar%'  Match the string “100 %” like ‘100 \%' escape '\' ©Silberschatz, Korth and Sudarshan3.24Database System Concepts - 6th Edition String Operations (Cont.)  Patters are case sensitive.  Pattern matching examples:  „Intro%‟ matches any string beginning with “Intro”.  „%Comp%‟ matches any string containing “Comp” as a substring.  „_ _ _‟ matches any string of exactly three characters.  „_ _ _ %‟ matches any string of at least three characters.  SQL supports a variety of string operations such as  concatenation (using “||”)  converting from upper to lower case (and vice versa)  finding string length, extracting substrings, etc. ©Silberschatz, Korth and Sudarshan3.25Database System Concepts - 6th Edition Ordering the Display of Tuples  List in alphabetic order the names of all instructors select distinct name from instructor order by name  We may specify desc for descending order or asc for ascending order, for each attribute; ascending order is the default.  Example: order by name desc  Can sort on multiple attributes  Example: order by dept_name, name ©Silberschatz, Korth and Sudarshan3.26Database System Concepts - 6th Edition Where Clause Predicates  SQL includes a between comparison operator  Example: Find the names of all instructors with salary between $90,000 and $100,000 (that is,  $90,000 and  $100,000)  select name from instructor where salary between 90000 and 100000  Tuple comparison  select name, course_id from instructor, teaches where (instructor.ID, dept_name) = (teaches.ID, ‟Biology‟); ©Silberschatz, Korth and Sudarshan3.27Database System Concepts - 6th Edition Duplicates  In relations with duplicates, SQL can define how many copies of tuples appear in the result.  Multiset versions of some of the relational algebra operators – given multiset relations r1 and r2: 1.  (r1): If there are c1 copies of tuple t1 in r1, and t1 satisfies selections ,, then there are c1 copies of t1 in  (r1). 2. A (r ): For each copy of tuple t1 in r1, there is a copy of tuple A (t1) in A (r1) where A (t1) denotes the projection of the single tuple t1. 3. r1  r2 : If there are c1 copies of tuple t1 in r1 and c2 copies of tuple t2 in r2, there are c1  c2 copies of the tuple t1 t2 in r1  r2 ©Silberschatz, Korth and Sudarshan3.28Database System Concepts - 6th Edition Duplicates (Cont.)  Example: Suppose multiset relations r1 (A, B) and r2 (C) are as follows: r1 = {(1, a) (2,a)} r2 = {(2), (3), (3)}  Then B(r1) would be {(a), (a)}, while B(r1)  r2 would be {(a,2), (a,2), (a,3), (a,3), (a,3), (a,3)}  SQL duplicate semantics: select A1,, A2, ..., An from r1, r2, ..., rm where P is equivalent to the multiset version of the expression: ))(( 21,,, 21 mPAAA rrrn    ©Silberschatz, Korth and Sudarshan3.29Database System Concepts - 6th Edition Set Operations  Find courses that ran in Fall 2009 or in Spring 2010  Find courses that ran in Fall 2009 but not in Spring 2010 (select course_id from section where sem = „Fall‟ and year = 2009) union (select course_id from section where sem = „Spring‟ and year = 2010)  Find courses that ran in Fall 2009 and in Spring 2010 (select course_id from section where sem = „Fall‟ and year = 2009) intersect (select course_id from section where sem = „Spring‟ and year = 2010) (select course_id from section where sem = „Fall‟ and year = 2009) except (select course_id from section where sem = „Spring‟ and year = 2010) ©Silberschatz, Korth and Sudarshan3.30Database System Concepts - 6th Edition Set Operations  Set operations union, intersect, and except  Each of the above operations automatically eliminates duplicates  To retain all duplicates use the corresponding multiset versions union all, intersect all and except all. Suppose a tuple occurs m times in r and n times in s, then, it occurs:  m + n times in r union all s  min(m,n) times in r intersect all s  max(0, m – n) times in r except all s ©Silberschatz, Korth and Sudarshan3.31Database System Concepts - 6th Edition Null Values  It is possible for tuples to have a null value, denoted by null, for some of their attributes  null signifies an unknown value or that a value does not exist.  The result of any arithmetic expression involving null is null  Example: 5 + null returns null  The predicate is null can be used to check for null values.  Example: Find all instructors whose salary is null. select name from instructor where salary is null ©Silberschatz, Korth and Sudarshan3.33Database System Concepts - 6th Edition Aggregate Functions  These functions operate on the multiset of values of a column of a relation, and return a value avg: average value min: minimum value max: maximum value sum: sum of values count: number of values ©Silberschatz, Korth and Sudarshan3.34Database System Concepts - 6th Edition Aggregate Functions (Cont.)  Find the average salary of instructors in the Computer Science department  select avg (salary) from instructor where dept_name= ‟Comp. Sci.‟;  Find the total number of instructors who teach a course in the Spring 2010 semester  select count (distinct ID) from teaches where semester = ‟Spring‟ and year = 2010  Find the number of tuples in the course relation  select count (*) from course; ©Silberschatz, Korth and Sudarshan3.35Database System Concepts - 6th Edition Aggregate Functions – Group By  Find the average salary of instructors in each department  select dept_name, avg (salary) from instructor group by dept_name;  Note: departments with no instructor will not appear in result ©Silberschatz, Korth and Sudarshan3.36Database System Concepts - 6th Edition Aggregation (Cont.)  Attributes in select clause outside of aggregate functions must appear in group by list  /* erroneous query */ select dept_name, ID, avg (salary) from instructor group by dept_name; ©Silberschatz, Korth and Sudarshan3.37Database System Concepts - 6th Edition Aggregate Functions – Having Clause  Find the names and average salaries of all departments whose average salary is greater than 42000 Note: predicates in the having clause are applied after the formation of groups whereas predicates in the where clause are applied before forming groups select dept_name, avg (salary) from instructor group by dept_name having avg (salary) > 42000; ©Silberschatz, Korth and Sudarshan3.38Database System Concepts - 6th Edition Null Values and Aggregates  Total all salaries select sum (salary ) from instructor  Above statement ignores null amounts  Result is null if there is no non-null amount  All aggregate operations except count(*) ignore tuples with null values on the aggregated attributes  What if collection has only null values?  count returns 0  all other aggregates return null ©Silberschatz, Korth and Sudarshan3.39Database System Concepts - 6th Edition Nested Subqueries  SQL provides a mechanism for the nesting of subqueries.  A subquery is a select-from-where expression that is nested within another query.  A common use of subqueries is to perform tests for set membership, set comparisons, and set cardinality. ©Silberschatz, Korth and Sudarshan3.40Database System Concepts - 6th Edition Example Query  Find courses offered in Fall 2009 and in Spring 2010  Find courses offered in Fall 2009 but not in Spring 2010 select distinct course_id from section where semester = ‟Fall‟ and year= 2009 and course_id in (select course_id from section where semester = ‟Spring‟ and year= 2010); select distinct course_id from section where semester = ‟Fall‟ and year= 2009 and course_id not in (select course_id from section where semester = ‟Spring‟ and year= 2010); ©Silberschatz, Korth and Sudarshan3.41Database System Concepts - 6th Edition Example Query  Find the total number of (distinct) students who have taken course sections taught by the instructor with ID 10101  Note: Above query can be written in a much simpler manner. The formulation above is simply to illustrate SQL features. select count (distinct ID) from takes where (course_id, sec_id, semester, year) in (select course_id, sec_id, semester, year from teaches where teaches.ID= 10101); ©Silberschatz, Korth and Sudarshan3.42Database System Concepts - 6th Edition Set Comparison  Find names of instructors with salary greater than that of some (at least one) instructor in the Biology department.  Same query using > some clause select name from instructor where salary > some (select salary from instructor where dept_name = ‟Biology‟); select distinct T.name from instructor as T, instructor as S where T.salary > S.salary and S.dept_name = ‟Biology‟; ©Silberschatz, Korth and Sudarshan3.43Database System Concepts - 6th Edition Definition of Some Clause  F some r   t  r such that (F t ) Where can be:      0 5 6 (5 < some ) = true 0 5 0 ) = false 5 0 5(5  some ) = true (since 0  5) (read: 5 < some tuple in the relation) (5 < some ) = true(5 = some (= some)  in However, ( some)  not in ©Silberschatz, Korth and Sudarshan3.44Database System Concepts - 6th Edition Example Query  Find the names of all instructors whose salary is greater than the salary of all instructors in the Biology department. select name from instructor where salary > all (select salary from instructor where dept_name = ‟Biology‟); ©Silberschatz, Korth and Sudarshan3.45Database System Concepts - 6th Edition Definition of all Clause  F all r   t  r (F t) 0 5 6 (5 < all ) = false 6 10 4 ) = true 5 4 6(5  all ) = true (since 5  4 and 5  6) (5 < all ) = false(5 = all ( all)  not in However, (= all)  in ©Silberschatz, Korth and Sudarshan3.46Database System Concepts - 6th Edition Test for Empty Relations  The exists construct returns the value true if the argument subquery is nonempty.  exists r  r  Ø  not exists r  r = Ø ©Silberschatz, Korth and Sudarshan3.49Database System Concepts - 6th Edition Test for Absence of Duplicate Tuples  The unique construct tests whether a subquery has any duplicate tuples in its result.  (Evaluates to “true” on an empty set)  Find all courses that were offered at most once in 2009 select T.course_id from course as T where unique (select R.course_id from section as R where T.course_id= R.course_id and R.year = 2009); ©Silberschatz, Korth and Sudarshan3.50Database System Concepts - 6th Edition Subqueries in the From Clause  SQL allows a subquery expression to be used in the from clause  Find the average instructors‟ salaries of those departments where the average salary is greater than $42,000. select dept_name, avg_salary from (select dept_name, avg (salary) as avg_salary from instructor group by dept_name) where avg_salary > 42000;  Note that we do not need to use the having clause  Another way to write above query select dept_name, avg_salary from (select dept_name, avg (salary) from instructor group by dept_name) as dept_avg (dept_name, avg_salary) where avg_salary > 42000; ©Silberschatz, Korth and Sudarshan3.54Database System Concepts - 6th Edition Scalar Subquery  Scalar subquery is one which is used where a single value is expected  E.g. select dept_name, (select count(*) from instructor where department.dept_name = instructor.dept_name) as num_instructors from department;  E.g. select name from instructor where salary * 10 > (select budget from department where department.dept_name = instructor.dept_name)  Runtime error if subquery returns more than one result tuple ©Silberschatz, Korth and Sudarshan3.55Database System Concepts - 6th Edition Modification of the Database  Deletion of tuples from a given relation  Insertion of new tuples into a given relation  Updating values in some tuples in a given relation ©Silberschatz, Korth and Sudarshan3.56Database System Concepts - 6th Edition Modification of the Database – Deletion  Delete all instructors delete from instructor  Delete all instructors from the Finance department delete from instructor where dept_name= ‟Finance‟;  Delete all tuples in the instructor relation for those instructors associated with a department located in the Watson building. delete from instructor where dept_name in (select dept_name from department where building = ‟Watson‟); ©Silberschatz, Korth and Sudarshan3.57Database System Concepts - 6th Edition Deletion (Cont.)  Delete all instructors whose salary is less than the average salary of instructors delete from instructor where salary< (select avg (salary) from instructor);  Problem: as we delete tuples from deposit, the average salary changes  Solution used in SQL: 1. First, compute avg salary and find all tuples to delete 2. Next, delete all tuples found above (without recomputing avg or retesting the tuples) ©Silberschatz, Korth and Sudarshan3.58Database System Concepts - 6th Edition Modification of the Database – Insertion  Add a new tuple to course insert into course values (‟CS-437‟, ‟Database Systems‟, ‟Comp. Sci.‟, 4);  or equivalently insert into course (course_id, title, dept_name, credits) values (‟CS-437‟, ‟Database Systems‟, ‟Comp. Sci.‟, 4);  Add a new tuple to student with tot_creds set to null insert into student values (‟3003‟, ‟Green‟, ‟Finance‟, null); ©Silberschatz, Korth and Sudarshan3.59Database System Concepts - 6th Edition Insertion (Cont.)  Add all instructors to the student relation with tot_creds set to 0 insert into student select ID, name, dept_name, 0 from instructor  The select from where statement is evaluated fully before any of its results are inserted into the relation (otherwise queries like insert into table1 select * from table1 would cause problems, if table1 did not have any primary key defined. ©Silberschatz, Korth and Sudarshan3.60Database System Concepts - 6th Edition Modification of the Database – Updates  Increase salaries of instructors whose salary is over $100,000 by 3%, and all others receive a 5% raise  Write two update statements: update instructor set salary = salary * 1.03 where salary > 100000; update instructor set salary = salary * 1.05 where salary <= 100000;  The order is important  Can be done better using the case statement (next slide) ©Silberschatz, Korth and Sudarshan3.61Database System Concepts - 6th Edition Case Statement for Conditional Updates  Same query as before but with case statement update instructor set salary = case when salary <= 100000 then salary * 1.05 else salary * 1.03 end ©Silberschatz, Korth and Sudarshan3.62Database System Concepts - 6th Edition Updates with Scalar Subqueries  Recompute and update tot_creds value for all students update student S set tot_cred = ( select sum(credits) from takes natural join course where S.ID= takes.ID and takes.grade <> ‟F‟ and takes.grade is not null);  Sets tot_creds to null for students who have not taken any course  Instead of sum(credits), use: case when sum(credits) is not null then sum(credits) else 0 end Database System Concepts, 6th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use End of Chapter 3