3.1 Overview of the SQL Query Language
The original version of SQL, called Sequel, was developed by IBM in the early 1970s. It has evolved since then and changed its name to SQL (Structured Query Language).
The SQL language has several important parts
- Data-definition language (DDL)
- The SQL DDL provides commands for defining relation schemas, deleting relations, modifying relation schemas.
 
 - Data-manipulation language (DML)
- The SQL DML provides the ability to query information from DB, insert tules into, delete tuples from, and modify tuples in the database.
 
 - Integrity
- The SQL DDL includes commands for specifying integrity constraints that the data in the database must satisfy.
 - Updates that violate integrity constraints are disallowed.
 
 - View definition
- The SQL DDL includes commands for defining views.
 
 - Transaction control
- Commands for specifying the beginning and end-points of transactions.
 
 - Embedded SQL and Dynamic SQL
- How SQL statements can be embedded within general-purpose programming languages
 
 - Authorization
- The SQL DDL includes commands for specifying access rights to relations and views
 
 
3.2 SQL Data Definition
The set of relations in a database are specified using a data-definition language (DDL).
The SQL DDL not only allows specification of a set of relations, but also information about each relation, including
- Schema for each relation
 - Types of values associated with each attribute
 - Integrity constraints
 - Set of indices to be maintained for each relation
 - Security and authorization information for each relation
 - Physical storage structure of each relation on disk
 
Basic Types
THe SQL standard supports built-in types, including
char(n)- A fixed-length character string with length 
n. 
- A fixed-length character string with length 
 varchar(n): A variable-length character string with user-specified maximum lengthn.int- An integer
 
smallint- A small integer (machine-dependent subset of int)
 
numeric(p, d)- A fixed-point number with 
pdigits (plus a sign). dofpdigits are to the right of the decimal points.- Ex) 
numeric(3,1)allows44.5but does not allow444.5or0.32 
- A fixed-point number with 
 real, double precision- Floating-point and double-precision floating-point numbers
 
float(n)- A floating-point number with precision of at least 
ndigits 
- A floating-point number with precision of at least 
 
Each type may include a special value called
nullvalue indicating an absent value.
When storing a
charvalue with the less length than the specified lengthnthen the spaces are appended to match the length.When comparing two chars, the shorter one is appended with spaces to match the length.
When char and varchar are compared, extra spaces may be added to the varchar type to match the length. However, it may vary depending on the database system.
Hence, Always use
varcharinstead ofchartype to avoid these problems.
Basic Schema Definition
- Define an SQL relation by 
CREATE TABLEcommand.1 2 3 4 5 6
CREATE TABLE department ( dept_name VARCHAR(20), building VARCHAR(15), budget NUMERIC(12, 2), PRIMARY KEY (dept_name) );
 - The general form of the 
CREATE TABLEis1 2 3 4 5 6 7 8 9
CREATE TABLE r ( A1 D1, A2 D2, ..., An Dn, <integrity-constraint 1> ..., <integrity-constraint k> );
r: Name of the relation- Each 
Aiis the name of an attribute in the schema of relationr Diis the domain of attribute Ai (type of attribute) with optional constraints that restrict the domain of Ai.
 - Some of integrity constraints commands
- PRIMARY KEY $(A_{J_1}, A_{j_2},…,A_{j_m})$
- Attributes$(A_{J_1}, A_{j_2},…,A_{j_m})$ form the primary key for the relation.
 - The primary-key attributes MUST be NON-NULL and UNIQUE.
 
 - FOREIGN KEY $(A_{J_1}, A_{j_2},…,A_{j_m})$ REFERENCES $s$
- Attributes $(A_{J_1}, A_{j_2},…,A_{j_m})$ for any tuple in the relation must correspond to values of the primary key attributes of some tuple in the relation $s$.
 
1 2 3 4 5 6 7 8 9 10 11 12 13
CREATE TABLE COURSE ( course_id VARCHAR(7), title VARCHAR(50), dept_name VARCHAR(20), PRIMARY KEY (course_id), FOREIGN KEY (dept_name) REFERENCES department(dept_name) ); CREATE TABLE department ( dept_name VARCHAR(20), building VARCHAR(15), PRIMARY KEY (dept_name) );
FOREIGN KEY (dept_name) REFERENCES department(dept_name)specifies that for each course tuple,dept_namemust exist in the primary key attributedept_nameof department relation.- Some database systems allow foreign-key constrain definition without specifying referenced attributes explicitly: 
FOREIGN KEY (dept_name) REFERENCES department 
 - NOT NULL
- NULL value is NOT allowed for the specified attributes.
 
 
 - PRIMARY KEY $(A_{J_1}, A_{j_2},…,A_{j_m})$
 
A newly created relation is initially empty.
Data Manipulation Statements
- Remove a relation from an SQL database
DROP TABLE r;- Deletes not only all tuples of 
rbut also the schema forr. 
- Deletes not only all tuples of 
 DELETE FROM r;- Just deletes all tuples in 
r. 
- Just deletes all tuples in 
 
 - Alter an existing relation
ALTER TABLE r ADD A D;r: Name of existing relationA: Name of the attribute to be addedD: Type of the added attribute1
ALTER TABLE course ADD course_tag VARCHAR(255);
ALTER TABLE r DROP A;- Many database systems do not support dropping of attributes
 A: Name of an attribute to be dropped.
 
3.3 Basic Structure of SQL Queries
The basic structure of an SQL query consists of three clauses
selectfromwhere
3.3.1 Queries on a Single Relation
SELECT
SELECT dept_name FROM instructor;SELECT DISTINCT dept_name FROM instructor;- remove duplicatesSELECT dept_name FROM instructor;- allow duplicates- The select may contain arithmetic expressions 
+, -, *, /SELECT ID, name, dept_name, salary * 1.1 FROM instructor;
 
WHERE
SELECT name FROM instructor WHERE dept_name = 'Comp. Sci';
3.3.2 Queries on Multiple Relations
We often query against multiple relations.
1
2
3
SELECT A1, A2, ..., An
FROM r1, r2, ..., rm
where P;
SELECT: list the attributes desired in the result of a queryFROM: list of the relations to be accessed- It defines a Cartesian product of the relations
 - The result relation has all attributes from all the relationis in the FROM clause. So the resulted relation might be extremely big. So we usually restrict with WHERE clause.
 - If the same attribute name appears in both 
r1andr2, we prefix the name of the relation. (ex,instructor.ID,instructor.dept_name, etc) - For those attributes that appear only one of the schemas, we can drop the relation-name prefix
 
WHERE: Predicate involving attributes of the relation in the FROM clause.
In general, the meaning of an SQL query can be understood as
- Generate a Cartesian product of the relations listed in the 
FROMclause. - Apply the predicates specified in the 
WHEREclause - For each tuple in Step (2), output the attributes specified in the 
SELECTclause. 
This explains “what” the result of an SQL query, not “how” it should be executed.
- A real implementation optimizes evaluation by generating only elements that satisfy the WHERE clause (as far as possible)
 
3.4 Additional Basic Operations
Rename operation: AS
- Renaming attributes
1 2 3
SELECT name as instructor_name, course_id FROM instructor, teaches WHERE instructor.ID = teaches.ID;
 - Renaming relations
1 2 3
SELECT T.name, S.course_id FROM instructor as T, teaches as S WHERE T.ID = S.ID;
 - An identifier such as 
TandSis referred to as a correlation name or table alias, correlation variable, or tuple variable 
String operations
LIKEfor pattern matching%: matches any substring_: matches any character- Pattern matching is case-sensitive
1 2 3
SELECT dept_name FROM department WHERE building LIKE '%Watson%';
 - Offers 
NOT LIKEfor mismatches \for escape characterLIKE 'ab\%cd%'matches all strings beginning with “ab%cd”
Attribute Specification in the SELECT Clause
The asterisk * can be used in SELECT to denote all attributes.
1
2
3
SELECT instructor.*
FROM instructor, teaches
WHERE instructor.ID = teaches.ID;
SELECT * FROM ... indicates all attributes of the result relation are selected
Ordering the Display of Tuples
SQL offers ORDER BY clause to sort the result of a query.
1
2
3
4
SELECT name
FROM instructor
WHERE dept_name = 'Physics'
ORDER BY name;
- By default, 
ORDER BYsorts in ascending order - To specify the order, use 
DESCorASC 
WHERE-clause Predicates
between: comparison operator for range1 2 3
SELECT name FROM instructor WHERE salary between 90000 and 100000;
- Similarly for 
NOT BETWEEN 
- Similarly for 
 
Set Operations
SQL supports the following set operations.
UNIONINTERSECTEXCEPT
Suppose we have two sets of courses.
1
2
3
SELECT course_id
FROM section
WHERE semester = 'Fall' and year = 2017;
1
2
3
SELECT course_id
FROM section
WHERE semester = 'Spring' and year = 2018;
Union- Automatically eliminates duplicates. (use 
UNION ALLto retain duplicates)1 2 3 4 5 6 7
(SELECT course_id FROM section WHERE semester = 'Fall' and year = 2017;) UNION (SELECT course_id FROM section WHERE semester = 'Spring' and year = 2018);
 
- Automatically eliminates duplicates. (use 
 INTERSECT- Automatically eliminates duplicates. (use 
INTERSECTION ALLto retain duplicates)1 2 3 4 5 6 7
(SELECT course_id FROM section WHERE semester = 'Fall' and year = 2017;) INTERSECT (SELECT course_id FROM section WHERE semester = 'Spring' and year = 2018);
 
- Automatically eliminates duplicates. (use 
 EXCEPT- Automatically eliminates duplicates. (use 
EXCEPT ALLto retain duplicates) - To find all courses taught in Fall 2017 but NOT in Spring 2018 semester.
1 2 3 4 5 6 7
(SELECT course_id FROM section WHERE semester = 'Fall' and year = 2017;) INTERSECT (SELECT course_id FROM section WHERE semester = 'Spring' and year = 2018);
 
- Automatically eliminates duplicates. (use