Lecture7.pdf
💡Recall
🖋 Notes
Relational Algebra
- relational algebra is a formal system for manipulating and querying relational databases
- it is a set of mathematical operations that can be performed on the relations (tables) in a database
- query languages in the relational mode
- relational algebra and calculus - formal query languages with a significant influence on SQL:
- relational algebra
- queries are specified in an operational manner
- relational calculus
- queries describe the desired answer, without specifying how it will be computed (declarative)
- not expected to be Turing complete
- not intended for complex calculations
- provide efficient access to large datasets
- allow optimizations
- used by DBMSs to represent query execution plans
- a relational algebra query:
- is built using a collection of operators
- describes a step-by-step procedure for computing the result set
- is evaluated on the input relations' instances
- produces an instance of the output relation
- every operation returns a relation, so operators can be composed, the algebra is closed
- the result of an algebra expression is a relation, and a relation is a set of tuples
- relational algebra on bags (multisets) - duplicates are not eliminated
Conditions
Conditions that can be used in several algebraic operators (similar to the SELECT filter conditions):
- attribute_name relational_operator value
- value - attribute name, expression
- attribute_name IS [NOT] IN single_column_relation
- a relation with one column can be considered a set
- the condition tests whether a value belongs to a set
- relation {IS [NOT] IN | = | <>} relation
- the relations in the condition must be union-compatible
- (condition)
NOT condition
condition1 AND condition2
condition1 OR condition2
- where condition, condition1, condition2 are conditions of type 1-4
Operators in the Algebra
Selection
-
selects a subset of rows from a table that meet a certain condition
-
equivalent SELECT statements can be specified for the relational algebra expressions
-
notation: ${\sigma{c}}^{}(R)$
-
resulting relation:
- schema: $R$ 's schema
- tuples: records in $R$ that satisfy condition $C$
-
equivalent SELECT statement:
SELECT *
FROM R
WHERE C
Projection
-
selects a subset of columns from a table
-
notation: ${\prod{
α}}^{}(R)$
-
resulting relation:
- schema: attributes in
$α$
- tuples: every record in $R$ is projected on
$α$
-
equivalent SELECT statement:
SELECT DISTINCT 𝛼
FROM R
SELECT 𝛼
FROM R -- algebra on bags
Cross/Cartesian product
-
combines every row from one table with every row from another table
-
notation: $R1 X R2$
-
resulting relation:
- schema: the attributes of $R1$ followed by the attrbiutes of $R2$
- tuples: every tuple $r1$ in $R1$ is concatenated with every tuple $r2$ in $R2$
-
equivalent SELECT statement:
SELECT *
FROM R1 CROSS JOIN R2
Union
-
combines the rows from two tables, removing duplicates
-
notation: $R1 ∪ R2$
-
resulting relation:
- $R1$ and $R2$ must be union-compatible:
- same number of columns
- corresponding columns, taken in order from left to right, have the same domains
-
equivalent SELECT statement:
SELECT *
FROM R1
UNION
SELECT *
FROM R2
-- algebra on bags: SELECT statements that don't eliminate duplicates (e.g. UNION ALL)
Intersection
-
returns only the rows that are in both tables
-
notation: $R1 - R2$
-
resulting relation:
- $R1$ and $R2$ must be union-compatible:
- same number of columns
- corresponding columns, taken in order from left to right, have the same domains
-
equivalent SELECT statement:
SELECT *
FROM R1
EXCEPT
SELECT *
FROM R2
Difference
- returns only the rows that are in the first table but not in the second table
- notation: $R1 ∩ R2$
- resulting relation:
- $R1$ and $R2$ must be union-compatible:
- same number of columns
- corresponding columns, taken in order from left to right, have the same domains
- equivalent SELECT statement:
Join