in the relational model there are multiple ways to model a data collection (using tables)
database design should assure that
common normal forms: 1NF, 2NF, 3NF, BCNF, 4NF, 5NF [defined by Codd (the first 3), Boyce and Codd (BCNF), and Fagin (4NF, 5NF)]
if a relation is not in normal form X, it can be decomposed into multiple relations that are in normal form X
to decompose a relation we use:
or
Decomposition that is not good:
LearningContracts[Student, FacultyMember, Course]
Two new relations obtained by applying the projection operator to this relation: SF[Student, FacultyMember]
and FC[FacultyMember, Course]
assume the original relation contains the following values:
using the ∏ and * operators, the following values are obtained for the 2 projections and their natural join:
the decomposition is not good, since the SF*FC relation contains extra records, i.e. records that were not present in the original relation
simple attribute
composite attribute = a set of attributes in a relation (with at least two attributes)
repeating attributes
in some applications, attributes (simple or composite) can take on multiple values for a record in the relation
cannot be used in the relational model, hence they are to be avoided, without losing data
Let $R[A]$ be a relation, $A$ the set of attributes. Let $α$ be a repeating attribute in $R$ (simple or composite):
$R$ can be decomposed into 2 relations, such that $α$ is not a repeating attribute anymore If $K$ is a key in $R$, the two relations into which $R$ is decomposed are:
$R'[K\cupα] = \prod_{K\cupα}^{}(R)$
$R''[A-α] = \prod_{A-α}^{}(R)$
attribute values must be scalar, atomic (they cannot be further decomposed) when defining a relation (in the relational model)
Consider the relation: STUDENT [NAME, BIRTHYEAR, GROUP, COURSE, GRADE]
key: NAME
composite repeating attribute: the pair {COURSE, GRADE}
this relation could store the following values:
The STUDENT relation is decomposed into the following 2 relations:
GENERAL_DATA [NAME, BIRTHYEAR, GROUP]
RESULTS [NAME, COURSE, GRADE]
The values in these relations are:
Consider the relation: BOOK [BId, AuthorsNames, Title, Publisher, PublishingYear, Keywords]
The BOOK relation is decomposed into the following 3 relations (there are two repeating attributes in the BOOK relation):
BOOKS [BId, Title, Publisher, PublishingYear]
AUTHORS [BId, AuthorName]
KEYWORDS [BId, Keyword]
If a book is not associated with any authors / keywords, it must have one corresponding tuple in AUTHORS / KEYWORDS, with the second attribute set to null.
In the absence of such tuples, the BOOK relation can't be obtained from the three relations using just the natural join (outer join operators are required).
the following 3 relational normal forms use a very important notion: the functional dependency among subsets of attributes
the database administrator is responsible with determining the functional dependencies
these depend on the nature, the semantics of the data stored in the relation
Let $R[A1,A2,...,An]$ be a relation and $α, β$ two subsets of attributes of $R$.
$α$ → $β$ , the (simple or composite) attribute $α$ functionally determines attribute $β$ (simple or composite)
If an $α$ value appears in multiple rows, each of these rows will contain the same value for $β$:
$\prod_{α}^{}(r) = \prod_{α}^{}(r')$ $implies$ $\prod_{β}^{}(r) = \prod_{β}^{}(r')$