Lecture1.pdf
💡Recall
🖋 Notes
Data Storage Methods
-
databases
-
distributed databases
-
files
e.g.
bank storing a large collection of data on employees, clients, accounts, transactions, etc.
requirements:
- quick answers to questions about the data
- protecting the data from inconsistent changes made by users accessing it at the same time
- restricting access to some parts of the data, e.g. salaries
difficulties:
- encountered when storing and managing the data using a collection of files
- multiple data storage formats
- data redundancy
- some parts of the data can be stored in multiple files => potential inconsistencies
- read / write operations are described in the program (using certain record structures) ⇒ difficulties in program development (changes in the file structure lead to changes in the program)
- changing data (modifying / removing records), retrieving data based on search criteria - difficult operations
- integrity constraints - checked in the program
- main memory management, e.g. how is a data collection of tens / hundreds of GB loaded for processing?
- no adequate security policies, allowing different users to access different segments of data
- concurrent data access is difficult to manage
- data must be restored to a consistent state in the event of a system failure, e.g. a bank transaction transferring money from account A to account B is interrupted by a blackout after having debited account A, but prior to crediting account B ⇒ money must be put back into account A
useful:
- for single-user programs dealing with a small amount of data
Data Description Models
- data description model:
- set of concepts and rules used to model/describe data
- the set describes the:
- structure of the data
- consistency constraints
- relationships with other data
- schema (the data structure or template)
- data structures used to describe a collection of data stored in a database
- instance of the schema
[analogy: classes and objects in object-oriented programming]
Types
- entity-relationship
- relational
- network
- hierarchical
- object-oriented
- noSQL
- semistructured (XML)
The Relational Model
- relation is the main concept used to describe data
- the schema of a relation:
- the relation's name
- for each field (column): name and type
e.g.
Movie(mid: string, title: string, director: string, year: integer)
instance of the Movie relation, every row has 4 columns:

The Entity-Relationship Model
- semantic, more abstract, high-level model
- eases the task of developing a good initial description of the data
- even though the database management system's model hides many details, it's still closer to the manner in which the data is stored than to the user's perspective on the data
concepts:
- entity
- a piece of data, an object in the real world
- described by attributes (properties)
- entity set (class / entity schema)
- entities with the same structure (e.g. the set of students)
- name, list of attributes
- attribute
- name, domain of possible values, conditions to check correctness
- key
- a restriction defined on an entity set
- set of attributes with distinct values in the entity set's instances
- relationship
- specifies an association among 2 or more entities
- descriptive attributes can be used
- relationship set (relationship schema)
- describes all relationships with the same structure
- name, entity sets used in the association, descriptive attributes
- schema of the model
- set of entity sets and relationship sets
- binary relationships (between entity sets T1 and T2) - relationship types:
- restrictions in the database:
- when the database is changed, the system checks whether the relationship is of the specified type
1:1
1:n
m:n
- graphical representation of the model
Example

Databases and Database Management Systems
- a database contains:
- separation between:
- database design
- data analysis
- database management system (DBMS)
- DBMS examples
- database system
The Structures of a Database