client application - SQL statement execution request
statement execution - stages:
the following operators are necessary in the querying process:
an SQL query can be written in multiple ways
programs[id, pname, pdescription]
groups[id, program, yearofstudy, gdescription]
students[cnp, lastname, firstname, sgroup, gpa, addr, email]
-- query:
-- Find students (lastname, firstname, year of study, program name, gpa)
-- in a given program (e.g. with id = 2, can be a parameter),
-- with a gpa >= 9 (can be a parameter).
-- a
SELECT lastname, firstname, yearofstudy, pname, gpa
FROM students st, groups gr, programs pr
WHERE st.sgroup = gr.id AND gr.program = pr.id AND program = 2 and gpa >= 9
**ππ½(ππΆ(π π‘π’ππππ‘π Γ ππππ’ππ Γ ππππππππ ))**
-- b
SELECT lastname, firstname, yearofstudy, pname, gpa
FROM (students st INNER JOIN groups gr ON st.sgroup = gr.id)
INNER JOIN programs pr ON gr.program = pr.id
WHERE program = 2 AND gpa >= 9
ππ½(ππΆ1((π π‘π’ππππ‘π β¨πΆ2ππππ’ππ )β¨πΆ3ππππππππ ))
-- c
SELECT lastname, firstname, yearofstudy, pname, gpa
FROM(
(
SELECT lastname, firstname, sgroup, gpa
FROM students
WHERE gpa >= 9
) st INNER JOIN (
SELECT *
FROM groups
WHERE program = 2
) gr ON st.sgroup = gr.id
) INNER JOIN (
SELECT id, pname
FROM programs
WHERE id = 2
) pr ON gr.program = pr.id
ππ½(((ππ½1(ππΆ2(π π‘π’ππππ‘π ))) βπΆ3 (ππΆ4(ππππ’ππ ))) βπΆ5 (ππ½2(ππΆ6(ππππππππ ))))
Galaxies[GID, GName, Diameter]
SolarSystems[SID, SName, Galaxy, Mass]
Planets[PID, PName, SolarSystem, AvgOrbitalSpeed, AvgRadius, Volume]
-- Find planets (planet name, avg speed, solar system name, galaxy name, avg radius)
-- from galaxy with GID = 2, with a volume >= 1.5*10^15 km^3.
-- a
SELECT PName, AvgOrbitalSpeed, SName, GName, AvgRadius
FROM Planets P, SolarSystems S, Galaxies G -- cross product by just enumerating
WHERE P.SolarSystem = S.SID AND S.Galaxy = G.GID AND Galaxy = 2 AND Volume >= 150000000000000
**ππ½(ππΆ(Planets Γ SolarSystem Γ Galaxies))**
-- b
SELECT PName, AvgOrbitalSpeed, SName, GName, AvgRadius
FROM (Planets P INNER JOIN SolarSystems S ON P.SolarSystem = S.SID)
INNER JOIN Galaxies G ON S.Galaxy = G.GID
WHERE Galaxy = 2 AND Volume >= 150000000000000
ππ½(ππΆ1((**Planets**β¨πΆ2**SolarSystem**)β¨πΆ3**Galaxies**))
-- c
SELECT PName, AvgOrbitalSpeed, SName, GName, AvgRadius
FROM(
(
SELECT PName, AvgOrbitalSpeed, AvgRadius, SolarSystem
FROM Planets
WHERE Volume >= 150000000000000
) P INNER JOIN (
SELECT *
FROM SolarSystems
WHERE Galaxy = 2
) S ON P.SolarSystem = S.SID
) INNER JOIN (
SELECT GID, GName
FROM Galaxies
WHERE GID = 2
) G ON S.Galaxy = G.GID
ππ½(((ππ½1(ππΆ2(**Planets**))) βπΆ3 (ππΆ4(**SolarSystem**))) βπΆ5 (ππ½2(ππΆ6(**Galaxies**))))
e.g.
B+ tree, entries a2
To obtain desired records:
β index scan
β get recordsβ addresses
β get records from relation
this algorithm is used to evaluate a cross-product
bR, bS = the number of blocks storing R and S, respectively
m, n = the number of blocks from R and S that can simultaneously appear in the main memory (there are m+n buffers for the 2 tables)
the following algorithm can be used to generate the cross-product {(r, s) | r β R, s β S}:
for every group of max. m blocks in R:
read the group of blocks from R into main memory; let M1 be the set of records in these blocks
for every group of max. n blocks in S:
read the group of blocks from S into main memory; let M2 be the set of records in these blocks
for every r β M1:
for every s β M2:
add (r, s) to the result
algorithm complexity:
to minimize this value, m should be maximized (the other operands are constants), one buffer can be used for S (so n = 1), while the remaining space can be used for R (m max.)
switch the 2 relations (in the algorithm and when computing the complexity):
choose better version
obs.: if bR β€ m or bS β€ n β complexity bR + bS