I

Let R and S be 2 relations. R has 10.000 records; a page can hold 10 R records.

S has 2.000 records; a page can hold 10 S records.

52 buffer pages are available.

Compute the cost of:

SELECT *

FROM R INNER JOIN S

ON R.a = S.b

using page-oriented nested loops join and block nested loops join;

S is the outer relation.

⇒ we need to consider the number of I/O operations required for each join algorithm:

Page-Oriented Nested Loops Join:

In this join algorithm, we iterate through each page of the outer relation (S) and for each page, we iterate through each record in the inner relation (R). The cost is determined by the number of I/O operations required to read and write the data.

Given:
- R has 10,000 records with a page size of 10 records per page. So, R has 10000 / 10 = 1,000 pages.
- S has 2,000 records with a page size of 10 records per page. So, S has 2000 / 10 = 200 pages.
- 52 buffer pages are available.
The cost of page-oriented nested loops join can be calculated as follows:
- Cost of reading the outer relation (S)
  
  Since S is the outer relation, we need to read all its pages once.
  
  So, the cost is equal to the number of pages of S.
  
  Cost of reading S = 200 (pages)
- Cost of reading the inner relation (R):
  
  For each page of S, we need to read all pages of R.
  
  Since there are 200 pages in S, we need to read 200 * 1000 = 200000 pages of R.
  
  Cost of reading R = 200000 (pages)
- Total cost:
  
  The total cost is the sum of the costs of reading the outer relation (S) and the inner relation (R).
  
  Total cost of page-oriented nested loops join = Cost of reading S + Cost of reading R = 200 + 200000 = 200200 (pages) I/Os
Block Nested Loops Join:

In this join algorithm, we read blocks of pages from the outer relation (S) and the inner relation (R) instead of individual pages. The cost is determined by the number of I/O operations required to read and write the data.

Given:
- R has 10,000 records with a page size of 10 records per page. So, R has 10,000 / 10 = 1,000 pages.
- S has 2,000 records with a page size of 10 records per page. So, S has 2,000 / 10 = 200 pages.
- 52 buffer pages are available.
The cost of block nested loops join can be calculated as follows:
- Cost of reading the outer relation (S):
  
  Since S is the outer relation, we need to read all its pages once.
  
  So, the cost is equal to the number of pages of S.
  
  Cost of reading S = 200 (pages)
- Cost of reading the inner relation (R):
  
  For each block of S, we need to read all blocks of R.
  
  The number of records that can be accommodated in a block is determined by the available buffer pages.
  
  Number of records in a block = buffer pages * page size = 52 * 10 = 520 records
  
  Number of blocks in R = Total records in R / records in a block = 10,000 / 520 ≈ 19.23 (blocks)
  
  Since we cannot have a fraction of a block, we need to round it up to the nearest integer.
  
  Number of blocks in R = 20 (blocks)
  
  Cost of reading R = Number of blocks in R = 20 (blocks)
  
  4 * 1000 = 4000 I/Os
- Total cost: The total cost is the sum of the costs of reading the outer relation (S) and the inner relation (R).
  
  Total cost of block nested loops join = Cost of reading S + Cost of reading R = 200 + 20 = 220 (blocks)
  
  block size: 50 ⇒ 200/50 = 4 S blocks
  
  200 + 4 * 1000 = 4200 I/Os

Therefore, the cost of the SELECT query using page-oriented nested loops join is 200200 I/Os, and the cost using block nested loops join is 4200 I/Os.

Let R and S be 2 relations.

R has 10.000 records; a page can hold 10 R records.

S has 2.000 records; a page can hold 10 S records.

Compute the cost of sorting R using external merge sort with 200 buffer pages.
Let R and S be 2 relations.

R has 10.000 records; a page can hold 10 R records.

S has 2.000 records; a page can hold 10 S records.

R is stored at București, S is stored at Cluj-Napoca.

Compute the cost of: SELECT *

FROM R INNER JOIN S

ON R.a = S.b

using simple nested loops join (tuple-oriented) in Cluj-Napoca, without caching;

S is the outer relation.
- td time to R / W a page from / to disk
- ts time to ship a page
- 200td + 2000 * 1000 (td + ts) = 200td + 2000000 (td + ts)
Encode the data de gustibus non disputandum using the secret encryption key metallica and the table of codes below. Write the last 5 characters in the result.

II

T1 and T2 are 2 concurrent transactions, both active at time t.

Choose the correct answer(s): a. The following execution describes a write read conflict: At time t, T2 is reading a data object previously written by T1. b. The following execution describes a write read conflict: At time t, T2 is writing a data object previously read by T1. c. The following execution describes a read write conflict: At time t, T2 is reading a data object previously written by T1. d. The following execution describes a read write conflict: At time t, T2 is writing a data object previously read by T1. e. none of the above answers is correct.

In a write read conflict, one transaction writes a data object and another transaction reads the same data object. This creates a conflict because the reading transaction might see an inconsistent state if it reads the data object before the writing transaction commits its changes.
A schedule S:

a. is conflict serializable if and only if its precedence graph has exactly one cycle.

b. is conflict serializable if and only if its precedence graph is acyclic.

c. is conflict serializable if and only if its precedence graph has exactly two cycles.

d. is conflict serializable if and only if its precedence graph has exactly three cycles.

e. none of the above answers is correct.

Conflict serializability is a property of a schedule in concurrency control, indicating that the schedule is equivalent to some serial schedule. A schedule is conflict serializable if the order of conflicting operations (read-write or write-write) in the schedule can be rearranged to form a valid serial schedule without changing the outcome of the transactions.

The precedence graph represents the conflicts between operations in a schedule. It is a directed graph where the nodes represent the transactions, and the edges represent the conflicts between transactions. In the precedence graph, a cycle indicates a conflict that cannot be resolved, and therefore the schedule is not conflict serializable.

Options a, c, and d are incorrect because they specify a specific number of cycles in the precedence graph as a condition for conflict serializability, which is not accurate. The presence of any cycle in the precedence graph indicates a conflict that violates conflict serializability.
In SQL Server, under the READ UNCOMMITTED isolation level:
In horizontal fragmentation:
I is an index with search key <C1, C2, C3, C4>.
Let R be a relation with P pages. The cost of sorting R using simple two-way merge sort (i.e., with 3 pages in the buffer pool) is:
Consider the query: SELECT * FROM R1, R2, R3 WHERE p1 AND p2 AND p3 The conditions tested by the predicates in the WHERE clause are statistically independent. The cardinality of a relation R is denoted by |R|. The reduction factor associated with predicate p is denoted by RF(p). The cardinality of the query’s result set can be estimated by: