Welcome

Our website is made possible by displaying online advertisements to our visitors.
Please disable your ad blocker to continue.

Computer Engineering - Data Bases 2

Full exam

Databases 2 - exam S. Comai, P. Fraternali, D. Martinenghi September 1, 2022 A. Concurrency Control (8 points)Given the resource hierarchy X(Y(A,B), Z(S,T)), describe the behavior of the following arrival sequence of requests managed by a scheduler that applies hierarchical locking. Suppose that locks are released after the commit of each transaction, which occurs immediately after the last operation of each transaction. r1(S) w1(A) w2(Z) r2(A) r3(X) w1(Y) Solution.Op.XYABZST r1(S)ISL 1---ISL 1SL 1- w1(A)IXL 1(upgr)IXL 1XL 1-ISL 1SL 1- w2(Z)IXL 1,2IXL 1XL 1-ISL 1(Conflict - T2 waits!)SL 1- r2(A)T2 is waiting r3(X)IXL 1,2Conflict - T3 waits!IXL 1XL 1-ISL 1(Conflict - T2 waits!)SL 1- w1(Y)IXL 1,2XL 1(upgr)XL 1-ISL 1SL 1- T1 commitsIXL 2------ w2(Z)IXL 2---XL 2-- r2(A)IXL 2ISL 2SL 2-XL 2-- T2 commits------- r3(X)SL 3------ T3 commits------- 1 B. Physical Databases (8 points) A tableFILE(fid, name, mediaType, description) stores 100K tuples on 1K blocks in a primary entry sequenced storage. A tableHYPERLINK(source, target, rel) stores 1M tuples on 10K blocks in a primary hash built on the primary key with negligible overflow chains. The relation is not symmetric: if A→B is inHYPERLINKB→A may or may not be inHYPERLINK. We also know that •val(mediaType)=200; •only 5% of the hyperlinks correspond to a relation of type “alternate”. Consider the query that finds pairs of HTML files, one of which is an alternative version of the other: select F1.fid, F2.fid from (FILE F1 join HYPERLINK on F1.fid=source) join FILE F2 on F2.fid = target where F1.mediaType="HTML" and F2.mediaType="HTML" and rel="alternate" and source target Describe briefly (but precisely) a reasonable query plan and estimate its cost in the following scenarios:(a)there are no secondary indexes; (b)there is also a B+(mediaType) index forFILE(F=25, 3 levels, 500 leaf nodes). Solution.(a)In case there are no secondary indexes, there are two strategies: (1) without caching, and (2) withcaching:(1)Self-join nested loop onFILEtable + 500×499 hash lookups onHYPERLINK(there are 500 HTML files because there are 100k tuples and 200 files per media type, and once a file is chosen, there are only 499 distinct files to choose from). The total cost is: ≈1k×1k + (1×250k)≈1.25M (2)Indexed filtered nested loop withFILEtable as external and caching. The strategy is the following: •Find the 500 HTML files (caching theirfid) •Table scan (cost: 1k) •For each pair of HTML filesfid: –Build the hash key from the pair (500×499 key values≈250k) –Access the primary hash onHYPERLINKto check if there is a link (1 block per access =⇒250k) –Get the tuple from primary storage and check therelvalue The output is formed by the linked pairs withrel = ’alternate’, and the total cost is: ≈1k + 1×250k≈251k The plan can actually be improved, because, when caching is available, instead of computing all source-target pairs and use the hash to check the existence of an alternate hyperlink, we can simply scan the 10K blocks of the hash to look for a match with the cached files, so the cost becomes≈1k (scanFILE) + 10k (scanHYPERLINK) = 11k (b)Indexed filtered nested loop withFILEtable as external, optimized with filtering throughmediaType B+ onFILE. The strategy is the following: •Find the 500 HTML files –Lookup the index with search key =HTML –2 intermediate nodes + 3 leaf nodes (100K tuples / 500 leaf nodes = 200 pointers per node, so 3 nodes for 500 HTML files) + 500 data blocks for HTML files = 505 –500 tuples→get (cache) thefidvalue •For each pair of HTML files –Build the hash key from the pair 2 – 500×499≈250k –Access the primary hash onHYPERLINKto check if there is a link –1 block per access→250K –Get the tuple from primary storage and check the relation typerel The output is formed by the linked pairs withrel = ’alternate’, and the total cost is: ≈2 + 3 + 500 + 250k≈250,505 As in the previous case, the plan can be improved by simply scanning the 10K blocks of the hash instead of checking all the source-target pairs, with cost ≈2 + 3 + 500 (find files on B+) + 10k (scanHYPERLINK)≈10.5k 3 C. Triggers (9 points) The following relational schema offers a simplified description of a university (primary keys underlined):Person(Pid, Name) Course(Cid, Dept, Credits) Schedule(Teacher, Cid) Registration(Student, Cid) with obvious foreign keys fromCidtoCourse(Cid)and fromTeacherandStudenttoPerson(Pid). The university policy requires the following integrity constraints to hold:•No one is both a student and a teacher for the same course. •Every course in the CS department is worth at most 10 credits. •At least one course in the CS department has at least 5 students. Design a trigger system that keeps the integrity constraints satisfied by rejecting illegal operations. In particular, the triggers must react to the following kinds of events occurring on the database tables: (i)insertions (any table);(ii)deletions (any table);(iii)updates ofCourse. Solution.Triggers for insertions: CREATE TRIGGER insert_schedule BEFORE INSERT ON Schedule FOR EACH ROW BEGINIF EXISTS( SELECT * FROM RegistrationWHERE new.Teacher = Student AND new.Cid = Cid ) THEN RAISE EXCEPTION ’Existing student with the same name’; END IF; END; CREATE TRIGGER insert_registration BEFORE INSERT ON Registration FOR EACH ROW BEGINIF EXISTS( SELECT * FROM ScheduleWHERE new.Student = Teacher AND new.Cid = Cid ) THENRAISE EXCEPTION ’Existing teacher with the same name’; END IF; END; CREATE TRIGGER insert_course BEFORE INSERT ON Course FOR EACH ROW BEGINIF new.Credits > 10 AND new.Dept = ’CS’ THENRAISE EXCEPTION ’Too many credits for a CS course’; END IF; END; Triggers for deletions: CREATE TRIGGER delete_registration BEFORE DELETE ON Registration FOR EACH ROW BEGIN 4 IF (SELECT dept FROM Course WHERE Cid=old.Cid) = ’CS’ AND (SELECT COUNT(*) FROM Registration WHERE Cid=old.Cid) = 5 AND NOT EXISTS( SELECT 1 FROM Registration R JOIN Course C ON R.Cid=C.CidWHERE dept=’CS’ AND R.Cidold.Cid GROUP BY R.Cid HAVING COUNT(*)>=5 ) THEN RAISE EXCEPTION ’No course left with at least 5 students’; END IF; END; Triggers for updates ofCourse: CREATE TRIGGER update_course_1 BEFORE UPDATE ON Course FOR EACH ROW WHEN NEW.dept = ’CS’ -- redundant, already checked in code BEGIN-- same code as trigger insert_course IF new.Credits > 10 AND new.Dept = ’CS’ THENRAISE EXCEPTION ’Too many credits for a CS course’; END IF; END; CREATE TRIGGER update_course_2 BEFORE UPDATE ON Course FOR EACH ROW WHEN new.Dept ’CS’ AND old.Dept = ’CS’ BEGIN-- almost same code as trigger delete_registration IF (SELECT COUNT(*) FROM Registration WHERE Cid=old.Cid) >= 5 AND NOT EXISTS( SELECT 1 FROM Registration R JOIN Course C ON R.Cid=C.CidWHERE Dept=’CS’ AND R.Cidold.Cid GROUP BY R.Cid HAVING COUNT(*)>=5 ) THENRAISE EXCEPTION ’No course left with at least 5 students’; END IF; END; 5 D. JPA (7 points) An application lets a researcher edit the pro jects s/he is responsible for. A pro ject has a name, a start and end date, a type and a budget (an integer number). A researcher can create pro jects and assign other researchers to them. Researchers have a name, an id, a photo, and a seniority level (a string). Researchers also pertain to one or more disciplines (e.g., computer science, automation, telecommunications, etc.). The application lets a researcher access the list of pro jects s/he manages (in descending order of end date) and the list of pro jects s/he works in. By selecting one pro ject s/he can see the details of the pro ject including the list of the researchers that work in it/manage it with the disciplines they belong to. Researchers are in the order of hundreds, pro jects are in the order of thousands and disciplines in the order of tens.Show the JPA entities (with all their attributes) that map the domain ob jects of the conceptual model, taking into account the above mentioned access paths of the application and data cardinalities. When designing the annotations for the relationships, specify the owner side of the relationship, the mapped-by attribute, the fetch policy and the cascading policies you consider more appropriate to support the access required by the web application. Comment on the design choices you have made.Discipline id id id id id id id id id id id id idid id id id nameResearcher id id id id id id id id id id id id idid id id id firstname lastname photo seniorityPro ject id id id id id id id id id id id id idid id id id name type start end budgetsector0:N1:Nmanages 0:N1:1 worksin0:N0:N Solution. See separate pdf file. 6