Welcome

Our website is made possible by displaying online advertisements to our visitors.
Please disable your ad blocker to continue.

Computer Engineering - Machine Learning

Full exam

DIPARTIMENTO DI ELETTRONICA, INFORMAZIONE E BIOINGEGNERIA Politecnico di Milano Machine Learning (Code: 097683)February 16, 2024 Surname: Name: Student ID: Row: Column:Time:2hoursMaximumMarks:33 The following exam is composed of8 exercises(one per page). The rst page needs to be lled with yourname, surname and student ID. The following pages should be used only in the large squarespresent on each page. Any solution provided either outside these spaces orwithout a motivationwill not be considered for the nal mark. During this exam you arenot allowed to use electronic devices, such as laptops, smartphones, tablets and/or similar. As well, you are not allowed to bring with you any kind of note, book, written scheme, and/or similar. You are also not allowed to communicate with other students during the exam. The rst reported violation of the above-mentioned rules will be annotated on the exam and will be considered for the nal mark decision. The second reported violation of the above-mentioned rules will imply the immediate expulsion of the student from the exam room and theannulment of the exam. You are allowed to write the exam either with a pen (black or blue) or a pencil. It is your responsibility to provide a readable solution. We will not be held accountable for accidental partial or total cancellation of the exam. The exam can be written either inEnglishorItalian. You are allowed to withdraw from the exam at any time without any penalty. You are allowed to leave the room not early than half the time of the duration of the exam. You are not allowed to keep the text of the exam with you while leaving the room. Three of the points will be given on the basis on how quick you are in solving the exam.If you nish earlier than 30 min before the end of the exam you will get 3 points, if you nish earlier than 20 min you will get 2 points and if you nish earlier than 10 min you will get 1 point (the points cannot be accumulated). The box on Page 10 can only be used to complete the Exercises 7, and/or 8.Ex. 1Ex. 2Ex. 3Ex. 4Ex. 5Ex. 6Ex. 7Ex. 8TimeTot. / 7/ 7/ 2/ 2/ 2/ 2/ 4/ 4/ 3/ 33 Student's name: Please go on to the next page. . . Computer Science Computer Science | Machine Learning Page 2 of 10 Exercise 1 (7marks) Write the pseudocode of the perceptron algorithm and prove that the update rule decreases the error for the currently processed sample at each iteration.Student's name: Please go on to the next page. . . Computer Science Computer Science | Machine Learning Page 3 of 10 Exercise 2 (7marks) Provide an overview of thefeature selectionmethods that you know and explain their pros and cons.Student's name: Please go on to the next page. . . Computer Science Computer Science | Machine Learning Page 4 of 10 Exercise 3 (2marks) Consider the following snippet of code: 1 Q = np . z e r o s ( nS*nA ) 2 Qo l d = np . o n e s ( nS *nA ) 3 V = np . z e r o s ( nS ) 4 5w h i l enp .any(Q != Qo l d ) : 6 Qo l d = Q 7f o rsi n range( nS ) : 8 V [ s ] = np .max( Qo l d [ s *nA : ( s + 1 )*nA ] ) 9 Q = Rs a + gamma *Ps a s @ V 1.What algorithm is the snippet of code above implementing? What problem is it trying tosolve? 2.Is code snippet above correct? If not, list the mistakes and provide a x for each of them. 3.Is thewhileloop guaranteed to stop after a nite number of iterations (ifgamma eps) , whereeps > 0is a user-dened threshold. 3.Value Iteration is guaranteed to converge only asymptotically. Thus, thewhilewith the loop control condition at line 5 is not guaranteed to terminate after a nite number of steps (of course, one may argue that the loop can terminate anyway because of the nite machine precision). A simple counterexample is given bynS=nA=1,Rsa = 1 . Here, at the ith iteration, we have thatQold =P i1 `=0 ` andQ=P i `=0 ` =Qold + i and, consequently, Q6 =Qold over all iterations.Student's name: Please go on to the next page. . . Computer Science Computer Science | Machine Learning Page 5 of 10 Exercise 4 (2marks) Tell if the following statement holds or not for the logistic regression and KNN methods. Provide motivations for your answers. 1.Exists a closed form solution for the optimization procedure required to train the model. 2.They are methods used for classication, but can be easily adapted to solve regressionproblems. 3.The prediction phase for the model is computationally light. 4.They allow the use of regularization methods for model selection purposes. 1.FALSE for both: logistic regression is trained using an iterative procedure (is guaranteed toconverge but there is not closed form solution), KNN is a non-parametric method that do not require any training process. 2.FALSE for logistic regression, TRUE for KNN: logistic regression represents an extensionof the linear regression technique for classication problems. KNN can be used also for regression, averaging the output of the selected samples instead of applying a ma jority voting approach at the end of the procedure. 3.TRUE logistic regression, DEPENDS KNN: logistic regression requires a number ofoperations for prediction which is linear in the dimension of the parameter vector, KNN requires to compute all the distance between the samples in the training set and the new sample (linear in the sample number). If we assume to have a small number of samples, then this operation can be performed in a fast way too. 4.TRUE for both: the logistic regression loss function can be modied to include penaltiesfor regularization purposes, the KNN method uses the parameterKas a regularization tool (largeKimplies large regularization).Student's name: Please go on to the next page. . . Computer Science Computer Science | Machine Learning Page 6 of 10 Exercise 5 (2marks) Tell whether the following statements are true or false. Justify your answers. 1.SARSA is a prediction method since it makes use of the Bellman expectation equation forupdating the value function. 2.Q-learning is based on Temporal Dierence learning while SARSA on Monte Carlo estimation. 3.Both SARSA and Q-learning necessitate to update the policy used to collect samples withthe-greedy policy improvement. 4.Dierently from SARSA, Q-learning does not need to wait for the action played in the nextstate for updating the value function. 1.FALSE. Although it uses the Bellman expectation equation, SARSA is a control methodsince it aims at learning the optimal Q-functionQ as well as Q-learning. 2.FALSE. They are both based on Temporal Dierence learning since theybootstrapthe previous estimate of the Q-function for dening the targets of the update. 3.FALSE. This is true for SARSA only, where the policy used to collect samples needs toconverge to the optimal policy, being SARSA an on-policy method. Being Q-learning an o- policy method, the policy used to collect samples does not need to converge to the optimal policy. Thus, for the latter, the-greedy policy improvement is not needed. (One may argue that other improvements are possible for SARSA other than the-greedy policy improvement. What matter is that those converge to the optimal policy.) 4.TRUE. SARSA needs such an actiona t+1for dening the TD target that contains Q(s t+1; a t+1). Instead, Q-learning uses in the TD target the term max a0 2AQ (s t+1; a0 ); thus, it does not need to wait fora t+1.Student's name: Please go on to the next page. . . Computer Science Computer Science | Machine Learning Page 7 of 10 Exercise 6 (2marks) Consider a problem of designing an autonomous driving agent for a car. The system's performance must be evaluated based on both the time it takes to reach a designated location and the safety of the driver and any individuals the car may come across along its path. The system should replace the driver in the possible decision/actions they could perform in a car with automatic gearbox. 1.Model the problem as an MDP and provide a technique to learn the best driving policy.Motivate your decisions. 2.In the case the learning process is performed on a real car in a real-world setting, would youprefer to adopt an o-policy or an on-policy approach? (In the case it is o-policy, specify also the policy the agent should follow) 1.Since we want to nd an optimal strategy we want to solve a control problem. The MDPmodeling such a problem has the following elements: States: location, speed, acceleration of the car. We might add also other parameters related to the distances from possible obstacles. Actions: steer (left, right with a specic angle), accelerate/decelerate (with specic intensity), emergency break activation. Transition matrix: stochastic and unknown, since the modelization of the possible situations is signicantly large and complex to be modeled. Reward: weighted sum of a value corresponding to the decrease in terms of distance to the nal target and an index related to the safety of the action (considering penalties if the car hits some obstacles).