Welcome

Our website is made possible by displaying online advertisements to our visitors.
Please disable your ad blocker to continue.

Computer Engineering - Machine Learning

Full exam

DIPARTIMENTO DI ELETTRONICA, INFORMAZIONE E BIOINGEGNERIA Politecnico di Milano Machine Learning (Code: 097683)July 8, 2023 Surname: Name: Student ID: Row: Column:Time:2hoursMaximumMarks:33 The following exam is composed of8 exercises(one per page). The first page needs to be filled with yourname, surname and student ID. The following pages should be used only in the large squarespresent on each page. Any solution provided either outside these spaces orwithout a motivationwill not be considered for the final mark. During this exam you arenot allowed to use electronic devices, such as laptops, smartphones, tablets and/or similar. As well, you are not allowed to bring with you any kind of note, book, written scheme, and/or similar. You are also not allowed to communicate with other students during the exam. The first reported violation of the above-mentioned rules will be annotated on the exam and will be considered for the final mark decision. The second reported violation of the above-mentioned rules will imply the immediate expulsion of the student from the exam room and theannulment of the exam. You are allowed to write the exam either with a pen (black or blue) or a pencil. It is your responsibility to provide a readable solution. We will not be held accountable for accidental partial or total cancellation of the exam. The exam can be written either inEnglishorItalian. You are allowed to withdraw from the exam at any time without any penalty. You are allowed to leave the room not early than half the time of the duration of the exam. You are not allowed to keep the text of the exam with you while leaving the room. Three of the points will be given on the basis on how quick you are in solving the exam.If you finish earlier than 30 min before the end of the exam you will get 3 points, if you finish earlier than 20 min you will get 2 points and if you finish earlier than 10 min you will get 1 point (the points cannot be accumulated). The box on Page 10 can only be used to complete the Exercises 7, and/or 8.Ex. 1Ex. 2Ex. 3Ex. 4Ex. 5Ex. 6Ex. 7Ex. 8TimeTot. / 7/ 7/ 2/ 2/ 2/ 2/ 4/ 4/ 3/ 33 Student’s name: Please go on to the next page. . . Computer Science Computer Science — Machine Learning Page 2 of 10 Exercise 1 (7marks) Beginning with the expected squared error of a regression model, outline the steps that lead to the Bias-Variance Decomposition of this error, providing the resulting equations. Then, briefly discuss the intuitive meaning behind each term in the final decomposition.Student’s name: Please go on to the next page. . . Computer Science Computer Science — Machine Learning Page 3 of 10 Exercise 2 (7marks) Explain what is anϵ-greedy policy and why it is useful in Reinforcement Learning.Student’s name: Please go on to the next page. . . Computer Science Computer Science — Machine Learning Page 4 of 10 Exercise 3 (2marks) Consider the following snippet of code: 1 X = z s c o r e ( d a t a s e t [ [ ’ s e p a l=l e n g t h ’ , ’ s e p a l=w i d t h ’ ] ] . v a l u e s ) 2 t = d a t a s e t [ ’ c l a s s ’ ] . v a l u e s == ’ I r i s=s e t o s a ’ 3 X , t = s h u f f l e ( X , t , r a n d o ms t a t e =0) 4 5 c 1 = L o g i s t i c R e g r e s s i o n ( p e n a l t y= ’ n o n e ’ ) 6 c 1 . f i t ( X , t ) 7 8 c 2 = G a u s s i a n N B ( ) 9 c 2 . f i t ( X , t ) 10 11 p r e d 1 = c 1 . p r e d i c t (X) 12 p r e d 2 = c 2 . p r e d i c t (X) 13 p r e d 3 = p r e d 1*p r e d 2 1.Describe the operations executed and the purpose (which problem and which method hasbeen used) of the snippet reported above. 2.Interpret the operation that is performed at line 13. Can you say something about the recallofpred3compared to the recall ofpred1? Motivate your answer. 3.Can any of the two models employed in the snippet be used for augmenting the availabledata with artificially generated samples? Motivate your answer. 1.The snippet is performing binary classification on the Iris dataset for the task of identifying’Iris-setosa’ using as features ’sepal-length’ and ’sepal-width’. The methods used are Na¨ıve Bayes with Gaussian priors for the features and logistic regression with no regularization. 2.At line 13, we are generating a new prediction for the training set in which samples areclassified as ’Iris-setosa’ if both models classify the sample as ’Iris-setosa’. Since the true positivesT Pdecrease and the false negativeF Nincrease inpred3compared topred2, we have that theRecall=T P /(T P+F N) = 1/(1 +F N/T P) decreases. 3.Yes, the Na¨ıve Bayes is a generative model and, thus, it can be used for generating newartificial samples.Student’s name: Please go on to the next page. . . Computer Science Computer Science — Machine Learning Page 5 of 10 Exercise 4 (2marks) Tell if the following properties are typical of filter feature selection, PCA, Lasso, or none of them. Motivate your answers. 1.It increases the bias of the model. 2.It provides features that are different from the original ones. 3.It is embedded into the training loss. 4.It can be applied in combination to linear regression only. 1.All (PCA if a strict subset of the principal components is considered). All model selectiontechniques aim at trading off a reduction of the variance at the price of increasing the bias of the model. 2.PCA. All other methods perform a form of feature selection identifying a subset of the originalfeatures, instead, PCA is a feature extraction method that provides novel features that are linear combinations of the original ones. 3.Lasso. It is a regularization technique which alters the training loss by introducing aregularizer. 4.Lasso. Filter feature selection and PCA are applied before training irrespectively on themodel. Note that, even if we only saw Lasso for the regression problem, it can be applied also to any parametric model (regression or classification).Student’s name: Please go on to the next page. . . Computer Science Computer Science — Machine Learning Page 6 of 10 Exercise 5 (2marks) We are the manager of a football team and we are planning the players to buy and sell to create the next year team. The process of bargaining between teams to conclude a purchase is a process where, in a sequence, we propose an offer (a mix of money and other players for a specific player) and the counterpart decides to accept the offer, refuse it and continue bargaining or refuse and stop the bargaining process. 1.Define the type of data one might consider when modeling this process (input, output, state,action, reward, etc.). 2.Tell which problem is the one of evaluating the performance of the bargaining strategy of theprevious season. Suggest a method to solve this problem. 3.Can we use the same approach if we want to find the most appropriate way to maximizethe value obtained from the bargain process? If not, tell which problem we are facing and propose a method to solve it. 1.The problem is a sequential decision problem that can be modeled as an MDP. The states arerepresented by the current situation of the team (players already in the team and available money for transfers), the actions are the different offers for the players of other teams, the rewards are positive if we close a deal, zero if we continue the bargain and negative if the bargain is stopped, the transition are stochastic (assuming that the managers of the other teams are not changing their behaviour over time), the discount factor can be set to 1 since we assume that the episodes ends at some point, and the initial distribution is the initial amount of money and the current team. 2.In this case, we are facing an RL prediction problem, where we would like to have thecumulated value of a specific strategy for each one of the different states. In this setting, we might use MC or TD. 3.In this case, we are facing an RL control problem, since the target is the optimization of theoptimal strategy. In this setting, we can resort to Q-learning or SARSA.Student’s name: Please go on to the next page. . . Computer Science Computer Science — Machine Learning Page 7 of 10 Exercise 6 (2marks) Tell if the following statements are true for the UCB1 and/or Thompson Sampling (TS) algorithms. Properly motivate your answers. 1.It is able to include prior information on the MAB problem we are solving. 2.Can be applied if the rewards provided by pulling arms are unbounded. 3.Effectively solves online RL problems with a single state. 4.Solves the exploration/exploitation dilemma using theoptimism in the face of uncertainty principle. 1.TS: being a Bayesian algorithm, is able to incorporate prior information changing the shapeof the initial prior distribution on the expected rewards. 2.NONE: the UCB1 requires that the reward are over a finite support and during lectures weonly presented the version of TS that requires that the reward are drawn from a Bernoulli distribution. Extension for unbounded rewards for both the above methods exists, but have not been presented during the course. 3.BOTH: the MAB setting is equivalent to the definition of a RL problem with a single stateand a finite time horizonT. 4.UCB1: it uses upper confidence bounds over the expected reward estimates to avoid theconvergence on local optima arms.Student’s name: Please go on to the next page. . . Computer Science Computer Science — Machine Learning Page 8 of 10 Exercise 7 (4marks) Consider a soft-margin linear SVM classifier trained on a non-linearly separable dataset, leading to the weight vectorw= (−2,1)⊤ and constantb=−1. Address the following questions motivating your answers. 1.Draw the decision boundary and the margins. 2.Letx= (−3,1)⊤ be a training point belonging to the negative class. Is it correctly classified? Tell which of the following conditions of the slack variableξof pointxis correct:ξ= 0, ξ >1, or 0< ξ