This week we will further explore and deepen our knowledge on one of the most used supervised data science method: linear regression. All of you are assumed familiar with this method, but when applied to big data, new aspects come into play. Typically, in big data we have many more predictors than we can use and/or are interested in, so how to filter out the best ones? We will learn about two concepts that deal with this: best subset selection and shrinkage methods such as ridge regression and lasso.
Complete and hand in questions 1 - 6 of the lab (at least 2 hours before the start of the lab), finishing the section “Part 1: to be completed before the lab”.
You can download the student zip including all needed files for practical 4 here.