ETC3250/5250 Resources

Books and articles

This book by James, Witten, Hastie and Tibshirani contains the primary content for the unit. It has the explanations for different methodology, practical labs, and a range of exercises to work through. Use the second edition, with Applications in R.

This book by Boehmke & Greenwell is an accessible and practical guide to many aspects of machine learning. It’s coverage of unsupervised classification is very good.

Machine learning is an active area of research across several disciplines, primarily statistics and computer science. Perhaps because of this there are many ways to define and fit models. The tidy modeling approach coordinates these into a consistent and understandable workflow. It doesn’t interface to all software, but getting started with machine learning using this mind-set helps you get organised despite the fragmented landscape. This book accompanies the software tidymodels.

This book contains the code to do most of the exercises from ISLR using the tidymodels thinking and coding style.

This book by Cook and Laa is the primary resource for learning how to visualise high-dimensions, how to explore the data, and to visually examine and diagnose models.

This book by Christoph Molnar serves as a guide for making black box models explainable. It is an excellent resource for developing your understanding of the different types of models and how to diagnose and interpret them.

This book by Biecek and Burzykowski provides useful approaches for making black boxes explainable.

Written by Emil Hvitfeldt to cover creating new variables as broadly as possibly. Has classical methods such as dummy variables and box-cox transformations, temporal and spatial data and missing value imputation.