| x | cl |
|---|---|
| 11 | A |
| 33 | A |
| 39 | B |
| 44 | A |
| 50 | A |
| 56 | B |
| 70 | B |
Week 5: Trees and forests
We will cover:
Pros and cons:
Define
\[\mbox{MSE} = \frac{1}{n}\sum_{i=1}^{n} (y_i - \widehat{y}_i)^2\]
Split the data where combining MSE for left bucket (MSE_L) and right bucket (MSE_R), makes the biggest reduction from the overall MSE.
| x | cl |
|---|---|
| 11 | A |
| 33 | A |
| 39 | B |
| 44 | A |
| 50 | A |
| 56 | B |
| 70 | B |
Note: x is sorted from lowest to highest!
All possible splits shown by vertical lines