Statistical Learning. Part II

Tree based methods and (Deep) Neural Networks

Alex Sanchez (E. Vegas and F. Reverter)

Outline

Tree-based Methods (10.5 h \(\sim\), 3-4 week)
1. The Basics of Decision Trees. Regression Trees. Classification Trees.
2. Ensemble Learning. Bagging. Random Forests. Boosting.
Artificial Neural networks (10.5 h \(\sim\), 3-4 week)
1. Feed-Forward Network Functions.
2. Network Training.
3. Error Backpropagation.
4. Deep Learning models.
5. Convolutional Neural Networks.

A type of non-parametric classifiers
Very successful because of
- Interpretability,
- Flexibility,
- Decent Accuracy.
Also some cons
- Not very robusts
- Tend to overfit

By Gilgoldm - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=90405437

Build predictions combining multiple models
Addres some limitations of Trees:
- improve accuracy and robustness
- reduce overfitting.
- capture complex relationships.

https://static.javatpoint.com/tutorial/machine-learning/images/

ML models, inspired in brain, that simulate neuron behavior
- Receive input
- Processes it
- Output predictions.
For long time limited applications
- Black box criticism
- Not very powerful
- Hard to interpret

ANN with multiple hidden layers.
Improves on ANN
Automatic tuning
Complex tasks
- Computer vision
- Natural Language Processing
- Recommender systems

Main concepts will be presented in class based on slides and blackboard.
Practical applications will be demonstrated/followed using notebooks provided in campus.
Exercises for practice will be provided and their solution discussed in class.
Two compulsory tasks will be provided. Students work them and submit their work in time planned.
Student participation is encouraged, either by presenting their work in class and/or contributing to the forum.

As indicated in the course guide
Each part of the course: 50%
For each part:
- A final examen is done with weight of 50%
- Remaining 50% is the average of scoring of submitted tasks.

Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. CRC press.
Brandon M. Greenwell (202) Tree-Based Methods for Statistical Learning in R. 1st Edition. Chapman and Hall/CRC DOI: https://doi.org/10.1201/9781003089032 Web site
Efron, B., Hastie T. (2016) Computer Age Statistical Inference. Cambridge University Press. Web site
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112). Springer. Web site

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning (Vol. 1). MIT press. Web site
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
Chollet, F. (2018). Deep learning with Python. Manning Publications.
Chollet, F. (2023). Deep learning with R . 2nd edition. Manning Publications.