Introduction to Statistical Learning
Introduction
This web page and its associated github repository is intended to provide materials (slides, scripts datasets etc) for some of the units[^1] of the Statistical Learning course at the UPC-UB MSc in Statistics and Operations Research (MESIO).
[^1:] Specifically the materials in this page are about chapters:
- Introduction
- Tree and Ensemble Methods
- Artificial Neural Networks..
The main reason for creating this page, apart of making it open and facilitatigt the hierrchjichal presentation of the materials, is that some labs may take time to run. So, given that it is not an option to run them in class we provide links to the HTML files obtaind when running the R or Python notebooks provided.
By the other side, in order to make the materials completely reproducible, the page is based on a github repository, which can be cloned or downladed in order for anyone who wishes to play around with the materials and reproduce or improve the examples.
Introduction to Statistical Learning
The first part of the course introduces Statistical Learning and relates it to the related fields of Statistics, Machine Learning and Statistical Learning.
After this, an introduction to Supervised Learning (Prediction and Classification) and to Model Validation and Resampling (Bootstrap) concepts is presented.
- Complements
Tree based methods
Decision Trees
Decision trees are a type of non-parametric classifiers which have been Very successful because of their interpretability, flexibility and a very decent accuracy.
Ensemble methods
The term “Ensemble” (together in french) refers to distinct approaches to build predictiors by combining multiple models.
They have proved to addres well some limitations of trees therefore improving accuracy and robustness as well as being able to reduce overfitting and capture complex relationships.
Artifical Neural Networks
Shallow Neural Networks
These are raditional ML models, inspired in brain, that simulate neuron behavior, thata is they receive an input, which is processed and an output prediction is produced.
For long their applicability has been relatively restricted to a few fields or problems due mainly to their “black box” functioning that made them hard to interpret.
The scenario has now completely changed with the advent of deep neural networks which are in the basis of many powerful applications of artificial intelligence.
Deep Neural Networks
Esssentially these are ANN with multiple hidden layers with allow overpassing many of their limitations.
They can be tuned in a much more automatical way and have been applied to many complex tasks. such as Computer vision, Natural Language Processing or Recommender systems.
References and resources
References for Tree based methods
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. CRC press.
Brandon M. Greenwell (202) Tree-Based Methods for Statistical Learning in R. 1st Edition. Chapman and Hall/CRC DOI: https://doi.org/10.1201/9781003089032 Web site
Efron, B., Hastie T. (2016) Computer Age Statistical Inference. Cambridge University Press. Web site
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112). Springer.
References for deep neural networks
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning (Vol. 1). MIT press. Web site
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
Chollet, F. (2018). Deep learning with Python. Manning Publications.
Chollet, F. (2023). Deep learning with R . 2nd edition. Manning Publications.
Some interesting online resources
Statistical/Machine Learning in General
Decision Trees
Neural Networks
Deep Learning and Neural Networks (Andrew Ng @ Coursera) (https://www.coursera.org/learn/neural-networks-deep-learning)
-The Neural network Playground
This page has been created as Quarto Website project.
To learn more about Quarto websites visit https://quarto.org/docs/websites.