Statistical Learning

A github repository for the materials of the 2nd part of the Statistical Learning course at MESIO

Statistical Learning

Introduction

This github repository is intended to provide materials (slides, scripts datasets etc) for the 2nd part of the Statistical Learning course at the UPC-UB MSc in Statistics and Operations Research (MESIO).

A previous version of this repository can be found at https://github.com/ASPteaching/introstatlearning, but I have decided to re-create it because the repository had become to heavy due to caches and git savings that I don’t know how to clean.

The (second part of the course) has two blocks, each with two parts.

  1. Tree based methods

    1.1 Decision trees

    1.2 Ensemble methods

  2. Artificial neural networks

    2.1 Artificial neural networks

    2.2 Introduction to deep learning

Class material

All class materials are available from the repository https://aspteaching.github.io/Introduction2StatisticalLearning/.

In this page you will find links to the html version of the slides and other documents, as well as to datasets or references and resources documents

Decision Trees

Decision trees are a type of non-parametric classifiers which have been Very successful because of their interpretability, flexibility and a very decent accuracy.

Ensemble methods

The term “Ensemble” (together in french) refers to distinct approaches to build predictiors by combining multiple models.

They have proved to addres well some limitations of trees therefore improving accuracy and robustness as well as being able to reduce overfitting and capture complex relationships.

Artifical Neural Networks

Thesea are raditional ML models, inspired in brain, that simulate neuron behavior, thata is they receive an input, which is processed and an output prediction is produced.

For long their applicability has been relatively restricted to a few fields or problems due mainly to their “black box” functioning that made them hard to interpret.

The scenario has now completely changed with the advent of deep neural networks which are in the basis of many powerful applications of artificial intelligence.

Deep learning

Esssentially these are ANN with multiple hidden layers with allow overpassing many of their limitations. They can be tuned in a much more automatical way and have been applied to many complex tasks. such as Computer vision, Natural Language Processing or Recommender systems.

References and resources

References for Tree based methods

References for deep neural networks

Some interesting online resources

-Decision Trees free course (9 videos). By Analytics Vidhya