**Classes**: Thursdays and Fridays 14:15 - 15:30, West Hall 8

**Contents. **This course gives an introduction to the basic concepts of statistical modeling. We bring together the two views of statistics and of machine learning. While both traditions have developed advanced statistical tools to "analyse data", the fundamental questions that are asked (and answered) differ. Stated briefly, statisticians try to *answer specific, decision-relevant questions* on the basis of data, whereas machine learners aim at *modeling complex pieces of the world in as accurately and comprehensively as possible*, given data. Both views are important in the current fast developments in "Big Data" or "Data Analytics". The course proceeds in four main parts: (i) the fundamental concepts of statistical modeling: probability spaces, observation spaces, random variables; (ii) a crash refresher on basic mathematical formulas and laws; (iii) introduction to statistical methods (using the R programming language); (iv) introduction to methods of machine learning (using Matlab or Python). The course will be jointly taught by a statistician (A. Wilhelm) and a machine learner (H. Jaeger), and will be highly enriched by examples, exercises and miniprojects.

**Lecture notes**

- Part 1: Face to Face with Probability: Clear Concepts, Clean Notation
- Parts 2 and 3: Introduction to Statistical Inference
- Part 4: Machine Learning in a Tiny Nutshell (version from Nov 10)

**Grading scheme**

The course grade will be computed from the following components: 1. four miniquizzes will be written (each at the end of one of our four theme blocks), of which the best three will be taken and counting each by 15% toward the course grade; 2. classroom presence 10%; 3. Homeworks 20%; 4. final exam 25%.

**Schedule (to be filled in agreement with the unfolding of reality**

Sep 1 |
Introduction. Beginning or Part 1 (Herbert Jaeger) |

Sep 2 | Examples of data generating environments |

Sep 8 | Formalizing DGEs, DRPs, DVS's by three simple-looking symbols. Products of sample spaces and RV's. Modeling stochastic processes with RVs. Exercise sheet 1 |

Sep 9 | Formalizing stochastic processes; stopping times. Transformations of RVs. A first glimpse on sigma-fields. |

Sep 15 | Second glimpse on sigma-fields: the Borel sigma-field. Exercise sheet 2 |

Sep 16 | The full picture: probability spaces. Notation: how to correctly write down probability statements. Conditional probability. |

Sep 22 | Miniquiz 1 (25 min at beginning of class, room: CNLH). Samples and estimators. |

Sep 23 | Beginning of Part 2 & 3 (Adi Wilhelm): Distributions and random variables |

Sep 29 | More on distributions and their characteristics Exercise sheet 3 Sample solution Some slides |

Sep 30 | Functions of random variables |

Oct 6 | The statistical model Exercise sheet 4 (incomplete) sample solution |

Oct 7 | The statistical problem |

Oct 13 | Criteria for choosing a statistical procedure Exercise sheet 5 Sample solution |

Oct 14 | Statistical Learning |

Oct 20 | Linear Regression in a Nutshell |

Oct 21 | The Mathematics of Linear Models |

Oct 27 | Cross-Validation |

Oct 28 | Miniquiz 2 (25 min at beginning of class, room: Lecture Hall, Research II) Bootstrap Miniquiz 3 Data set |

Nov 3 | Beginning of Part 4 (Herbert Jaeger) ML as learning complex, high-dimensional probability distributions. Distance surprises in high-dimensional metric spaces. Manifolds. |

Nov 4 | Realizing complex manifold mappings by neural networks (intuitive). Subdomains of ML: statistical learning theory; symbolic learning & data mining. |

Nov 10 | Introducing the digits dataset. Classification task. Optimality criterion: minimal misclassification rate. |

Nov 11 | Machine learning as a special branch of statistics: ML algorithms as "statistical procedures". Curse of dimensionality. Feature extraction. |

Nov 15 | Due date: Miniproject (replacing miniquiz 3) |

Nov 17 | PCA - definition and algorithm. |

Nov 18 | PCA - singular values and how they relate to reconstruction accuracy. A basic learning pipeline for classifier training. Programming exercise: classifying digit images |

Nov 24 | Refresher on bias-variance dilemma / overfitting, cross-validation, regularization |

Nov 25 | Feedforward neural networks: architecture, universal approximation properties |

Dec 1 | Miniquiz 4 (Venue: CNLH) Why deep neural networks work so well in principle |

Dec 2 | Impressions from deep learning |

Dec 16 | pre-exam tutorial, 15:45, East hall 8 |

Dec 19 | Final exam 16:00 - 18:00 Conference Hall (IRC) |