You are here

Principles of Statistical Modeling Spring 2018 (340101)

Jacobs University Bremen, Spring 2018, Herbert Jaeger

Classes: Wed 9:45-11:00 (East Hall 4) and Fri 11:15-12:30, East Hall 8

Tutorial session: Tue 17:15-18:30, West Hall 4

TAs: Xu He (x.he at and Tianlin Liu (t.liu at

Contents. This course gives an introduction to the basic concepts of statistical modeling. We bring together the two views of statistics and of machine learning. While both traditions have developed advanced statistical tools to analyse data, the fundamental questions that are asked (and answered) differ. Stated briefly, statisticians try to answer specific, decision-relevant questions on the basis of data, whereas machine learners aim at modeling complex pieces of the world in as accurately and comprehensively as possible, given data. Both views are important in the current fast developments in "Big Data" or "Data Analytics". The course proceeds in four main parts: (i) the fundamental concepts of statistical modeling: probability spaces, observation spaces, random variables; (ii) a crash refresher on basic mathematical formulas and laws; (iii) introduction to statistical methods; (iv) introduction to methods of machine learning. The course was developed jointly by a statistician (A. Wilhelm) and a machine learner (H. Jaeger), and will be highly enriched by examples, exercises and miniprojects.

Lecture notes

Homework. There will be two kinds of homeworks, which are treated quite differently. A. Paper-and-pencil  problems. These homeworks give an opportunity to exercise the theoretical concepts introduced in the lecture. These homeworks will not be checked or graded, and doing them is not mandatory. Instead, the problems will be discussed and show-solved in weekly tutorial sessions held by the TAs. Model solutions will be put online a week after issuing the problem sheets. B. Programming miniprojects. The other type of homework comes in the form of  small-sized programming projects. Students work in teams of two or three, each team submitting a single solution, by email to the TAs, consisting of the code and a documentation (typeset pdf document, preferably generated in Latex, other word processing software allowed). These miniproject homeworks will be graded. Programming can be done in Matlab or Python.

Grading. The course grade will be computed from the following components: 1. three miniquizzes written in class (30 min) of which the best two will be taken and counting each by 20% toward the course grade; 2. classroom presence 10%; 3. programming homeworks 20%; 4. final exam 30%. All quizzes and exams are open-book.

Schedule (to be filled in agreement with the unfolding of reality

Feb 2

Feb 7 Lots of examples for probability measurement scenarios. Reading: Lecture Notes Part 1, Section 2   Exercise sheet 1
Feb 9 Elementary events and random variables. Reading: LN Section 3
Feb 14 Operations on RVs 1: products and projections Reading: LN Section 4.1  and Appendix A
Feb 16 Operations on RVs 2: transformations of RVs. Modeling time series data by RVs. Reading: LN Sections 4.2 and 5.  Exercise sheet 2
Feb 21 Events and sigma-fields. Reading: LN Section 7.1, 7.2 up to (excluding) Theorem 3.
Feb 23 More on sigma-fields. The Borel sigma-field. Generating sigma-fields. Reading: LN 7, to its end. Exercise sheet 3
Feb 28  
Mar 2  
Mar 7  
Mar 9  
Mar 14  
Mar 16  
Mar 21  
Mar 23  
Apr 4  
Apr 6  
Apr 11  
Apr 13  
Apr 18  
Apr 20  
Apr 25  
Apr 27  
May 2  
May 4  
May 9  
May 11  
May 16