This is a dual undergraduate and graduate course.
Class Sessions: Tuesday and Friday 15:45-17:00, West Hall 4
Contents: This course is a direct continuation of the course "Principles of Statistical Methods" held in Fall. Machine Learning (ML) is a very diversified field which has spun off a great variety of mathematical models and learning algorithms. Based on the understanding of basic concepts from statistical modeling earned in Fall, this Spring course will give highlight introductions to a number of quite different methods and "ways of thinking" from ML. While a single-semester course cannot aspire to give a comprehensive overview of ML, at least I hope that these highlight intros can give "support vectors" that span a substantial portion of the field. I will place a little more emphasis on unsupervised learning methods than on supervised ones. Specifically, I will present the following themes: properties of n-dimensional Gaussians (plus, a little linear algebra refresher), mixture of Gaussians and related representations, the EM algorithm, sampling methods (elementary method, Gibbs sampler, MCMC sampling), applications of sampling (simulated annealing, Markov random fields, Hopfield networks, Boltzmann machines), Bayesian networks and message-passing algorithm, and a choice (YOUR choice) from A. recurrent neural networks, B. hidden Markov models, C. fuzzy logic, D. adaptive signal processing.
Lecture notes. A set of self-contained LNs will be supplied. Since I teach this course in this format for the first time, the LNs will grow in instalments throughout the semester. Links to these documents will be entered into the schedule below. Many parts will be adapted from my former graduate lecture notes on "Algorithmical and Statistical Modelling".
Grading and exams. The course grade is computed from classroom presence (10%), homeworks and miniprojects (40%), midterm (20%) and final exam (30%).
Slides of a Neural Network Course (23 MB) first given at the "Interdisciplinary College" 2008
Slides of a basic ML Course (1.6 MB) first given at the "Interdisciplinary College" 2006
A condensed primer on measure theory and probability theory, by Manjunath Gandhi
An online textbook on probability theory (by Rick Durrett)
Hints for writing good miniproject reports, or rather, for avoiding typical blunders
For your exam preparation: solved exams from the somewhat similar old lecture on "Algorithmical and Statistical Modeling": final exam 2010, final exam 2007, final exam 2005, midterm 2010, midterm 2010 probability theory part, midterm 2007
The online lecture notes are self-contained, and no further literature is necessary for this course. However, if you want to study some topics in more depth, the following are recommended references.
Bishop, Christopher M.: Pattern Recognition and Machine Learning. Springer Verlag, 2006 Quite thick... (730 pages) -- more like a handbook for practicians.
Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine Learning, Neural and Statistical Classification (1994) Free and online at http://www.amsta.leeds.ac.uk/~charles/statlog/ and at the course resource repository. A transparently written book, concentrating on classification. Good backup reading.
T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Verlag 2001. IRC: Q325.75 .H37 2001 I have found this book only recently and haven't studied it in detail – looks extremely well written, combining (statistical) maths with applications and principal methods of machine learning, full of illuminating color graphics. May become my favourite.
Schedule (this will be filled in synchrony with reality as we go along)
|Introduction. Geometry of multivariate Gaussians.|
|Feb 5||Estimating MoGs by gradient descent.|
|Feb 9||Estimating MoGs by EM. Lecture notes part 1 V1.2|
|Feb 12||Microproject 1|
|Feb 16||The EM algorithm in general.|
|Feb 19||Sampling basics.|
|Feb 24||Rejection sampling. Markov chain essentials.|
|Feb 26||Markov chains, cont'd. Gibbs sampler.|
|Mar 1||Metropolis sampler. Homework 2|
|Mar 4||(no class)|
|Mar 8||(no class)|
|Mar 11||(no class)|
|Mar 15||Case study: reconstructing evolutionary trees|
|Mar 18||Boltzmann distribution, simulated annealing. Homework project 3|
|Mar 29||Ising model.|
|Apr 1||Boltzmann machine. Lecture notes part 2 V1.0|
|Apr 4||(extra class 15:45, West Hall 8) Midterm rehearsal session|
|Apr 5||Midterm (written during class time)|
|Apr 8||Online adaptive signal processing: intro|
|Apr 12||(no class)|
|Apr 15||Signals: basic terminology, concepts and notation|
|Apr 18||(extra class 15:45, West Hall 8) Examples of adaptive signal processing applications. Lecture notes part 3 V0.3 (posted April 17)|
|Apr 19||The LMS algorithm - a quick derivation. Homework 4|
|Apr 22||Gradient descent on quadratic error surfaces - analytical gradient known|
|Apr 25||(extra class 15:45, West Hall 8) Stochastic gradient descent on quadratic error surfaces. Convergence properties of LMS. Miniprojekt 5|
|Apr 26||More on convergence properties.|
|Apr 29||Hopfield networks. Slides|
|May 3||Dynamical Systems: a mini-tutorial Slides|
|May 6||Dynamical systems, continued. Reservoir computing: basics. Slides for reservoir computing|
|May 17||Final exam (9:00 - 11:00, West Hall 4)|