Statistical Learning Theory

This page contains resources about Statistical Learning Theory, Computational Learning Theory, Algorithmic Learning Theory and Learning Theory in general.

Recently, there is a trend to be incorrectly referred to by many as "Theoretical Machine Learning", which is a contradicting term.

Subfields and Concepts

 * Asymptotics
 * Vapnik-Chervonenkis(VC) Theory
 * VC dimension
 * Symmetrization
 * Chernoff Bounds
 * Kernel Methods
 * Support Vector Machines
 * Probably Approximately Correct (PAC) Learning
 * Empirical Risk-Mininization Principle
 * Boosting
 * Estimation Theory
 * Decision Theory
 * Bayesian Decision Theory
 * Information Theory
 * Entropy
 * Kullback-Leibler (KL) Divergence
 * Information bottleneck
 * Algorithmic Information Theory
 * Kolmogorov Complexity / Algorithmic Complexity
 * Rademacher Complexity
 * Universality probability
 * Universal Turning Machine
 * Game Theory
 * Minimax Theorem
 * Blackwell's Approachability
 * Occam's razor / Occam Learning
 * Empirical Inference
 * Solomonoff's Theory of Inductive Inference
 * No Free Lunch Theorem
 * Principle of Maximum Entropy
 * Maximum Entropy (Maxent) Models / Entropic priors
 * Multinomial logistic regression / Softmax regression
 * Online Learning and Online Convex Optimization
 * Regret Bounds
 * Bregman Divergence
 * No-regret Learning
 * Online Gradient Descent
 * Online Subgradient Descent
 * Mirror Descent
 * Stochastic Gradient Descent (SGD)
 * Mini-batch Gradient Descent
 * Follow The Regularized Leader (FTRL)
 * Multi-Armed Bandit (MAB)
 * Regularization
 * L2-regularization / Tikhonov regularization / Ridge regression
 * L1-regularization / Least absolute shrinkage and selection operator (LASSO)
 * Matrix Regularization
 * Reinforcement Learning
 * Mistake bounds
 * Theory of Artificial Neural Networks
 * Representation Theorem
 * Universal Approximation Theorem
 * Universal Turing Machine

Video Lectures

 * Learning Theory by Reza Shadmehr
 * Statistical Learning Theory and Applications by Tomaso Poggio and Lorenzo Rosasco
 * Statistical Learning Theory by John Shawe-Taylor - MLSS 2004 VideoLectures.NET
 * Learning Theory by John Shawe-Taylor - MLSS 2009 VideoLectures.NET
 * Statistical Learning Theory by Olivier Bousquet - MLSS 2003 VideoLectures.NET
 * Statistical Learning Theory by Olivier Bousquet - MLSS 2007 VideoLectures.NET
 * Advanced Statistical Learning Theory by Olivier Bousquet - MLSS 2004 VideoLectures.NET
 * Introduction to Learning Theory by Olivier Bousquet - MLSS 2006 VideoLectures.NET
 * Online Learning with a Memory Harness by Shai Shalev-Shwartz - NIPS 2005 VideoLectures.NET
 * Multi-Task Learning and Matrix Regularization by Andreas Argyriou - VideoLectures.Net

Lecture Notes

 * CS229T/STATS231:Statistical Learning Theory by Percy Liang
 * CS 281B / Stat 241B: Statistical Learning Theory by Peter Bartlett and Wouter Koolen
 * Machine Learning Theory by Wouter M. Koolen, Rianne de Heide and Peter D. Grunwald
 * Statistical Learning Theory by Peter Bartlett
 * Statistical Learning Theory by Prof. Dmitry Panchenko
 * Statistical Learning Theory and Sequential Prediction by Alexander Rakhlin and Karthik Sridharan
 * Machine Learning Theory by Karthik Sridharan
 * Comp 236: Computational Learning Theory by Roni Khardon
 * CS 7545: Machine Learning Theory by Maria Florina Balcan
 * Foundations of Machine Learning by Mehryar Mohri
 * Computational and Statistical Learning Theory by Nati Srebro
 * Foundations of Machine Learning by Rob Schapire
 * Learning Theory by Sham Kakade and Ambuj Tewari
 * Introduction to Machine Learning by Shai Shalev-Shwartz
 * Statistical Learning Theory by Maxim Raginsky
 * Computational Learning Theory by Sally A Goldman
 * Introduction to Machine Learning by Amnon Shashua
 * Mathematics of Machine Learning by Philippe Rigollet
 * Introduction to Online Optimization by Sebastien Bubeck
 * Machine Learning Theory by Nicholas Harvey

Books and Book Chapters

 * Kearns, M. J. (1990). The Computational Complexity of Machine Learning. MIT press.
 * Natarajan, B. K. (1991). Machine Learning: A Theoretical Approach. Morgan Kaufmann.
 * Kearns, M. J., & Vazirani, U. V. (1994).  An Introduction to Computational Learning Theory . MIT press.
 * Hassoun, M. H. (1995). Fundamentals of artificial neural networks. MIT press.
 * Devroye, L., Gyorfi, L., & Lugosi, G. (1997).  A Probabilistic Theory of Pattern Recognition . Springer Science & Business Media.
 * Anthony, M. H. G., & Biggs, N. (1997). Computational Learning Theory. Cambridge University Press.
 * Mitchell, T. M. (1997). "Chapter 7: Computational Learning Theory". Machine Learning. McGraw Hill.
 * Vapnik, V. N., & Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley.
 * Vapnik, V. (1999). The Nature of Statistical Learning Theory. Springer Science & Business Media.
 * Devroye, L., & Lugosi, G. (2001). Combinatorial methods in density estimation. Springer Science & Business Media.
 * Cesa-Bianchi, N., & Lugosi, G. (2006). Prediction, Learning, and Games. Cambridge University Press.
 * Vapnik, V. (2006). Estimation of dependences based on empirical data. Springer Science & Business Media.
 * Rissanen, J. (2007). Information and complexity in statistical modeling. Springer Science & Business Media.
 * Anderson, D. R. (2008). "Section 3.2: Linking Information Theory to Statistical Theory". Model Based Inference in the Life Sciences. Springer New York.
 * Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Friedman, J., & Tibshirani, R. (2009). The Elements of Statistical Learning. 2nd Ed. New York: Springer.
 * Shalev-Shwartz, S. (2011). Online Learning and Online Convex Optimization.Foundations and Trends® in Machine Learning, 4(2), 107-194.
 * Sridharan, K. (2012). Learning From An Optimization Viewpoint. arXiv preprint arXiv:1204.4145.
 * Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012). Foundations of Machine Learning. MIT press.
 * Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.
 * Du, K. L., & Swamy, M. N. (2014). Neural networks and statistical learning. Springer Science & Business Media.
 * Hazan, E. (2015). Introduction to online convex optimization. Foundations and Trends® in Optimization, 2(3-4), 157-325.
 * Odense, S. (2015). Universal approximation theory of neural networks. MSc Diss. University of Victoria.
 * Blum, A., Hopcroft, J., & Kannan, R. (2015). Foundations of Data Science. (link)
 * Goldman, S. A. (2017). "Computational learning theory". Atallah, M. J., & Blanton, M. (Eds). Algorithms and Theory of Computation Handbook, Volume 1: General Concepts and Techniques. Chapman & Hall/CRC.

Scholarly Articles

 * Vapnik, V. N. (1999). An overview of statistical learning theory. IEEE transactions on neural networks, 10(5), 988-999.
 * Webb, A. R. (2002). Statistical Pattern Recognition. 2nd Ed. John Wiley & Sons.
 * Bousquet, O., Boucheron, S., & Lugosi, G. (2004). Introduction to Statistical Learning Theory. In Advanced Lectures on Machine Learning (pp. 169-207). Springer Berlin Heidelberg.
 * Boucheron, S., Bousquet, O., & Lugosi, G. (2005). Theory of classification: A survey of some recent advances. ESAIM: probability and statistics, 9, 323-375.
 * Ying, Y., & Pontil, M. (2008). Online gradient descent learning algorithms. Foundations of Computational Mathematics, 8(5), 561-596.
 * Shalev-Shwartz, S. (2011). Online learning and online convex optimization. Foundations and Trends® in Machine Learning, 4(2), 107-194.
 * Sridharan, K. (2012). Learning from an optimization viewpoint. arXiv preprint arXiv:1204.4145.
 * Villa, S., Rosasco, L. & Poggio, T. (2013). On Learning, Complexity and Stability. arXiv preprint  arXiv:1303.5976.
 * Bubeck, S. (2014). Convex optimization: Algorithms and complexity. arXiv preprint arXiv:1405.4980.

Tutorials

 * Stochastic Optimization by N. Srebro and A. Tewari - ICML 2010

Software

 * Minibatch learning for large-scale data, using scikit-learn - Python

Other Resources

 * Statistical Learning Theory - Metacademy
 * Learning Theory (Formal, Computational or Statistical) - Notebook
 * Statistical Learning Theory with Dependent Data - Notebook
 * Statistical Learning Theory - Notes
 * OnlinePrediction wiki
 * Introduction to Online Machine Learning : Simplified
 * I'm a bandit- Blog