Bayesian Machine Learning

This page contains resources about Bayesian Machine Learning and Bayesian Learning including Bayesian Inference, Bayesian Computational Methods and Computational Methods for Bayesian Inference.

Bayesian Networks do not necessarily follow Bayesian Methods, but they are named after Bayes' Rule. Bayesian and Non-Bayesian (Frequentist) Methods can either be used.

A distinction should be made between Models and Methods (which might be applied on or using these Models).

Bayes' Rule can be used at both the parameter level and the model level.

Subfields and Concepts

 * Bayesian Model Selection / Bayesian Model Comparison
 * Bayes Factor
 * Marginal likelihood / (Bayesian Model) evidence
 * Bayesian Model Averaging (in Ensemble Learning)
 * Bayesian Information Criterion (BIC)
 * Bayesian Parameter Estimation
 * Bayesian Parametric Models
 * Bayesian Linear (Regression) Model
 * Bayesian Multivariate Linear (Regression) Model
 * Bayesian Nonparametric Models
 * Gaussian Process Regression Model
 * Bayesian Smoothing Splines
 * Bayesian Decision Theory
 * Bayesian Feature Selection / Bayesian Variable Selection
 * Spike and Slab Method
 * Kuo & Mallick
 * Gibbs Variable Selection (GVS)
 * Stochastic Search Variable Selection (SSVS)
 * Adaptive shrinkage with Jeffreys' prior or Laplace prior
 * Reversible jump MCMC
 * Approximate Inference
 * Deterministic: Variational Bayesian Inference (as Optimization)
 * Stochastic: Monte Carlo Methods in Bayesian Inference
 * Laplace Approximation
 * Black-box alpha
 * Approximate Bayesian Computation (ABC)
 * Automatic Variational ABC
 * Variational Bayes with Intractable Likelihood (VBIL)
 * Bayesian Information Theory
 * The Principle of Maximum Entropy
 * Bayesian Occam's Razor
 * Minimum Message Length (MML)
 * Minimum Description Length (MDL) principle
 * Bayesian Compression (in Deep Learning)
 * Bayesian Naive Bayes
 * Bayesian Mixture Models
 * Sparse Bayesian Models / Sparsity inducing priors / Sparsity promoting priors
 * The Spike and Slab Model (similar to L0-regularization) / Bernoulli-Gaussian prior
 * Bayesian LASSO (similar to L1-regularization) / Laplace prior
 * Bayesian Ridge Regression (similar to L2-regularization) / Gaussian prior
 * Sparse Bayesian Learning (similar to Lp-regularization, but smoother)
 * Relevance Vector Machine (RVM) - using example selection
 * Automatic Relevance Determination (ARD) - using variable selection
 * Bayesian State Space Models
 * Bayesian Linear Dynamical System
 * Bayesian Time Series
 * Bayesian Structural Time Series (BSTS)
 * Kalman filter
 * Spike and Slab Method
 * Bayesian Model Averaging
 * Probabilistic Matrix Factorization
 * Bayesian Multitask Learning
 * Bayesian Optimization
 * Bayesian Reinforcement Learning
 * Bayesian Neural Network
 * Bayesian Deep Learning
 * Bayesian Deep Reinforcement Learning

Video Lectures

 * Bayesian Methods for Machine Learning - Coursera
 * Bayesian Learning by Zoubin Ghahramani - VideoLectures.Net
 * Graphical modelling and Bayesian structural learning by Peter Green - VideoLectures.Net
 * Bayesian Learning (Part 1, Part 2) and Bayesian Optimization by Nando de Freitas

Lecture Notes

 * CSC 2541: Topics in Machine Learning: Bayesian Methods for Machine Learning by Radford Neal
 * CSE 515T: Bayesian Methods in Machine Learning by Roman Garnett
 * Bayesian Methods by Francesca Dominici
 * Bayesian Inference and Analysis by Sujit K. Ghosh
 * Bayesian Statistics for Engineers by Brani Vidakovic
 * Introduction to Bayesian Analysis by Bradley P. Carlin
 * Computational methods for Bayesian statistics by Petter Mostad

Books and Book Chapters

 * Davidson-Pilon, C. (2015). Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference. Addison-Wesley Professional.
 * Koduvely, H. M. (2015). Learning Bayesian Models with R. Packt Publishing.
 * Theodoridis, S. (2015). "Section 12: Bayesian Learning". Machine Learning: A Bayesian and Optimization Perspective. Academic Press.
 * Congdon, P. (2014). Applied Bayesian Modelling. 2nd Ed. John Wiley & Sons.
 * Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2014). Bayesian Data Analysis. 3rd Ed. Chapman & Hall/CRC.
 * Lee, P. M. (2012). Bayesian Statistics: An Introduction. 4th Ed. John Wiley & Sons.
 * Barber, D. (2012). Bayesian Reasoning and Machine Learning. Cambridge University Press.
 * Duda, R. O., Hart, P. E., & Stork, D. G. (2012). Pattern Classification. John Wiley & Sons.
 * Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.
 * Barber, D., Cemgil, A. T., & Chiappa, S. (2011). Bayesian time series models. Cambridge University Press.
 * Berger, J. O. (2010). Statistical decision theory and Bayesian analysis. 2nd Ed. Springer Science & Business Media.
 * Koller, D., & Friedman, N. (2009). Probabilistic Graphical Models. MIT Press.
 * Hoff, P. D. (2009). A first course in Bayesian statistical methods. Springer Science & Business Media.
 * Albert, J. (2009). Bayesian computation with R. 2nd Ed. Springer Science & Business Media.
 * Carlin, B. P., & Louis, T. A. (2008). Bayes and empirical Bayes methods for data analysis. 3rd Ed. Chapman and Hall/CRC.
 * Robert, C. (2007). The Bayesian choice: from decision-theoretic foundations to computational implementation. Springer Science & Business Media.
 * Congdon, P. (2007). Bayesian Statistical Modelling. 2nd Ed. John Wiley & Sons.
 * Koop, G., Poirier, D. J., & Tobias, J. L. (2007). Bayesian Econometric Methods. Cambridge University Press.
 * Bishop, C. M. (2006). "Section 2.2: The beta distribution". Pattern Recognition and Machine Learning. Springer.
 * Congdon, P. (2005). Bayesian Models for Categorical Data. John Wiley & Sons.
 * MacKay, D. J. (2003). "Chapter 37: Bayesian Inference and Sampling Theory". Information Theory, Inference and Learning Algorithms. Cambridge University Press.
 * Mitchell, T. M. (1997). "Chapter 6: Bayesian Learning". Machine Learning. McGraw Hill.
 * Pearl, J. (1988). "Chapter 2: Bayesian Inference". Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann.

Scholarly Articles
See also Further Reading.
 * Osawa, K., Swaroop, S., Jain, A., Eschenhagen, R., Turner, R. E., Yokota, R., & Khan, M. E. (2019). Practical Deep Learning with Bayesian Principles. arXiv preprint arXiv:1906.02506.
 * Ghosh, S., Yao, J., & Doshi-Velez, F. (2018). Structured Variational Learning of Bayesian Neural Networks with Horseshoe Priors. In Proceedings of the 35th International Conference on Machine Learning (pp. 1739-1748).
 * Louizos, C., Ullrich, K., & Welling, M. (2017). Bayesian Compression for Deep Learning. arXiv preprint arXiv:1705.08665.
 * Serra, P., & Krivobokova, T. (2017). Adaptive empirical Bayesian smoothing splines. Bayesian Analysis, 12(1), 219-238.
 * Shahriari, B., Swersky, K., Wang, Z., Adams, R. P., & de Freitas, N. (2016). Taking the human out of the loop: A review of bayesian optimization. Proceedings of the IEEE, 104(1), 148-175.
 * Depeweg, S., Hernández-Lobato, J. M., Doshi-Velez, F., & Udluft, S. (2016). Learning and policy search in stochastic dynamical systems with bayesian neural networks. arXiv preprint arXiv:1605.07127.
 * Ghahramani, Z. (2015). Probabilistic machine learning and artificial intelligence. Nature, 521(7553), 452-459.
 * Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015). Weight Uncertainty in Neural Network. In Proceedings of the 32nd International Conference on Machine Learning (pp. 1613-1622).
 * Snoek, J., Rippel, O., Swersky, K., Kiros, R., Satish, N., Sundaram, N., & Adams, R. P. (2015). Scalable bayesian optimization using deep neural networks. In In Proceedings of the 32nd International Conference on Machine Learning (pp. 2171-2180).
 * Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems (pp. 2951-2959).
 * Brochu, E., Cora, V. M., & De Freitas, N. (2010). A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599.
 * Geweke, J. (1999). Using simulation methods for Bayesian econometric models: inference, development, and communication. Econometric reviews, 18(1), 1-73.

Tutorials

 * Heckerman's Bayes Net Learning Tutorial
 * A Brief Introduction to Graphical Models and Bayesian Networks
 * A brief introduction to Bayes' Rule
 * An Introduction to Graphical Models by Michael Jordan
 * Bayesian Modeling and Inference by Michael Jordan
 * Bayesian Modelling in Machine Learning: A Tutorial Review
 * Bayesian Methods for Machine Learning - NIPS 2004
 * Bayesian Machine Learning by Ian Murray
 * Bayesian Machine Learning by Zoubin Ghahramani
 * Dynamical Systems, Stochastic Processes and Bayesian Inference - NIPS 2016 workshop

Software

 * Bayesian Probabilistic Matrix Factorization - MATLAB
 * Bayesian Modeling and Monte Carlo Methods - MATLAB
 * Bayesian Optimization (Statistics and Machine Learning Toolbox) - MATLAB
 * Bayesian Methods for Hackers - Python
 * Infer.NET - Developed by Microsoft Research
 * OpenBUGS - Bayesian Inference Using Gibbs Sampling
 * Spearmint (older version) - Bayesian Optimization in Python
 * Edward: A library for probabilistic modeling, inference, and criticism - Python with TensorFlow
 * Edward2 - Python with TensorFlow
 * InferPy - Python with Edward
 * Pyro - Python with PyTorch
 * PyMC3 - Python
 * botorch - Python with PyTorch (Bayesian Optimization)
 * Ax - Python (Bayesian Optimization)

Other Resources

 * Bayesian Inference with PyMC3 (Part 1, Part 2, Part 3) - Python
 * A Bayesian Approach to Monitoring Process Change (Part 1, Part 2, Part 3) - Python
 * Bayesian Inference in R
 * Bayesian machine learning - Introduction
 * Bayesian machine learning - FastML
 * Tuning hyperparams automatically with Spearmint - FastML
 * Bayesian machine learning - Metacademy
 * Bayesian Statistics - Scholarpedia
 * Are "Bayesian networks" Bayesian? - No, Bayesian and Frequentist approaches can both be used.
 * Bayesian Deep Learning - NIPS 2018 workshop
 * Bayesian Methods Research Group
 * Bayesian Learning for Statistical Classification - blog post
 * Deep Learning Is Not Good Enough, We Need Bayesian Deep Learning for Safe AI - blog post
 * Deep Bayesian Neural Networks - blog post
 * “Fully Bayesian” vs “Bayesian” - Stack Exchange
 * Sparse Bayesian Models
 * MNIST For ML Beginners: The Bayesian Way
 * PPAML Summer School 2016
 * Neural Networks from a Bayesian Perspective - blog post
 * BayesianOptimization (GitHub) - code
 * Making Your Neural Network Say “I Don’t Know” — Bayesian NNs using Pyro and PyTorch
 * Estimating Probabilities with Bayesian Modeling in Python
 * Bayesian Neural Networks with Random Inputs for Model Based Reinforcement Learning - blog post