# Bayesian Parameter Estimation

## Subfields and Concepts

• Statistical Signal Processing / Estimation Theory
• Bayesian Decision Theory
• Bayesian Point Estimation
• Bayesian Score
• Maximum Likelihood Estimation (MLE)
• Asymptotics of Maximum Likelihood
• Cramer-Rao bound / Cramer-Rao lower bound
• Fisher information
• For complete (fully observed data):
• Dirichlet distribution (or other priors)
• For incomplete (hidden/missing data):
• Bayesian Parametric Models
• Bayesian Linear (Regression) Model
• Bayesian Multivariate Linear (Regression) Model
• Bayesian Nonparametric Models
• Bayesian Smoothing Splines
• Posterior Risk / Bayes Risk
• Posterior variance (when MSE is used)
• Bayes Risk Function / Posterior Expected Loss (i.e. Posterior Expectation Value of Loss Function)
• Posterior mean / Minimum MSE (MMSE) estimator / Bayes least squared error (BLSE) estimator / Squared error loss
• Posterior median / Median-unbiased estimator / Absolute error loss
• Posterior mode
• Bayes estimator
• MMSE / BLSE estimator
• Median-unbiased estimator
• Bayes estimator for conjugate priors (e.g. exponential family)
• Bayesian Hierarchical Modelling / Hierarchical Bayes
• Hyperparameter
• Hyperprior
• Empirical Bayes / Maximum marginal likelihood estimator (MMLE) / Evidence Approximation
• Nonparametric Empirical Bayes (NPEB)
• Parametric Empirical Bayes Point Estimation
• Full Bayes
• Uninformative priors / Noninformative priors / Maximum entropy priors
• Jeffreys prior
• Maximum Entropy (Maxent) Models / Entropic priors
• Exponential family
• Beta distribution
• Bayesian Online Parameter Estimation
• Iterative proportional fitting (IPF)
• Density Estimation (i.e. the unknown parameter is probability density itself)
• Risk Function / Expected Loss (i.e. Expectation Value of Loss Function)
• Mean integrated squared error (MISE)
• Parametric Density Estimation
• Maximum likelihood estimator (MLE)
• Bayes estimator / Bayesian Density Estimation (i.e. a distribution over distributions)
• Nonparametric Density Estimation
• Rescaled Histogram (i.e. the oldest and most naive approach)
• Parzen window / Kernel Density Estimation (KDE) / Parzen-Rosenblatt estimator
• k-Nearest Neighbors Density Estimation
• Bayesian Nonparametric Density Estimation

## Books and Book Chapters

• Theodoridis, S. (2015). "Chapter 12: Bayesian Learning" Machine Learning: A Bayesian and Optimization Perspective. Academic Press.
• Aster, R. C., Borchers, B., & Thurber, C. (2012). "Chapter 11: Bayesian Methods". Parameter estimation and inverse problems. Academic Press
• Barber, D. (2012). "Section 9.1: Learning as Inference". Bayesian Reasoning and Machine Learning. Cambridge University Press.
• Barber, D. (2012). "Chapter 18: Bayesian Linear Models". Bayesian Reasoning and Machine Learning. Cambridge University Press.
• Duda, R. O., Hart, P. E., & Stork, D. G. (2012). "Chapter 2: Bayesian Decision Theory". Pattern Classification. John Wiley & Sons.
• Murphy, K. P. (2012). "Section 5.7: Bayesian Decision Theory". Machine Learning: A Probabilistic Perspective. MIT Press.
• Koller, D., & Friedman, N. (2009). "Chapter 17: Parameter Estimation". Probabilistic Graphical Models. MIT Press.
• Theodoridis, S., Pikrakis, A., Koutroumbas, K., & Cavouras, D. (2008). "Chapter 2: Bayesian Decision Theory". Pattern Recognition. 4th Ed. Academic Press.
• Bishop, C. M. (2006). "Chapter 2: Probability Distributions". Pattern Recognition and Machine Learning. Springer.
• MacKay, D. J. (2003). "Chapter 24: Exact Marginalization". Information Theory, Inference and Learning Algorithms. Cambridge University Press.
• MacKay, D. J. (2003). "Chapter 36: Decision Theory". Information Theory, Inference and Learning Algorithms. Cambridge University Press.
• Bretthorst, G. L. (1998). Bayesian spectrum analysis and parameter estimation. Springer Science & Business Media.
• Berger, J. O. (1993). Statistical decision theory and Bayesian analysis. 2nd Ed. Springer Science & Business Media.

## Scholarly Articles

• Caticha, A. (2010). Entropic inference. arXiv preprint arXiv:1011.0723.
• Caticha, A., & Preuss, R. (2004). Maximum entropy and Bayesian data analysis: Entropic prior distributions. Physical Review E70(4), 046127.
• Ghahramani, Z. (2003). "Graphical models: Parameter learning". In Handbook of Brain Theory and Neural Networks.
• Malouf, R. (2002). A comparison of algorithms for maximum entropy parameter estimation. In proceedings of the 6th conference on Natural language learning-Volume 20 (pp. 1-7). Association for Computational Linguistics.