Difference between revisions of "Probability and Statistics"

From Ioannis Kourouklides
Jump to navigation Jump to search
 
(15 intermediate revisions by 3 users not shown)
Line 94: Line 94:
 
*** Likelihood-ratio test
 
*** Likelihood-ratio test
 
*** Wald test
 
*** Wald test
 +
** Model Evaluation Metrics (for Classification)
 +
*** Confusion Matrix
 +
*** Accuracy
 +
*** F-measure / F1-score / F-score
 +
*** Precision
 +
*** Recall / Sensitivity / True Positive Rate
 +
*** Specificity /  True Negative Rate
 +
*** False Positive Rate
 +
*** False Negative Rate
 +
** Model Evaluation Metrics (for Regression)
 +
*** Mean Square Error (MSE)
 +
*** Root MSE (RMSE)
 +
*** Mean Absolute Error (MAE)
 +
*** R-Squared
  
 
===Statistical Models===
 
===Statistical Models===
Line 103: Line 117:
 
** Generalized Linear Model (GLM or GLIM)
 
** Generalized Linear Model (GLM or GLIM)
 
** Poisson Regression
 
** Poisson Regression
 +
** Negative Binomial Regression
 
** Logistic Regression Model / Logit Model
 
** Logistic Regression Model / Logit Model
 
** Multinomial Logistic Regression / Softmax Regression
 
** Multinomial Logistic Regression / Softmax Regression
Line 132: Line 147:
 
** Survival Analysis
 
** Survival Analysis
 
** Reliability Theory
 
** Reliability Theory
 +
** Risk Assessment
 +
** Hazard Function
  
 
===Probability Theory===
 
===Probability Theory===
Line 184: Line 201:
 
*[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.171.321&rep=rep1&type=pdf ETC5410: Nonparametric smoothing methods by Rob J Hyndman]
 
*[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.171.321&rep=rep1&type=pdf ETC5410: Nonparametric smoothing methods by Rob J Hyndman]
 
*[http://polisci.msu.edu/jacoby/icpsr/regress3/ Regression III: Advanced Methods by William Jacoby]
 
*[http://polisci.msu.edu/jacoby/icpsr/regress3/ Regression III: Advanced Methods by William Jacoby]
 +
*[http://people.stat.sfu.ca/~lockhart/richard/830/13_3/ Statistical Theory I by Richard Lockhart]
 +
*[http://content.csbs.utah.edu/~ehrbar/ecmet.pdf Class Notes in Statistics and Econometrics by Hans G. Ehrbar]
  
 
==Books==
 
==Books==
Line 198: Line 217:
 
* Ramachandran, K. M., & Tsokos, C. P. (2012). ''Mathematical Statistics with Applications in R''. Elsevier.
 
* Ramachandran, K. M., & Tsokos, C. P. (2012). ''Mathematical Statistics with Applications in R''. Elsevier.
 
* Liero, H., & Zwanzig, S. (2012). ''Introduction to the theory of statistical inference''. CRC Press.
 
* Liero, H., & Zwanzig, S. (2012). ''Introduction to the theory of statistical inference''. CRC Press.
 +
* Wasserman, L. (2013). ''All of statistics: a concise course in statistical inference''. Springer Science & Business Media.
 
* Gentle, J. E. (2007). ''Matrix algebra: theory, computations, and applications in statistics''. Springer Science & Business Media.
 
* Gentle, J. E. (2007). ''Matrix algebra: theory, computations, and applications in statistics''. Springer Science & Business Media.
 
* Rice, J. (2006). ''Mathematical statistics and data analysis''. 3rd Ed. Duxbury Press.
 
* Rice, J. (2006). ''Mathematical statistics and data analysis''. 3rd Ed. Duxbury Press.
Line 216: Line 236:
 
* Kroese, D. P., & Chan, J. C. (2016). ''Statistical modeling and computation''. Springer.
 
* Kroese, D. P., & Chan, J. C. (2016). ''Statistical modeling and computation''. Springer.
 
* Chatterjee, S., & Hadi, A. S. (2012). ''Regression analysis by example''. 5th Ed. John Wiley & Sons.
 
* Chatterjee, S., & Hadi, A. S. (2012). ''Regression analysis by example''. 5th Ed. John Wiley & Sons.
 +
* Kaminskiy, M. P. (2012). ''Reliability models for engineers and scientists''. CRC Press.
 
* Goldstein, H. (2010). ''Multilevel statistical models''. 4th Ed. John Wiley & Sons.
 
* Goldstein, H. (2010). ''Multilevel statistical models''. 4th Ed. John Wiley & Sons.
 +
* Tobias, P. A., & Trindade, D. (2011). ''Applied reliability''. 3rd Ed. CRC Press.
 
* Freedman, D. A. (2009). ''Statistical models: theory and practice''. Cambridge University Press.
 
* Freedman, D. A. (2009). ''Statistical models: theory and practice''. Cambridge University Press.
 
* Dobson, A. J., & Barnett, A. (2008). ''An introduction to generalized linear models''. 3rd Ed. CRC press.
 
* Dobson, A. J., & Barnett, A. (2008). ''An introduction to generalized linear models''. 3rd Ed. CRC press.
Line 223: Line 245:
 
* Stapleton, J. H. (2007). ''Models for probability and statistical inference: theory and applications''. John Wiley & Sons.
 
* Stapleton, J. H. (2007). ''Models for probability and statistical inference: theory and applications''. John Wiley & Sons.
 
* Li, Q., & Racine, J. S. (2007). ''Nonparametric Econometrics: Theory and Practice''. Princeton University Press.
 
* Li, Q., & Racine, J. S. (2007). ''Nonparametric Econometrics: Theory and Practice''. Princeton University Press.
 +
* Birolini, A. (2007). ''Reliability engineering: theory and practice''. 5th Ed. Springer.
 
* Gelman, A., & Hill, J. (2006). ''Data analysis using regression and multilevel/hierarchical models''. Cambridge University Press.
 
* Gelman, A., & Hill, J. (2006). ''Data analysis using regression and multilevel/hierarchical models''. Cambridge University Press.
 
* Faraway, J. J. (2005). ''Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models''. CRC press.
 
* Faraway, J. J. (2005). ''Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models''. CRC press.
 
* Rausand, M., & Arnljot, H. A. (2004). ''System reliability theory: models, statistical methods, and applications''. John Wiley & Sons.
 
* Rausand, M., & Arnljot, H. A. (2004). ''System reliability theory: models, statistical methods, and applications''. John Wiley & Sons.
 +
* Bazovsky, I. (2004). ''Reliability theory and practice''. Courier Corporation.
 
* Ruppert, D., Wand, M. P., & Carroll, R. J. (2003). ''Semiparametric regression''. Cambridge University Press.
 
* Ruppert, D., Wand, M. P., & Carroll, R. J. (2003). ''Semiparametric regression''. Cambridge University Press.
 
* Faraway, J. J. (2002). Practical regression and ANOVA using R. ([http://www.ats.ucla.edu/stat/r/sk/books_pra.htm link])
 
* Faraway, J. J. (2002). Practical regression and ANOVA using R. ([http://www.ats.ucla.edu/stat/r/sk/books_pra.htm link])
 +
* O'Connor, P., & Kleyner, A. (2002). ''Practical reliability engineering''. 4th Ed. John Wiley & Sons.
 
* Hayashi, F. (2000). ''Econometrics''. Princeton University Press.
 
* Hayashi, F. (2000). ''Econometrics''. Princeton University Press.
 
* Elandt-Johnson, R. C., & Johnson, N. L. (1999). ''Survival models and data analysis''. John Wiley & Sons.
 
* Elandt-Johnson, R. C., & Johnson, N. L. (1999). ''Survival models and data analysis''. John Wiley & Sons.
 
* Draper, N. R., & Smith, H. (1998). ''Applied regression analysis''. 3rd Ed. John Wiley & Sons.
 
* Draper, N. R., & Smith, H. (1998). ''Applied regression analysis''. 3rd Ed. John Wiley & Sons.
 
* Long, J. S., & Freese, J. (1997). ''Regression models for categorical dependent variables''. Sage Publications.
 
* Long, J. S., & Freese, J. (1997). ''Regression models for categorical dependent variables''. Sage Publications.
 +
* Leemis, L. M. (1995). ''Reliability: probabilistic models and statistical methods''. Prentice Hall.
 
* McCullagh, P., & Nelder, J. A. (1989). ''Generalized linear models''. CRC press.
 
* McCullagh, P., & Nelder, J. A. (1989). ''Generalized linear models''. CRC press.
  
Line 239: Line 265:
 
* Hollos, S. & Hollos, J. R. (2013). ''Probability Problems and Solutions.'' Abrazol Publishing.
 
* Hollos, S. & Hollos, J. R. (2013). ''Probability Problems and Solutions.'' Abrazol Publishing.
 
* Patrick, D. (2007). ''Introduction to Counting and Probability''. 2nd Ed. AoPS Incorporated.
 
* Patrick, D. (2007). ''Introduction to Counting and Probability''. 2nd Ed. AoPS Incorporated.
* Patrick, D. (2007). ''Intermediate Counting and Probability''. AoPS Incorporated.
 
 
* Hamming, R. W. (1993). ''The Art of Probability for Scientists and Engineers''. CRC Press.
 
* Hamming, R. W. (1993). ''The Art of Probability for Scientists and Engineers''. CRC Press.
  
Line 253: Line 278:
  
 
==See also==
 
==See also==
* [[Machine Learning|Computational Statistics / Machine Learning]]
+
* [[Machine Learning]]
 
* [[Statistical Learning Theory]]
 
* [[Statistical Learning Theory]]
 
* [[Statistical Signal Processing]]
 
* [[Statistical Signal Processing]]
 
* [[Information Theory]]
 
* [[Information Theory]]
 
* [[Optimization]]
 
* [[Optimization]]
 +
* [[Computational Finance]]
 
* [[Combinatorics]]
 
* [[Combinatorics]]
 
* [[International Mathematical Olympiad]]
 
* [[International Mathematical Olympiad]]

Latest revision as of 18:44, 10 December 2018

Probability.gif

This page contains resources about Probability Theory and Statistics in general.

More specific information is included in each subfield.

A distinction should be made between Models and Methods (which might be applied on or using these Models).

Subfields and Concepts[edit]

See Category:Probability and Statistics for all its subfields.

Statistical Inference / Inferential Statistics[edit]

  • Frequentist Inference
    • Statistical Hypothesis Testing / Statistical Tests
      • Fisher's Null Hypothesis Testing
      • Neyman-Pearson Theory
      • Analysis of Variance (ANOVA)
      • Analysis of Covariance (ANCOVA)
      • Multivariate Analysis of Variance (MANOVA)
      • T-test
      • F-test
      • Tests of Goodness-of-Fit
    • Confidence Intervals
    • Bootstrapping
  • Bayesian Inference
    • Bayesian Testing: Bayes Factor
    • Bayesian Confidence Sets: Credible Intervals
    • Hierarchical Bayes
    • Empirical Bayes
    • Full Bayes
  • Computational Methods for Bayesian Inference (i.e. using Algorithmic Methods)
  • Inductive inference
  • Empirical Inference
  • Causal Inference
  • Interval Estimation
  • Estimation Theory / Point Estimation
  • Sufficiency, Minimality, Completeness and Variance Reduction Techniques (VRT)
    • Gauss-Markov Theorem
    • Lehmann–Scheffe Theorem
    • Factorization Theorem
    • Complete statistic
    • Minimal sufficient statistic
    • Ancillary statistic
    • Fisher information
    • Fisher information metric / Fisher–Rao metric
    • Scoring algorithm / Fisher's scoring
    • Score function
    • Cramer–Rao bound (CRB) / Cramer–Rao lower bound (CRLB)
    • Rao–Blackwell Theorem
      • Rao–Blackwellization
      • Rao–Blackwell estimator
    • Exponential family
    • Conjugate prior family
  • Decision Theory
    • Neyman-Pearson Theory
    • The Expected Loss Principle
    • Optimal decision rules
    • Bayesian Decision Theory / Bayes estimator
    • Cost function / Loss function
    • Risk function
    • Admissibility
    • Unbiasedness
    • Minimaxity
  • Algorithmic Information Theory
    • Kolmogorov Complexity / Algorithmic Complexity
    • Algorithmic Probability / Solomonoff Probability
    • Universal Search (by Levin)
    • Algorithmic Randomness (by Martin-Lof)
    • Solomonoff's Theory of Inductive Inference
    • Epicurus' Principle of Multiple Explanations
    • Occam's Razor
    • Bayes' rule
    • Minimum Description Length (MDL) principle
    • Minimum Message Length (MML)
    • Algorithmic Statistics
  • Model Selection and Evaluation
    • Akaike Information Criterion (AIC)
    • Bayesian Information Criterion (BIC)
    • Deviance Information Criterion (DIC)
    • Bayesian Predictive Information Criterion (BPIC)
    • Focused Information Criterion (FIC)
    • Minimum Description Length (MDL)
    • Minimum Message Length (MML)
    • Akaike Final Prediction Error (FPE)
    • Parzen's Criterion Autoregressive Transfer Function (CAT)
    • Bayesian Model Selection / Bayesian Model Comparison
    • Cross-Validation
    • Statistical Hypothesis Testing (for Multilevel Models / Nested Models only)
      • Lagrange multiplier test / Score test / Score Method
      • Likelihood-ratio test
      • Wald test
    • Model Evaluation Metrics (for Classification)
      • Confusion Matrix
      • Accuracy
      • F-measure / F1-score / F-score
      • Precision
      • Recall / Sensitivity / True Positive Rate
      • Specificity / True Negative Rate
      • False Positive Rate
      • False Negative Rate
    • Model Evaluation Metrics (for Regression)
      • Mean Square Error (MSE)
      • Root MSE (RMSE)
      • Mean Absolute Error (MAE)
      • R-Squared

Statistical Models[edit]

  • Regression Analysis
    • Linear Regression Model
    • Simple Linear Regression
    • Multiple Linear Regression (not to be confused with Multivariate Linear Regression)
    • General Linear Model / Multivariate Linear Model
    • Generalized Linear Model (GLM or GLIM)
    • Poisson Regression
    • Negative Binomial Regression
    • Logistic Regression Model / Logit Model
    • Multinomial Logistic Regression / Softmax Regression
    • Probit Model
    • Fixed Effects Model
    • Hierarchical Linear Models / Multilevel Models / Nested Data Models
      • Random Effects Model / Variance Components Model
      • Mixed Effects Models (not to be confused with Mixture Models)
    • Nonparametric Regression Models
    • Semi-parametric Regression Models
    • Nonlinear Regression Models
    • Robust Regression Models
    • Random sample consensus (RANSAC)
    • Least Squares Methods
      • Ordinary Least Squares / Linear Least Squares
      • Weighted Least Squares
      • Nonlinear Least Squares
      • L1-regularization / Least absolute shrinkage and selection operator (LASSO) / Laplace prior
      • L2-regularization / Ridge Regression / Tikhonov Regularization / Gaussian prior
  • Probabilistic Models
  • State Space Models
    • Time Series Models
  • Reliability Engineering / Reliability Modelling
    • Survival Analysis
    • Reliability Theory
    • Risk Assessment
    • Hazard Function

Probability Theory[edit]

  • Random Variables
    • Continuous Random Variables
      • Probability Density Function
    • Discrete Random Variables
      • Probability Mass Function
    • Jointly Distributed Random Variables
      • Joint Density Function
    • Independent Random Variables
    • Uncorrelated Random Variables
  • Moments of a distribution
    • First Moment / Mean
    • Second Moment / Variance
    • Third Moment / Skewness
    • Fourth Moment / Kurtosis
  • Probabilistic Models
  • Stochastic Convergence
  • Probability Space
  • Measure Space
  • State Space
  • Theorem of Total Probability
  • Central Limit Theorem
  • Conditional Probability
  • Bayesian Probability Theory
  • Frequentist Probability Theory
  • Queueing Theory
  • Martingale Theory
  • Ergodic Theory
  • Decision Theory
  • Measure Theory
  • Utility Theory

Online Courses[edit]

Video Lectures[edit]


Lecture Notes[edit]

Books[edit]

Statistical Inference and Theory of Statistics[edit]

  • Bruce, P., & Bruce, A. (2017). Practical Statistics for Data Scientists: 50 Essential Concepts. O'Reilly Media.
  • Imbens, G. W., & Rubin D. B. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction.
  • Ross, S. M. (2014). Introduction to probability models. 11th Ed. Academic Press.
  • Smith, R. C. (2013). Uncertainty quantification: theory, implementation, and applications. SIAM.
  • Gentle, J. E. (2013). Theory of statistics. (link)
  • DeGroot, M. H., & Schervish, M. J. (2012). Probability and statistics. 4th Ed. Pearson.
  • Abu-Mostafa, Y. S., Magdon-Ismail, M., & Lin, H. T. (2012). Learning From Data. AMLBook.
  • Diez, D. M., Barr, C. D., & Cetinkaya-Rundel, M. (2012). OpenIntro Statistics. CreateSpace.
  • Ramachandran, K. M., & Tsokos, C. P. (2012). Mathematical Statistics with Applications in R. Elsevier.
  • Liero, H., & Zwanzig, S. (2012). Introduction to the theory of statistical inference. CRC Press.
  • Wasserman, L. (2013). All of statistics: a concise course in statistical inference. Springer Science & Business Media.
  • Gentle, J. E. (2007). Matrix algebra: theory, computations, and applications in statistics. Springer Science & Business Media.
  • Rice, J. (2006). Mathematical statistics and data analysis. 3rd Ed. Duxbury Press.
  • Cox, D. R. (2006). Principles of statistical inference. Cambridge University Press.
  • Lavine, M. (2005). Introduction to Statistical Thought. Michael Lavine.
  • Young, G. A., & Smith, R. L. (2005). Essentials of statistical inference. Cambridge University Press.
  • Lehmann, E. L., & Casella, G. (2003). Theory of point estimation. Springer.
  • Bertsekas, D. P., & Tsitsiklis, J. N. (2002). Introduction to Probability. Athena scientific.
  • Casella, G., & Berger, R. L. (2002). Statistical inference. Cengage Learning.
  • Garthwaite, P. H., Jolliffe, I. T., & Jones, B. (2002). Statistical inference. Oxford University Press.
  • Shao, J. (2000). Mathematical Statistics. Springer.
  • Mukhopadhyay, N. (2000). Probability and statistical inference. CRC Press.
  • Schervish, M. J. (1995). Theory of statistics. Springer Science & Business Media.

Regression Analysis, Reliability and Generalized Linear Models[edit]

  • Greene, W. H. (2018). Econometric analysis. 8th Ed. Pearson.
  • Harrell, F. (2015). Regression modeling strategies. 2nd Ed. Springer.
  • Kroese, D. P., & Chan, J. C. (2016). Statistical modeling and computation. Springer.
  • Chatterjee, S., & Hadi, A. S. (2012). Regression analysis by example. 5th Ed. John Wiley & Sons.
  • Kaminskiy, M. P. (2012). Reliability models for engineers and scientists. CRC Press.
  • Goldstein, H. (2010). Multilevel statistical models. 4th Ed. John Wiley & Sons.
  • Tobias, P. A., & Trindade, D. (2011). Applied reliability. 3rd Ed. CRC Press.
  • Freedman, D. A. (2009). Statistical models: theory and practice. Cambridge University Press.
  • Dobson, A. J., & Barnett, A. (2008). An introduction to generalized linear models. 3rd Ed. CRC press.
  • Davison, A. C. (2003). Statistical models. Cambridge University Press.
  • Fox, J. (2008). Applied regression analysis and generalized linear models. 2nd Ed. Sage Publications.
  • Stapleton, J. H. (2007). Models for probability and statistical inference: theory and applications. John Wiley & Sons.
  • Li, Q., & Racine, J. S. (2007). Nonparametric Econometrics: Theory and Practice. Princeton University Press.
  • Birolini, A. (2007). Reliability engineering: theory and practice. 5th Ed. Springer.
  • Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.
  • Faraway, J. J. (2005). Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models. CRC press.
  • Rausand, M., & Arnljot, H. A. (2004). System reliability theory: models, statistical methods, and applications. John Wiley & Sons.
  • Bazovsky, I. (2004). Reliability theory and practice. Courier Corporation.
  • Ruppert, D., Wand, M. P., & Carroll, R. J. (2003). Semiparametric regression. Cambridge University Press.
  • Faraway, J. J. (2002). Practical regression and ANOVA using R. (link)
  • O'Connor, P., & Kleyner, A. (2002). Practical reliability engineering. 4th Ed. John Wiley & Sons.
  • Hayashi, F. (2000). Econometrics. Princeton University Press.
  • Elandt-Johnson, R. C., & Johnson, N. L. (1999). Survival models and data analysis. John Wiley & Sons.
  • Draper, N. R., & Smith, H. (1998). Applied regression analysis. 3rd Ed. John Wiley & Sons.
  • Long, J. S., & Freese, J. (1997). Regression models for categorical dependent variables. Sage Publications.
  • Leemis, L. M. (1995). Reliability: probabilistic models and statistical methods. Prentice Hall.
  • McCullagh, P., & Nelder, J. A. (1989). Generalized linear models. CRC press.

Counting and Probability[edit]

  • Shu, Z. (2016). Probability and Expectation (Volume 14). World Scientific
  • Zhou, X. (2015). Counting: Math for Gifted Students. CreateSpace. 
  • Hollos, S. & Hollos, J. R. (2013). Probability Problems and Solutions. Abrazol Publishing.
  • Patrick, D. (2007). Introduction to Counting and Probability. 2nd Ed. AoPS Incorporated.
  • Hamming, R. W. (1993). The Art of Probability for Scientists and Engineers. CRC Press.

Software[edit]

See List of Statistical packages for a complete list.

See also[edit]

Other Resources[edit]