# Information Theory

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

More specific information is included in each subfield.

## Subfields and Concepts

See Category:Information Theory for some of  its subfields.

• Shannon entropy / Information entropy
• Cross entropy / Joint entropy
• Conditional entropy
• Differential entropy
• Information content
• Mutual Information
• Relative entropy / Kullback-Leibler divergence / Information gain
• Perplexity
• Entropy encoding
• Huffman coding
• Arithmetic coding
• Algorithmic Information Theory
• Kolmogorov Complexity / Algorithmic Complexity
• Algorithmic Probability / Solomonoff Probability
• Universal Search (by Levin)
• Algorithmic Randomness (by Martin-Lof)
• Solomonoff's Theory of Inductive Inference
• Epicurus' Principle of Multiple Explanations
• Occam's Razor
• Bayes' rule
• Universality probability
• Universal Turning Machine
• Minimum Description Length (MDL) principle
• Minimum Message Length (MML)
• Algorithmic Statistics
• Principle of Maximum Entropy
• Hamming distance
• Hamming code
• Wavelets
• Information bottleneck
• Neural Network Compression / Model Compression
• Nodes pruning
• Weight pruning / Connection pruning
• Quantization of weights
• Deep Compression
• Dynamic Network Surgery
• SqueezeNet Architecture
• Structured Sparsity Learning
• Soft-weight sharing
• Bayesian Compression
• Variational Dropout
• Coding Theory
• Data Compression / Source Coding
• Lossy Compression
• Lossless Compression
• Shannon's Source Coding Theorem / Noiseless Coding Theorem
• Error Correction / Channel Coding
• Cryptographic Coding
• Shannon–Hartley Theorem
• Noisy-Channel Coding Theorem
• Shannon Limit / Shannon Capacity
• Applications

## Books

### Introductory

• Moser, S. M., & Chen, P. N. (2012). A Student's Guide to Coding and Information Theory. Cambridge University Press.
• Gray, R. M. (2011). Entropy and Information Theory. Springer.
• Yeung, R. W. (2008). Information Theory and Network Coding. Springer.
• Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory. John Wiley & Sons.

### Specialized

• El Gamal, A., & Kim, Y. H. (2011). Network Information Theory. Cambridge University Press.
• Merhav, N. (2010). Lecture Notes on Information Theory and Statistical Physics. Foundations and Trends® in Communications and Information Theory 6(1-2): 1-212 .
• Anderson, D. R. (2008). "Chapter 3: Information Theory and Entropy". Model Based Inference in the Life Sciences. Springer New York.
• MacKay, D. J. (2003). Information Theory, Inference and Learning Algorithms. Cambridge University Press.

## Scholarly Articles

• Louizos, C., Ullrich, K., & Welling, M. (2017). Bayesian Compression for Deep Learning. In Advances in Neural Information Processing Systems (pp. 3290-3300).
• Ullrich, K., Meeds, E., & Welling, M. (2017). Soft Weight-Sharing for Neural Network Compression. arXiv preprint arXiv:1702.04008.
• Molchanov, D., Ashukha, A., & Vetrov, D. (2017). Variational Dropout Sparsifies Deep Neural Networks. arXiv preprint arXiv:1701.05369.
• Wen, W., Wu, C., Wang, Y., Chen, Y., & Li, H. (2016). Learning Structured Sparsity in Deep Neural Networks. In Advances in Neural Information Processing Systems (pp. 2074-2082).
• Han, S., Mao, H., & Dally, W. J. (2015). Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. arXiv preprint arXiv:1510.00149.
• Steinruecken, C. (2014). Lossless Data Compression. PhD Diss., University of Cambridge.
• Alajaji, F., & Chen, P. N. (2013). Lecture Notes in Information Theory: Part I.
• Tishby, N., Pereira, F. C., & Bialek, W. (2000). The information Bottleneck Method. arXiv preprint physics/0004057.