
Reinforcement Learning
This page contains resources about Reinforcement Learning.
Subfields and Concepts[edit]
- Action space
- Discrete
- Continuous (usually dealt by Actor-Critic Methods)
- Multi-Armed Bandit
- Finite Markov Decision Process (MDP)
- Partially Observable MDP (POMDP)
- Model-based RL (i.e. model the environment)
- Model-free RL
- Value-based Methods
- Temporal-Difference (TD) Learning
- SARSA
- Q-Learning
- Deep Q-learning / Deep Q Network (DQN)
- Double Q-learning / Double DQN
- Dueling DQN
- Policy-based methods / Policy Optimization
- (Pure) Policy Gradients Methods
- Score function estimator / REINFORCE
- Monte Carlo Policy Gradient
- Bayesian Policy Gradient
- Score function estimator / REINFORCE
- Trust Region Policy Optimization (TRPO)
- (Pure) Policy Gradients Methods
- Actor-Critic Methods (i.e. combination of Value-based and Policy-based Methods)
- Advantage-Actor-Critic (A2C)
- Asynchronous Advantage-Actor-Critic (A3C)
- Soft Actor Critic (SAC)
- Neural Fitted Q Iteration with Continuous Actions (NFQCA)
- Deterministic Policy Gradient (DPG)
- Deep DPG (DDPG)
- Twin Delayed DDPG (TD3)
- Proximal Policy Optimization (PPO)
- Value-based Methods
- Evolutionary Algorithms
- Cross Entropy Method (CEM)
- Covariance Matrix Adaptation (CMA)
- Genetic Algorithms
- Adaptive Dynamic Programming
- Deep Reinforcement Learning
- Deep Q-learning / Deep Q Network (DQN)
- Deep Recurrent Q-Network (DRQN)
- Deep Soft Recurrent Q-Network (DSRQN)
- Double Q-learning / Double DQN (DDQN)
- Proximal Policy Optimization (PPO)
- Multi-Agent Reinforcement Learning (MARL)
- Connectionist Reinforcement Learning
- Score function estimator / REINFORCE
- Variance Reduction Techniques (VRT) for gradient estimates
- Inverse Reinforcement Learning
- On-policy Learning
- Temporal-Difference (TD) Learning
- SARSA
- (Pure) Policy Gradients Methods
- Off-policy Learning
- Q-Learning
- Exploration Vs. Exploitation problem
Online Courses[edit]
Video Lectures[edit]
- Practical Reinforcement Learning - Coursera
- Machine Learning and Reinforcement Learning in Finance Specialization by Igor Halperin - Coursera
- Reinforcement Learning by Pascal Poupart
Lectures Notes[edit]
- Reinforcement Learning by David Silver
- Reinforcement Learning by Michael Herrmann
- Deep Reinforcement Learning by Sergey Levine
- Deep Reinforcement Learning and Control by Katerina Fragkiadaki and Ruslan Satakhutdinov
- Reinforcement Learning by Mario Martin
Books and Book Chapters[edit]
- Lapan, M. (2018). Deep Reinforcement Learning Hands-On. Packt Publishing.
- Dutta, S. (2018). Reinforcement Learning with TensorFlow. Packt Publishing.
- Ravichandiran, S. (2018). Hands-On Reinforcement Learning with Python. Packt Publishing.
- Russell, S. J., & Norvig, P. (2010). "Chapter 21: Reinforcement Learning". Artificial Intelligence: A Modern Approach. Prentice Hall.
- Alpaydin, E. (2010). "Chapter 18: Reinforcement Learning". Introduction to Machine Learning. MIT Press.
- Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press. (draft)
- Mitchell, T. M. (1997). "Chapter 13: Reinforcement Learning". Machine Learning. McGraw Hill.
Scholarly Articles[edit]
- Bard, N. ... (2018). The Hanabi Challenge: A New Frontier for AI Research. arXiv preprint arXiv:1902.00506.
- Arulkumaran, K., Deisenroth, M. P., Brundage, M., & Bharath, A. A. (2017). A brief survey of deep reinforcement learning. arXiv preprint arXiv:1708.05866.
- Lanctot, M., Zambaldi, V., Gruslys, A., Lazaridou, A., Perolat, J., Silver, D., & Graepel, T. (2017). A unified game-theoretic approach to multiagent reinforcement learning. In Advances in Neural Information Processing Systems (pp. 4193-4206).
- Foerster, J., Assael, I. A., de Freitas, N., & Whiteson, S. (2016). Learning to communicate with deep multi-agent reinforcement learning. In Advances in Neural Information Processing Systems (pp. 2137-2145).
- Ghavamzadeh, M., Mannor, S., Pineau, J., & Tamar, A. (2015). Bayesian reinforcement learning: A survey. Foundations and Trends® in Machine Learning, 8(5-6), 359-483.
- Szepesvari, C. (2010). Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine , 4(1), 1-103.
- Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, And Cybernetics-Part C: Applications and Reviews, 38 (2).
- Panait, L., & Luke, S. (2005). Cooperative multi-agent learning: The state of the art. Autonomous agents and multi-agent systems, 11(3), 387-434.
- Bowling, M., & Veloso, M. (2000). An analysis of stochastic game theory for multiagent reinforcement learning (No. CMU-CS-00-165). CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE.
- Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research, 4, 237-285.
- Wl, M. H., Harmon, M. E., & Harmon, S. S. (1996). Reinforcement Learning: A Tutorial.
- Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4), 229-256.
Tutorials[edit]
Software[edit]
- PyBrain - Python
- OpenAI Gym - A toolkit for developing and comparing Reinforcement Learning algorithms
- Reinforcement-Learning-Toolkit - Python
- DeepRL - Python
- coach - Python
- Griduniverse - Python
- Retro (formerly Universe) - Python
- RoomAI - Python
- ViZDoom - Python, C++, Lua, Java and Julia
- tensorforce - Python and TensorFlow
- keras-gym - Python, TensorFlow and Gym
- spinningup - Python, TensorFlow and Gym
See also[edit]
Other resources[edit]
- Awesome-RL (GitHub) - A curated list of Reinforcement Learning resources
- awesome-deep-rl (GitHub) - A curated list of Deep Reinforcement Learning resources
- awesome-rl (GitHub) - A curated list of Deep Reinforcement Learning resources
- Practical_RL( GitHub) - A course in reinforcement learning in the wild
- Deep RL Bootcamp - Videos and slides
- Software Tools for RL, ANNs and Robotics - Python and MATLAB
- Reinforcement Learning - blog post
- Learning Diverse Skills via Maximum Entropy Deep Reinforcement Learning - blog post
- I’m a bandit - Random topics in optimization, probability, and statistics. By Sébastien Bubeck - blog
- Simple Reinforcement Learning with Tensorflow (Part 0 to 9) - blog posts
- Paper Collection of Multi-Agent Reinforcement Learning (MARL)
- Practical_RL - Github
- AgentNet - Github
- DataLab Cup 5: Deep Reinforcement Learning
- Reinforcement learning tutorial using Python and Keras - blog post
- Reinforcement Learning w/ Keras + OpenAI: Actor-Critic Models - blog post
- Deep Q-Learning with Keras and Gym - blog post
- deep-q-learning - Github
- keras-rl - Github
- Reinforcement Q-Learning from Scratch in Python with OpenAI Gym - blog post with code
- A (Long) Peek into Reinforcement Learning - blog post
- trfl (GitHub) - code
- multiworld (GitHub) - code
- Deep Reinforcement Learning Workshop at NIPS 2018
- Spinning Up in Deep RL - educational resource produced by OpenAI
- Horizon (GitHub) - code
- Unsupervised Neural Networks Fight in a Minimax Game - blog post
- Cases for Applying Multi-Agent Reinforcement Learning - blog post
- surreal (GitHub) - code
- robotics-rl-srl (GitHub) - code
- Deep Reinforcement Learning with TensorFlow 2.0 - blog post
- rllab (GitHub) - code
- garage (GitHub) - code
- metaworld (GitHub) - code
- RLBench (GitHub) - code
- stable-baselines (GitHub) - code
- RLzoo(GitHub) - code
- RLlib - code