Preface
Chapter 1: What is Reinforcement Learning?
Learning - supervised, unsupervised, and reinforcement
RL formalisms and relations
Reward
The agent
The environment
Actions
Observations
Markov decision processes
Markov process
Markov reward process
Markov decision process
Summary
Chapter 2: OpenAI Gym
The anatomy of the agent
Hardware and software requirements
OpenAI Gym API
Action space
Observation space
The environment
Creation of the environment
The CartPole session
The random CartPole agent
The extra Gym functionality - wrappers and monitors
Wrappers
Monitor
Summary
Chapter 3: Deep Learning with PyTorch
Tensors
Creation of tensors
Scalar tensors
Tensor operations
GPU tensors
Gradients
Tensors and gradients
NN building blocks
Custom layers
Final glue - loss functions and optimizers
Loss functions
Optimizers
Monitoring with TensorBoard
TensorBoard 101
Plotting stuff
Example -GAN on Atari images
Summary
Chapter 4: The Cross-Entropy Method
Taxonomy of RL methods
Practical cross-entropy
Cross-entropy on CartPole