课程笔记《UCL强化学习》 发表于 2017-09-09 | 分类于 课程笔记 | | UCL David Silver的强化学习课程 思维导图Intro to RL MDP Planning by DP Model-Free Prediction Model-Free Control Value Function Approximation Policy Gradient Integrating Learning and Planning Exploration and Exploitation References 强化学习课程 强化学习教材