Reinforcement Learning_By David Silver笔记二: Markov Decision Processes
2017-12-11 17:00
441 查看
Markov Process
Markov Reward Process
直接求解的时间复杂度是O(N^3), 对于small MRPs,可使用直接计算的方法,对于large MRPs使用如下迭代法:动态规划,蒙特卡洛评估,时序差分学习
Markov Decision Process (Markov reward process with decisions)
a policy is a distribution over actions given states. GIven an MDP and policy, the state sequence is Markov process, the state and reward sequence is Markov reward process.
state-value function of an MDP is the expected return starting from state and then following policy
action-value function is the expected return starting from state, taking action and following policy
Markov Reward Process
直接求解的时间复杂度是O(N^3), 对于small MRPs,可使用直接计算的方法,对于large MRPs使用如下迭代法:动态规划,蒙特卡洛评估,时序差分学习
Markov Decision Process (Markov reward process with decisions)
a policy is a distribution over actions given states. GIven an MDP and policy, the state sequence is Markov process, the state and reward sequence is Markov reward process.
state-value function of an MDP is the expected return starting from state and then following policy
action-value function is the expected return starting from state, taking action and following policy
相关文章推荐
- Reinforcement Learning_By David Silver笔记三: Planning by Dynamic Programming
- Reinforcement Learning_By David Silver笔记四: Model Free Prediction
- [论文笔记]Web service composition using markov decision processes (WAIM 2005)
- Reinforcement Learning_By David Silver笔记一: Introduction
- Reinforcement Learning_By David Silver笔记五: Model Free Control
- CMU 10703 |Lecture 2 Markov Decision Processes
- reinforcement learning,增强学习:Markov Decision Processes
- 学习小记 之 马尔可夫决策过程(Markov Decision Processes)
- [收集]使用Markov Decision Processes方法的Web Service相关论文
- reinforcement learning Finite Markov Decision Processes
- [RL] 3 Finite Markov Decision Processes (1)
- 论文笔记之:Learning to Track: Online Multi-Object Tracking by Decision Making
- [RL] 3 Finite Markov Decision Processes (3)
- [RL] 3 Finite Markov Decision Processes (2)
- 有限马尔可夫决策过程(Finite Markov Decision Processes)
- Reinforcement Learning and Markov decision processes 加强学习
- 论文笔记:MDPTracking,Learning to Track: Online Multi-Object Tracking by Decision Making
- Finite Markov Decision Processes
- Markov Decision Processes
- 简单MDP分析(Markov decision processes)