非负费用折扣半马氏决策过程

黄永辉, 郭先平

数学学报 ›› 2010, Vol. 53 ›› Issue (3) : 503-514.

PDF(550 KB)
PDF(550 KB)
数学学报 ›› 2010, Vol. 53 ›› Issue (3) : 503-514. DOI: 10.12386/A2010sxxb0056
论文

非负费用折扣半马氏决策过程

    黄永辉, 郭先平
作者信息 +

Discounted Semi-Markov Decision Processes with Nonnegative Costs

    Yong Hui HUANG, Xian Ping GUO
Author information +
文章历史 +

摘要

本文考虑可数状态非负费用的折扣半马氏决策过程.首先在给定半马氏决策核和策略下构造一个连续时间半马氏决策过程,然后用最小非负解方法证明值函数满足最优方程和存在ε-最优平稳策略,并进一步给出最优策略的存在性条件及其一些性质. 最后,给出了值迭代算法和一个数值算例.

 

Abstract

This paper deals with discounted semi-Markov decision processes with countable states and nonnegative costs. We first construct a continuous-time semi-Markov decision process under a given semi-Markov decision kernel and each policy. Then, we prove that the value function satisfies the optimality equation and there exists an ε-optimal stationary policy by using a minimum nonnegative solution approach, and further give conditions for the existence of optimal policies as well as some properties of optimal policies. Finally, a value iteration algorithm for computing the value function is developed and a numerical example is given.

 

关键词

半马氏决策过程 / 折扣费用 / 最优策略

Key words

Semi-Markov decision processes / discounted cost / optimal policy

引用本文

导出引用
黄永辉, 郭先平. 非负费用折扣半马氏决策过程. 数学学报, 2010, 53(3): 503-514 https://doi.org/10.12386/A2010sxxb0056
Yong Hui HUANG, Xian Ping GUO. Discounted Semi-Markov Decision Processes with Nonnegative Costs. Acta Mathematica Sinica, Chinese Series, 2010, 53(3): 503-514 https://doi.org/10.12386/A2010sxxb0056

参考文献


[1] Wessels J., van Nunen J. A. E. E., Discounted semi-Markov decision processes: linear programming and policy iteration, Statistica Neerlandica., 1975, 29: 1--7.

[2] Dong Z. Q., Liu K., Structure of optimal policies for discounted semi-Markov decision programming with unbounded rewards, Science in China Series A: Mathematic, 1985, 11: 975--985 (in Chinese).

[3] Dong Z. Q., Song J. S., A secondary approach to the discounted model in semi-Markov decision processes, Kexue Tongbao (English Ed.), 1988, 33: 448--454.

[4] Wakuta K., Arbitrary state semi-Markov decision processes with unbounded rewards, Optimization, 1987, 18: 447--454.

[5] Puterman M. L., Markov Decision Processes: Discrete Stochastic Dynamic Programming, New York: John Wiley & Sons. Inc., 1994.

[6] Hu G. H., Zhang S., Discounted semi-Markov decision processes with a constraint, Acta Math. Appl. Sinica, 1997, 20: 187--195 (in Chinese).

[7] Hu Q. Y., Discounted semi-Markov decision process in a semi-Markov environment, Optimization, 1997, 39: 367--382.

[8] Lippman S. A., Semi-Markov decision processes with unbounded rewards, Management Sci., 1973, 19: 717--731.

[9] Lippman S. A., On dynamic programming with unbounded rewards, Management Sci., 1975, 21: 1225--1233.

[10] Doshi B. T., Generalized semi-Markov decision processes, J. Appl. Prob., 1979, 16: 618--630.

[11] Sennott L. I., Stochastic Dynamic Programming and the Control of Queuing Systems, New York: John Wiley & Sons. Inc., 1999.

[12] Hernández-Lerma O., Lasserre J. B., Discrete-Time Markov Control Processes: Basic Optimality Criteria, New York: Springer-Verlag, 1996.

[13] Hernández-Lerma O., Lasserre J. B., Futher Topics on Discrete-Time Markov Control Processes, New York: Springer-Verlag, 1999.

[14] Guo X. P., Hernández-Lerma O., Continuous-time controlled Markov chains, Ann. Appl. Prob., 2003, 13: 363--388.

[15] Guo X. P., Hernández-Lerma O., Continuous-time controlled Markov chains with discounted rewards, Acta. Appl. Math., 2003, 79: 195--216.

[16] Guo X. P., Dai Y. L., The unbounded cost discounted model for continuous time Markov decision processes, Acta Mathematica Sinica, Chinese Series, 2002, 45(1): 171--182.

[17] Lin Y. L., Continuous time discounted Markov decision model with moment optimality criterion and the relationship between it and the discrete time quasi-discounted model-Q matrix is necessarily conservative, Acta Mathematica Sinica, Chinese Series, 1992, 35(1): 8--19.

[18] Limnios N., Oprisan J., Semi-Markov Processes and Reliability, Boston: Birkháuser, 2001.

[19] Ross S. M., Average cost semi-Markov decision processes, J. Appl. Prob., 1970, 7: 649--656.

基金

国家自然科学基金资助项目(60874004, 10925107)

PDF(550 KB)

397

Accesses

0

Citation

Detail

段落导航
相关文章

/