
非负费用折扣半马氏决策过程
Discounted Semi-Markov Decision Processes with Nonnegative Costs
半马氏决策过程 / 折扣费用 / 最优策略 {{custom_keyword}} /
Semi-Markov decision processes / discounted cost / optimal policy {{custom_keyword}} /
[1] Wessels J., van Nunen J. A. E. E., Discounted semi-Markov decision processes: linear programming and policy iteration, Statistica Neerlandica., 1975, 29: 1--7.
[2] Dong Z. Q., Liu K., Structure of optimal policies for discounted semi-Markov decision programming with unbounded rewards, Science in China Series A: Mathematic, 1985, 11: 975--985 (in Chinese).
[3] Dong Z. Q., Song J. S., A secondary approach to the discounted model in semi-Markov decision processes, Kexue Tongbao (English Ed.), 1988, 33: 448--454.
[4] Wakuta K., Arbitrary state semi-Markov decision processes with unbounded rewards, Optimization, 1987, 18: 447--454.
[5] Puterman M. L., Markov Decision Processes: Discrete Stochastic Dynamic Programming, New York: John Wiley & Sons. Inc., 1994.
[6] Hu G. H., Zhang S., Discounted semi-Markov decision processes with a constraint, Acta Math. Appl. Sinica, 1997, 20: 187--195 (in Chinese).
[7] Hu Q. Y., Discounted semi-Markov decision process in a semi-Markov environment, Optimization, 1997, 39: 367--382.
[8] Lippman S. A., Semi-Markov decision processes with unbounded rewards, Management Sci., 1973, 19: 717--731.
[9] Lippman S. A., On dynamic programming with unbounded rewards, Management Sci., 1975, 21: 1225--1233.
[10] Doshi B. T., Generalized semi-Markov decision processes, J. Appl. Prob., 1979, 16: 618--630.
[11] Sennott L. I., Stochastic Dynamic Programming and the Control of Queuing Systems, New York: John Wiley & Sons. Inc., 1999.
[12] Hernández-Lerma O., Lasserre J. B., Discrete-Time Markov Control Processes: Basic Optimality Criteria, New York: Springer-Verlag, 1996.
[13] Hernández-Lerma O., Lasserre J. B., Futher Topics on Discrete-Time Markov Control Processes, New York: Springer-Verlag, 1999.
[14] Guo X. P., Hernández-Lerma O., Continuous-time controlled Markov chains, Ann. Appl. Prob., 2003, 13: 363--388.
[15] Guo X. P., Hernández-Lerma O., Continuous-time controlled Markov chains with discounted rewards, Acta. Appl. Math., 2003, 79: 195--216.
[16] Guo X. P., Dai Y. L., The unbounded cost discounted model for continuous time Markov decision processes, Acta Mathematica Sinica, Chinese Series, 2002, 45(1): 171--182.
[17] Lin Y. L., Continuous time discounted Markov decision model with moment optimality criterion and the relationship between it and the discrete time quasi-discounted model-Q matrix is necessarily conservative, Acta Mathematica Sinica, Chinese Series, 1992, 35(1): 8--19.
[18] Limnios N., Oprisan J., Semi-Markov Processes and Reliability, Boston: Birkháuser, 2001.
[19] Ross S. M., Average cost semi-Markov decision processes, J. Appl. Prob., 1970, 7: 649--656.
国家自然科学基金资助项目(60874004, 10925107)
/
〈 |
|
〉 |