1) unbounded vector-valued reward
无界报酬向量
2) unbounded reward
无界报酬
1.
In this paper, a non-stationary discounted Markovian Decision model with unbounded rewards is investigated, in which the discount factor β_t is dependent of the state and the action taken before last step of the system, under some assumptions, the optimality equations are established, and the existence of an ε-optimal policy is proved.
讨论了无界报酬非时齐折扣马氏决策模型,且折扣因子βt依赖于前一阶段所处的状态和采取的行动,从而推广了常数折扣因子的马氏决策模型,在一定的假设下,得到了最优方程,证明了存在ε-最优马氏策略。
2.
This paper first investigates the continuous time Markov decision processes with unbounded rewards and non-unifornily bounded transition rates under discounted criterion.
文中引入了一类新的无界报酬函数,在一类新的马氏策略中,讨论了最优策略的存在性及其结构,除证明了在有界报酬和一致有界转移速率族下成立的主要结果外,本文还得到一些重要结论。
3) vector of return
报酬向量
5) gratuitous bailment
无报酬委托
6) no cure no pay
无效果无报酬
补充资料:发光地寄色界无色界天乘
【发光地寄色界无色界天乘】
谓三地菩萨,明修八禅定行,同于色界四禅,无色界四空处,故云发光地寄色无色界天乘。(八禅定者,色界、无色界各四禅定也。四禅者,初禅、二禅、三禅、四禅也。四空者,即空处、识处、无所有处、非非想处也。)
谓三地菩萨,明修八禅定行,同于色界四禅,无色界四空处,故云发光地寄色无色界天乘。(八禅定者,色界、无色界各四禅定也。四禅者,初禅、二禅、三禅、四禅也。四空者,即空处、识处、无所有处、非非想处也。)
说明:补充资料仅用于学习参考,请勿用于其它任何用途。
参考词条