Optimally solving Markov decision processes with total expected discounted reward function: Linear programming revisited |
Abstract:
We compare the computational performance of linear programming (LP) and the policy iteration algorithm (PIA) for solving discrete-time infinite-horizon Markov decision process (MDP) models with total expected discounted reward. We use randomly generated test problems as well as a real-life health-care problem to empirically show that, unlike previously reported, barrier methods for LP provide a viable tool for optimally solving such MDPs. The dimensions of comparison include transition probability matrix structure, state and action size, and the LP solution method.
|
Keywords: |
Markov decision process; MDP; Linear programming; Policy iteration; Total expected discounted reward; Treatment optimization |
Author(s): |
Oguzhan Alagoza, , Mehmet U.S. Ayvaci, , Jeffrey T. Linderoth |
Source: |
Computers & Industrial Engineering Volume 87, September 2015, Pages 311–316 |
Subject: |
تحقیق در عملیات |
Category: |
مقالات ترجمه شده - دانلود ترجمه مقاله |
Release Date: |
2015 |
No of Pages: |
6 |
Price(Tomans): |
0 |
بر اساس شرایط و ضوابط ارسال مقاله در سایت مدیر، این مطلب توسط یکی از نویسندگان ارسال گردیده است. در صورت مشاهده هرگونه تخلف، با تکمیل فرم گزارش تخلف حقوق مؤلفین مراتب را جهت پیگیری اطلاع دهید.
|
ترجمه این مقاله موجود است. مشاهده ترجمه مقاله
|