Tutte Seminar Series, Winter 2013

Friday, February 1, 2013 3:30 pm, MC 5158	Tutte Seminar Series Combinatorics & Optimization Winter 2013
Ian Post University of Waterloo
The performance of the simplex method on deterministic Markov decision processes
Motivated by the search for strongly-polynomial algorithms for linear programming (an algorithm whose runtime depends only on the number of constraints and variables, not the sizes of the numbers involved), we analyze the performance of the simplex method on Markov decision processes (MDPs), a class of LPs that is "close" to being strongly-polynomial but beyond the reach of existing techniques. We prove that for deterministic MDPs the simplex method with the highest gain/most-negative-reduced cost pivoting rule converges in strongly-polynomial time, regardless of the discount factor of the MDP and even when each action has a distinct discount. Previously the simplex method was known to run in polynomial time only for discounted MDPs where the discount was bounded away from 1. Unlike in the discounted case the algorithm does not greedily converge to the optimum, and we require a more complex measure of progress. We identify a set of layers in which the values of variables must lie and show that the simplex method always makes progress optimizing the variables in one layer. Joint work with Yinyu Ye.

Ian Post University of Waterloo