POMDP and MDP

Algorithms:

  • Structure Policy Iteration
    • (Boutilier, Dearden & Goldszmit,195)
  • Value Directed Compression (VDC)
    • This algorithm is based on the generation of compressed state space which is still a sufficient statistic for optimal control. Starting from the transition and reward matrices it produces a reformulation in terms of the compressed beliefes state which can then be solved using other approaches.
  • Boundend Policy Iteration (BPI)
  • Bounded Policy Iteration over Valuer Directed Compression (VDCBPI)
  • Point-based Value Iteration (PBVI)
  • Gradient Ascent (GA)
  • Stochastic Local Search (SLS)
  • RT-Bel
  • Real-time Belief Space Search, RTBSS (Paquet et al., 2005)
    • Online algorithm that uses branch and bound to search the state space, The lower bound is computed offline using some approximate method and used to discard sub-optimal actions.
  • Witness
  • AEMS (Ross & Chaib-draa,2007)
  • Perseus
  • Heuristic belief space search, Satia And Lave, 1973
  • BI-POMDP (Washington,1997)
  • Pointbased Policy Iteration (PBPI) 2007
  • Mohan Enhumeration Algorithm

Typical Benchmark Tasks

  • RockSample
    • Smith & Simmons 2004
  • Tag
    • Pineau, J et al. 2003
  • Maze
    • Poupart, P. & Boutilier, C.. 2004
  • tiger-grid,
  • hallway
  • hallway2
  • Widget processing
    • Hansed & Feng 2000
  • coffee problem
    • Hansed & Feng 2000
    • Boutiler & Poole 1996
  • Mountain Car ( Munos and Moore, 2002)
  • DAP
  • Sys Admin ( Guestrin et al. 2003)

Keywords

  • sequential decision making
  • partial observability
  • stochastic environment
  • policy
  • belief state
  • markov decision process
  • probabilistic planning
  • perceptual aliasing
  • Factored representation
  • decomposition
  • abstraction
  • use of prior knowledge
  • context
  • mode
  • hidden state
  • macro-action
  • priorization
  • backup
  • contraction
  • value function

People in the field:

Conferences And Workshops:

Links:

Open Problems:

  • Computational complexity
  • Generalization
  • Efficient representation of uncertainty