POMDP and MDP
Online source code:
Online source code:
- Pascal Poupart symbolic perseus (approximate, point based,
- Approximate POMPD Planning toolkit
- Trey Smith's ZMDP Software for POMDP and MDP Planning
- Toni Cassandra pomdp solver
- Matthijs Spaan Perseus
- Tarek Taha POMDP tools
- Caylus open source library for MDP
- RL-POMDP
- POMDPToolBox matlab
- Symbolic HSVI
- POMDP in continuos space (Porta, Vlassis et al.)
- Stochastic Planning using Decision Diagrams (Hoey)
Algorithms:
Algorithms:
- Structure Policy Iteration
- (Boutilier, Dearden & Goldszmit,195)
- Value Directed Compression (VDC)
- This algorithm is based on the generation of compressed state space which is still a sufficient statistic for optimal control. Starting from the transition and reward matrices it produces a reformulation in terms of the compressed beliefes state which can then be solved using other approaches.
- Boundend Policy Iteration (BPI)
- Bounded Policy Iteration over Valuer Directed Compression (VDCBPI)
- Point-based Value Iteration (PBVI)
- Gradient Ascent (GA)
- Stochastic Local Search (SLS)
- RT-Bel
- Real-time Belief Space Search, RTBSS (Paquet et al., 2005)
- Online algorithm that uses branch and bound to search the state space, The lower bound is computed offline using some approximate method and used to discard sub-optimal actions.
- Witness
- AEMS (Ross & Chaib-draa,2007)
- Perseus
- Heuristic belief space search, Satia And Lave, 1973
- BI-POMDP (Washington,1997)
- Pointbased Policy Iteration (PBPI) 2007
- Mohan Enhumeration Algorithm
Typical Benchmark Tasks
Typical Benchmark Tasks
- RockSample
- Smith & Simmons 2004
- Tag
- Pineau, J et al. 2003
- Maze
- Poupart, P. & Boutilier, C.. 2004
- tiger-grid,
- hallway
- hallway2
- Widget processing
- Hansed & Feng 2000
- coffee problem
- Hansed & Feng 2000
- Boutiler & Poole 1996
- Mountain Car ( Munos and Moore, 2002)
- DAP
- Sys Admin ( Guestrin et al. 2003)
Keywords
Keywords
- sequential decision making
- partial observability
- stochastic environment
- policy
- belief state
- markov decision process
- probabilistic planning
- perceptual aliasing
- Factored representation
- decomposition
- abstraction
- use of prior knowledge
- context
- mode
- hidden state
- macro-action
- priorization
- backup
- contraction
- value function
People in the field:
People in the field:
Simple source code for students, mainly MDP:
Simple source code for students, mainly MDP:
- Eric A. Hansen
- Milos Hauskrecht
- Jesse Hoey
- Shiao Ji
- Leslie Kaelbling
- Bert Kappen
- d koller
- Lagoudakis, M.G
- Micheal Littman
- Xuejun Liao
- David A. McAllester
- Sébastien Paquet
- Ron Parr
- Joelle Pineau
- Josep M. Porta
- Pascal Poupart
- Nicholas Roy
- Stephan Ross
- Scott Sanner
- Jurgen Schmidhuber
- Kevin D. Seppi
- Guy Shani
- R. Simmons
- Bill Smart
- T. Smith
- Satinder Singh
- Matthijs T.J. Spaan
- Csaba Szepesvári
- Marc Toussaint
- Nikos Vlassis
- David Wingate
- Dit-Yan Yeung
- Kevin Murphy mdp tool box for matlab
- Terran Lane Java MDP library
Conferences And Workshops:
Conferences And Workshops:
- UAI
- IJCAI
- NIPS
- AISTAT
- ICML
- AAAI 2008 Workshop on Advancements in POMDP Solvers
- POMDP Practitioners Workshop: solving real-world POMDP problems (Workshop at ICAPS 2010)
Links:
Links:
Open Problems:
Open Problems:
- Computational complexity
- Generalization
- Efficient representation of uncertainty