POMDP and MDP

Online source code:

Structure Policy Iteration
- (Boutilier, Dearden & Goldszmit,195)
Value Directed Compression (VDC)
- This algorithm is based on the generation of compressed state space which is still a sufficient statistic for optimal control. Starting from the transition and reward matrices it produces a reformulation in terms of the compressed beliefes state which can then be solved using other approaches.
Boundend Policy Iteration (BPI)
Bounded Policy Iteration over Valuer Directed Compression (VDCBPI)
Point-based Value Iteration (PBVI)
Gradient Ascent (GA)
Stochastic Local Search (SLS)
RT-Bel
Real-time Belief Space Search, RTBSS (Paquet et al., 2005)
- Online algorithm that uses branch and bound to search the state space, The lower bound is computed offline using some approximate method and used to discard sub-optimal actions.
Witness
AEMS (Ross & Chaib-draa,2007)
Perseus
Heuristic belief space search, Satia And Lave, 1973
BI-POMDP (Washington,1997)
Pointbased Policy Iteration (PBPI) 2007
Mohan Enhumeration Algorithm