Important Note: You must have THIS COMPONENT installed to use all other codecs.
This document should be your first resource if you are new to RL-Glue. RL-Glue is not particularly complicated or tricksy, but you will be much more effective using it if you have a high-level understanding of how the project works, and how it interacts with related projects (RL-Glue Extensions).
- RL-Glue what? learning about RL-Glue at an abstract level
- Compatibility: making existing C/C++ agents and environments work with RL-Glue
- Plugging agents and environments together: how to write experiment programs
- What function do I use to... quick function reference for RL-Glue
- High-level changes from RL-Glue 2.x to 3.x
This manual will help you get RL-Glue, and install it on your local machine.
After that, if you decide that you want to write agents, environments, or experiments in C or C++, all of the details you need are here also.
- Where to download RL-Glue
- How to install RL-Glue
- C/C++ Changes since RL-Glue 2.x
- Details about C/C++ data types and function prototypes
- Memory Management Pointers and Suggestions
No. RL-Glue is designed for single agent reinforcement learning.
At present we are not planning a multi-agent extension of RL-Glue. We
envision that this would be a separate project with a different
audience and different objectives.
Update: There have recently been some a proposal (from Gabor Balazs) about how to make some simple updates to RL-Glue that would allow it to work with multiple agents. If you would really like RL-Glue to support multiple agents, please let us know on the discussion list (http://groups.google.com/group/rl-glue
RL-Glue is meant to be a low level protocol for connecting agents, environments, and experiments. These interactions can
easily be described by the simple, flat, functions calls of RL-Glue. We don't feel that it is useful to overcomplicate
things in that respect.
However, there is no reason that an implementation of an agent or environment shouldn't be designed using an object-oriented
approach. In fact, many of the contributors to this project have their own object-oriented libraries of agents that
they use with RL-Glue. Some of the codecs even have an OO flavor (Python, Java, Lisp).
Some might argue that it makes sense to create a codecs that support very seriously, with a hierarchy of observation and action types, where you create an instance of RL-Glue instead of calling static methods on it, etc . This would not be hard, it's just a matter of someone interested picking up the project and doing it. Personally,
we've found it easy enough to write a small bridge between the existing codecs and our personal OO hierarchies.
If the state of an environment is fully observable, then you can often use the terms state
observation is a more general term that is meant to mean the perceptions
that the agent receives. This can be different
from the concept of state
, which corresponds to some truth
about the environment. For example, in partially observable environments, the observations
may be aliased
: the environment may be in different states, but the agent receives the same observation.
The environment in RL-Glue is responsible for keeping track of the
current ``state'' and computing the next ``state'' given an action. The
old state does not need to be passed
outside of the environment, the state stays within the environment. The
method in CLSquare is basically the same as env_step
This can be done in RL-Glue by using env_message, agent_message, and coordinating the two with your own experiment program. It's not trivial, and there are many different ways and reasons that you might want to do this, so it's hard to come up with a very clear example. If you are interested in this
, please contact us and we would like to make an example for you, that we can share with everyone else.
The functionality of Freeze can easily be replicated through RL_agent_message
There are literally a hundred similar methods that would be desirable
to one person or another. To avoid the RL-Glue interface becoming
bloated, we are trying to avoid adding too many redundant functions for
the sake of convenience.