This work considers two phases in order to teach the chameleon-like robot to balance itself. These steps are related to reinforcement learning (RL) in continuous domains and the design of the transfer learning (TL) strategy.
This section briefly describes the work. In addition, since it is also interested in developing new RL and TL strategies using artificial hydrocarbon networks (AHN), this supervised learning method is explained firstly. Then, the RL for continuous domains using AHN is introduced. After that, the TL using AHN is generally proposed. Finally, the procedure for transferring knowledge between a well-known robot and the chameleon-like robot is described. At the end of the section, there are several resources (codes, videos, etc.) related to this work.
Artificial Hydrocarbon Networks (AHN) are a supervised learning method inspired on chemical hydrocarbon compounds. It allows modularity and organization of information, inheritance of packaging information, and structural stability.
The fundamental unit of information in known as molecule. As in nature, molecules are formed by linking two elements in different arrays. In case of the AHN model, only hydrogen and carbon atoms can compose those molecules. By uniting two or more molecules, compounds are formed; then by combining compounds, mixtures are created. To this end, the AHN models nonlinear relationships of structural data decomposition.
When applying AHN to regression and classification problems, some notable properties are shown, like:
Detailed information about AHN can be found in the Artificial Organic Networks Website.
There are many factors that difficult the study of reinforcement learning (RL) in a continuous space such as: the determination of the transition function, lack of in continuous to-discrete relaxation problems, the difficulty to generalize complex states, among others.
Nevertheless, there is a constant interest in RL for continuous domains since real-world problems, such as robotics, require these methods for learning complex tasks, and because states and actions are normally continuous.
A common approach is to use model-free and model-based approaches for policy search in continuous domains. However, most of the research done does not consider continuous states and actions without discretization of any of them. Thus, this project considers modeling the dynamics of the continuous domain (both states and actions) using the supervised AHN learning method.
We developed a continuous RL strategy using AHN, as described in our paper entitled A Reinforcement Learning Method for Continuous Domains Using Artificial Hydrocarbon Networks (WCCI-IJCNN 2018). This and other applications can be found out here.
Transfer of learning is the process in which learning in one context or with one set of materials impacts on performance in another context or with other related materials. Humans use past knowledge to improve on new tasks, we recognize similarities and apply relevant experience to new tasks.
In machine learning, transfer learning aims to improve the learning performance and efficiency of a given target agent or learner based on the knowledge of a previous agent or knowledge base. It is the process of improving the target predictive function by using related information from the source domain and source task with the target domain and task.
Since the goal of this work is to transfer learning between robots, we propose to use continuous RL using AHN for learning the task in the teacher robot. Then, the AHN-model is transferred to the learner agent. Assuming that the domain states in both the source and target are the same, then only a transition function has to be learned for the actions. Moreover, using the organizational molecular structure of AHN, it is possible to divide the learning transition function problem by learning transition sub-functions for each molecule, accelerating the process.
This proposal is still in progress. Advances will be available soon.
The following steps will be implemented for transferring knowledge from an inverted pendulum like robot (source) to the chameleon robot (target):
This implementation is still in progress and changes can be made. Advances will be available soon.
These resources will be updated before August 2018.