The command "go around the map" does not appear in the dataset. However, commands like "send one peasant to scout the map" or "go around north if you have to" does exist in the dataset. This gives our model the concept of going around and exploring.
By utilizing a language encoder, our model is able to generalize to the unseen command "go around the map" and send out units to explore the map.
In the dataset, humans issue commands like "we need more minerals", "we need more soldiers" or "we need another archer asap". In addition, humans sometimes use the word "tower" to refer to a guard tower. When our model is given the unseen command "we need more towers", it builds a group of guard towers.
In the dataset, there are commands like "retreat" or "run away from the enemy", which demonstrates fleeing away from the enemies. Our model is able to generalize to the unseen command "run back to your base". As shown in the image below, the army units are indeed running away from the enemies and approaching the base.
By issuing commands "build a blacksmith" and "build swordman", our policy builds another type of army unit, which is effective against invading enemies.