6 - Simple utility learning

In an ACT-R model more than one production can match the buffer conditions. In this case the production with the highest utility level is chosen. Utility levels are learned through experience and through reward. Different forms of reinforcement learning have been used to model this in ACT-R. These are available through Python ACT-R, which also includes Q-Learning, which is used in the Clarion architecture. Although these learning algorithms are different, they all seek to adjust the utility of productions to reflect how often they lead to a reward state, discounted by how long it took to get from the production to the reward state. Noise on the utility functions is also very important as it controls how much exploration (i.e., not choosing the highest utility production) occurs. The different Python ACT-R learning algorithms can be found on the Python ACT-R site (http://ccmsuite.ccmlab.ca/?q=node/19) or on this site under Reference Material.