The probabilities and utilities specified in the effects of probabilistic rules can be automatically optimised from dialogue data. OpenDial relies on Bayesian learning to perform this parameter estimation. Each (univariate or multivariate) parameter is therefore associated with a specific distribution that is progressively narrowed down as more data points are observed in order to provide the best "fit" for the observed data.
OpenDial comes bundled with specific functions to export and import dialogues right from the user interface. To record a particular interaction, click on Interactions -> Save Dialogue As.... The function will record the sequence of dialogue turns (from the user and from the system) in the form of an XML file. Every turn contains a particular utterance (encoded as a probability distribution in the case of uncertain user inputs) as well as other optional contextual variables. The recorded XML file follows the following skeleton:
<interaction>
<userTurn>
<variable
id="u_u">
<!-- distribution for the user utterance -->
</variable>
<!-- other contextual variables -->
</userTurn>
<systemTurn>
<variable
id="u_m">
<!-- System utterance -->
</variable>
<!-- other contextual variables -->
</systemTurn>
<!-- etc. -->
</interaction>
The order of user and system turns is arbitrary (user turns can follow other user turns without any system utterance, and vice versa). The (domain-specific) contextual variables to include in each turn are specified in the system settings.
OpenDial can be paused/resumed at any time during the data collection using the Interactions -> Pause/Resume button.
Wizard-of-Oz interactions can also be recorded through the OpenDial interface. A Wizard-of-Oz interaction is an interaction in which a human user is asked to interact with a system that is remotely operated by an unseen human agent. One can easily execute such Wizard-of-Oz experiments via the "remote connection" functionality integrated in the latest versions of OpenDial. To allow two OpenDial systems to be connected with one another, follow the following procedure:
If one wishes to conduct more advanced Wizard-of-Oz experiments (where e.g. the wizard can only choose among the set of actions available from a predefined dialogue domain), the module WizardControl can be used. Using this module, OpenDial will trigger the domain models as usual upon the reception of new user inputs, but will not select the highest-utility actions. Instead, the system will show a list of possible actions on the right side of the chat window. The wizard is then expected to select the most appropriate action in this list. Note that the action selection box only displays actions that are relevant in the current situation (i.e. for which at least one utility rule specifies a utility). Using the domain from the step-by-step example, three actions will therefore be available to the wizard after observing the user input a_u=Request(Left), namely Move(Left), AskRepeat, and the default None (representing the absence of any action).
Previously recorded dialogues can be imported via Interactions -> Import Dialogue From.... This function does more than simply displaying the dialogue history in the chat window - it actually "replays" the full interaction, performing dialogue update at each step (including the update of parameter distributions). This import function is therefore a crucial tool to estimate model parameters based on previously recorded dialogue data.
Two alternatives import modes are available:
Some parameters can be directly learned from raw, unannotated dialogues. Take for instance the dialogue domain described in the step-by-step example. The parameter theta_repeatpredict (which reflects the probability that the user will comply to the system request to repeat the instruction) can be automatically estimated through repeated interactions with users. Each observed dialogue act a_u after a system request AskRepeat will thus trigger a Bayesian update of the parameter.
One can easily test this learning mechanism by starting OpenDial with the domain domains/examples/example-step-by-step_params.xml and entering a few instructions with a low probability (in order to trigger the AskRepeat response). The state viewer shows how the prior prediction a_u^p and the actual dialogue act a_u' are combined.
The parameter theta_repeatpredict is automatically refined as a result of this update.
Once the interaction is complete, the posterior parameter distributions can be exported by clicking on Domains -> Export -> Parameters. As the posterior distribution of theta_repeatpredict may not be a Dirichlet distribution anymore due to the partial observability of the graphical model, the posterior distribution is encoded as a multivariate Gaussian distribution (with a diagonal covariance).
The parameter theta_repeatpredict could be directly estimated from unannotated dialogues. This is not the case for the utility theta_repeat. Indeed, there is no way the system could learn the utility of the AskRepeat action without receiving some feedback on the desirability of this action.
One simple way to estimate such utility parameters is to collect Wizard-of-Oz data (cf. explanations above) and estimate posterior parameter distributions from them. The most likely parameter values are in this case those that provide the best "fit" for the wizard decisions - in other words, the parameter values that best imitate the wizard's conversational behaviour in similar situations).[1]
In practice, you can simply import a previously recorded Wizard-of-Oz dialogue (for instance the one in domains/examples/woz-dialogue.xml). At the end of the interaction, the parameters will be automatically updated to reflect the Wizard-of-Oz decisions, as illustrated in the screenshots below.
Parameter distributions before learning:
Parameter distributions after learning:
(Notice that the theta_repeat distribution has narrowed down and that its mean is now centered around 0.2-0.3)
Finally, the last possible method for parameter estimation is via simulation. A simulator automatically generates user inputs in accordance with an internal model of the user behaviour.
The easiest way to build a user simulator in OpenDial is to use the Simulator module. The simulator module takes as parameter a dialogue domain specification in the same format as a standard dialogue domain. The simulator is triggered after each system action and automatically updates its internal state and generates new user inputs in accordance with the specified model.
An example of such simulator domain is provided in domains/examples/example-simulator.xml. As we can see, this simulator domain is composed of :
Some state variables in this domain specification are labelled with an ^o suffix. This suffix indicates which variables are intended to be part of the generated output of the simulator (the remaining variables are internal to the simulator). In this particular example, the only output variable is the user dialogue act. More complex domains may however include other contextual variables. One should also note the use of a Dirichlet parameter called error. This Dirichlet parameter is used for the simulator's error model. The first dimension of this parameter determines the confidence probability for the dialogue act actually selected by the simulator, the second dimension the confidence probability for another, erroneous dialogue act, and the third dimension the probability of no dialogue act. At runtime, the simulator samples a particular (multivariate) value from this Dirichlet and use it to construct the final N-best list for the dialogue act.
In order to start the simulator, simply add the paramater -Dsimulator=path/to/the/simulator/domain on the command line. A simulated dialogue will then start. The simulation can be interrupted at any time by clicking on Interactions > Pause/Resume.
More advanced simulators can be constructed as separate modules. The section External modules describes in more details the implementation of such modules.
[1] See Lison (2014), chapter 5 for details.