Configure Parameters & Generator Configurations

Generator configurations, especially the use of configure parameters set by lambda expressions (lambda configuration), provide flexible and customized ways for users to control of the stochastic expansion process over the grammar. Different configurations set the parametrized item type to different instances, with different rule probability set up for the item grammar, or different values for other parameters used in the generation process etc. Configure parameters are the main part of generator configurations.

Configure Parameters

- Configure parameters are parameters used to parameterize the item grammar and the associated generation process (i. e. they are parameters for the item type) and they are set in the generator configurations. The syntax for declaring parameters themselves are very like declaring parameters for C++ function etc. (except that they are not wrapped in parenthesis but in curly braces, and that an additional specifier dynamic is allowed), see [Here] for the syntax for the declaration the configure parameter list itself.
- The rule probabilities are by default configure parameters, and additional ones can be declared in parts that are wrapped in{ and } in the item type definition.
  - Item level configure parameters can be declared after the item type name (before the : symbol), which can be used anywhere in the item type. The syntax is at [Here]. Examples include the "n", "unit_len", and "max_wid" in the Tree demo.
  - Rule level configure parameters can be declared after the rule name (before the : symbol), which can be used in the scope of the rule. The syntax is at [Here]. Examples include the "min_int" and "max_int" in the Quiz demo, and the "branch_prob" and "branch_deg" in the Tree demo.
  - Currently user defining nonterminal level configure parameters (i.e. those shared by all rules expanding a nonterminal type) is not implemented, but this may be added in the future.
- By default, each configure parameter is set with a lambda expression (in other words, a pointer to a function defined inline) in configuration, which is evaluated when generating the item, not when setting the configuration (of course it would not matter if it is just a constant etc.). This important feature is named as lambda configuration. (Using special symbols like # in the expression to set configuration parameters allows users to (partially) change this convention.)
  - Item level configure parameters are evaluated at the start of executing item generators, unless declared with the dynamic keyword.
  - Rule probabilities are evaluated at the start of executing node generators when calling with nonterminal names (see the figure at [Here]). It potentially gets a different value for each node. For example, the probability for "mulExpr" rule in the Quiz demo is set with (depth < diffic ? 0.3 : 0.0), which may be different for each node because the value of "depth" is different for each node.
  - Other rule level configure parameters are evaluated at the start of the rule part of node generator (after rule is selected, or when calling with rule names, see the figure at [Here]), unless declared with the dynamic keyword. Like the rule probabilities, it potentially get a different value for each node.
  - When declared with dynamic keyword, the configure parameter is evaluated at each instance it appears, therefore potentially gets different value for each instance even in the same scope (we call this a dynamic configure parameter). For example, the "branch_deg" in the Tree demo is declared with the dynamic keyword, and it appears twice in the implementation of node generators in the rule "ntTree", for the left branch and right branch respectively. Without the dynamic keyword, it would always have the same left branch turning angle and right branch turning angle for each node (each expansion of the LSystem rule). Using the the dynamic keyword would allow it to take different values for those two even in the same node, when set by lambda expressions like GetRandFloat(30.0, 60.0)(a random number will be rolled each time when evaluating this lambda expression), which helps break the symmetry.
- The arguments to the lambda expression to set a configure parameter is determined by whether or which nonterminal type the configure parameter is associated with.
  - For an item level configure parameter, the argument to its lambda expression is empty.
  - For a rule level configure parameter (also applied to rule probability), the argument to its lambda expression is the arguments to the nonterminal type this rule expands from. For example, we can set the probability for "mulExpr" rule in the Quiz demo is set with (depth < diffic ? 0.3 : 0.0) while "depth" is not available in the current scope, because it is the argument to the lambda expression which is determined when declaring the arguments to the nonterminal type "Expr".
- When a variable name appears both in current scope and the arguments to the lambda expression, the argument scope takes priority over the current scope, unless # symbols are used. Note that this type of name collision is not encouraged and should be avoided.
- Variables in the lambda expression that appears only in current scope is bound with references rather than values, unless # symbols are used. This is useful when the configuration is stored and reused multiple times, or when certain in-scope variables are intended to get feedback from the generation process (e.g. a counter). An example of this is the "diffic" in the Quiz demo, which gets updated in the loop every time the configuration is used.
- The # symbol can modify an expression (or sub-expression) for setting configure parameters so that it forced to pass by value and evaluated at the configuration time, called force configuration time modifier. As arguments to lambda expressions are generally not available at the configuration time, they should not appear inside any expression marked with # symbols. Note that for most simple examples this features are rarely needed.
  - This symbol if regarded as a unary operator, is right associative (i.e. operates on the right side), and its precedence is the same as type-casting operation.
- It can be applied to a single variable or a bigger expression, e. g. a + #b means only force "b" to be passed by value at configuration time, while #(a + b) forces the sum to be passed by value at configuration time.
- #a + #b + c will be automatically optimized to #(a + b) + c, but expressions like a + #b + #c can not be automatically optimized in current implementation (as + is left associative), because for languages with side effects, associative and commutativity is not guaranteed and hard to check.
- One usage is to force passing by value. For example, in the Quiz demo, if we want to generate the sequence of quizzes with the same difficulty, the smallest modification is to add # in front of the three occurrences of the "diffic" variables in the configuration, which makes it pass the value of "diffic" at the time of constructing the configuration and it does not change across different iterations in the loop later.
  - Another usage is to optimize for runtime performance by implicitly pre-computing some computationally expansive parts. This can alternatively be achieved by store the computation result in a variable then pass it in.
  - As mentioned above, it can also be used to disambiguate for name collision between arguments to lambda expressions and in-scope variables (by forcing it to be the in-scope variable), but again, this should not happen at the first place and it is not a recommended way of using this feature.

Generator Configuration

- A generator configuration is a data structure that stores all information that are used to set the configure parameters for an item type.
- Generator configurations can be stored in variables declared with the giglconfig specifier (see [Here] for syntax). An example is the use of "quiz_config" variable in the Quiz demo.
- More important part about generator configurations is to construct them from first principle using GIGL syntax, which is needed in almost in every GIGL source file. The syntax for constructing generator configurations can be found at [Here].
  - The constructions for generator configurations are wrapped in <* and *>.
  - There are two major parts, firstly the item (wrapper) level configuration, and then the node level configuration.
    - The item level configuration starts with the item type name and followed by the configuration for item level configure parameters.
    - The node level configurations are organized by nonterminals and rules, similar to how they are declared in the item type. After each rule name, the first part is the configuration for custom configure parameters for this rule (which may be omitted the rule does not have custom configure parameters), the second part (following the @ symbol) is the configuration for the rule probability (which may be omitted in some cases, see below).
    - When the rule probability field is omitted (include the @ symbol), the rule probability takes an implicit value. The implicit value is determined by evenly distribute the remaining probability from 1.0 (left after those explicitly specified ones) over the implicit (omitted) ones. If there is only one rule from the nonterminal type and its probability is implicit, then it take 1.0, as expected. If there are two rules, both with implicit probabilities, then they both take 0.5, which is the case in the HelloWorld demo. If there are three rules, one explicitly specified two have 0.6 probability, the other two with implicit probabilities, then the other two both take 0.2. The usage of this feature is also seen in the Tree demo, which is another common usage, that is to omit the probability field of one of the rules ("termTree") and let it to take the remaining probability.
    - If one rule does not appear at all in the configuration (note it not just omitting the probability field, but the whole rule, including the rule name), then it means to set the probability of this rule to be zero, i.e. not selecting this rule in the whole generation process. This also means loosing the opportunity to set the custom configure parameters for the rule, which often would not be an issue because the rule would not be selected anyway, however, should still be careful if there are other ways to directly generate nodes with the rule (e.g. calling node generator with the rule name as shown in the figure at [Here]). This feature appears in the Monster demo, where the configuration in the "GenerateEasyRoomMonsters" function forbids the "flaiWeapon" to be selected by omitting that rule entirely.
  - The configurations for custom declared configure parameters are grouped and ordered the same way as they are declared in the item type (except that the order of nonterminals and rules can flip without affecting result when not considering side effects, as long as the configure parameters follows the corresponding rules properly, which also applies to rule probabilities). The configuration for configure parameters, including probabilities are wrapped in { and }, similar to how custom ones are declared.
  - The evaluation order of parts of configure parameters that are needs to be done in configuration time are done in the order they appear in the configuration, regardless of the order of nonterminals and rules. For example using two configuration (note rule probabilities by default are implicitly normalized)
  - Expr : = mulExpr @ {#(i++)} | addExpr @ {#(i++)} | intExpr{0, 9} @ {1}
  - and
  - Expr : = addExpr @ {#(i++)} | mulExpr @ {#(i++)} | intExpr{0, 9} @ {1}
  - will give different result because of the evaluation order and side effects. However, this kind of overuse or unnatural use of side-effects are warned against and may trigger even more bugs within the current implementation when used with some other features mentioned in the smart rule probability configuration section.
  - In the configuration for each configure parameter, it uses the lambda configuration feature, which is mostly covered above in this page.

Discussion: How to Design Customized Generator Configuration and Configure Parameters

- The general principle to decide what configure parameter to add is to consider what needs to be parametrized. In other words, what is associated with the item type (type-wise) and does not needs to be changed across different instances, and what is associated to some instance (instance-wise) of the item type and can be different across different instances. Type-wise property does not need configure parameters to support, but instance-wise property does. The answer to this type of questions may be different for the same item type when the design intention is different. A different discussion according to a similar principle can also be seen at [Here].
- For example, in the Quiz demo, we added the configure parameters "min_int" and "max_int" to the rule "intExpr", because we consider that in every possible configuration we want to use for this problem, each integer in the arithmetic expression we want to generate is a random number uniformly selected from a range (which is made a type-wise property), but the range might be different for each instance (which is made an instance-wise property).
- If we considering the range is always the same, and known, say 0 to 99. Then we do not need to add those two configure parameters. We can directly set the "val" to be GetRandInt(0, 99) in the generator (implementation) in the rule "intExpr" as that is a type-wise property.
- If we does not assume the integers in the arithmetic expressions to be from a uniform range, but instead can be a constant in some configuration, or from some other arbitrary distribution (that can be programmed) in some other configuration, then we should no longer use those two configure parameters. Now we know little about the type-wise property for those integers (all we know is just they are integers). Therefore a good approach in this case would be just using one configure parameter, say "input_val", and then set the "val" to be just input_val in the generator in the rule "intExpr". If we want it to be a constant for some configuration, we can pass in the constant, say 60 (or it could be a constant stored in a variable, as this constant is not a C++ constant but a statistical constant) in the generator configuration. If we still want it to be from a uniform range, then we pass in the GetRandInt(0, 99) in the generator configuration (note, not in generator implementation in the rule). If we want it to be from some other distributions, we just pass in other expressions or even those containing function calls in the generator configuration. Note that the lambda configuration feature ensures this to work properly, i.e., those random number, albeit set in the configuration, are evaluated only when generating the corresponding part of the each item.