Debugging and Runtime Monitoring

Overview

Warnings

Pausing and resuming Execution

Real-time Monitoring

Tripwires

Catching Undefined Predicates and Methods

Type Checking

Analyzing Non-termination with Terminyzer

Other Tools

Considerations for Improving Performance

Overview

In addition to the basic tools of explanations and reporting errors/warnings concerning the syntax and semantics, Ergo includes a number of more advanced tools to help with debugging knowledge bases and to support real time monitoring of query execution. These more advanced tools are suitable especially for more expert users. Runtime monitoring lets the user watch how memory consumption grows over time, as well as see other statistics. Some of these statistics may indicate that the query is in an infinite loop or that the knowledge base's rules are written in an inefficient way, which will cause the performance to degrade -- sometimes dramatically. Debugging tools enable other kinds of analysis. For instance, the Terminyzer tool lets the user analyze the execution and helps detect infinite loops ("non-termination") in query execution. Not only that, it may even find the exact sequence of rule applications that causes/constitutes the infinite loop. Another tool lets one check which recursive subgoals (predicates or frames) are being worked on right this moment. Yet other tools can perform type checking, or identify calls to undefined predicates or frame methods. Such calls would normally fail, but their existence may indicate a mistyped predicate or method name. Here we will just give an overview of what is available.

Details can be found in section "Debugging User Knowledge Bases" in the ErgoAI Reasoner User's Manual.

Warnings

We begin with the importance of addressing the warnings and reiterate that it is a very bad idea to ignore any kinds of warnings.

By issuing a warning, Ergo's compiler is trying to draw the user's attention to an idiom that is questionable either semantically or syntactically. In addition, Ergo provides various easy-to-use facilities to suppress specific warnings. By explicitly using these facilities, the user tells the system that he is aware of the issue, that the use of a questionable axiom is intended, and the system will stop issuing warnings that were explicitly addressed.

The three most common warnings are:

Singleton variable. This is a named variable that appears exactly once in a rule or a fact. It is all too easy to mistype a variable name and produce singleton variable, which would lead to a hard-to-find logical error.
- If a singleton variable was intended, a better way would be to just put ? in its place.
- Sometimes the user might want to give a proper name to such a variable in order to remind what this variable is supposed to stand for. For instance, a user name. In this case, one can suppress the singleton variable warning by prefixing the variable with an underscore:

?_UserName.

Unsafe variable. This is a named variable that appears in a rule head, but not in its body. An unsafe variable rarely makes sense because it means that every possible constant or term can be substituted for such a variable. For instance, in

q(1).

p(?X,?Y) :- q(?X).

p(1,1), p(1,p(q,p(q))), p(1,p(?X)), p(1,?X(p(?Y),?Z)), p(1,q(?X(q,p(?Y)))), ... are all valid answers -- unlikely to be the intent here. Although unsafe variables can be useful in some cases, this is typically reserved for experts and one must clearly understand what he or she is doing here. To suppress such a warning, the don't care variable ? or a silent variable ?_Y should be used.

Symbols that appear in multiple contexts. Ergo recognizes several contexts for the symbols used in a knowledge base:
- HiLog function symbol of arity N (different arities mean different contexts; constants have arity 0)
- Hilog predicate symbol of arity N (again, the arity matters and propositions have arity 0)
- UDF (user defined function) symbol of certain arity
- Prolog symbol (predicate or function) of arity N.

If the same symbol occurs in more than one context, Ergo may

view this with suspicion and issue a warning
- This type of warning can be suppressed with the :- symbol_context. compiler directive.
issue an error

Example. Compiling the following set of rules

p(?X,?Y,foo) :- q(?X).

q(?X) :- r(?X,?Y).

r(1,foo(2)).

foo(bar).

foo(bar,moo).

will produce the following warnings:

++Warning[Ergo]> [tmp$user.ergo] <Compiler> near line(1)/char(7) `?Y'

                  singleton variable: use ?_Var instead of ?Var, if not an error

++Warning[Ergo]> [tmp$user.ergo] <Compiler> near line(1)/char(7) `?Y'

                  unsafe variable in rule head (does not occur in rule body): use ?_Var, if not an error

++Warning[Ergo]> [tmp$user.ergo] <Compiler> near line(2)/char(16) `?Y'

                  singleton variable: use ?_Var instead of ?Var, if not an error

++Warning[Ergo]> [tmp$user.ergo] <Compiler> near line(3)/char(5) `foo'

                  HiLog function symbol was also used with a different number of arguments on line 1 in the same file

 If this is not an error, use the :- symbol_context directive to suppress this warning

++Warning[Ergo]> [tmp$user.ergo] <Compiler> near line(4)/char(1) `foo'

                  predicate symbol was also used as a HiLog function symbol on line 3 in the same file

 If this is not an error, use the :- symbol_context directive to suppress this warning

++Warning[Ergo]> [tmp$user.ergo] <Compiler> near line(5)/char(1) `foo'

                  predicate symbol was also used with a different number of arguments on line 4 in the same file

 If this is not an error, use the :- symbol_context directive to suppress this warning

Warnings 1, 3 are singleton warnings, 2 refers to an unsafe variable in rule 1, and the last three warnings flag symbols that occur in more than one context. The first three warnings can be handled by adding the underscores or by using the don't care variable, while the last three call for the :- symbol_context directive. The full explanation of this directive is given in section "Controlling Context of Symbols" of ErgoAI Reasoner User's Manual and here we will just give an example. The general principle is that the offending occurrences of the symbols (i.e., the occurrences reported on the first lines of each of the warnings 4-6, must be declared in the aforesaid directive. The following version corrects all these problems and yields a clean compilation:

:- symbol_context{

     foo/1,       // the functional occurrence of foo/1 reported in Warning 4;

                  // conflicts with the functional occurrences of foo/0 on line 1

     foo(?)@\@,   // the predicate occurrence of foo(bar) reported in Warning 5;

                  // conflicts with functional occurrence of foo/1 on line 3

     foo(?,?)@\@  // the predicate occurrence of foo(bar,moo) reported in Warning 6;

                  // conflicts with the predicate occurrence on line 4

}.

p(?X,?,foo) :- q(?X).  // takes care of Warnings 1, 2

q(?X) :- r(?X,?).      // takes care of Warning 3

r(1,foo(2)).

foo(bar).

foo(bar,moo).

Pausing and resuming Execution

One can pause the execution by hitting Ctrl-C when the keyboard focus is over the terminal window (if ergo runs as a command line application) or over the input text area in the Ergo Listener (in the studio mode). In the studio mode, execution can be posed also by clicking the Pause button. When the execution pauses, the following is displayed:

+++ quick summary:

       cpu: 77

       memory: 4.62GB

       calls made: 93119224

       derived facts: 35531

       total subgoals:  1853067

       active subgoals: 49093

       active recursive components: 2

       ratio of active components to active subgoals: 0.0000

+++ current operation is paused:

       \resume            - resumes

       \toplevel            - aborts,

       setruntime{...}   - sets time, memory, and term-depth limits

       showgoals{...}   - displays all subgoals currently being computed

Typing \resume. (don't forget the period '.' at the end!) will resume the computation; \toplevel. will abort the query and skip to the top-level prompt. While in the pause mode, one can use various debugging tools described in section "Debugging User Knowledge Bases" in the ErgoAI Reasoner User's Manual, but one cannot run regular Ergo queries. For instance, showgoals{}. will show all incomplete subgoals that have over 1000 calls or over 50 answers. This means that the engine is still trying to evaluate these goals and if it is taking unexpectedly too much time one can see the reasons. For one, one might discover subgoals that were not expected to have ever been called -- this is likely a bug in the user knowledge base. Even if all subgoals are expected, their presence among the incomplete subgoals after such a long time likely means that this is where most of the computational resources are consumed, so take another look at these subgoals: maybe they are written suboptimally.

As can be seen above, several important statistics are also shown during pauses in the execution. For instance, the ratio of the number of active recursive components to the number of active goals(the last statistic) is especially useful. This ration is normally ≤ 1, but if it is very small (say, under 3% and keeps falling during subsequent pauses), then it is an indication that the rules are poorly structured. Consider restructuring the rules so that the recursive components form some kind of a hierarchy. The use of Ergo modules is strongly recommended in this case. Also, try to mix HiLog and frame-based representation. If the aforesaid ratio stays approximately constant and both the number of active re-cursive components and the number of active goals climb steadily, it is an indication of an infinite recursion. More details in the user manual.

A related tool is showtables{}. It is similar to showgoals{} and produces similar statistics, but is used after the query finishes. This can give too kinds of hints: a possible bug or a possible inefficiency. As before, a bug may be lurking if unexpected subgoals show up in the output of showtables{}. Inefficiency should be suspected if some subgoals have unusually high number of calls and/or answers.

Real-time Monitoring

Ergo allows the user to set up three different monitors:

heartbeat
performance
extended

The heartbeat monitor just shows the time elapsed since the query started:

cpu = 26

This monitor lets one see if the query makes progress and, say, is not waiting for input. The performance monitor is more useful, as it shows memory consumption, the number of calls made, etc.:

cpu: 20s; memory: 1.38GB; calls made: 13965.4K; derived facts: 14.8K

The most useful is the extended monitor; it continuously shows the statistics similar to what we saw during the pauses in execution:

cpu: 17s

memory: 1.66GB

calls made: 22402.3K

derived facts: 15.9K

total subgoals: 466027

active subgoals: 33584

active recursive components: 2

ratio of active components to active subgoals: 0.0001

These statistics are analyzed the same way as in the previous section, with more details in section "Debugging User Knowledge Bases" in the ErgoAI Reasoner User's Manual.

The following command can be used to request runtime monitoring. In ErgoAI Studio, these commands are conveniently accessible from the Debug menu of Ergo Listener.

setmonitor{Secs,Type}.

where Secs represents the interval (measured in seconds) at which display the statistics and Type can be heartbeat, performance, or extended. For instance:

setmonitor{3,extended}.

This will open a window in which statistics are be displayed and subsequent queries will be monitored from now on. To turn monitoring off, type

setmonitor{0,?}.

One can also type simply

setmonitor{}.

and a window will pop up to enable the user to specify the monitoring parameters.

Tripwires

A tripwire is a condition that will cause the execution to pause, abort, or do something else. In Ergo, tripwires are set using the command setruntime{...} in the terminal or in the Studio Listener. In Studio, in addition, tripwires can be set via the menu Debug > Set tripwire. The details of this command depend on the type of the tripwire. Tripwires can be set to do something after a set period of time, if memory consumption exceeded a set limit, if a subgoal call was made that has too deeply nested function terms as arguments, or if an answer is produced and one of its arguments's depth exceeds the set limit.

Here are some of the most useful timeout-based tripwires. In the studio, these tripwires can be conveniently set via the Debug menu.

setruntime{timeout(max(7,pause))}.

setruntime{timeout(max(44,abort))}.

The first will cause the execution to pause after 7 seconds and the second will abort it after 44 seconds. More complex timeouts are described in the User's Manual.

An example of a tripwire that is invoked if a deeply nested subgoal is called is:

setruntime{goaldepth(100,1000,abort)}.

This tripwire gets tripped if either a subgoal of depth 100 is called or a subgoal that contains a list of length 1000. In either case, the computation is aborted. A similar tripwire is

setruntime{answerdepth(100,1000,abort)}.

It is tripped if an answer of term depth 100 is generated or an answer that contains a list of depth 1000. Finally, this tripwire

setruntime{memory(12)}.

will abort the computation if memory usage reaches 12 Gb.

If Ergo runs on command line but in a "smart" terminal window (which is usually the case), a more convenient command is provided:

setruntime{}.

This pops up a graphical widget, shown below, which lets one specify the above parameters so one won't need to remember the exact syntax. The same widget pops up if the user chooses to specify tripwire limits using the Debug menu.

Catching Undefined Predicates and Methods

Suppose the knowledge base has the facts and the rules like this:

person({Bob,Kate}).

student({Mary,Jane}).

person(?X) :- student(?X).

If we now ask the query person(?X). we expect four answers, while for the query person(Mike). we expect the answer No.

What about the query homosapiens(?X)? If you try this query against the above information, the answer would be also No. But is it the intendedly "right" answer? What if the knowledge engineer never intended to define the predicate homosapiens and this notion is not even in the ontology of the application domain in question? Or, maybe, the intended concept was supposed to be HomoSapiens or homo_sapiens? If we just let such queries fail and provide no indication to the user, finding such mistakes in large knowledge bases would be an extremely costly and frustrating exercise. To make this less of a problem, Ergo allows the user to request strict checks for "undefinedness". A predicate (or method) foo is considered undefined if there are no rules or (asserted) facts that define foo, i.e., there are no rule heads or facts that mention foo. A class bar that has no members and no subclasses is also considered to be undefined.

In ErgoAI Studio, the Debug menu provides convenient way to set the undefinedness checks. On command line, these checks can also be set using this command, if Ergo is run in a graphical mode (which usually is the case):

mustDefine{}.

This command will pop up a graphical widget, which will let the user enter the parameters for checking undefinedness without the need to remember the exact syntax of the commands explained below. On a dumb terminal, to request strict undefinedness checks, the user issues the query

Method[mustDefine(on)]@\sys.

This query can be issued interactively or be embedded in a file. To request strict undefinedness checks only in a certain module, MyMod, use

Method[mustDefine(on(MyMod))]@\sys.

If undefinedness checks are in effect (globally, as in the first case, or in a particular module, as in the second) and a query about an undefined concept (like homosapiens(?X)) is issued, Ergo will abort the query with an error and thus help the user locate the offending concept.

To turn these checks off, use one of the following queries (again, interactively or embedded in a file with ?- ...):

Method[mustDefine(off)]@\sys.

Method[mustDefine(off(MyMod))]@\sys.

Sometimes we might want to perform the strict checks, but also to exempt some of the predicates. For instance, we might have a class, student(cse505), where cse505 is the code of a particular course, but because this course is not currently being offered, the class student(cse505) has no members and thus is undefined. Naturally, we don't want a query like ?X:student(cse505). to abort with an error, since we do know that this is a bona fide class in our ontology and not a mistake. To deal with such situations, Ergo provides the following form of the above method:

Method[mustDefine(off, ?:student(cse505))]@\sys.

Further details about this facility are found in section "Checking for Undefined Methods and Predicates" in ErgoAI Reasoner User's Manual .

We should mention that strict undefinedness checking is turned off by default and must be requested explicitly. This is because these checks have adverse effect on performance and are considered a debugging feature. Once the knowledge base goes to production, any previously enabled undefinedness checks should be turned off.

Type Checking

In Ergo, frames can be optionally typed -- see Specifying Types in the Frame Syntax. Apart from various ontological uses, types allow us to verify that the data represented in the frame-based fashion is well-typed. To do this, Ergo provides the following method:

Type[check(?Frame,?Result)]@\typecheck.

What exactly is going to be checked depends on the exact form of the ?Frame argument (for which see section "Type Checking" in the ErgoAI Reasoner User's Manual ). The argument ?Result will then be bound to pairs (data-frame,type-frame) that show examples of type violation. If there are no violations (the happy case), the above query would fail. For concreteness, consider the file typecheck.ergo. This file has several type violations with respect to the type specification

employee[|salary(\integer){0..1}=>\integer|].

that appears in the file. What are they? To find out, let us ask the query (the exact forms that this query can have are described in the aforesaid section "Type Checking"):

Type[check(?[?->?],?Result)]@\typecheck.

The answer is below (where we formatted it for the sake of clarity):

?Result = [(${Jim[salary(2013)->big]@main}, ${Jim[salary(2013)=>'\decimal']@main}),

           (${Jim[salary(2013)->big]@main}, ${Jim[salary(2013)=>'\integer']@main}),

           ${Bill[salary(2015,bla)->50000]@main},

           ${James[salary(wrongyear)->50000]@main}

The first two pairs state that Jim[salary(2013)->big]@main violates the two type constraints in bold. Indeed, big has neither the decimal type nor is an integer. The last two frames are singled out not because they violate some type but because there are no types given for them at all. (Note that neither Bill nor James are employees, while the cardinality constraint was given only for the employee class.)

In addition, note that the method salary has not only an explicitly given type, but also a cardinality constraint stating that in each year there can be only one salary. Is it violated? ErgoAI provides a method to check this as well:

Cardinality[check(?E[salary(2015) => ?])]@\typecheck.

and the answer is ?E = Mary. Indeed, the Mary-object violates the cardinality constraint, as there are two salaries: 50000 and 60000. Details on cardinality checking are given in section "Checking Cardinality of Methods" in the ErgoAI Reasoner User's Manual.

Analyzing Non-termination with Terminyzer

Infinite execution loops in evaluating complex logic queries is a fact of life and is unavoidable. In fact, even the very question if a given query is going to terminate is undecidable, meaning that no algorithm exists to tell the user whether the query terminates or not. Some ErgoAI queries (for instance, if no function symbols, arithmetic, or updates are used) are known to terminate, but this cannot be guaranteed in general. In all cases, non-termination happens due to user's mistakes of one sort or another. Fortunately, ErgoAI comes with a powerful tool for detecting non-termination called Terminyzer, which helps the user to find those mistakes.

ErgoAI Reasoner User's Manual has a chapter called "Non-termination Analysis," which explains the causes of non-termination in detail and shows how to use Terminyzer.

Here we will just give a brief summary and highlight the concepts, such as call abstraction and answer abstraction, which are further explained in the manual.

First, one must understand the two reasons for non-termination in logic queries:

Non-termination due to an infinite chain of calls of increasing size: This happens, for example, in the knowledge base has a rule and a query of the form

p(?X) :- p(f(?X)).

?- p(?X).

Clearly, this query is going to cause a chain of calls p(?X), p(f(?X)), p(f(f(?X))), ..., of increasing size, and this chain of calls will never terminate.

Non-termination due to an infinite number of answers of increasing size: This happens if the knowledge base has rules of the form

p(f(?X)) :- p(?X).

p(a).

?- p(?X).

Here the query will get an infinite number of answers p(a), p(f(a)), p(f(f(a))), etc.

Similar problems may arise due to the use of arithmetic operators. For instance:

p(?X) :- ?Y \is ?X+1, p(?Y).

?- p(1).

q(?X) :- q(?Y),  ?X \is ?Y+1.

q(1).

?- q(?X).

In the above cases, it is easy to find the mistakes just by eyeballing the rules, but usually the infinite loops are spread around many rules and many subgoals are involved in generation of infinite chains of calls or infinite sequences of answers. Terminyzer comes to the rescue.

The general workflow is as follows:

The user starts the Terminyzer either before issuing the query or while a suspicious query is paused. This is done via the command terminyzer{} issued from the terminal or the listener command line. It can also be started from the Studio Listener's menu Debug > Use Terminyzer. This will pop up a window, shown below, where the user can select the limits for the various tripwires; execution of the query will stop when one of the limits is reached. In that window, the user can also choose to select call abstraction (see the manual for the explanations). As explained in Section "Non-termination Analysis" of the ErgoAI Reasoner User's Manual , this choice may sometimes ensure the termination, thereby solving the problem.

Next, the user should resume the query, if Terminyzer was started from within a pause query; if Terminyzer started not from within a pause, the user should issue the query to be analyzed.
After a while, the execution of the query will pause and Terminyzer will start the analysis. If it finds no infinite loops, the user will be given an option to give Terminyzer more time to detect non-terminating behavior. To this end, the user should tell Terminyzer to stick around and then resume the query. In the lucky case when Terminyzer finds a problem, a report will be shown in a pop-up window. In that case, the user should stop Terminyzer, probably also abort the query, and examine the report. The following picture shows a report that Terminyzer issued in response to query p(?X) for the following set of rules and facts:

p(a).

q(b).

p(f1(?X)) :- q(?X).

q(f2(?X)) :- p(?X).

As seen from the above picture, Terminyzer has detected an infinite answer producing pattern caused by the rules on lines 3 and 4 in the knowledge base. Namely, answers to subgoal q(?A) in the rule on line 3 will cause larger answers to be generated in the head of that rule, p(f1(?X)). At the same time, that head is also a subgoal in the body of the rule on line 4, so that bigger answer to p(...) will cause the production a still bigger answer q(f2(f1(?X))) for the head of that rule.

But that answer will cause the rule on line 3 to produce an even bigger answer for p(...), and so on.

Keep in mind that since the problem is undecidable, Terminyzer may not find a non-terminating cycle even if one exists, and it may report a cycle that in reality is a terminating one. In the latter case, however, the cycle is likely to be a performance bottleneck.

Further details can be found in ErgoAI Reasoner User's Manual .

Other Tools

Additional tools are described in the ErgoAI Reasoner User's Manual , section "Debugging User Knowledge Bases." These include: Table Dump -- a tool similar to showtables{} but one that provides much more information, which can be used for analysis of performance bottlenecks; a cardinality checker; two kinds of tracing, etc.

Considerations for Improving Performance

Although Ergo is always trying to automatically optimize performance of queries, this is a hard problem to solve in general. The user thus must be aware of a number of things, in order to achieve best reasoning performance. Section "Considerations for Improving Performance of Queries" of the ErgoAI Reasoner User's Manual discusses a number of such issues, including

Left-to-right evaluation of subgoals in queries and rule bodies
Nested frames and path expressions.
Unbounded vs. bounded variables in queries (the more variables are bound -- the better). In this regard, the delay quantifiers must and wish can be helpful, as they can cause a delay in subgoal evaluation until certain variables get bound. Delay quantifiers are described in section "Rearranging the Order of Subgoals at Run Time" of the ErgoAI Reasoner User's Manual.
Check out also the production=on compiler option described in section "Miscellaneous Compiler Options" of that manual.
The section "Notes on Style and Common Pitfalls" is also well worth reading.

Page updated

Report abuse