A brief introduction about expert system rules in Drools

Expert system

An expert system or a system based on knowledge is a computer system that makes decisions or solves problems in a particular field by means of knowledge and analytical rules defined by experts. It is made up of a knowledge base —the rules of the EXSYS, that is to say, the codified expert knowledge—, a working memory —stocks the data received at the beginning in order to solve a problem, then the intermediate conclusions and the final results— and an inference engine, which models the human reasoning process. The next diagram basically represents the structure of an expert system.

Three examples of very well-known expert systems are CLIPS, JESS and DROOLS.

- Clips: In the mid-eighties, NASA5 required the support of expert systems for developing projects. Therefore, a number of prototypes emerge but their results were not good enough to fulfill internal requirements. Consequently, a prototype of an expert system was developed; it was called CLIPS (C Language Integrated Production System) whose main characteristic was its interoperability with other existing systems. Subsequent improvements and enlargements have turned CLIPS into a point of reference for the development of other expert systems. Even though CLIPS has shown successfully its productive capacity, as regards expert systems, and it is now in the public domain, its interface with Java through JNI (Java Native Interface) is going through a 0.2 beta experimental phase.
- Jess: The rule engine JESS is a project that had its origin in CLIPS but which was written entirely in Java. It was developed during the nineties in Sandia National Laboratories and it shares with CLIPS several design concepts and similarities regarding syntax.
- Drools: As in the case of CLIPS and JESS, DROOLS is the implementation and extension of Rete algorithm [17], designed by Dr. Charles L. Forgy at the Carnegie Mellon University. Basically, its algorithm consists in a network of interconnected nodes with different characteristics —according to rules that define them— that evaluate inputs by propagating results to the next node when there are coincidences. DROOLS offers integration tools with Java, capacity of scalability and a clear division between data and logic domain. The IJA project incorporates DROOLS Expert and defines rules in MVEL scripting language.

Rules

The knowledge is stored into the Knowledge base as rules, those rules are knowledge representation to express propositional and first order logic in a concise, non-ambiguous and declarative manner. The brain of a Production Rules System is an Inference Engine that is able to scale to a large number of rules and facts. The Inference Engine matches facts and data against Production Rules - also called Productions or just Rules - to infer conclusions which result in actions. A Production Rule is a two-part structure using First Order Logic for reasoning over knowledge representation.

Basic structure of a rule

Advantages of a Rule Engine

- Declarative Programming:
  - Rule engines allow you to say "What to do", not "How to do it".
  - The key advantage of this point is that using rules can make it easy to express solutions to difficult problems and consequently have those solutions verified. Rules are much easier to read than code.
  - Rule systems are capable of solving very, very hard problems, providing an explanation of how the solution was arrived at and why each "decision" along the way was made (not so easy with other of AI systems like neural networks or the human brain - I have no idea why I scratched the side of the car).
- Logic and Data Separation
  - Your data is in your domain objects, the logic is in the rules. This is fundamentally breaking the OO coupling of data and logic, which can be an advantage or a disadvantage depending on your point of view. The upshot is that the logic can be much easier to maintain as there are changes in the future, as the logic is all laid out in rules. This can be especially true if the logic is cross-domain or multi-domain logic. Instead of the logic being spread across many domain objects or controllers, it can all be organized in one or more very distinct rules files.
- Speed and Scalability
  - The Rete algorithm,the Leaps algorithm, and their descendants such as Drools' ReteOO, provide very efficient ways of matching rule patterns to your domain object data. These are especially efficient when you have datasets that change in small portions as the rule engine can remember past matches. These algorithms are battle proven.
- Centralization of Knowledge
  - By using rules, you create a repository of knowledge (a knowledge base) which is executable. This means it's a single point of truth, for business policy, for instance. Ideally rules are so readable that they can also serve as documentation.
- Tool Integration
  - Tools such as Eclipse (and in future, Web based user interfaces) provide ways to edit and manage rules and get immediate feedback, validation and content assistance. Auditing and debugging tools are also available.
- Explanation Facility
  - Rule systems effectively provide an "explanation facility" by being able to log the decisions made by the rule engine along with why the decisions were made.
- Understandable Rules
  - By creating object models and, optionally, Domain Specific Languages that model your problem domain you can set yourself up to write rules that are very close to natural language. They lend themselves to logic that is understandable to, possibly nontechnical, domain experts as they are expressed in their language, with all the program plumbing, the technical know-how being hidden away in the usual code.

DROOLS

The Drools Rete implementation is called ReteOO, signifying that Drools has an enhanced and optimized implementation of the Rete algorithm for object oriented systems. It combines Object Oriented (OO) Paradigm entities with rules in order to let them interact in a transparent way, as an example the next

Basic interactivity between (OO) Entities and Rules

Rule 1: analyzes Object-A and returns an Object-R with related attributes.

Sequence:

1. It is assigned an Object-A to Rule-1
2. Rule-1 analyzes Object-A attributes
3. Then Rule-1 returns an Object-R instance with related attributes to Object-A

In Java source code:

Drools rules

A rule has the following rough structure:

rule "name"

attributes

when

LHS

then

RHSend

LHS is the conditional parts of the rule, which follows a certain syntax which is covered below. RHS is basically a block that allows dialect specific semantic code to be executed.

Decision tables for building Rules

Consider decision tables as a course of action if rules exist that can be expressed as rule templates and data: each row of a decision table provides data that is combined with a template to generate a rule.

Many businesses already use spreadsheets for managing data, calculations, etc. If you are happy to continue this way, you can also manage your business rules this way. This also assumes you are happy to manage packages of rules in .xls or .csv files. Decision tables are not recommended for rules that do not follow a set of templates, or where there are a small number of rules (or if there is a dislike towards software like Excel or OpenOffice.org). They are ideal in the sense that there can be control over what parameters of rules can be edited, without exposing the rules directly.

Decision tables also provide a degree of insulation from the underlying object model.

The key point to keep in mind is that in a decision table each row is a rule, and each column in that row is either a condition or action for that rule.

Rules templates

Related to decision tables (but not necessarily requiring a spreadsheet) are "Rule Templates" (in the drools-templates module). These use any tabular data source as a source of rule data - populating a template to generate many rules. This can allow both for more flexible spreadsheets, but also rules in existing databases for instance (at the cost of developing the template up front to generate the rules).

With Rule Templates the data is separated from the rule and there are no restrictions on which part of the rule is data-driven. So whilst you can do everything you could do in decision tables you can also do the following:

- store your data in a database (or any other format)
- conditionally generate rules based on the values in the data
- use data for any part of your rules (e.g. condition operator, class name, property name)
- run different templates over the same data

As an example, a more classic decision table is shown, but without any hidden rows for the rule meta data (so the spreadsheet only contains the raw data to generate the rules).

If this was a regular decision table there would be hidden rows before row 1 and between rows 1 and 2 containing rule metadata. With rule templates the data is completely separate from the rules. This has two handy consequences - you can apply multiple rule templates to the same data and your data is not tied to your rules at all. So what does the template look like?

Annotations to the preceding program listing:

- Line 1: All rule templates start with template header.
- Lines 2-4: Following the header is the list of columns in the order they appear in the data. In this case we are calling the first column age, the second type and the third log.
- Line 5: An empty line signifies the end of the column definitions.
- Lines 6-9: Standard rule header text. This is standard rule DRL and will appear at the top of the generated DRL. Put the package statement and any imports and global and function definitions into this section.
- Line 10: The keyword template signals the start of a rule template. There can be more than one template in a template file, but each template should have a unique name.
- Lines 11-18: The rule template - see below for details.
- Line 20: The keywords end template signify the end of the template.

The rule templates rely on MVEL to do substitution using the syntax @{token_name}. There is currently one built-in expression, @{row.rowNumber} which gives a unique number for each row of data and enables you to generate unique rule names. For each row of data a rule will be generated with the values in the data substituted for the tokens in the template. With the example data above the following rule file would be generated:

References

- "Enhancing Source Code Metrics Scope Through Artificial Intelligence", Agüero, Madou, Esperón, De Luise. CITSA 2010.
- Drools Expert User Guide: http://docs.jboss.org/drools/release/5.3.0.Final/drools-expert-docs/html_single/index.html

Page updated

Report abuse