FAQ‎ > ‎

RETE Business Rule Engines



This page examines the fundamental functional / architectural differences between a Transaction Logic Engine and a Rete Decision Rules Engine.


It illustrates that the former is best suited for transaction processing, and suggests that a combined approach can be advantageous.

For a more whimsical view, click here.




Abstract

At first glance, Rete and Business Logic Engine seem to have significant overlap:

  • They both provide derivation (inference) rules
  • They both operate on Java Beans (Domain Objects)

While similar, this document explores fundamental differences which suit them to different problems.

Rete-based inference engines provide valuable services that enable Business Users to manage (safely selected) elements of their system, by-passing the need to contact IT. While valuable in that sense, Rete engines are quite inappropriate for the transaction processing elements of Business Logic:

  1. Inadequate aggregate processing
    Rete engine interfaces do not presume a transaction that can be compared to an existing database to compute changes, and use these to prune processing logic. So, to define a sum, they must bring all the child rows into memory. Particularly when such aggregates are Forward Chained on other aggregates, this can be prohibitively expensive
  2. Inadequate integrity
    Unlike Spreadsheet Business Logic where rules are automatically invoked, Rete rules require explicit calls. Integrity that is elective is not reliable, and does not meet the requirements of regulatory compliance or system integrity
  3. Inadequate Expressive Power
    Rete engines do not have concepts like old values, so there is no natural way to express state transition logic. They also do not provide advanced logic such as copy and allocation
  4. Difficult to debug
    It can be difficult to debug Rete logic, since many implementations do not provide rule tracing / debugging

A powerful approach is to utilize both: RETE rules for (controlled) End User Access, and Business Logic to manage transactions. Coupled with a Process Engine for Workflow, and advanced User Interface tools to build screens, organizations can achieve 10-fold improvements in agility.


Performance

The sub-sections below explain why aggregates are vital to Business Logic, the performance challenges they present, and how the Business Logic Optimizer uniquely address the requirement for enterprise-class performance.


Aggregates are core business functions

Those experienced in transaction processing applications are aware that aggregates (sums, counts) are core to transaction processing - the "+" sign of Business Logic algebra. These are often called "rollups". To survey a few patterns / examples:

  1. See the data
    The simplest case is just to see the resultant data in real time. For example, you might want to see the total sales for each sales rep for each month.
  2. Constraint use
    More commonly, aggregates are referenced in Constraint processing. The classic example (illustrated below) is that
    • Customer Balance must not exceed the Credit Limit
    • the Balance is the Sum of the Order Totals
    • The Order Total is itself a Sum of the Item Amounts
      Balance is a Cascade Aggregate: an aggregate that depends on other aggregates
  3. Existence checks
    Counts are often used as existence checks, enabling you to constrain that Orders must have Items, or provide special handling for Orders that include promoted or restricted Products

  4. Presumed elements of more complex logic

    Logic that creates rows, such as Payment Allocation or the Bill of Materials Explosion, are dependent on aggregates to supply the remaining elements of logic


Aggregates and Business Logic

This diagram illustrates that aggregates are the heart of Business Logic.

The following sections describe

  • This is a database problem (not a CPU problem)

  • SQL optimization is...
    • Unaddressed by Rete Engines
    • Fully addressed by the Business Logic Optimizer



Database performance (more than CPU) is key

It is commonly presumed that RETE engines, in providing valuable flexibility, incur significant CPU penalties. While this can be an issue in some instances, a far more serious matter is database performance. Performance is particularly important in aggregate processing, which can involve significant amounts of data.

The following sub-sections introduce some typical database patterns, via some specific Use Cases. We will then investigate the SQL performance characteristics inherent in the alternative technologies.


Aggregate challenges: breadth, depth

Consider the familiar Customer / Order / Item example, as shown below. The diagram illustrates many Orders for a Customer, and many Items for an Order. The callouts reveal the challenges:

  1. Breadth: when the Customer Balance is required, it will not perform well to bring excessive Orders into memory, or to issue large aggregate queries
  2. Depth: even worse, this example illustrates a common Cascade Aggregate pattern where one aggregate (Customer Balance) depends on another (Order Total). Now, we need to consider the prospect of issuing N Item aggregates queries, where N number of Orders.

Aggregates in Typical Use Cases

Consider the diagram below, which must enforce the following underling Business Logic:

  1. Customer Balance = Σ Order Totals
  2. Order Total = Σ Item Amounts
  3. Balance cannot exceed Limit


Now, consider the following Use Cases:


Use Case Proper SQL handling Poor SQL handling
Adjust Limit No aggregate – compare to maintained Balance Recompute all data in rule; for balance:
  • Balance: Read Orders, sum Totals
  • But, Total depends on Items, so read Items for each Order to compute Total (cascade aggregate)


Adjust Order Date No aggregate – no data has been altered that affects the Balance, so the application can “prune” all related processing Same as above
Insert New Item No aggregates; instead, adjust the Order Total and Customer Balance Same as above

Of special note is the contrast between 1 update SQL vs. N+1 aggregate SQLs. Such "cascade aggregate" cases are quite common. There are even worse examples - consider a Department rollup of budgets. Without proper aggregate handling, this would entail N queries, resulting in loading the entire Department table into memory.


Core Underlying Difference: Rete / Logic Engines

For All Changes

This fundamental distinction is represented in our logo: for all changes.  Rete engines do not incorporate the concept of changes to an existing database.

Before we discuss how the Business Logic Engine provides such optimizations, we must briefly consider the fundamental differences in how Rete / Logic engines are invoked, and interact with a database. The diagram below depicts the core underlying difference between a Rete Engine and a Logic Engine:

  • Rete - rows independent of a database

    1. Applications explicitly invoke Rete Engines by passing a set of data, and the name of a Rule Set
      As further discussed below, this raises an integrity issue, since the application might forget, or call the wrong Rule Set
    2. The Rete Engine updates the supplied data per rule execution
      • As noted in the callout, there is no concept of a database or a transaction. This provides flexibility, but results in significant loss of functionality for database processing applications, since the engine cannot persist the data, nor can it detect changes relative to the existing data


  • Logic - rows, as changes to a presumed database

    1. The input is only the changed data
      In the case of Spreadsheet Business Logic, there is not really a direct API - programs submit changes to Hibernate (as usual), and Hibernate events automatically invoke the logic as shown here. This declarative encapsulation assures re-use, so that the proper logic is applied to every transaction
    2. The Logic Engine (through Hibernate) implicitly understands there is a backing database / transaction, so can provide state transition logic based on comparing proposed / existing ("old") values
      In this example, we might want to take special actions if an Order increases by a given amount, or, a more popular example, assuring raises are always over 10%:
      Constraint error when: isChanged("salary") && employee.salary < employee_old.salary * 1.1
    3. The Logic Engine can also use these old values to process changes to data that is related to supplied data, such as adjusting the customers' balance

This fundamental distinction is represented in our logo: for all changes.  Rete engines do not incorporate the concept of changes to an existing database.

Business Logic Optimizer Reduces / Eliminates SQLs

Business Logic provides the proper SQL handling noted above. The underlying enabling technology is that Business Logic is responsive to not simply sets of objects (as in a Rete engine), but analyzes the submitted changes relative to an existing database as illustrated above.

Business Logic Optimizer provides this analysis, including:

  1. Change Based Pruning
    in the first two Use Cases, the system detects there are no changes to data used for aggregates, so existing stored values are safe to use without recomputation, for transactions both in the aggregate source (changing Order Date) and target (changing Customer Limit). The entire aggregate query (or queries) is eliminated.
  2. Adjustment
    since the system knows the old values, it can simply make a single row update to adjust the aggregate by the difference. So, inserting a Lineitem adjusts the Order, whose Business Logic detects this change and adjusts the Customer Balance. Again, multiple' aggregate queries are eliminated, reduced to 2 single row updates.
  3. Single-Pass Rule Processing
    Rete engines rely on optimizing multiple rule evaluations - for all the rules in a rule set.  Transaction engines can organize rules by domain object and pre-determine their dependencies, enabling them to make a single (pruned) execution of rules.

In short, the system automatically Minimizes / Eliminates SQL overhead, based on transaction analysis.


Like a Query Optimizer

The examples above leverage the decision to persist the aggregate values - physically store them in the database. In fact, this is not a requirement. You can specify that aggregates are transient, in which case the system – specifically the Business Logic Optimizer – concludes that the aggregate must be run.

Our Best Practice recommendation is to persist aggregates, since they eliminate costly queries. We provide both alternatives because performance implications might not be clear at the start of a project, and because database schemas are sometimes "locked down" since they may be accessed by other applications. In any case, one important requirement is that you be able to make a choice, and then change your mind later as conditions unfold...

The result is analogous to a relational database index. You build your logic (or database retrieval) at a logical level, with the ability to change the physical structure as necessary without affecting programs / logic already written ("data independence").

Contrast this to hand-specified logic, where the decision to adjust or perform aggregation might not be clear until the system is built and in performance testing, and affects hundreds of thousands of lines of code. With Business Logic, you simply make the Hibernate transient attribute persistent, without altering your logic. The Business Logic Optimizer will automatically utilize adjustment logic.


Integrity

Rete: Integrity depends on explicit calls

The diagram below shows that utilization of a Rete engine requires specific calls to be introduced into your application. For tools that can directly utilize Hibernate, this is an unfortunate element of additional work. But worse than that, it makes the assurance of database integrity dependent on the proper coding – every time – of Rete invocations. Elective logic is not integrity.



Business Logic encapsulated to guarantee integrity

By contrast, the diagram below illustrates that Business Logic is not called directly by your code, but by Hibernate (events) as transactions are processed. This means

  1. Code reduction – not only is your logic automated by rules, but the Business Logic Framework assures they are automatically invoked

  2. Guaranteed Integrity – perhaps even more important, you can be certain your logic is executed, since it no longer is responsibility of each Developer. In a very fundamental sense, this Declarative Encapsulation provides Compliance Assurance.



Expressive Power

Important as aggregates are, there are other critical common patterns. These are provided as part of Business Logic, as part of its focus on transaction processing.


State Transition Logic

As noted above, Business Logic is driven by detecting changes to the existing database state. This information is made available to your Business Logic, so you can specify constraints or derivations based on specified changes (everyone’s favorite: Employee Salary must be be 10% higher than old Employee Salary).


Deep Copy

Services are provided for Deep Copy. So, it takes a single rule to clone an order and its items. In fact, a full Bill of Materials explosion is implemented with a half-dozen rules.


Allocation

A very common pattern is to allocate goods to set of recipients. For example:

  • Allocate a Payment to a set of outstanding orders
  • Allocate a Bonus to a set of Employees

Allocation logic is provided that makes automates these examples with 2-3 rules.


Integrated Constraint Handling

Constraint processing is a first-class citizen in Business Logic. The system provides services to identify multiple constraint violations for a single transaction, and produce an exception that can be sent along for End Users including the values that resulted in the violations.


Ease of Debugging

A common lament associate with Rete use is "what is it doing? how did it get that answer?". This can result in frustration, and loss of time.

Since these concerns relate to any automation service, Business Logic was designed to address Logic Debugging directly:

  1. Logic Logging - all rules are logged, including the state of the Domain Objects
  2. Logic Debugging - you can use a Java debugger to stop in your rule, inspect your Domain Objects, and step through if-conditions
  3. Logic Events - you can collect the logic events and log them for subsequent analysis


Logic and Rete

In the final analysis, the best choice is to employ both technologies, utilizing each for the elements of your system that best suits their purpose:

  1. Business Logic: the vast majority of transactional processing systems will greatly benefit from Business Logic
    • Transactional processing systems accept data (e.g., an interactive application, or a message) to update a database
    • Typical examples:
      • Financial systems
      • Inventory management
      • Project management
      • Order processing
      • etc...
  2. Rete Decision Logic: can complement Business Logic
    Provides value in End User enablement, and when decision results are not necessarily saved into a database
    • Typical examples
      • Invoking a Decision Table
      • Insurance underwriting
      • Credit rating
      • Complex tax calculations 
Comments