Business System

Global Context: In a global context all interactions happen without involving in any IO. In .NET world you can think of it as an AppDomain. In plain Java it's boundary within a class loader. If you are using OSGi then it gets slightly muddy, but it can be defined as the boundary in which method calls take place without involving IO. At the source code level whether these applications share any "developed" code or not is arguable but we would argue against it. Any interaction with it, from outside this context, happens via endpoints exposed.

Business Component (or component): Business component when deployment unit which provides business value and has a single global context. It can have uniquely addressable endpoints to which other components can interact to exchange data.

Service: Service is a business component providing one-or-more end-points to other components who are not end users.

Application: Application is a business component providing value directly to business user by handling it requests and responses. An application can additionally also perform the role of a service and hence exposes endpoints to other business components to interact with it. Also, an application can be used by cross-business-component utilities like monitoring, search, load-balancers, as well which we would refer as supporting functions.

A business system, logically, consists of one or more business components. In order to provide value to its consumers they internally leverage (interact with) other business components. But reusing other business components also makes it dependent on it. Hence, a looser coupling, between consumer and provider components, is much desirable. This coupling is really two-fold. Firstly, the subsequent deliveries of provider to production, is dependent on the version of consumer running in production. Then at runtime, the consumer requires the provider to be available to perform any dependent function. Here we would look at an architecture which reduces these couplings significantly.

Most people in software developer very soon gather the fact that loose coupling is a good idea. But inspite of that we do see quite coupled business components. What are common reasons for which this coupling is created.

Database based integration

Sometimes two business components interact with each other via an integration database. The business components interact with each other by reading/writing data from/to this shared database. Any change needed to the underlying schema or semantics of data stored for one component implies possible effect on other components as well. Through careful coordination or governance such cascading affects can be avoided, but the coupling can be significant drag on the development even then. Most often development teams just avoid making changes to shared data, which leads to sub-optimal database design and duplications.

Services as remote methods

The selling point of technologies like RMI, SOAP, CORBA was that accessing remote objects is as simple as invoking methods in local memory. While designing systems without understanding the implications of network call one downside, it is not the only one. Thinking of services as remote method implies that we do not think any differently about the contract between client and services.

Synchronous communication

Applications which use synchronous communication are easiest to understand, but this also demands that called component is available at all the times. In a system where all the components talk to each other synchronously cannot easily afford any localized downtime.

To take us through rest of this section of book, we would take an example system and propose an architecture for it. Then we would look at salient aspects of this architecture. In all fairness I would not be presenting any new idea. The ideas like asynchronous processing, document based contracts, decentralization of data, end to end principle, late binding (of validation and serialization mechanism), small components, has been around for very long time. We would be look at how these ideas can be used to design a system which has properties that we value.

Lets take example of customer system for railways. If we look at such a system it can be distilled down to following core activities:

Book seats

Cancel seats

Check reservation status (before and after booking)

Lookup timetable and fares

Get status of trains

Deliver tickets

There are some other activities like accounting, discounts, payment processing and so on which can also important for this enterprise but we have excluded these from core activities for simplicity. Lets break this system in multiple business components, as services and applications.

Services

Booking Service: book, reserve, cancel seats and get reservation status

Timetable Service: train routes and station information

Fare service: manage and get fares

Schedule Service: arrival, departure, train cancellation status

Delivery Service: ticket formatting, send ticket via email

Customer Service: booking, delivery, refund status, preferences and payment

Payment Service: performs payment transactions

Applications

Public facing website

IVR

Call center

Lets look at what are the responsibilities of a service, using booking service as an example. Lets say this service has following three operations: book, cancel and enquire. It owns booking information in its own data store which is not accessible to any other service. This service queries fare service to get the fares for the given journey route. It locally caches train routes published by timetable service in its database, to validate the journey against it. It commands payment service to perform the payment. Lastly it publishes booking information in its data feed as events. Delivery service is interested in this feed to deliver the ticket to customer.

A diagram of booking service and timetable service. (show where consumed data feed is stored)

Every service in the system works on the same principle: fewer coarse grained operations, owned database, data cache of external data, command invocations and data feed. Lets look at the important aspects of this architecture.

Component owned database

The database contains the objects which is used only one component. It also doesn't hold any referential relationship with other component's data objects. This implies that only way to reach this data is by calling the owning component's programmatic interface. This exclusive ownership provides flexibility to component to change the schema of the database without any worries of affecting other components in the system.

Query and Command

This is traditional style synchronous call made to another service to get some data from it or issue a command to it. Ideally this should be avoided by replacing it with a read from local cache, explained below. This depends on the business requirements though. Certain queries should be performed in real-time so that the current values are used and similarly the command needs to be executed in real-time to ensure that it is successful. The reason being, the dependency introduced between the provider service and consumers, resulting in slowing down the evolution of the provider.

Business Component Feed and Local Data Cache

A business component publishes events in form a chronological feed. Interested business components can read the data from this feed and save them to their own database, as local data cache. They need not save all the fields or events but just the important parts. This allows all the consumers to pick the service and choose the data that they need.

The advantage of this approach is that the provider doesn't need to support multiple requested formats. It just publishes in single format and the consumers and can choose to consume it in fashion they find appropriate. The dependent business components aren't effected by downtimes of a service as long as the feed is available.

Important aspects of this architecture

If we step back and look at it, we have basically created multiple copies of the same data within a system and defined architecture rules by which this can be made to work. One would like this data to be synchronized near real time. But even when the data is not in sync, this isn't a terrible thing. In real world we are used to things being slightly off each other as long as we see that they do synchronize with each other eventually. (changing the DNS settings on the Internet) The word eventual here is really milli-seconds in most cases.

In my experience in most systems the number of query/command requirements are really few. Most integration can happen via use of business component feed. e.g. We can argue whether booking service needs to really query fare service or can it depend on its feed instead.

Minimized remote calls

Since most of the data required by a business component is available to it in its database, all the requests to it can be satisfied without calling out to other components. This helps in reducing the number of blocked threads in a system and hence increases the utilization of resources of the platform on which it runs.

No centralized orchestration

The most obvious way of performing business logic which involves multiple business components is to have a orchestrator responsible for querying and updating them centrally. Whereas, in this architecture there is no central orchestrator as the individual business components are responsible for their updating themselves based on events raised by others on the feed. While this may sound like enterprise service bus (ESB) but there is a crucial distinction from it which is that the business logic is not centralized in your ESB but is with the business components. There is a lot more here.

Implementing idempotence

Rule object

Fine grained background job

Business component contract

How old events would be available in the feed?

Explain via code sample:

Starvation caused when implementing idempotent operations

Batch operations

No Data Services

Source code for: Reading the feed, contract for how long an item would remain in the feed, maintaining the local cache

Identifying applications and services

A typical application interacts with its user, uses a data store and interacts with other services. Each of these three characteristics of an application provides some clue about partitioning a system. Defining boundaries based on the recommendations argued below can be contradictory at times and resolving them is more of an art.

Users

If an application caters to different class of business users then may be it is doing too much. Here we should differentiate between logins and users. We might create administrators who can manage users and their access. In such a case user administrators are really operating the system and are not really the business users. But they do perform business functions then they should be treated like regular users.

User workflows

An enterprise user might use multiple applications based on one's needs. But on a regular basis these users perform repeated set of tasks (or workflow). If in order to perform these tasks a user needs to jump between various applications then it is really a smell. As, this involves, logging into multiple applications, enter/query some duplicate data and may be wait for data to replicate between applications. While some of these problems can be alleviated using technology (single-sign on, serving it from same base URL) some others are rather difficult. More importantly it points at underlying issue that the scope of these applications can changed to make the user experience better.

Interdependency

Quite often services are responsible for providing functionality over a set of data. When a service fetches data from another service and provides functionality over it, then such functionality can be delegated to the other service. Tell don't ask principle can be applied to services as well implying that the service should which has the data should provide the functionality on it. This principle helps in creating services which are not mere data services but business services.

There are some natural cues to breaking up this system like user base, their work processes, existing data stores and legacy systems. One is not always in control of how a system is be broken up as there are organizational (IT or enterprise) reasons which dictate these. A system consisting of applications and services need to interact with each other.

Service Interface Design

Schema definition and validation in a declarative language helps in exposing the contract in a language which would also be used at runtime.

What is the right way to partition a system into application/services? How do we know whether applications/services are too small or too large?

Should we share the a data store between multiple services/applications or not?

How should the application and services interact with each other? Should we use synchronous or asynchronous communication between them?

How do we ensure data consistency across services and applications? When dealing with duplicate data how is data synchronized between different services?

How can we evolve a service without affecting all its clients?

....

Business Logs

Business logs are structured logs which tracks the business events. These logs, containing success and failures messages from a component, only capture information which make sense from the business perspective. Any business person should be able to look at these logs and understand it. As a result the technical proceedings in a component is not logged here.