Introduction

Note: Please see Appjangle.com for the latest version of what the Nx Framework has become.

Welcome to this introduction to network-oriented design and the Nx Framework!

Both network-oriented design and the linnk.it cloud are based on my PhD thesis. My thesis started out with the question of how to design better systems for knowledge-intensive work; in specific, how to design systems, which help us to connect the dots of fluid, collective knowledge in one integrated space rather than isolating them in incompatible applications.
 
Connecting and sharing  diverse pieces of information between distributed partner is actually a quite fundamental task in software development. I therefore believe most of the developed tools, techniques and services can be a very valuable addition to many development projects (and not just those for specific knowledge-intensive tasks).

To support the vision of more interconnected applications, I provide a Java framework which is intertwined with a powerful and high-performance cloud service. I am happy to invite you to use this service and the associated framework and explore the concepts portrayed in this introduction. You could do me a great favor (and help my research tremendously) by trying the framework out.

This introduction is organized in four sections each discussing one level of network-oriented design. For each level, I'll discuss the motivating problems, the proposed  network-oriented solution, as well as possible advantages and tradeoffs for adopting the proposed solution. The four levels are namely:
 
Level 1 Loose Coupling
Level 2
Seamless Sharing
Level 3
Emergent Validation
Level 4
Collaborative Interfaces

If you want to dive right into the code, you can also check out the examples or download the portable linnk.it cloud library.

Level 1: Loose Coupling

Motivation

Loose coupling is one of the core principles of good software design. It is argued both to increase reusability and to reduce complexity in large software systems. Essentially, components are loosely coupled if they can be changed without affecting each other. Although this principle has been around for a long time, modern practices like test-driven development and the increasing complexity of systems have emphasized the necessity of loose coupling.

Loose coupling, however, is often difficult to practice. I'll illustrated this with a small sample scenario. I'll expand on this scenario in the latter parts of this tutorial but please bear with the fact that it is indeed very simple for now. Let's assume a user drops by and postulates the following problem.

"I often get deliveries sent back from the shippers our customers contract because they are too heavy for them to load on their truck. Especially when on order contains both our TC17-D CNC Lathe, weighing just under 2300 kg, along with the TC280 Heavy Duty Grinder, weighing about 1500 kg. But there are also other combinations of products which end up being too heavy. So I would really like some kind of system that tells me the total weight of an order so that I know which orders I need to split."

Although there are many ways to design an application to fulfil this users needs, I'll do a quick object-oriented analysis and implement a little trivial application based on this analysis

Analysis: I would understand from the user's description that we have to consider the entities 'orders', 'order items' and 'products'. To keep it simple here, I will stick with 'orders' and 'order items' for now. For 'orders', we need to keep a list of 'order items' (since an order can have more than one item) as well as be able to calculate to total weight of this order. For order items, we'll need to keep track of the product as well as it's weight. See the simple class diagram below. However, please note that this is not a perfect model and I'm sure there are many other ways to model this application!

Initial class diagram

Implementation: Implementing the application based on the developed model is rather straightforward and requires only a few lines of code. The source code given below is slightly shortened but you can check out the full source code here.

class Order {
  List<OrderItem> items; // [1]

  int getTotal() {
    int total = 0;
    for (OrderItem item:items) {   // [3]
      total = total + item.amount; // [2]
    }
    return total; 
  }  
}
      
class OrderItem {
  String productId;
  int weight;
}

To come back to the concept of loose coupling, we can even in this trivial example observe a number of possibly undesirable tight couplings:
  1. Order maintains a list of orders. For this purpose, it has a direct dependency to the class OrderItem.
  2. The method getTotal() accesses the amount of orders and therewith is also dependent on the class OrderItem.
  3. The method getTotal() also accesses the field items holding the list of orders for the class Order. Therewith, it is dependent on the class Order (apart from also being a member of this class). This last 'coupling' is indeed a desired property of object-oriented programming.
The figure below illustrates these dependencies between three essential components of the given example application: the class Order, the class OrderItem and the method getTotal(). One might in addition argue that there might be a bi-directional dependency from the class Order to its method getTotal(). But we omit this  dependency here, since it can also be aruged that the class Order can be instantiated without definition of the enclosed method.
Couplings in Scenario
There are a number of techniques we can employ to 'loosen' the couplings between these classes. First and foremost, it seems appropriate to introduce an interface abstracting the access to the class OrderItem. Moreover techniques like dependency injection and inversion of control enable us to manage the coupling between classes more dynamically (Spring framework, Google Guice, JSR-299, ...). However, introducing interfaces and using dynamic dependency injection does not necessarily reduce the complexity of our system. In fact, as illustrated below, the introduction of interfaces and dependency injection acutally increases the number of dependencies between our classes to a total of four.
Couplings using Dependency Injection
Although the application in the figure given above has looser coupling between the classes Order and OrderItemImpl, it is not necessarilty less complex. If we measure complexity as number of involved components and connections between them, the complexity indeed has increased. One cause for this increase in complexity might be that many approaches to reduce coupling do not directly address two fundamental sources of tight couplings between classes in object-oriented design:
  1. Dependencies between objects are established using attributes. For instance, using the class OrderItem for the defintion of the attribute items of the class Order tightly couples the classes Order and OrderItem.
  2. Data and operations are tightly coupled, since they are encapsulated together in a class. For instance, the method getTotal() and its implementation are tightly coupled with the attributes of the class Order. Indeed, the tight coupling between data and methods is one of the foremost principles of object-oriented design.
Inspired by these observations, one objective of network-oriented design is to find a way to develop loosely coupled applications without increasing the complexity of these systems.

Network-Oriented Approach to Loose Coupling

I have argued that object-oriented design has a number of properties, which complicate to build loosely coupled systems. True, one might contend that loose coupling is not necessary on the level of classes and objects, and should rather be pursued in the realm of components, modules and services. However, if it was possible to build loosely coupled classes without impacting the complexity and performance of our system significantly, then there are very few reasons not to do so.

The starting point for the network-oriented approach to loose coupling is that many complex systems can essentially be seen as a number of components and their interconnections. In consequence, from the perspective of network-oriented design, systems are seen as a collection of nodes (representing components) and connections between these nodes. Nodes are represented by atomic entities (text, numbers, bytes, objects, ...); connections are managed by dynamic networks.

An important difference between traditional object-oriented analysis and modelling a scenario from a network-oriented perspective, is that while object-oriented analysis requires us to begin with the abstract (classes), in network-oriented analysis, we begin with the concrete (entities). This might first require a bit of adjustment, since we are trained to automatically jump from presented plain facts to useful abstractions. This can often be helpful in designing systems but can also be determinatal; for instance, speaking in abstractions can complicate communication with end users (who are not as trained in thinking in software abstractions)!

If we take a naive look at the facts supplied in the scenario, we can identify the following network-oriented model: First, there are a number of straightforward atomic values: the weight values (150 kg, 100 kg) and the products ("flower pot set", "old ship anchor"). The weight values can be expressed as numbers (node type [I]), while the product names can be expressed as text (node type [S]).
First simple version of network model
After having established the atomic values, connections must be established involving these values. In this example, we use a very simple but common form of connection: aggregation. In order to aggregate two nodes, a third generic node (a node without value but just with an identification, type [ ]) must be introduced and the aggregated nodes connected to this node. For instance, we can introduce the generic node with the (arbitrary) identification pot and attach the value nodes [S] "Flower Pot Set" and [I] 150. 
Network model with aggregation
Following described patttern of aggregation, we introduce a genric node anchor aggregating the remaining two atomic values. Both pot and anchor can be aggregated into the generic node order. Like in any form of abstract modelling, the resulting network model shown below is not the one and only answer to model the scenario described by the user. However, we will use this model as a valid starting point for  
Naive network design
Sticking with the initially possibly imperfect model for now, we can implement the developed model easily using the Network Extension (Nx) language. This language is provided as part of the linnk.it framework and allows to write plain old Java code (only requires linnk.it.micro-core.jar on classpath):

Network n = Nx.newNetwork();
       
Object order = Nodes.node();
Object pot = Nodes.node();
Object anchor = Nodes.node();
       
Nx.put(order).in(n);
       

Nx.append(pot).to(order).in(n);
Nx.append(anchor).to(order).in(n);

Nx.append("Flower Pot Set").to(pot).in(n);
Nx.append("Old Ship Anchor").to(anchor).in(n);

Nx.append(150).to(pot).in(n);
     
Nx.append(100).to(anchor).in(n);

This implementation is what I would call a 'pure' network-oriented solution. However, what makes the network-oriented approach so interesting (from my point of view!) is that it can be seamlessly mixed with object-oriented code. Such a 'hybrid' implementation of the scenario could be implemented as follows (with the unmodified original OrderItem class):

Network n = Nx.newNetwork();
        
Object order = Nodes.node();
        
OrderItem pot = new OrderItem();
pot.productId = "Flower Pot Set";
pot.weight = 150;
        
OrderItem anchor = new OrderItem();
anchor.productId = "Old Ship Anchor";
anchor.weight = 100;
        
Nx.put(order).in(n);
Nx.append(pot).to(order).in(n);
Nx.append(anchor).to(order).in(n);

An important difference to object-oriented design that can be observed in the examples above is that the dependencies between nodes are managed externally. So while in the original example, Orders are aware which OrderItems are associated to them (through the attribute items), the order node does not 'know' that the nodes pot and anchor are attached to it. Only the Network class is aware of these dependencies. This external management of dependencies between objects enables to build very lightweight and independent classes. For instance, using a network-oriented design orientation, the initial order example can be rewritten as follows:

class Order { // [1]

}

class OrderItem {
  String materialId;
  int weight;
}

static int getTotal(Network network,
                             Object order) {  // [2]

  int total = 0;
  List<OrderItem> items = Nx.getAll(OrderItem.class)
                            .from(order).in(network);

                                             // [3]
       
  for (OrderItem item : items) {

    total = total + item.amount;
  }
  return total;
}

There are a number of notable changes to the original code:
  1. The class Order has been significantly simplified. Both the list of OrderItems as well as the method for calculating the total have been removed from this class. This removes the tight coupling between the classes Order and OrderItem as well as the tight coupling between the class Order and the algorithm to calculate the total of all orders.
  2. The method to calculate the total weight of all orders, getTotal(), has been changed into a independent component. (Note that in a production system, this routine would most likely be embedded in simple class rather than being a static method.) In result of this change, the tight coupling between the class Order and the method getTotal() has been removed.
  3. In order to access the list of items, which previously has been accessed as an attribute of the class Order, a statement in the Network Extension (Nx) language has been added. Essentially, the statement Nx. getAll(OrderItem.class).from(order).in(n) retrieves a list from all children of the node order, which are defined in the network being passed as a parameter of the method. 

Advantages and Tradeoffs

In the revised version of order example, the static dependencies between the domain classes could be significantly reduced. The figure below illustrates that the changes described above remove the dependencies between Order and OrderItem as well as the dependency between getTotal() and Order. These are precisely the dependencies we have illustrated above as resulting fundamental principles of object-oriented design.
Couplings using linnk.it
In comparison, the number of direct dependencies between the components of the simple system is lower using a network-oriented design approach (linnk.it) than in vanilla Java or using dependency injection (Guava).
Another rough measure of complexity we can employ are the lines of code (LoC) necessary to implement the same application. The initial size of the naive Java implementation (with tight couplings) are 34 LoC. The implementation using the linnk.it framework is slightly larger at 43 LoC. The code size using dependency injection is significantly larger at 75 LoC. For fairness sake, all source files have been formatted using eclipse's source code formatting (ctrl+shift+f). However, please feel free to check out the source code yourself. I am happy about any recommendations to reduce the source size of any of the examples.
While reducing complexity of an application in terms of reducing the tight couplings between classes and keeping the code small and lean, a common tradeoff for the reduction of complexity is a significant decrease in application performance. A classic example for such a tradeoff was the discussion of Java vs. compiled languages (C) performance in the beginnings of Java development.

To assure that the network-oriented implementation of the example application does not come at significant performance penalties for the given application, I have conducted a simple performance test. For this test  an order is created, two items attached and the total weight of these orders is calculated. This is repeated 1,000,000 times. Unsurprisingly, the vanilla Java implementation performs significantly better than the dynamic implementations using the linnk.it framework and Google Guice. Therefore, the plot below only shows the execution time in seconds for 1,000,000 iterations using the Google Guice framework and the linnk.it framework. Feel free to check out the detailed results as well as the source code used to perform the measurements.

Chart1

I hope the discussion above could give a good overview of how network-oriented design and linnk.it in particular can enable to build loosely coupled systems with only little impact on the complexity and performance of an application. Of course, in the given trivial example, I would wholeheartedly opt for an implementation in vanilla Java; it does not only offer us the most compact source code but also unmatched performance. However, the implementation using linnk.it allowed us to remove two of the three tight couplings between the domain-specific components of the application. In the scale of a large system, this reduction in tight couplings can significantly increase flexibility, reusability and maintainability of our system.

Apart from reducing coupling, the network-oriented approach offers a number of further advantages, which I will discuss in the following sections. Since the discussion in this introduction focuses mostly on the motivation and possible advantages of network-oriented design and the linnk.it cloud, I have written a number of complementary short tutorials, which explain how to get started following simple step-by-step instructions.

Tutorial 1: Getting Started

Tutorial 2: Network-oriented Design with linnk.it

Level 2: Seamless Sharing

Motivation

In its core, the linnk.it cloud library enables to establish and manage dynamic connections between classes and objects using the simple network extension (Nx) language. However, enabled by the simplicity of the network-oriented design approach, a number of higher-level features are offered by the framework. These higher levels all follow the same guiding principle employed for the first level: To identify the essence of problems we deal with in software design and offer simple solutions to these problems.

While the first level focuses on the problem of building loosely coupled systems in order to reduce complexity and increase re-usability, the second level of network-oriented design aims at enabling the seamless sharing of complex data structures in distributed systems.

To illustrate the motivation on this level, I'll first expand the sample scenario described in the first part of this introduction. It is likely that in the scenario, the user described, a number of systems could be involved. Lets assume we are given the following IT infrastructure:

SystemRoles
Order Management Server
  • We receive most orders electronically from our customers. Each of order contains the external customer id, the product ids of the ordered items as well as the requested quantity. The order management server processes these text files and stores them in a database. All orders are initially defined as 'pending'.
  • The order management server further provides a service, which will return a list of all currently pending orders.
  • Finally, the order management server allows customers to request the current status of their orders.
Material Management Server
  • The material management server maintains a list of all products and materials handled by our organization.
  • This system provides a service to check whether a product id is valid.
  • This system further provides a service, which allows us to determine the weight of a product given a valid product id.
Weight-check User Client
  • This command line client outputs a list of all currently pending orders. For each order, the client calculates the total weight of all ordered items.
Customer Client
  • This command line client outputs a list of pending orders for a given customer id.

Both the systems and the associated data, are, of course, still not nearly as complex as we would expect them to be in a real world application. However, the scenario has become significantly more challenging to implement.

Again my main focus will lie on the complexity of the implementations we can devise for this scenario. Whereas in the first part, I considered classes and objects, for this level, I will focus on the dependencies between systems. There are a number of dependencies as illustrated below:
Dependencies between systems
  1. The customer client requires the order management server in order to place orders and to check their status.
  2. The order management server requires the material management server to check the validity of product ids provided by clients.
  3. The weight-check client needs to request the currently pending orders from the order management server.
  4. And, the weight-check client needs to inquire the weight of products from the material management server.
These dependencies of course depend to some degree on the way we chose to implement our requirements. For instance, there would be fewer dependencies if we merged the order management server and material management server into one system. However, for this scenario we take it as given that it is an external restriction from our organization that these servers are separated.

While I have outlined in the first part, that dependencies between classes should be avoided if possible to keep the complexity of systems in check, I want to note here that each of the dependencies between these systems is associated with much higher complexity costs than a mere dependency between classes. Two important but not exhaustive reasons for this are:
  • If we need to change a class or a dependency between classes, we can usually recompile our application easily to reflect the change in all parts of the application. Distributed systems, in contrast, can usually not be changed in one go. For instance, if we change the order management server, we cannot assume that we can change the customer client at the same time (and vice averse).
  • There is an intrinsic complexity embedded in each connection in a distributed system. It is much more difficult to send information between distributed systems than between objects in an application. This is reflected in the complexity of communication standards such as SOAP (although it says it is a "lightweight protocol" ...).
Consequently, while it is often relatively simple to adapt an independent application to changing requirements, it is often much more difficult for a distributed system.

For instance, after implementing a first version of the distributed system outlined above, we might be request to implement a 'simple' change:

The user has found out that being able to calculate the total weight of orders alone is actually of little help. Since the customers contracts the shippers (EX-Works), only the customer knows the maximum weight for the shipment. Therefore, the user suggests that the customer should be able to specify a maximum shipping weight (which can be seen on the weight-check client). Furthermore, to be save, the customer should also be enabled to check the total weight of orders in case they forgot to specify the maximum shipping weight. Due to various firewall restrictions, customers can only be granted access to the order management server.

To implement this simple change a number of changes are necessary in the original distributed system:
  1. The customer client needs to be changed in order enable the customer to specify a maximum weight as well as check the total weight of orders they have submitted.
  2. In order to provide the required functionality, the customer client needs to request more information from the order management server. Consequently, the dependency between customer client and order management server needs to be changed (e.g. change service interface).
  3. The order management server initially was ignorant to the weight of orders and items. In order to supply this information to the customer client, the order management server needs to be changed.
  4. Before the order management server only communicated with the material management server to check the validity of product ids. Now, it will also have to request the weight of items.
  5. The weight-check client needs to be changed in order to output the weight limits possibly provided by customers.
  6. These limits must also be requested from the order management server, requiring an adjustment of the dependency between the weight-check client and order management server.
Necessary Changes due to Changed Requirements

I have observed before that there a number of intrinsic properties of object-oriented design, which prevent us from building truly loosely coupled systems. I would argue that there are also a number of factors in the way we design distributed systems, which make it difficult to build these to be adaptive and reasonably complex:
  • Firstly, that we often think of systems from a service-centric perspective and not a data-centric perspective. In result, the data of integrated systems is often poorly integrated.
  • As a result of this, not unlike objects who encapsulate behavior and data, most systems (server in particular) both hold logic and data. No! You might say, since I store my data in the database. However, from the outside perspective the application server abstracts the data away from us, so it really does not matter so much if they are one system or not.

Most applications require that information is available on a number of locations. For instance, if we built an application to display orders, their items and their weight, the information describing these orders could be required on the following locations (see figure below):
  1. To display the order information to the user, most likely as some kind of text or other visualization.
  2. To manipulate information or perform computations with the information, it needs to be held in a local memory.
  3. To store information persistently, it needs to be placed on a local hard disk or similar device.
  4. To exchange information with remote partners, it needs to be sent to other locations, for instance a server.
Replication of Information
Usually, we employ tailor-made technologies to deal shift data from one of these places to another:
  • MVP frameworks are employed to exchange information between application memory and user interface.
  • Persistence technologies such as relational database management systems are used to store information from the applications memory onto a hard disk.
  • Technologies such as remote procedure invocation or remote method invocation are used to send information from a local application to a remote location such as a server.
Although these technologies are arguably very different in nature, they, in essence, fulfill a similar task: Taking a particular piece of information on one location and replicating it on another location. For instance, if we have an application managing orders, the information these orders entail would first be defined in the applications memory on a users workstation. If we chose to persist this information on a central database server, we would most likely first call a service on an application server (Servlet, REST, SOA, ...) and submit the order information. The application server, in turn, would send a request to the database, which will persist the order.
save one order to the hard disk, some of the order information managed in memory (not necessarily all), will be replicated on the hard disk. The figure below illustrates this principle.

The motivation for the second level of network-oriented design, seamless sharing, is the observation that although the sharing or replication of information is a fundamental common task regardless of the involved media, current technologies offer specialized solutions depending on the involved media; possibly unnecessarily increasing the complexity of applications.

Network-Oriented Approach to Seamless Sharing

The motivation for network-oriented sharing is that many current approaches, to shift information from one location to another are message-based. Network-oriented sharing, in difference, is synchronization-based.

Distributed System using Nx


Essential Replication of Information
Extra:

Advantages and Tradeoffs

low tech: Java RMI
high tech: Java Message Service, SOA, REST
super-high tech: Amazon S3, Google Cloud Storage

Level 3: Emergent Validation

Motivation

Validation comes in many flavors in modern software systems. Essentially, validation tells us if some piece of information is correct or incorrect. However, validation has a strict brother: typing; most commonly known by its controversial extremes 'static typing' and 'dynamic typing'/'weak typing'.

How is typing related to validation? While validation tells us in general if some information is correct or incorrect, typing can tell us whether a piece of source code is correct or incorrect; if we understand a piece of source code as a special case of information, typing is a special case of validation.

Maybe the discussion about static vs. dynamic typing are fought with such intensity, since they reflect fundamentally different perspectives in software design. Static typing and validation, in general, are tools to exercise control over a system; dynamic typing, or not specifying exactly what is valid and what not, in contrast, allow systems to grow organically. Both approaches, of course, have their own advantages and disadvantages.

I'll first discuss the problems around static typing/strict validation in the context of the already exercised example, then discuss possible problems with the dynamic approach.