J-Sim Official

Documents

The Autonomous Component Architecture

March 28, 2005

(Spanish version)

This document introduces the autonomous component architecture (ACA). We begin with our technical motivation for proposing yet another component-based architecture.  Then, we delve into the details.  Specifically, we introduce the notion of components, ports, and contracts, discuss how components and ports are identified, how components imports/exports information at run time, and an implementation of ACA for simulation, called J-Sim.  In the document entitled "Abstract Network Model and J-Sim," we will focus on network simulation, and present a generalized packet switched network model, the Internetworking Simulation Platform (INET).  Spanish translation of this document is here.

Table of Contents


1 Technical Motivation

In modern digital-circuit design, a hardware system is assembled with integrated circuit (IC) chips on a printed circuit board. An IC chip is a blackbox fully specified by the function specification and the input/out signal patterns in the data cookbook. Changes in input signals trigger an IC chip to perform certain function, and change, after a certain delay, its outputs according to the chip specification. The fact that an IC chip is interfaced with other chips/modules/systems only through its pins (and is otherwise shielded from the rest of the world) allows IC chips to be designed, implemented, and tested, independently of everything else.  We believe this is one of the dominating factors why the IC industry is so successful.  

Our proposal of an autonomous component architecture is a direct attempt to mimic the IC design and manufacturing model in terms of how components are specified, designed, and assembled.  One may argue that the object-oriented (OO) programming paradigm was proposed exactly for the same purpose.  Yes, but it does not quite achieve the objective. Let's take a look at how the function that calculates ax is implemented in the OO programming model.  In the procedural programming paradigm, the code may look like:

double exp_a(double a, double x)
{
double tmp = x * ln(a);
double ans = exp(tmp);
return ans;
}

In the OO programming paradigm, one may package the function in an ExtendedMath class (which in turn calls ln() and exp() in a BasicMath class):

class ExtendedMath
{
BasicMath bMath;

double exp_a(double a, double x)
{
double tmp = x * bMath.ln(a);
double ans = bMath.exp(tmp);
return ans;
}

.....

}

 

Figure 1 Relation of two classes, ExtendedMath and BasicMath.

 

As shown Figure 1, the OO programming model does bear some similarity to the IC design and manufacturing model: a software system is composed of components that interacts with one another. As data is given to the input, a, x, of the ExtendedMath component, a signal a is generated and sent to the BasicMath component. When the result is ready, another signal tmp is generated and sent to the ExtendedMath component. Finally the result ax is generated at the output shortly after etmp is returned.  An interesting question is then: if software design may be specified in a similar manner as IC design, why cannot a software system achieve the same level of modularity as in IC design?

The most interesting question here is then: If software behavior may be described exactly the same as hardware, why software cannot achieve as good modularity as hardware and still suffer from the hyperspaghetti subsystems phenomenon1?

1 The hyperspaghetti subsystems phenomenon refers to the situation in which modules in the system are tightly coupled such that it is impossible to extract some of the code modules for reuse or to debug a code module without referring to other modules. See the paper [Bruce F. Webster, "Pitfalls of Object-Oriented Development," M&T Books, New York, 1995] for details.

Component Binding

We claim that the reason why software design cannot achieve the same level of modularity as IC design is because the OO programming paradigm is fundamentally different from hardware design in component binding. In the OO programming paradigm, a class makes direct references to other class instances and makes function calls to those exposed by other class instances, e.g., ExtendedMath contains an instance of BasicMath.  This implies:

  1. The binding is unseen in the component interface (i.e., the definition of a component). One cannot see, for example, exp_a() makes use of ln() and exp() by only looking at the function definition. One must look into the code in order to realize the dependence and interaction among callers and callees.
  2. The binding is too strong in the sense that the caller (exp_a()) has to know the exact names of the callees (ln() and exp()). The information makes it possible to do type checking at compilation time but also introduces unnecessary linkage to software modules. As a consequence, software codes are prone to the hyperspaghetti subsystem phenomenon, and are  difficult to be reused in different contexts.

Because of the above characteristics, it is difficult to develop and maintain an object-oriented software system with a large collection of functions and classes. In the course of debugging, one cannot obtain a clear view of binding relations without delving into the implementation details and tracing codes line by line.  This yields unpredictability in software development and high maintenance cost, and is usually termed as software crisis.

Separation of Contract Binding from Component Binding

OK, so component binding is the problem. Then, how is it done in IC design then? In IC design, signals that flow in/out of an IC chip are specified in the interface definition. That is, at design time,  an IC chip is bound with a certain contract (or in the jargon of IC design, the IC specification in the databook), instead of being bound to components that interact with it. Component binding is deferred to the time when a system (e.g., ALU) is being built.

A contract specifies how an initiator (caller) and a reactor (callee) fulfill a certain function.  It simply specifies the causality of information exchange between components but not the components that may participate in information exchange. Two components, acting respectively as the initiator and the reactor, are bound at system integration time to fulfill the contract. A system with all the components bound to one another is said to be complete, if the initiators of all involved contracts are fulfilled.

In the previous example, if exp_a(), ln() and exp() are encapsulated as component, the exp_a component is then bound to three different contracts: calculation of ax, ln(x) and ex respectively and is a reactor of the ax contract and an initiator of the other two. Similarly, the ln component is bound to the ln(x) contract and the exp component to the ex contract, both as reactors.  The concept is illustrated in Figure 2 below.


Figure 2 Three components (exp_a, ln and exp) and the contracts they are bound to. Thick lines between the contracts indicate the contracts are matched to each other.  The three components form a complete system with the ln() and ex contracts fulfilled, leaving the ax contract unfulfilled with a missing initiator. Note that the ln or exp component is also considered as a complete system alone.


We claim that binding contracts at design time and components at system building time eliminates the hyperspaghetti phenomenon. The information needed to bind contracts is defined in the interface of a component.  In this way, how components interact  is well specified and programmers do not have to delve into implementation details in order to find that information. This prevents the hyperspaghetti phenomenon from happening at the component level and enables composition of components in a very similar way to IC design.

Interface in RPC, CORBA or COM/COM+ is similar to the contract in our discussion, but interface is not as flexible as contract and the actual function binding in those standards is not as straightforward as that in a component-based architecture.

With the above discussion as the motivation, we will present the autonomous component architecture (ACA) below.

 

2 The Autonomous Component Architecture (ACA)

2.1 Components and Ports

In the autonomous component architecture, a basic entity is a component. Each component owns one or more end points, called ports. The component where a port resides is called the host component of the port. Two components are connected by ``wiring'' their ports together.  When a component sends data at one of its ports, the port relays the data to the port(s) that connect to it. When data arrives at a port, the component which owns the port processes the data immediately in a new execution context (or in the modern computing jargon, thread) and may generate outputs at certain ports as specified in the bound contract.

The output and input of a port is separately wired. When data is sent at a port, it is sent at the output wire of the port and arrives at the ports that have the wire as their input wire. Figure 3 depicts some possible wiring in ACA. In particular in (c), the blue wire is the output wire of port A, C and D, and the input wire of B, D and E. The red wire is the output wire of port B and E, and the input wire of C. So for example, data sent at port A will arrive at port B, D and E. ACA does not allow self-loop. For example, data sent at port D will arrive at port B and E, but not D itself.

Figure 3 Possible wiring between ports.

 
(a) One to one. (b) One to many. (c) Many to many.  

A port can be connected to another port in the simplex manner or duplex manner. In the simplex manner, the output wire of the first port is wired to the input wire of the second port or we say that the two wires are joined by the connection. In the duplex manner, the input wire of the first port and the output wire of the second port are joined as well. Figure 4 demonstrates a duplex wiring example.

Figure 4 A duplex wiring example. Connecting port B (or E) and C in (a) results in joining of the dark blue wire and the blue wire as well as the dark red wire and the red wire, as shown in (b).

(a)

(b)


 

2.2 Contract

A contract specifies how an initiator (caller) and a reactor (callee) fulfill a certain task. In particular, it specifies the causality of data sent/received between components but not the components that participate in the communication. Contracts can be further classified into two categories: port contract and component contract.  A port contract is bound specifically to a port of a component, while a component contract describes how a component responds to data that arrives at each of its ports (e.g., how the component processes the data, updates certain data structures, and generates outputs at certain ports).

2.3 Composite Component

ACA also supports the notion of composite component --- a component may be composed of several components and the entire system forms a component hierarchy.  A composite component is the parent component of the encapsulated components, which are called child components. Composite component makes it possible to organize a software system at desirable granularity.

Figure 5 illustrates how the three-component system shown in Figure 2 can be organized into a composite component. Note that the composite component has a port A connected to port B of the encapsulated component exp_a. When data arrives at port A, it actually arrives at port B. Similarly, when data is sent from port B, it is sent via port A to the outside of the component exp_a. Such a port in a composite component is called shadow port. What actually happens here is that port A and B share the same output wire and the same input wire. So wiring to a shadow port causes wiring to the ports that are encapsulated inside the composite component and are connected to the shadow port.  Behavior-wise speaking, a parent component does not receive data from a shadow port and it should not send data through a shadow port.  The behavior is not defined if a parent component sends through a shadow port.

 

Figure 5 Encapsulation of the three-component system in Figure 2.


2.4 Server Port

A component may provide a common service at a port of its to other components in the system. When one component sends a request to this component for the service, ideally, the component completes the service and sends back a reply to the requesting component. However, the architecture introduced by far has the reply arrive at all the components that connect to the same port, which is not the desired behavior. To conquer this problem, ACA defines a special type of port, called server port.

Formally, a server port is for a component to provide a common service to other components, and, specifically, to send back a reply only to the requesting component, no matter how many components are connected to this port. In addition, ACA specifies that the service be conducted, and the reply be sent back, in the same context as the request sending context.  One can say that the sending (of the request) is "blocked" until the reply returns.

Port A of the component exp_a in Figure 5 is a typical example of a server port. Imagine that two components are connected to port A for the service of calculating ax. One component sends a=2, x=4 and exp_a should reply 16 only to the requesting component rather than to both components.

2.5  How Components and Ports are Identified

Every component and every port in a software system have to be uniquely identified. As the component hierarchy in the autonomous component architecture is similar to the file system in the modern operating system, we adopt the naming method similar to that in the UNIX file system.  That is, a component is identified by the path which is  formed recursively by concatenating the path of the parent component, a ``/'' separator, and the identification of the component.

A software system forms a component hierarchy with itself as root. The root component has path ``/''. Every component and port in the hierarchy can then be uniquely identified. Suppose the parent component in Figure 5 is identified by the path

	/some_prefix/exp_a

Then the path of the child component exp_a is

	/some_prefix/exp_a/exp_a

Ports in a component are further categorized into different groups. Each group has a unique identification in the component. Each component has a default port group with a null identification. The identification of a port in its host component is the concatenation of the identification of the port, a ``@'' separator and the group identification. The path of a port is similarly defined as the concatenation of the path of its host component, ``/'' and its identification in the host component. For example, if port A in Figure 5 is in the default port group and port B is in group ``a_x'', then the path of port A is

	/some_prefix/exp_a/A@

and the path of port B is

	/some_prefix/exp_a/exp_a/B@a_x

Note that the separator ``@'' used in the identification of a port ensures that a child component of a composite component can be differentiated from a port of the same composite component.

2.6  Exporting Information at Run Time

For the purposes of diagnosis and configuration, a component in ACA may import/export information at run time through several designated ports. The architecture defines a designated port, called infoport in a component to export diagnosis information at run time.  Also, a component may be equipped with one or more designated event ports, each of which exports a specific type of events at run time. We describe these ports and the formats of the exported information below.

Infoport (Information Port)

Every component is equipped with an information port, called infoport. Four types of information may be exported spontaneously at this port: error message, garbage message, debug message and trace message. All exported information shares a similar format below:

  1. Time of export (double)
  2. Path of the component/port in subject (string)
  3. Data in subject (could be any type)
  4. Detailed description (string)

The four types of information that can be exported are listed below.

Error Message Exported when a component cannot handle incoming data
(probably because the data cannot be recognized)

Format:  1. Time when the error occurs (double)
              2. Path of the port where the data came in (string)
              3. The incoming data (could be any type)
              4. Implementation information (string)
                  (such as the place where the error is detected)
              5. Detailed description (string)
Garbage Message Exported when a component discards data
(probably because the capacity limit is reached or certain policy is violated)

Format:  1. Time when the message is discarded (double)
              2. Path of the component where the data is discarded (string)
              3. The discarded data (could be any type)
              4. Detailed description (string)
Debug Message Exported when the component writer would like to export debugging information

Format:  1. Time when the message is exported (double)
              2. Which component or port that this debug message is about (string)
              3. Detailed information (string)
Trace Message Is a special debug message and is exported for all incoming and outgoing data

Format:  1. Trace type "DATA" (incoming) or "SEND" (outgoing) (string)
              2. Time when the message is logged (double)
              3. Path of the port where the data comes in or out (string)
              4. The data (could be any type)
              5. Detailed information (string)

Event Message and Event Ports

In addition to the above information that can be exported at infoport, a component can also export events at designated ports, called event ports. The format of an event message is: 

Event  Message Format:  1. Time when the message is exported (double)
              2. Path of the port from which the message is exported (string)
              3. Event name (string)
              4. Event object (could be any type)
              5. Detailed information (string)

Control of Information Exportation

One may specifically ask a component to, or not to, export certain types of information. This is done by sending a 6-bit flag at the infoport of a component. The first bit of the flag is a binary indicator that specifies enabling or disabling exporting the specified types of information. The remaining bits form a mask that specifies which type of information is requested. The following diagram gives an example of enabling exporting garbage and debug messages.

First bit 1 Action (turn on or off)
  0 Error message
  1 Garbage message
  1 Debug message
  0 Trace message
  0 Event exportation

Component Property

Each component may expose a collection of properties.  A property of a component is defined by the pair of a name and a value.  One may query all the properties of a component or a specific one, by sending a null signal or a signal that contains the name of the property at the infoport  of the component.  The component then replies a property message that consists of the value of the property.

3 Features of the Autonomous Component Architecture

  1. Loosely coupled component model: The most notable feature of the autonomous component architecture is its loosely coupled component model: contracts are bound at design time and components are bound at system integration time.  With the separation of contract binding from component binding, a component can be individually implemented and tested independent of the rest of the system, can be incrementally deployed in a system, and can be easily extracted from a system for reuse.
  2. Independent execution context for handling data: Another important feature of the architecture is its ability to handle data in independent execution contexts --- at any time when data arrives at a port of a component, the component processes the data immediately in an independent execution context. The interference between different pieces of data handled by the same component at the same time is minimal. Consequently, a component writer does not have to be concerned about the order in which different, but simultaneously-arrived data are handled.  The only condition to which a component writer has to pay attention is when different execution contexts access some shared data. Synchronization among different contexts must be ensured to maintain the integrity of shared data.  
  3. Component-level diagnosis and debugging: As elaborated on in the section of Exporting Information at Run Time, each component is equipped with an information port (infoport) and may be some designated event ports. One may connect an ``instrument'' component to the infoport and the event ports of a component to inspect, or collect information of, the component at run time. This closely mimics the IC debugging and testing process.
  4. Realization of software IC design: The ability to handle data in independent execution contexts, along with the fact that components are loosely coupled and only bound to one another at system integration/implementation time, enables a component to be autonomous (and hence the name of the architecture). A component can hence be designed, tested, and reused in other software systems with the same contract context, in much the same way an IC chip does in the IC design. Moreover, with the built-in information and event ports available, one may test/debug a software system at the component level (rather than at the line-by-line code level). An analogy between the autonomous component architecture and the IC design is illustrated in Figure 6.

Figure 6: Analogy between an IC chip and a component.

				

 

4 J-Sim

J-Sim is an implementation of ACA in Java. The reason for choosing Java as the programming language is due to many of its desirable features, such as platform independence, pure object orientation, clean language syntax, built-in thread execution, Java reflection capability, and runtime automatic garbage collection, all of which make realization of ACA easier.

The greatest challenge in developing J-Sim is to efficiently provide independent execution contexts or threads for components to handle incoming data. J-Sim introduces a background thread manager called runtime which is key to the performance of J-Sim. In the following sections, we first discuss the runtime in general and then the current runtime implementation in J-Sim.

Simulation is implemented in J-Sim as an extension to the runtime. In particular, the simulation time is globally observed by all the active threads instead of each thread keeping a local time axis as the logical process (LP) in parallel and distributed event-based simulation. The details of how the extension is done is provided in the section of Real-time Process-based Simulation.

4.1 Runtime

To provide independent execution contexts for data that arrive at different ports of a component in the autonomous component architecture, special support at run time is required. In particular, the runtime (in general)2 has to create a new execution context when data arrives at the port of a component.  We depict the process in Figure 7, and outline the steps as follows:

  1. Component C1 is about to send data at one of its ports to component C2 in the execution context x.
  2. To have C2 receive the data, the execution context x sends (to the runtime) the data and the request for a new execution context y.
  3. The runtime takes over the sending process. The execution context x returns from the sending process and proceeds in component C1 .
  4. The runtime creates and activates the execution context y to process the data in component C2 .

2 Runtime is a collection of background processes (unseen by running applications), such as automatic garbage collection in modern programming languages.

Figure 7 How the runtime handles data delivery.

 

Note that the runtime has full control of creating new execution contexts. To ensure that the software system run in a well-controlled manner, the runtime imposes an upper bound on the number of execution contexts that can be active simultaneously. When the upper bound is reached, the runtime delays to create any new context until some existing context completes its execution. This prevents an excessive number of contexts from depleting system resources, or in the worst case, shutting down the system unexpectedly.

In J-Sim, execution contexts are implemented by Java threads, with the thread scheduler in the Java Virtual Machine (JVM) scheduling thread execution.  In the current implementation, the runtime consists of two classes, WorkerThread and ACARuntime:

  1. The WorkerThread wraps up the Java Thread class with the execution context information. 
  2. The ACARuntime class manages creation and recycle of the WorkerThreads as well as implements the control mechanism that controls the number of WorkerThreads that can be simultaneously active.

When data is sent to a port, the WorkerThread in working actually creates a task and hands it to ACARuntime (as in step 2 above). Upon receipt of such a task, ACARuntime either creates/wakes up a WorkerThread for the task, or puts the task in the ready queue until a WorkerThread is ready and fetches it.
 

4.2  Inter-Component Communication Overhead

As mentioned above, when data is delivered to a component, the runtime (ACARuntime) creates a thread as the new execution context to process the data at the receiving component. The time overhead incurred in this process reflects the overhead of the inter-component communication. The factor that contributes to the major part of the overhead is how threads are created and handled. A naïve approach is to create a new thread each time a new execution context is needed.  As it is usually expensive to create and start a new thread, this approach does not perform well. Instead of creating threads anew, we implement in ACARuntime a thread pool in which threads are recycled after their execution and kept alive (but in the sleep state). Whenever needed, threads are awakened to serve as new execution contexts.

To further improve the performance, we enable a thread to announce its readiness (to the runtime) in advance of the end of the execution. It is fairly common for a component to send out some result at the end of processing data. Since the thread at the sending component will be recycled after it finishes processing the data, it is natural to have this thread continue to serve at the receiving component. In order to implement this in the runtime, the thread at the sending component must notify ACARuntime in advance so that ACARuntime does not create/wake up another thread to process the data in the sending process. When the thread finishes up at the sending component, it then obtains the data from ACARuntime and serves as a new execution context at the receiving component.  This, in some sense, achieves the "one thread per message" paradigm as recommended in the x-kernel implementation. 

4.3 Real-time Process-based Simulation

Simulation is implemented as an extension to the runtime in J-Sim. Basically, it makes sure that the system is always busy running (with active WorkerThreads) by carefully manipulating the simulation time. In particular, it handles simulation time as follows:

  1. The simulation time elapses proportional to the wall time if at least a WorkerThread is active.
  2. When none of the WorkerThreads is active, it advances the simulation time to the nearest "future" time point when at least a WorkerThread can be awakened and become active.
  3. If no such a WorkerThread exists, then the simulation stops.

To achieve this, we keep three variables as such:

  • last_time_updated: last (wall) time to calculate current simulation time based on.
  • time_scale: the ratio between the wall time elapsed and the simulation time elapsed.
  • time_advances: the amount of time advances by far.

The current simulation time is then calculated as follows:

	current_simulation_time = (current_wall_time - last_time_updated) / time_scale + time_advances;

When simulation time advances, the variables are updated as follows:

	time_advances += nearest_simulation_future_time - current_simulation_time;

last_time_updated = current_wall_time;

With the above mechanism, a simulation runs in the same manner a real system does, in the sense that event executions are carried out in real time as opposed to at fixed time points in discrete event simulation (thus termed as "real-time process-based" simulation).  The interactions and interferences among event executions, hence take place naturally as in real systems.  When no thread is currently active, the runtime performs a time-advancing operation to the nearest future at which at least one thread can be activated. This preserves the behavior of real systems, and hence enhances the fidelity of the simulation, while always keeping the simulation running.

The time_scale variable plays an important role in such a simulation methodology.  For example, a thread sleeps for 1 microsecond in simulation actually sleeps for (1.0e-6 * time_scale) second in wall time. By giving an appropriate value to time_scale, for example, 1.0e6, the actual sleep time becomes feasible in the time resolution of the computer on which the simulation runs. On the other hand, we can adjust the time_scale so that the processing delay of an event in simulation falls within the range that we want to model. For example, if the actual processing delay of an event is 1 second in average, and we want to model the processing delay in 1 millisecond, then we set time_scale = 1.0e3.

Note that discrete-time event-based simulation is a special case of real-time process-based simulation with time_scale set to infinity (all processing delay becomes zero in simulation).

~ END ~