Parallel Course (July 1994)

{This material originally appeared in the July, 1994 issue of Byte magazine, a McGraw-Hill publication} 


The Taos operating system uses objects from the ground up to enable

processors based on different architectures to work together on the

same problem


Nowadays many computers can generate pretty ray-traced images, and

some of them can even do it fast. What impressed me about the

demonstration I was witnessing was that it was running on a no-name

486 PC clone with some co-processor boards in it, not a Silicon

Graphics workstation. On the other hand parallel-processing

accelerator boards for PCs are not exactly new - no, what made the

event REALLY special was that the PC contained an Intel 486 CPU, four

Inmos T800 transputers and four MIPS R3000s, and it was running the

same binary ray-tracing program in parallel on all of them at once.

What made this feat possible was Taos, a radically different,

object-oriented parallel operating system.

Most operating systems are created either by large hardware

manufacturers or by university researchers, but Taos came from neither

- it's the product of a devoted group of enthusiasts with an idea that

was well ahead of its time. The principal architect of Taos is Chris

Hinsley who was a professional games programmer, with hit titles for

the Atari ST and the Commodore Amiga to his credit. Although writing

solely in assembler, Hinsley devised his own object-oriented

development style based on macros, which sparked the original idea for

Taos. (Incidentally you pronounce its name 'dowse' like the Chinese

religion rather than the New Mexico town). Fired by the launch of

Inmos's Transputer, Hinsley wanted to create a real-time operating

system that could harness the parallel-processing power he believed

would be needed for future multimedia systems.

When I first wrote about Taos in Byte International some four years

ago it seemed outrageously far removed from the mainstream, but the

rest of the commercial operating system world is catching up. Everyone

wants a micro-kernel now, but Taos is already a nano-kernel system,

with its tiny 12 Kbyte kernel running on each processor in a parallel

network. Taligent promises objects-from-the-ground-up with dynamic

binding; Taos has had them from the start. However Taos doesn't really

aspire to mainstream desktop status, but is rather a fast-and-skinny

system for embedded applications and Tao Systems is now promoting it

into the multimedia and games console markets.


VPCODE

By far the most radical aspect of Taos is its hardware independence.

Taos programs are all written in the machine code of a virtual

processor (VP), which is called VPcode. The Taos kernel translates

VPcode into the native machine code of each real processor immediately

before running it - there is little or no runtime penalty, unlike

earlier interpreted systems like the UCSD p-system which were very

slow. Taos's fine-grained object orientation and dynamic binding makes

this translation strategy feasible since VPcode modules are always

small (typical a few hundred bytes) and so can be translated

on-the-fly as they load from disk into memory. Huge monolithic

applications like Excel or Wordperfect wouldn't lend themselves to

this approach, though Tao Systems' translator supremo Andy Henson did

stress to me that a fast modern CPU can actually translate VPcode

faster than the hard disk can transfer data.

The imaginary VP processor is a 32-bit little-endian RISC machine with

16 registers. It supports data types from 8-bit bytes up to 64-bit

double integers and 32 or 64-bit IEEE floats. Hence the VP machine is

a reasonably good match to most real RISC chips like the Alpha, MIPS,

ARM and PowerPC, if somewhat short of registers by today's standards.

It supports around 60 simple RISC-like arithmetic, logical and

branching instructions and a few special pseudo-instructions, like TAO

which calls Taos kernel routines, and LIT which marks literal data

that needs to be translated (eg. from little to big-endian).

The Taos assembler VPASM outputs VPcode, which you can run directly or

you can invoke the appropriate translator manually to convert it to

native code (Text box 1 shows a sample of VPASM source code).

Currently Tao Systems has translators for the Intel 286, 386 and 486,

the Inmos T8 and T9000 transputers, MIPS R3000 and ARM 601. PowerPC

and DEC Alpha are next in the pipeline; it takes around 6 man-months

to produce a new translator.


THE TAOS OBJECT MODEL

Taos is a message-passing operating system whose software model is

based on objects, processes, and messages. An object is a bundle of

data and code that consumes memory, while a process executes an object

and consumes processor time. The Taos hardware model involves multiple

processors each with a local memory, connected by a network of

communication links. Every processor in this network runs a copy of

the Taos kernel and the translator from VPcode to its own native code.

Whenever Taos creates a new object, it allocates the object to a

processor and then starts a process to execute the object.

All Taos objects are constructed from variably-sized blocks of

contiguous memory called 'nodes' which contain two link fields so that

the kernel can manage them in doubly-linked lists. Nodes can contain

data or code, and they have a type field that identifies the type of

object they hold. Taos itself doesn't type-check the application of

operations to data, though you can implement such type-checking at a

higher level within an Object-Oriented programming language.

While stored on disk, or in transit over a communications link, nodes

exist as unbound 'templates' but once loaded into memory they are

converted to 'process-ready' form, and it's at this time that any

translation of VPcode to native code takes place. The Taos kernel on a

particular processor inserts a process-ready node onto a list of other

process-ready objects, from where it can be processed according to the

type of object it holds.


TOOLS, CONTROL OBJECTS AND CLASSES

Taos's pre-defined system object types are Tools, Control Objects,

Bitmaps, Graphical Objects and Class Objects but programmers are free

to define new types. A Tool is a node containing executable code that

can act upon the data contained in an object, to perform calculations

or send and receive messages. A Control Object is the Taos equivalent

of a program, consisting of one or more component tools which are

executed in sequence. Control objects are the smallest unit of

parallel distribution and execution under Taos, but not the smallest

unit of memory management since individual tools can be retrieved from

disk and made process-ready. The kernel which creates a new control

object distributes its template (using a special load-balancing

algorithm) onto some processor which starts a process to execute the

object. When the last component of the control object is finished, the

control object closes and its process terminates. Every component has

at least two tools associated with it, one that executes it and one to

clean up after it dies.

A control object's template contains only the text names of its

component tools, not their actual code. When the kernel creates a new

control object, it first checks whether any of these specified

components are already in memory, and if so just points to them -

otherwise it fetches them from disk and makes them process-ready

(first translating them if necessary). Binding under Taos is thus

fully dynamic, so that no module gets loaded until it's needed and

only one copy is ever present in memory. The Taos processes which

execute control objects are lightweight, equivalent to 'threads' in

operating systems like OS/2, and more than one process can share the

same tool's code in multi-threaded fashion.

Class objects provide the highest level of organization under Taos. A

class encapsulates a group of message-passing objects which can run in

parallel, hiding them behind an OOP method interface. Users of classes

like Window or PolygonWorld make method calls to the class object, for

example to open a new window, and are shielded from the complexity of

the underlying parallelism that's generated by the execution of the

objects hidden within the class.


THE KERNEL AND MEMORY MANAGEMENT

The simplest version of the Taos kernel is just 12 Kbytes in size and

is responsible for multi-tasking (via a time-sliced process

scheduler), memory-management, and the mail and naming systems. Tao

Systems is currently working on a POSIX compliant version of the

kernel which implements virtual memory and memory protection on

microprocessors that have suitable MMU hardware, but the 12 Kbyte

version does not offer these features.

All executable code in Taos is contained in tools, apart from the

small bootstrap loader on each processor which must be in native

code). Even the kernel itself is built from tools and is largely

written in VPcode. Device drivers are simply processes like any other,

running outside and in parallel with the kernel. All message I/O is

handled by link drivers running outside the kernel, though the kernel

handles some local I/O support mechanisms such as data cacheing.

The lifetime of a Taos tool in memory is determined by its status,

according to four different degrees of volatility:

1) VIRTUAL tools are only loaded, translated and bound when called by

another tool, the translated code remaining in memory until the tool's

reference count (kept by the kernel) falls to zero, after which it may

be flushed whenever the kernel needs memory. The kernel may relocate a

virtual tool at any time;

2) NON-VIRTUAL tools get loaded and bound at the same time as the tool

that references them, and they remain in memory, exempt from

relocation, for at least the lifetime of their caller;

3) SEMI-VIRTUAL tools are only loaded and bound when called, like

virtual tools, but they then remain in memory like non-virtual tools;

4) Non-virtual tools can also be flagged as EMBED, which causes the

translator to embed them as inline code in their caller's code. This

is a speed optimization which is extensively used in the kernel code.

A process called the Migrator, running outside the kernel, is

responsible for actually relocating objects in memory and for

incremental garbage collection.


MESSAGES

Since Taos does not support shared memory, the only way for objects

existing in the address spaces of different processors to interact is

by exchanging messages. The lightweight asynchronous mail system works

through just two kernel operations, SENDMAIL and READMAIL, and it's

non-blocking so that the sender continues executing without waiting

for a reply.

All Taos messages are sent to 'mailboxes' belonging to processes,

which act as queues for incoming messages. When a control object is

created and executed it automatically receives a default mailbox,

whose mail address is simply the ID of the child process which

executes the object. The new control object can then send mail to any

other object whose mailbox address it knows, which will always include

its own parents and children, and named resources like disk drives and

VDU displays. Messages may contain a whole list of successive

destination addresses for forwarding, along with the address of their

sender in case a reply is requested.

Taos messages are typed, with 16 reserved types used by the kernel

that include arrays, streams and executable code; error and debugging

data; screen refresh, mouse and keyboard events. A further 16 types

are free for programmers' use. The kernel on each processor handles

all incoming mail for its local objects, all outgoing mail, and mail

to be forwarded to another processor. The typing system enables the

kernel to trap system messages (eg. executable code) and also allows

user-defined objects to prioritize the way they read their waiting

mail. Objects can employ the READMAIL kernel call to read messages

from their mailbox, adding a list of the desired message types as a

parameter. The result of such a call might be a message of the

required type, or the news that there are no such messages - if the

mailbox contains no messages at all then READMAIL suspends the calling

process until some mail arrives, so that you can use mail messages to

awaken sleeping objects.

Taos's link drivers hide the details of the physical transport

mechanism that implements the communication links from user programs

(though real-time performance issues may sometimes intrude). In the PC

demonstration I mentioned at the start of this article the transputers

were connected via their serial links, while the MIPS R3000 chips were

connected together though FIFO chips, and all of them talked to the

486 host CPU via the PC's VL local-bus.


PARALLEL PROCESSING

Taos is a fully distributed operating system which doesn't attempt to

exert central control over the execution of parallel applications.

Obviously in practice you must pick out one processor from which to

boot the system, but once all the kernels are booted Taos programs

tend to spread out over the network of processors in an almost organic

manner, controlled by a distributed load-balancing algorithm. Text Box

2 lists some of the Taos kernel calls, and if you examine the

subsection on 'control object management' you'll see the kind of

services that are available for spawning remote processes. These

kernel calls use the mail system to transfer executable VPcode from

one processor to another.

Information about the system's performance and current loading is

stored in the link drivers that control each communication channel. At

boot-time each link driver benchmarks the processors to which it's

attached (by timing the VPcode translator) and this number is divided

by the number of processes currently running to give a measure of

available power for each processor. The automatic load-balancing

algorithm uses these power figures in the allocation of new processes.

When a tool object arrives at a processor, the local kernel inspects

all the links leading outwards and asks "is there one of my nearest

neighbours who's got more spare power than I have?" - if so the object

is passed on, if not it executes here. Applications that dynamically

spawn many parallel processes spread out like water running down a

mountain, the flow seeking out the 'gullies' or lowest points in

processor-loading space.

Each link driver also maintains a table of encoded information about

the network topology, used by the kernel to route messages. These

tables are dynamically updated at runtime so that if a new processor

is added to the system, news of its existence spreads outwards like a

wave. The nature of the routing algorithm reduces the probability of

deadlock due to circular message paths, and it can usually find

multiple paths between two processors (if they exist) which provides a

degree of fault tolerance if a link fails.

A programmer can always override the automatic load-balancing and

allocate objects to specified processors, by using the OPENDEVICE or

OPENGLOBAL calls, while OPENREMOTE invokes a partially automatic

mechanism where you explicitly send a number of objects to a

particular processor but let Taos distribute them automatically from

there. For example you could specify that a 1000 process ray-tracing

calculation should be run by sending groups of 100 objects to 10

different processors, with Taos completing the distribution.


TAOS ON A PC

Though Taos can support its own file and display systems, the current

release version is PC-hosted, using the MS-DOS file system and a

SuperVGA graphics adaptor for display. I received Taos on six 1.44

Mbyte floppies, though more than three of these were filled with

bitmaps and MPEG animations. I was able to run Taos quite happily on

my 486DX2/66 Elonex PC as a single processor operating system,

coexisting on the same hard disk with Windows (though hardly

surprisingly it would not run under Windows).

Taos comes with a very simple GUI whose look-and-feel is loosely

modelled on Motif (fig.2). Control objects which you store in the

taos/control directory automatically appear on a pop-up menu from

where you can execute them with a mouse click. To supplement this GUI

you can open a shell window and use a command line interface, with a

syntax that resembles DOS. However unlike DOS, Taos command lines

represent genuine pipelines in which each successive command launches

a separate process whose output is fed to the next.

The most immediately striking attribute of Taos is its blazing

graphics speed; you can grab a window in which an image is being

ray-traced and whirl it vigorously around the screen while tracing

continues unhindered. The GUI, which is packaged as a Taos class

object, works to a device-independent virtual screen with only two

hardware-dependent primitives to put and get bitmaps to the real

screen. Apart from SVGA adaptors Tao Systems currently implements the

GUI for several of Inmos's graphical TRAMs (Transputer Modules).

Processes running on remote processors can open screen windows by

sending messages to the processor running the GUI, rather like a

lightweight version of the X Window system.

Taos also encapsulates the MS-DOS filing system within its own object

model, so that DOS disk drives are mapped into Taos servers which you

can send messages about the objects they hold. For example a control

object called TRACE.CTL which is referenced in Taos by the message

\@PC1\CONTROL\TRACE is just the DOS file C:\TAOS\CONTROL\TRACE.CTL;

@PC1 is a Taos server object that aliases my C:\TAOS directory.


THE FUTURE OF TAOS

At present Taos is very deficient in the sort of development tools

that programmers under UNIX or DOS expect to find - the small Taos

team has devoted most of its time over the last two years to getting

the kernel and VPCODE translation system robust, and to building a

variety of graphical tools for manipulating and displaying ray-traced

images and MPEG animations, all written directly in VPASM assembler.

There's a Basic compiler which uses a QBasic-like dialect but as yet

no C compiler - there is however a library called the Taos HLL

Toolset, accessible from VP assembler or Basic, which provides the

functionality of the ANSI C library including malloc, sprintf, fscanf

and all the rest. Work is underway on a in-house C++ implementation.

The much hyped 'multimedia revolution' which puts a new premium on

cheap but high-performance graphics may prove to be a window of

opportunity for Taos. SGS-Thomson/Inmos has made a technology sharing

agreement with Tao Systems to use Taos on its next generation

processors (code-named 'Chameleon') in the games, visualisation and

multimedia markets. Tao Systems is presently negotiating with a large

Japanese communications corporation which is evaluating Taos as an

operating system for the TV 'set-top boxes' that will control the new

domestic multimedia services. These units will have to decrypt,

decompress, decode and otherwise mangle real-time data streams for

'video on demand', videophone communications and other services yet to

be invented - this will require large amounts of processing power, but

must be delivered at domestic electrical appliance prices. A small,

hardware-independent parallel operating system begins to look very

attractive; you can shop around for this week's best processor deals

and issue cheap upgrade cards to provide more processing power.


Dick Pountain - 08/03/94 15:04

--------------------------------------------------------------------


CONTACT:

Tao Systems

PO Box 2320,

London NW11 6PW,

England

-------------------------------------------------------------------


TEXT BOX 1.

An example tool written in VP assembly language. This tool changes the

backdrop picture shown on the Taos GUI desktop to another bitmap file

selected by the user from a browser window.


include 'tao.inc'

node bproc,CONTROLTP,0,TEMPLATE

    control 0,1024,DATATP,0,0,0,0

    component plmthd2-bproc

    compend

    tstring plmthd2,'DESK/BACKDROP'

nodeend bproc

node tl_b,TOOLTP,VP,TEMPLATE

    tool 'DESK/BACKDROP'

    ;inputs

    ;r6=control object pointer

    allocstruct 80,r7

    loop

        ;request filename from user

        cpy r7,r0

        lea bpath,r1

        qcall GUI/GI_BROWSE

        breakif r8=0

        ;gui will be parent

        enquire 'BACKDROP',0

        breakif r8=0

        ;send name to backdrop

        cpy r7,r1

        qcall LIB/PA_SEND

    endloop

    freestruct 80

    ret

    string bpath,'BITMAPS/*.TBM'

    toolend tl_b

nodeend tl_b

end

-------------------------------------------------------------------


TEXT BOX 2. TAOS KERNEL CALLS

This selection from among the 64 Taos kernel calls gives some

impression of the kind of services that the kernel provides.


Mailbox Management

SENDMAIL    Send a mail message

COPYMAIL    Copy a mail message then send the copy

READMAIL    Read a mail message from a mailbox

READTYPE    Read a mail message from a mailbox; wait until

                specified type arrives


Control Object Management

STARTCONTROL    Start a control object locally

OPENCONTROL    Create a control object and start locally

OPENCHILD    Create, distribute and start a control object in the

        network

OPENARRAY    Create, distribute and open a number of different

        control objects

OPENFARM    Create, distribute and open multiple instances of a

        control object

OPENDEVICE    Create and transport a control object to a specified

        processor and start it

OPENGLOBAL    Create, distribute and open multiple instances of a

        control object. Guarantee one control object on every

        processor

OPENREMOTE    Create and transport a control object to a specified

        processor for distribution from that processor, then

        start it

OPENPIPE    Create, distribute and open a pipeline of control objects


Tool Object Management

VCALL        Virtual Call tool object

OPENTOOL    Request tool object load

CLOSETOOL    Close tool object

FLUSHNAMES    Flush named tools from local tool list

FLUSHTOOLS    Flush un-referenced tools from local tool list

UNCLOSETOOL    Increments a tool object's reference count


General Object Management

VADDR        Obtain address of embedded object

OBJPROC        Process an object using the existing thread

LISTPROC    Process a linked list of objects using the existing

                thread

LISTTEST    Search list for types of node


Processor Type Identification & Mailbox ID

FINDTYPE    Enquire processor id of a processor node of

        specified processor type and with a minimum memory

        requirement

GETMYID        Enquire mailbox id of own control object

GETPARENT    Enquire mailbox id of parent control object

GETSERVER    Enquire server mailbox id for an object

NETINFO        Enquire processor and network information