Back in 1987 I wrote a user manual for the Occam concurrent programming language for Inmos, and I've always kept up an interest in parallel processing. Occam more or less ceased to be used after Inmos stopped production of the Transputer and then closed down, but there has been a recent revival from an interesting source. A group of teachers and students of Allegheny College in Meadville, Pennsylvania and the University of Kent, UK, have started teaching Occam for robotics control and other embedded systems (www.concurrency.cc). They're using an interpreter they wrote themselves called the Transterpreter (www.transterpreter.org) and they've ported this interpreter to Atmel's Atmega328 processor, which forms the core of the Arduino control board used by many artists and hobbyists (www.arduino.cc), where it runs in a skinny 20KB of flash memory.
As for myself, nowadays I experiment with a simulated system of my own design called PriPar (for Primitive Parallel computers), a software simulation of networks of parallel Turing machines. My PriPar system is implemented as a Ruby class of Turing machine objects, each of which can run its own local program that makes or removes simple marks from a simulated "tape". I chose to employ "half-infinite" tapes with a fixed starting point and then running off (in theory) to infinity on the right: in practice of course PC memory eventually limits their size. And unlike the classic Turing machine, my PriPar machines are able to send messages to one another.
My version 1 machines used a communication protocol based on Occam/CSP where ! means send a message and ? means wait for a message before proceeding - messages were contentless events that either happened or not. This worked well but sacrificed a lot of parallelism while waiting for output. My latest version 2 machines instead send an actual mark, "/" or a blank " " which gets written onto the other's tape, which is far more efficient (compare the programs sequential add.rb and parallel add.rb in the two versions): in effect the message channel becomes a shared tape.
One odd feature of the PriPar system is that rather than avoid deadlock it actually courts and exploits it. The Ruby harness in which a parallel PriPar program is embedded terminates the running program when it detects that all processes are simultaneously waiting - that is, deadlocked. (I even thought about calling them "Deadlock Machines"...) This is a sort of cheat, because each PriPar machine does have a proper halting state 0, and I could have insisted that programs terminate with all processes in state 0. However I've found that the deadlock alternative simplifies programming enormously, at the cost of sometimes producing a spurious result which can be fixed by introducing a synchronisation barrier (see below). The halting problem for parallel programs is a very deep one, and I find this an acceptable compromise.
Here's a flow diagram for the PriPar program whose source code is in the box above - it calculates the values of a quadratic expression. Note that two PriPar machines can read and write from the same tape. The source for this program is included among the attachments below as quadratic.rb.
Although each PriPar machine is a full-blown Turing machine, capable in theory of running any sequential program, in practice I discovered a small repertoire of commonly used routines that I employ as a higher-level language for writing programs. These routines resemble those in dataflow languages: "Inject" reads a tape and sends its contents to another machine; "Tally" receives a stream of messages and writes them to a tape; various other routines implement simple integer arithmetic and logic functions.
I've also added Barrier Synchronisation, which enables complex programs to be divided into shorter sections that enforce synchronisation: at the barrier, all processes have to catch up with one another, then the next section is started. This scheme also enables dynamic programming, where the program running on one or more machines gets changed at the barrier - it's even possible to create new machines at the barrier and start them running. You observe the results of a PriPar calculation using the built-in trace function which animates activity on the tapes.
I have no idea where the limits on PriPar computation are yet. So far I can compute polynomial expressions, perform a crude kind of sort using either a tree of PriPars, or dynamic program changes at each pass, and I can add up terms of an arithmetic series. I can't so far compute the factorial function properly, though I've written a "cheat" version just for timing purposes which uses external Ruby code to put the successive integers into a series of tapes. I think that PriPar might need to be extended to allow each machine to output to multiple others to avoid this cheat: then one machine could set up the tapes for all the others. Multiple communication channels per machine is the next obvious enhancement.