Home‎ > ‎RFC 0 Blog‎ > ‎

Sally take my hand…

posted Dec 8, 2011, 3:09 AM by Michael Richter   [ updated Dec 8, 2011, 5:05 AM ]
There are two types of programming language that cover 99.44% (the Chambers Constant) of programming language design.
  1. Languages designed and promulgated primarily by academics like Prolog, Lisp, Haskell, Pascal, etc.
  2. Languages designed and promulgated primarily by corporate entities (factoring government as a corporate entity) such as C, Java, Erlang, Ada, etc.
The subject of today's blog is neither of these.  Indeed Forth (for that is the language I'm covering today) has a history as colourful as its creator, Chuck Moore, a very colourful (and opinionated!) character with a very colourful career.  He's also somewhat of a personal hero of mine, although I've never met the man (nor even interacted via email).  This is mostly because of his laser-sharp focus on "simplicity" and "correctness" in software over almost any other concern.  His opinions are, to me, a balm that soothes the rash caused by modern bugware and bloatware.

History of the language

Forth's history goes back into the late '50s and began with a very simple observation: most assembler code that Moore was writing at the time, and that he carried around with him as a programming toolkit, looked something like this (in a pseudo-assembler syntax that corresponds to no real machine):


The actual object code from the above would be the hex code for the CALL instruction (A9, say) followed by an address as a constant.  Moore's first discovery, in 1958, on the road to Forth was that you could get equivalent code in a much smaller space by simply providing a list of addresses and a small "interpreter" loop that just indexed into the function table, calling indirectly on each one.  It was, truth be told, a small amount less efficient in execution, but it was, on the machines of the time, often a 25-50% savings in space, depending on the specifics of the architecture!  (Keep in mind that back in those days 64 kilowords of memory would have been viewed as an unthinkable luxury!)

In 1961 Moore made his second discovery: the use of an implied data stack as the sole mechanism for passing parameters to and from functions led to very nifty unified evaluation semantics and with it to code that was very regular and very easy to reason about.  Four years later he added to the notions of his the idea of making the environment interactive, shifting programming from a mass of paperwork and thought (the maths approach) to more of an empirical system (the sciences approach).  With this he anticipated the current trend toward "agile" methodology by about thirty-five to forty years.

In 1968 Moore's travels down the road to Forth progressed further with the addition of the interactive compiler (and its attendant dictionary) and the return stack.  This gave his code very high performance for its day, with speeds somewhere between classic interpreters (like BASIC) and pure compilers.  The combination of exploratory programming, small size and high speeds made his environment particularly useful and the addition, in 1971, of indirect threaded code was the final feature that led to the Forth we know and love (and, incidentally, is the year the toolkit was finally named and declared a language).

The first real Forth was coded in an IBM 360 and ported to the Honeywell DDP-116 and DDP-316, as well as to the DEC PDP-11 later on.  The first major application to be built under the Forth name was software for a radio telescope.  The NRAO, however, was not interested in commercializing Forth at the time so Moore decided in 1973 to launch Forth, Inc., a company that is still running to this day, to commercialize Forth.

Since then Forth has been used in a bewildering variety of applications including: CAD systems, financial systems, space exploration (!), embedded systems (where it thrives to this day) and a host of other applications which can be found in various places on the web.  Indeed even today Chuck Moore is selling Forth solutions that go right down to the bare metal.  Despite its death being announced with monotonous regularity in many circles, Forth is still going strong, albeit often invisibly so.

That's all very interesting, Michael, but what is Forth?

Answering the question "what is Forth?" is harder that it may at first appear it should be.  I mean it's a programming language, right?  We're all programmers here, right?  We know what a language is.  What's the scoop on Forth?

Well, the scoop is that Forth isn't a programming language in the conventional sense that most programmers think of.  Among other things, Forth has no real grammar and barely any syntax at all.  Worse (or better!) yet, what little it has of either of these can be (and frequently is!) replaced completely.  All while still being considered "Forth".  This is not your uncle Olaf's programming language…

There are two ways to best describe Forth:
  1. It is an abstraction of an abstract machine.  (Yes, indirect abstraction.)
  2. It is an enforced approach to programming.

The abstract machine view

When adopting the abstract machine view, Forth is a data stack (a.k.a. parameter stack), a return stack, a dictionary of run-time routines (called "words"), a small loop that steps through these words (called the "interpreter") and a small routine that accepts input and uses it to build new words (the "compiler") or execute existing ones.  Notice what's missing in that description?  There's nothing there about lexing, parsing, abstract syntax trees, code generation, optimization, etc.  This is because a classic Forth has none of these.  The "lexer", such as it is, simply takes input and creates a list of whitespace-delimited words out of it.  This is tokenization so simple that even C's anaemic strtok() could handle it (indeed it would be overkill!).  There is no real parsing: the input routine executes words as it encounters them, switching into a compilation mode when certain words are executed.  There is no AST.  Instead words are built into lists of words that are themselves made of lists of words that at some point ultimately point to real code.  These words are then placed into the dictionary, a data structure that permits quick look-up of word definitions.  The dictionary, in effect, replaces the AST and the code generation phase of more normal languages.

Making it work

So how does this all work then, if there's none of the usual trappings of programming languages?  Well, the key is that there is a set of words provided by the Forth environment already.  These pre-defined words form the language and its semantics.  (You can take a look at the ANS Forth specification if you want an idea of what words a Forth environment will provide.)  How are these words defined?  Well most of them (including perennial programming favourites like loop constructs and conditional statements!) are written in Forth itself.  Indeed there are only a very small number of primitive words which must be implemented by the runtime in assembler.  After that the rest of a Forth system can be bootstrapped and written in pure Forth.

The key to how Forth works lies in "indirect threaded code", a concept which can be a bit mind-mangling when first encountered.  Rather than wasting blog space explaining how this works, I direct interested parties to this implementation/tutorial (search down to "INDIRECT THREADED CODE") for full, excruciating detail.  In brief, however, the way it works is that each word has a pointer to the code that executes it.  In most cases that pointer is to an interpreter in the Forth runtime that will take a pointer from the word's data and recursively calls itself.  In the case of primitives the pointer points to actual executable code embedded in the word definition.  Most of the work done in a Forth interpreter consists of calling the interpreter to get the next word.

So how does this work?  Well, let's look at the following session:

1    \ Portable Forth Environment 0.33.71 (Dec  4 2011 21:19:36)
2    Copyright (C) Dirk Uwe Zoller  1993 - 1995.
3    Copyright (C) Tektronix, Inc.  1998 - 2003.
4    Copyright (C) Guido U. Draheim 2005 - 2008.
5    ANS/ffa ITC Forth - Please enter LICENSE and WARRANTY. 
6    Running on x86_64 linux-gnu - to quit say BYE. ok 
7    5 6 + ok 
8    . 11 ok 
9    : double dup + ; ok 
10   5 double ok 
11   . 10 ok 
12   100 double ok 
13   . 200 ok 

Material in italic is just the header of the Forth run-time being used.  Text in bold is the interpreter's output.  (Note that ok is a prompt.)  Regular text is text typed in at the console.

The first actual line of code, line 7, has three words: 5, 6 and +.  The first one is recognized by the interpreter as a number and when executed simply pushes the number 5 onto the data (or parameter) stack.  The second is similarly recognized and pushes the number 6 onto the same stack.  After execution the stack looks like this:
6 ←

The + word is a pre-defined word supplied by the Forth environment.  It will take the top two items on the data stack and return their sum.  After its execution the stack will now look like this:
11 ←

Because of the stack-based, implied nature of parameters, math (and most things) in Forth is done in so-called "reverse Polish notation".  This is a format HP and some TI calculator users will be intimately familiar with.  (Note that future examples will show the stack in-lined with the top of the stack to the right, so the first stack diagram above will look like … 5 6.)

Lines 8, 11 and 13 simply execute the word . which takes the topmost entry in the stack off and displays it.  The 11 ok from line 8 is the interpreter's output.

These examples show the interactivity of the Forth environment.  Words are executed as they are encountered.  Executed, that is, unless the environment is switched into the compiler mode.  Line 9 demonstrates this.  The code : double dup + ; does the following:
  • : switches the run-time into compile mode.
  • double gives the name of the word being compiled and starts building the dictionary entry for it.
  • dup takes the…
OK, here we're going to have to back up a bit.  Ordinarily, at run-time, dup will take the top value of the stack and push it on again.  Thus 5 dup leaves a stack of … 5 5.  Since, however, we're in compile mode, what happens instead is that the dictionary is searched for the definition of dup and a pointer to dup's code is placed into the word definition we're building, double in this case.  So going back to the list:
  • dup places the address of the dup word's code in the definition of double.
  • + places the address of the + word's code in the definition of double.
  • ;
Dammit!  Here we go again!  We're in compile mode, but the ; word ends a definition.  How is this done?  The way things are now we should be putting the address of ; into the definition of double, but this is not what happens.  The reason is that ; is defined as an "immediate" word.  Immediate words are executed right away no matter what mode the run-time is in.  So back to the list:
  • ; ends the definition of the double word and puts the rest of the stuff needed in the dictionary, switching the system back to interpreter mode.
There is now a new word in the dictionary which can be executed at will.  Line 10 demonstrates this.  By executing 5, the stack contains … 5.  By executing double, the following occurs (resulting stack in parentheses):
  • dup is executed (… 5 5).
  • + is executed (… 10).
The rest of the session should be clear at this point.

Of note here is that any word can be redefined.  Any word.  I could decide later that double is better expressed as : double 2 * ;, for example.  More dangerously I can decide that 3 is better expressed as : 3 5 ;.  When a word is redefined, any subsequent use of the word uses the new definition.  (Uses of the word previous to the redefinition still use the old definition, one of the nice features of the way the Forth compiler works.)  Even more to the point, any component of the Forth run-time can, with a full environment (which typically features an embedded assembler), be replaced.  Read that again.  Any component of the Forth run-time can be replaced.  This is a level of flexibility that rivals (and possibly even surpasses) the Lisps!

The programming methodology viewpoint

As an almost-inevitable side effect of the way Forth works a very specific kind of methodology is subtly (or not-so-subtly) enforced by the language. The implied parameters living on a ubiquitous stack, for example, subtly encourage the creation of many simple words that operate on two or three stack items at most.  (Any more than that results in some very ugly stack tangling.)  The fact that words must be defined before they are used encourages a bottom-up design approach in the main.  (It's actually a bit more nuanced than that, but for the nuanced view I encourage reading Leo Brodie's Thinking Forth book.  It's useful for more than Forth developers, despite the title.)

In addition, the ability to quickly and easily write new words and test them on a live system immediately lends itself to an exploratory style of programming.  It is common, when developing Forth code, to write and rewrite code dozens of times over in the same time that it takes other languages to compile once.  This rapid feedback cycle lends itself well to finding algorithmic solutions to problems because less time is spent on ceremony and more on the actual problem domain.

Given the above observations, it is tempting to view Forth not only as a language (or, rather, an abstract machine abstraction) but also as a programming methodology.  (Given Charles Moore's open preaching for "simplicity" and "correctness" and his open disdain of most modern software's bloat and unreliability, it's likely that this is an intentional feature of the language.)

The death of Forth

Forth is often referred to (by the ignorant</flamebait>) as a dead language.  To be honest there is some (albeit limited) truth to this viewpoint.  Because Forth isn't one of the 99.44% mentioned at the beginning (corporate or academic originating languages) it doesn't get much fair press.  The language is too pragmatic to be of interest to most academics.  It's not based on increasingly abstract (and irrelevant) maths</more-flamebait>.  It has no type discipline whatsoever.  It features nothing that academics have their papered wet dreams over.

The view from the corporate side is equally bleak.  Forth not only doesn't really support the fancy processes that "enterprise" computing seems to insist upon, it almost, but not quite, actively resists it.  Forth is a language for creative programmers doing creative things.  It is not a language for code monkeys churning out almost, but not quite, identical functions and objects to almost, but not quite, identical specifications according to rituals laid out in law.

Where Forth still lives, however, and indeed thrives, is the embedded systems world.  Forth is in many more places than you think.  It's in the Space Shuttle (or, rather, was before NASA killed that program).  It's in many satellites.  It's in embedded display systems.  It's in the bootstrap loaders of many machines and/or peripherals.  Anywhere that space is at a premium and performance still counts is a candidate for Forth development.

So if it's so great…

…why don't more people use it?  Well, aside from the reasons given above, Forth has one more deep, dark problem.  One that is perhaps insurmountable.  Forth suffers from a syndrome very similar to "the Lisp curse" with a vengeance.  It is such a powerful and uniquely flexible language that it comes with almost nothing out of the box.  There is no standard Forth object system, for example, for those who want to do OOP.  Of course it's only an afternoon's work to make such an object system, but as is usual for such afternoon hacks the result is going to be incomplete and buggy.  It also likely won't play well with other people's object systems, thus making Forth harder to reuse across projects.  Forth is, more than most, an intensely personal programming language.

Worse, however, than this variant curse is Forth's status as a "programming skills amplifier".  A good programmer with good discipline and good habits can write positively great code in Forth; paragons of succinctness, readability and grace.  (Great programmers can write mind-blowing code!)  Most programmers, however, bluntly put, are not good programmers.  Most programmers are the aforementioned code monkeys.  And bad programmers with bad discipline and bad habits write positively horrific code in Forth; code full of unreadable and undecodable stack mangling, inconsistent interfaces, ridiculous names, poor refactoring and generally useless and worthless code.  It is this aspect of Forth that has given it a very unfair reputation as a write-only programming language.

So why should I learn Forth?

Learning Forth does several things.  First, it's good practice in a more familiar world (imperative) for learning the so-called "points free style" of many modern functional programming languages.  Second, it is, by virtue of being a programming amplifier, a good way to test your habits.  If you're finding your Forth code running away from your control and becoming an unreadable, unmaintainable mess it may be time for you to check your habits and see what's wrong.  Third, it's a great language if you want to explore everything from the lowest of the low in levels (you could implement a Forth environment in a weekend that would allow you to dynamically explore a PC at the bit-flipping level) to the highest of the high (the abstraction heights you can reach in Forth can easily match Lisp's).  Finally it will change the way you think of programming and what programming is, which any decent language worth learning will do.

I'm not a code monkey and I want to learn more!

Well that's easy then.  Pick up a Forth environment.  (I favour PFE myself, although GForth is decent as well.)  Skip on over, then, to Leo Brodie's Starting Forth and start working through it.  (Be aware, however, that you might have to take into account differences in environments while working through the book.)  You'll be up to productive speed in no time.  Afterwards, if you want to understand how a Forth works, take a look over at jonesforth, probably the best tutorial/implementation I've seen since Byte Books' Threaded Interpretive Languages that got me hooked way back when.  (I ported the code from that book into the 80286 because I wanted to explore the chipsets in my brand new PC-AT clone.)

If you want to get serious about Forth development (or, indeed, any development), Brodie's Thinking Forth is the place to go after that.  Although the information in it is a bit dated in terms of Forth implementations (it predates the ANSI/ISO standard), if you get this far you'll know enough to know how to take the differences into account.  If you're interested in the historical perspective on Forth's development, the Forth Dimensions archive is a blast to read.  You might even be able to find the article that describes a full Modula-style module system for Forth written in three lines of code.  Three short lines of code…

Also very instructive for learning Forth is looking at existing Forth implementations and programs.  The Forth Interest Group contains links to much of this kind of stuff.  (A particular favourite of mine is the Forth code for making slide rules because I'm perverse.)  If you're gonzo raving loony—like me—you can even take a look at ForthOS!

In addition to the above, if you want something that has many of the virtues of Forth but is better-suited to the more modern way of thinking, the Factor language is a good place to look.  Factor's documentation, however, to put it bluntly, sucks, especially at the conceptual level.  Learning Forth is a good way to get into learning Factor.

There is a lot to learn in Forth.  A lot more than I can show in a reasonably-sized blog entry (or even this unreasonably-sized one!).  It is a full programming environment with a colourful history of over half a century behind it.  It is also, if you know where to look, a vibrant and living language that's well worth the effort it takes to learn it.
Michael Richter,
Dec 8, 2011, 5:40 AM