Joseph Weizenbaum's Original ELIZA

Most people who know anything about Joseph Weizenbaum's ELIZA at the level of the program code think that it was written in Lisp and that it has been floating around since the publication of Weizenbaum's 1966 CACM paper. However, surprisingly, the ELIZA program itself was never published, and it wasn't written in Lisp, but in a now-obscure language called MAD-SLIP running on an IBM 7094 computer at MIT. Until now, the ELIZA source code has been missing, presumed by many to be lost, and because many alternate versions have been created over the years, the original source code remained undiscovered.

Weizenbaum's ELIZA was intended as a general conversational agent. It interpreted a separate, domain-specific script that determined its style of conversation. The most well-known script is called "DOCTOR". This is the one that carries on the well-known "Rogerian therapist" conversations that appear in Weizenbaum's 1966 paper, and which are most commonly associated with the name "ELIZA." Indeed, the name "ELIZA" has basically come to mean “the ELIZA agent running the DOCTOR script.” ELIZA can be seen as the precursor of many of the conversation interfaces and chatbots that we have become so familiar with today, such as Siri, but it worked over a much more clunky typewriter-based console.

As Weizenbaum (1967: 475) explains, “From one point of view, an ELIZA script is a program and ELIZA itself an interpreter. From another perspective, ELIZA appears as an actor who must depend on a script for his [sic] lines. The script determines the contextual framework within which ELIZA may be expected to converse plausibly." In Contextual Understanding by Computers (CACM, 1967), he writes: "The first program to which I wish to call attention is a particular member of a family of programs which has come to be known as DOCTOR. The family name of these programs is ELIZA. This name was chosen because these programs, like the Eliza of Pygmalion fame, can be taught to speak increasingly well. DOCTOR causes ELIZA to respond roughly as would certain psychotherapists (Rogerians). ELIZA performs best when its human correspondent is initially instructed to "talk" to it, via the typewriter, of course, just as one would to a psychiatrist."

However, only the DOCTOR script appears in the 1966 paper, not the ELIZA code that interprets that script, although the algorithm is described in great detail. The common misconception that ELIZA was originally written in Lisp arose because shortly after the publication of the 1966 paper, Bernie Cosell wrote a version in Lisp, based upon the description of the algorithm in the 1966 paper. Cosell's version used a version of the published script, but he never saw the original ELIZA.

The language of the original ELIZA was MAD-SLIP. MAD (“Michigan Algorithm Decoder”) was an Algol-like programming language, available on many common large computers at that time. Weizenbaum added list-processing functionality to MAD, creating MAD-SLIP ("Symmetric LIst Processor," Weizenbaum, 1963), a Lisp-like language embedded in MAD. Indeed, adding to the Lisp/MAD-SLIP confusion, the published DOCTOR script, an appendix to the 1966 paper, is parenthesized exactly like a Lisp S-expression, the way one would if one were writing a Lisp program.** Here is a copy of the MAD-SLIP manual (extracted from this University of Michigan manual, which in turn attributes the SLIP writeup to Yale.)

(Re)Discovery of the Original ELIZA

In hopes of discovering the original source code for ELIZA, I recently went [remote] spelunking in Weizenbaum's archives, held by MIT. I was aided by MIT archivist Myles Crowley. This exploration succeeded spectacularly! We found a set of files labeled "Computer Conversations," and the first file folder we opened included a complete source code listing of ELIZA in MAD-SLIP, with the DOCTOR script attached!

I contacted Dr. Weizenbaum's estate for permission to open-source this code, and they granted this permission under a Creative Commons CC0 public domain license.

Here is the real true original ELIZA (reading notes, below); You are among the first to see this code in over half a century!

ORIGINAL_ELIZA_IN_MAD_SLIP_CC0_For_Resease.pdf

Anthony Hay has started to transcribe the core code, here, and I started a translation, based on Anthony's transcription, here. Anthony has also created an annotated version of the core code, here. If others would like to participate in either transcribing or translating the code, let me know and I'll add it to the site and/or link to yours. (Or, I guess, since it's open source, feel free not to let me know, but it'd be great if you made your work public.)

I want to thank Myles Crowley, Special Archival Librarian, MIT Libraries, who was my spelunking guide. As well as (alphabetically), David M. Berry (d.m.berry@sussex.ac.uk), Anthony Hay (anthony.hay.1@gmail.com), and Peter Millican (peter.millican@hertford.ox.ac.uk) who studied the code, participated in discussion of the project, and helped to create this page. We all thank the estate of Dr. Joseph Weizenbaum for permitting the original ELIZA to be open-sourced! Pm Weizenbaum was especially helpful in managing the open-sourcing of the code with Dr. Weizenbaum's estate, and in her careful editing of this page. Alex Moss, and others at the Electronic Frontier Foundation helped us think about how to open source the code.

'Jeff Shrager May 23, 2021

References and Resources

Weizenbaum, J. (1963). Symmetric List Processor. Communications of the ACM, 6(9), 524-536.

Weizenbaum, J. (1966). ELIZA: A computer program for the study of natural language communication between man and machine. Communications of the ACM, 9, 36-45.

This paper remains under ACM copyright, and so is inaccessible, except to members or for purchase. However there are numerous versions of the PDF online.

Weizenbaum, J. (1967). Contextual Understanding by Computers, Communications of the ACM, Volume 10, Number 8, August, 1967, rtro.de/eliza61

Weizenbaum, J. (1984). Computer Power and Human Reason: From Judgment to Calculation, London: Penguin.

Here is a 1961 computer primer for the MAD language, by Elliott Organick. Here is the CTSS MAD manual. And here's a copy of the MAD-SLIP manual from the 7090 user's guide (extracted from this University of Michigan IBM 7090 executive manual, which in turn attributes the SLIP writeup to Yale).

An intersting discussion of Weizenbaum's 1976 "Computer Power and Human Reason", that touches many times on ELIZA, can be found here: https://dl.acm.org/doi/pdf/10.1145/1045264.1045265

FORTRAN ASSEMBLY PROGRAM (FAP) for the IBM 709/7090 (Computer History Museum)

** Interesting additional context can be found here: https://computerhistory.org/blog/the-promise-of-the-doctor-program-early-ai-at-stanford/

Real Conversations with the Original ELIZA (Discovered in the Garfinkel Archive)

Anne Warfield Rawls (Bentley University, US & University of Siegen) and Clemens Eisenmann (University of Konstanz & University of Siegen, Germany) are curating the archives of Harold Garfinkel, the father of ethnomethodology.[1] They recently published an amazing paper in AI & Society about Garfinkel's exploration of ELIZA and other early conversational AIs (Eisenmann, et al, 2023).

That paper mentions conversations between real people and ELIZA, but only reproduces transcripts of interactions with LYRIC (a similar program used by Garfinkel at UCLA). Anne and Clemens provided Team ELIZA with access to the archive, and we found a whole series of printouts of original conversations between the original ELIZA and real people who were subjects in the experiment described in Quarton, et al. (1967).

As far as we know these are the only examples of conversations between real people and ELIZA! [3]

IMPORTANT PROVENANCE AND COPYRIGHT INFORMATION: These materials are reproduced with permission of The Harold Garfinkel Archive, Newburyport, MA; Anne Warfield Rawls, Director, Intellectual Executor, and copyright holder of Garfinkel materials.

References:

Eisenmann, C., Mlynář, J., Turowetz, J., Rawls, AW. (2023) "Machine Down": making sense of human-computer interaction-Garfinkel's research on ELIZA and LYRIC from 1967 to 1969 and its contemporary relevance. AI and Soc. https://doi.org/10.1007/s00146-023-01793-z

Quarton, GC, McGuire, MT, Lorch S (1967) Man-machine natural language exchanges based on selected features of unrestricted input. I. The development of the time-shared computer as a research tool in studying dyadic communication. J Psych. Res. 5(2),165-177. PMID:6056818 DOI:10.1016/0022-3956(67)90029-5.

Notes:

[1] Thanks to Andrei Korbut for making the connection between the ELIZA and Garfinkel archive teams.

[2] This experiment appears to have been conducted with Weizenbaum's original ELIZA, but using a slightly modified script, called "YapYap", which is included in these materials.

[3] The only other known "real" ELIZA conversation is the one published in Weizenbaum's CACM paper, but that conversation was almost certainly constructed for the paper, or at least reconstructed from a real conversation and greatly cleaned up.

20220705: ELIZA Code Discovery Featured on CoRecursive Podcast

Adam Gordon Bell who creates the CoRecursive Podcast recently interviewed Jeff Shrager (founder of ElizaGen.org) about the discovery of the ELIZA code. He did a great job of turning what started out as two hours of making our work fascinating and mysterious!

For fun, here's a short list of mostly minor errors that we've noticed:

Adam confused The University of Pennsylvania (UPenn) with Penn State at one point, but mostly got it right. (This is a common confusion; UPenn used to have t-shirts that said "Not Penn State"! :-)
Jeff called the author of the educational version of ELIZA "Paul Howard". His name is actually "Paul Hayward".
Adam says at one point that Bernie Cosell worked at Raytheon, and at another point that he worked at BBN. Bernie worked at BBN at the time of the events under discussion; BBN was acquired by Raytheon much later.
Jeff calls one of the missing functions "MATCH"; it's actually "YMATCH".
Adam says that SLIP was written in MAD. Actually, SLIP was what we would now call a package that added list processing functionality to MAD. It was written (in this case) in a language (unfortunately) called FAP: the "Fortran Assembly Program".
Jeff claims that MAD weirdly allows you to abbreviate "END" as "E'D". It's true that MAD permits this sort of abbreviation, but there is no "END" keyword; Jeff appears to be referring to the abbreviation: "E'L", which is used where conditional blocks end, and so could me misinterpreted as an abbreviation for "END", but it's really an abbreviation for "End Conditional". (Not that that makes the language any less weird!)
Adam calls the ENIAC "the first general purpose electric computer". ENIAC actually stands for "Electronic Numerical Integrator and Computer", so it was probably best described as the first general purpose ELECTRONIC computer. (I know -- picky picky picky! :-)
Adam implies that ELIZAGen.Org can AUTOMATICALLY tell you something about ELIZA lineages. Would that it were so! There are some notes here and there about what came from what, but the problem that I set out to work on -- building a phylogeny of ELIZA implementations (whether manually or automatically) -- hasn't been achieved. (If you want to work on this problem, lmk!)
Adam implies that Anthony Hay started working on his C++ ELIZA in order to understand the rediscovered MAD-SLIP code. Actually, Anthony had been working on his version for some time before he and Jeff even met. It's true that he improved his version based on the discovered code, and that he was instrumental in understanding what the discovered code did. [Jeff did not make this clear in the interview.]
At the very beginning, Jeff says that in a party of philosophers, he'd probably be a linguist. In fact, Jeff has very little reason to fairly call himself a linguist. It would have made much more sense to call himself a molecular biologist in that setting.
Jeff hypothesizes that Bernie Cosell's ELIZA would have been written "at BBN in Maclisp, I guess would be… Well, it was BBN LISP at the time." This is a bit confused (and confusing). Both MacLisp and BBN Lisp were forks from PDP-6 Lisp 1.5, but whatever BBN used at the time (mid 1960s), it apparently wasn't called BBN Lisp until the early 1970s. These details are just messily confusing, and irrelevant. Let's just say that it was written at BBN in some non-MacLisp branch of Lisp 1.5, whatever it was called at the time at the time. (Note that Bernie's original code is over on the "Commony Know ELIZA Clones" page on this site.)
Adam says: "everybody thought ELIZA was written in LISP [bacause] Snippets of it were published in the original paper, and they looked like LISP S-expressions". This is slightly misleading. The "snippets" Adam is referring to is the DOCTOR conversation driver script, which is published in full in the CACM paper, and, indeed, looks exactly like a Lisp s-expression. And this almost certainly had a part to play in folks thinking that ELIZA was written in Lisp. However, that's not a snippet of ELIZA; It's the ELIZA conversation driver script. Another, probably more important, reason that people thought that ELIZA was written in LISP was that the only known version, until we rediscovered the MAD-SLIP version, was Bernie Cosell's LISP version.
There is a tiny confusion about Geoff Hinton, who was indeed a newly-minted assistant professor at CMU when Jeff Shrager was a grad student. In the next sentence, however, Adam lumps Hinton together with Simon, Newell, etc. saying "This group, they were the leaders of symbolic AI, what we now call GOFAI. Good old fashioned AI. " This is an unintentional mis-classification; Hinton is definitely not among the leaders of GOFAI; he's one of the leaders of the neo-connectionism, which is the antithesis of GOFAI.
Jeff implies that Simon won his Nobel based on the work that he and Newell did on proving mathematical theorems. This is a gross (and unintentional) minimization of Simon and Newell's contribution to cognitive science and AI (as well as behavioral economics, which is what Simon actually won the Nobel for). If one had to put a shorthand description to what they were working on it would be "human problem-solving", not merely mathematical theorem-proving.
Speaking of which, it is implied the Simon and Newell were Jeff's advisors at CMU. They were indeed among his advisors, but his committee chair was David Klahr, who built computational models of child development, and the committee included developmental psychologist Bob Siegler, and cognitive scientist John Anderson.
In the outro, Adam makes an interesting connection, pointing out that Ron Garret, who was interviewed on a previous CoRecursive, helped Jeff "retrieve an old LISP version [of ELIZA] from an Apple II floppy." It's true that Ron did this quite amazing piece of vintage hacking, but he did it well before Jeff was collecting ELIZAs, so the causality didn't work in the implied direction; Ron didn't do it on Jeff's behalf.

SOME NOTES ON READING THE CODE

Note that the ELIZA main program actually starts on page 9 of the PDF (the 4th page of actual MAD-SLIP code), and continues to the end. It is only around 230 lines long. The code before that appears to be a set of sub-functions.

To my eye, the code originally looked like gibberish. There are no comments and only a few helpful labels that provide signposts to what's going on. However, once you learn a little about how to read MAD and SLIP, it becomes a bit easier.

Here is a 1961 computer primer for the MAD language, by Elliott Organick. And here's a copy of the MAD-SLIP manual from the 7090 user's guide.

And just knowing a few abbreviations will help a lot. Most importantly, keywords are abbreviated as follows:

W'R WHENEVER (if)

O'E OTHERWISE (else)

E'L END OF CONDITIONAL (endif)

T'O TRANSFER TO (goto)

OR W'R OR WHENEVER (else if)

T'H <label> THROUGH <label> (loop until label)

F'N <var> FUNCTION RETURN <var> (return <var>)

E'N END OF FUNCTION

Boolean expressions read like Fortran:

.E. equal

.NE. not equal

.L. less than

.LE. less than or equal to

.G. greater than

.GE. greater than or equal to

Dollar signs ($) indicate string constants.

So, for example, this:

W'R WORD .E. $.$ .OR. WORD .E. $,$ .OR. WORD .E. $BUT$

Is saying, essentially:

When WORD == "." or "," or "BUT"

To take a more interesting example, one of the things that the 1966 paper is vague about is the memory mechanism, describing it as a "certain counting mechanism is in a particular state." We found that this refers to LIMIT, which cycles from 1 to 4 and then starts at 1 again. When there is no keyword match (IT equals 0) and LIMIT is 4 and there are stored memories, then you get the oldest previously stored memory:

W'R IT .E. 0

W'R LIMIT .E. 4 .AND. LISTMT.(MYLIST) .NE. 0

OUT=POPTOP.(MYLIST)

TXTPRT.(OUT,0)

IRALST.(OUT)

T.O START

The SLIP augmentation to MAD (i.e., MAD-SLIP) provides many list processing functions, such as POPTOP and LISTMT. MAD-SLIP also includes the pattern matching and reassembly functions, YMATCH and ASSMBL, that form the core of ELIZA's sentence parsing and generation capabilities. YMATCH and ASSMBL are not described in Weizenbaum's 1963 SLIP paper, but are described in the MAD-SLIP manual from the University of Michigan Computing Center (1965).

Here's a copy of Weizenbaum's 1963 SLIP paper (as embedded in Fortran)
Here is a 1961 computer primer for the MAD language, by Elliott Organick
Here's a copy of the MAD-SLIP manual from the 7090 user's guide

Interestingly, there is a GNU SLIP for C++, in use as recently as 2014 (although the language seems to have been modernized). It appears to be based on the 1963 SLIP, and does not include YMATCH or ASSMBL. (Although the GNU SLIP manual mentions that SLIP was used to implement ELIZA, the version here could not be used to do that because it is missing these important functions; at least not without re-programming those functions.)

Regarding the DOCTOR script, note that it is not pretty-printed and so is hard to follow. It appears to be very slightly different from the published script. For example, PERHAPS is missing the (DON'T YOU KNOW) clause, but there seems to be a (DON'T YOU KNOW) clause dangling free after the MAYBE rule. So it's likely that we have a very slightly different version from the one that created the published conversations.

Anthony Hay has done some further analysis on the FAP HASH function, which is used in ELIZA's memory.

On 2014-12-20, Jeff Barnett provided this interesting historical note about the original ELIZA:

The original Eliza was moved to the ANFS Q32 at SDC (one of the (D)ARPA block grant sites) in the mid 1960's. The programmer responsible was John Burger who was involved with many early AI efforts. Somehow, John talked to one of the Playboy writers and the next thing we knew, there was an article in Playboy much to Weizenbaum's and everybody else's horror. We got all sorts of calls from therapists who read the article and wanted to contribute their "expertise" to make the program better. Eventually we prepared a stock letter and phone script to put off all of this free consulting.

The crisis passed when the unstoppable John Burger invited a husband and wife, both psychology profs at UCLA, to visit SDC and see the Doctor in action. I was assigned damage control and about lost it when both visitors laughed and kept saying the program was perfect! Finally, one of them caught their breath and finished the sentence: "This program is perfect to show our students just exactly how NOT to do Rogerian* therapy. *I think Rogerian was the term used but it's been a while.

A little latter [sic] we were involved in the (D)ARPA Speech Understanding Research (SUR) Program and some of the group was there all hours of day and night. Spouses and significant others tended to visit particularly in the crazy night hours and kept getting in our way. We would amuse them by letting them use Eliza on the Q32 Time Sharing System. One day, the Q32 became unavailable in those off hours for a long period of time. We had a Raytheon 704 computer in the speech lab that I thought we could use to keep visitors happy some of the time. So one weekend I wrote an interpretive Lisp system for the 704 and debugged it the next Monday. The sole purpose of this Lisp was to support Eliza. Someone else adopted the Q32 version to run on the new 704 Lisp. So in less than a week, while doing our normal work, we had a new Lisp system running Eliza and keeping visitors happy while we did our research.

The 704 Eliza system, with quite a different script, was used to generate a conversation with a user about the status of a computer. The dialogue was very similar to one with a human playing the part of a voice recognition and response system where the lines are noisy. The human and Eliza dialogues were included/discussed in A. Newell, et al., "Speech Understanding Systems; Final Report of a Study Group," Published for Artificial Intelligence by North-Holland/ American Elsevier (1973). The content of that report was all generated in the late 1960s but not published immediately.

The web site, http://www.softwarepreservation.org/projects/LISP/, has a little more information about the Raytheon 704 Lisp. The SUR program was partially funded and on-going by 1970.

[Ed. note: I think that this is the Playboy article referred to: http://blog.modernmechanix.com/computers-their-scope-today/ -- Jeff@20210619)

20220519: ELIZA Scriptwriter's Manual (Computer History Museum)

The Computer History Museum has generously created a PDF of The 1968 ELIZA Scriptwriter's Manual, by Paul R. Hayward. This document helps to fill out a little-known branch in the genealogy of ELIZA which, soon after it was initially created (in 1965), teachers at MIT, esp. Dr. Hayward, sought to apply it in education. The relevant CHM archive entry is here: https://www.computerhistory.org/collections/catalog/102683842. Interestingly, the version of ELIZA described in this document is greatly extended from the original, including what appears to be a MAD-SLIP interpreter built into the ELIZA scripting language.