Gödel, Escher, Blog

Stevey's Drunken Blog Rants™

Over the holiday I spent many pleasant hours reading Gödel, Escher, Bach by Douglas R. Hofstadter. It's a book I've attempted to read several times in the past, but (like many people) I've never managed to finish it. Despite Hofstadter's best efforts to keep it readable for the layperson, the math would eventually grind my progress to a halt. This time around, I've been pleasantly surprised to find that I can now work my way through all of his exercises, reach some of his conclusions before he spells them out, and generally follow his reasoning. My "math every day" program has paid off at least that much so far.

According to Hofstadter, his book started out as an essay explaining Gödel's Incompleteness Theorem, which staggered the mathematical world in 1931 by proving that no formal system is powerful enough to derive all of the true or false statements in that system. This rather put a dent in the work of Bertrand Russell and Alfred North Whitehead, who in their massive Principia Mathematica were attempting to establish a complete framework for mathematics and logic from first principles. As if that weren't bad enough, Gödel did it by "cheating": he basically established a formal system for describing formal systems, and then demonstrated that at its core there was a paradox, the very thing Russell and others had been trying to avoid.

Amusingly, our "Better Together" promotion (a.k.a. "buy X and Y together for the same price as you'd have paid for them separately") pairs the Principia Mathematica with Kurt Gödel's On Formally Undecidable Propositions of Principia Mathematica and Related Systems. And Gödel's book is only 1/100th the price of Bertrand's. The irony is palpable. Poor Bertrand.

Anyway, Hofstadter, yet another one of those pesky mathematicians, had been seeing the core ideas of recursion, self-reference, contradiction and paradox as a sort of -- well, a recurring theme, if you will, running through a great many interdisciplinary studies, including in art, music, literature, and artificial intelligence research. So he started writing a blog about it, even though they hadn't been invented yet. His exposition of Gödel's Proof grew from an essay into a grand Pulitzer Prize-winning work that everyone knows and loves, but few people have been able to finish. I think it may have won the Pulitzer Prize just based on the Achilles/Tortoise dialogs at the beginning of each chapter.

GEB is one of those books that's filled with wonderful ideas, and it encourages you to be filled with ideas, so I suppose it's no surprise that I had a mild insight while reading it, one that I'll outline a bit in my blog today.

Recursion and Smarts

Hofstadter's book elevates recursion (in the broadest sense of self-reference) to exalted heights, leading to Andrew Plotkin's recursive definition of recursion: "If you already know what recursion is, just remember the answer. Otherwise, find someone who is standing closer to Douglas Hofstadter than you are; then ask him or her what recursion is." For my blog today, I'll be using the slightly more restrictive meaning of recursion as a self-referential function that eventually terminates, because it always operates on a "smaller" version of itself.

When you write a loop or a recursive function, you need some sort of check to see if the loop or recursion should continue. I'll use "recursion" from now on to include recursion and iteration, since for plain old repetition they're formally equivalent, but recursion has broader and deeper consequences when talking about intelligence and intelligent systems. Rather than back that statement up here, I'll just refer you to Gödel, Escher, Bach, which explains it far better than anyone else has done.

For our part, I'm interested in that base-case check: the part of your function that says, on every invocation: "Mom, are we there yet?" I think this little base case, possibly very small compared to the rest of the computation, is maybe more interesting than we usually give it credit for, because it's the part of your function that's actually smart. The rest of the computation occurring is just mechanical work, but the base-case checks are different. I think of them as the part of the code that's the most self-aware, and hence the most intelligent. That's what I'll be talking about in this essay, so if it doesn't make much sense now, hopefully it will by the end.

Programming Yourself

When you're getting ready to write a program, or extend an existing one, you don't just dive in and start coding. Whether you're aware of it or not, you come up with a plan, at least in your head, for how you're going to approach the problem. You choose how you're going to solve it, you decide what exact outputs will satisfy the problem and/or make your boss happy, you estimate how long you think it'll take, you decide how long its lifespan will be and whether you'll ever need to re-use it (and hence, whether you'll need to find a permanent home for it and possibly document it), you guess which libraries or external dependencies it'll need, and so on.

For a small function, you may decide all these things in a matter of seconds, but you still (hopefully) think about them. For a large system, the complexity or size of the effort may inspire you to write your plan down on paper, and possibly share it with others. But you always do some sort of intelligent planning before you start writing code.

I'd argue that in doing this planning and decision-making, you've programmed yourself to write a program. You've established the base cases (e.g. "what output does it need to produce, and how will I verify it?"), and you've set up a bunch of criteria for determining, at various points along the way, whether you're getting closer to your goal of producing the right program.

As you execute your self-program, writing your code, you're probably stopping occasionally, if not constantly, to examine your progress and ask questions such as "am I still on schedule?" and "am I still convinced that my proposed solution is going to solve the problem?" This kind of constant self-appraisal and minor course-correction is clearly one of the things that differentiates humans from machines. As Hofstadter points out, machines can go on adding 1 to a number over and over without ever complaining, or feeling bored. People, however, can't help but "step out of the system" occasionally to ask questions about it, and look for patterns.

I think Hofstadter is actually talking about smart people, since obviously there are plenty of people who set a course for themselves and then rarely bother to think twice about it. You might say they rarely do any "meta-thinking" about their work; they just do it. They may very well be thinking about their work, but only at the base level of how to get things done, and how to execute on their initial plans. But they're not stepping out and thinking about the plan or the work itself, and its place in the grander scheme of things.

So while Hofstadter optimistically assumes that all people are smart and curious and reflective and eager to solve hard problems, I find myself being a bit more skeptical. You see, having conducted well over a thousand technical interviews, I've found that it's far more usual for people to be so challenged by their base-level technical problems that they often charge headlong into them without doing any meta-thinking at all. That's why my project-related questions always include meta-questions like "how many lines of code was your project or program," "how many people were working on the project and for how long", "who was the customer of this project", "how did you know if you were on track", and so on.

I find that smart people always have good answers to these questions: they're always wondering why they're doing what they're doing, and they always have a good grasp on the size, scope, goals, and direction of their efforts. Unsmart people (who, alas, comprise the majority of our interview candidates) not only haven't ever thought about these things, they're usually quite surprised, if not openly hostile, about being asked. They evidently think of themselves as cogs in a machine, one that's being piloted by someone else.

One of the best programmers I've worked with works at Amazon. Aside from just being really smart, she also has a personality characteristic that helps make her a great programmer: she is pathologically averse to being "at fault" for anything that might go wrong. So she writes the most robust code-fortresses I've ever seen; it's the programming equivalent of a survivalist laying tripwires and traps all around his tent, to make absolutely certain he won't be surprised. This programmer I'm talking about is always the first to know if one of her many assumptions in planning or writing the program was incorrect. And her stuff basically never breaks. Or if it does, well, it's not her fault, and she finds out about it before anyone else (and fixes it).

That's a quality that I see in all of the good engineers I know. As they're writing programs, they litter their code with checks, double-checks, validations, assertions, and so on. They never assume anything blindly. And their stuff is generally rock-solid as a result.

The very best engineers (including the one I mentioned above) go a step further than that: they litter their "self-programming" (i.e. the plans they intend to execute) with the same sorts of checks as they put in their code. It's usually not so much explicit as it is a matter of habit, of routine. When you're driving to the mall, you may consciously plan out which roads you're going to take (depending on traffic, weather conditions, etc.), but you're unlikely to add "check rear-view mirror every 7 seconds" into your conscious plan. It's really part of a meta-meta-program that you've built up for yourself, and all your meta-programs inherit from it.

I think that's an important point: self-aware programming happens at several levels, including in the code and in the coding. At the base level, you're doing something straightforward: let's say you're driving to the mall, which is somewhat analogous to writing a program. Driving involves manipulating the car controls, looking for other cars, etc. If you're not thinking at a meta-level, then if you happen to be stuck in a six-hour traffic jam, you may never notice, and the mall could be closed by the time you get there. Similarly, if your code is taking far longer than you originally anticipated, but you're not paying attention at that level, then you could wind up at your deadline, with your manager standing in your office, and nothing to show for it.

Well, your meta-meta-level of consciousness (one of them, anyway) is taking care of looking in the rear-view mirror occasionally, to make sure there isn't an out-of-control logging truck about to plow into you, or a cop with his lights on, or something equally noteworthy. Initially, before it becomes habitual, this check has to be at your first meta-level: part of your plan, and something you need to remind yourself to do. Well, technically your meta-meta-meta-level needs to remind you that it's time to run through all the things you promised to remind yourself about. Consciousness is actually a fairly deep tower of self-awareness, and it's for the most part utterly absent in computer programs.

But the very best programmers manage to embed a little bit of "consciousness" into their programs, and they remain highly self-aware during the process of writing them.

The Smartest Chess Program

Hofstadter mentions a certain Canadian computer-chess championship in which one chess program really managed to impress the judges and experts, even though it wasn't particularly good at playing chess.

What it was good at was knowing when it was licked. It had a good set of meta-heuristics that took stock of the situation and decided: "hey, I'm gonna lose, so I'll just quit now rather than prolong the agony". None of the other chess programs in the tournament had this level of self-awareness built in, so even though they were generally better at playing chess, they would still go through the boring, ritual motions of being checkmated when they'd lost. Everyone considered the self-aware program to be unusually "smart".

I would argue that the smartest programs -- the kind we want running on our machines at Amazon -- are programs that are self-aware enough to know when their fundamental assumptions have been violated, and to have at least some decision-making ability built in for handling such situations. I think it goes without saying that you'd rather have your program tell you when things are going wrong, rather than have your customers telling you, perhaps days later.

Or does it really go without saying? Actually, I think most programmers at Amazon and elsewhere do not want to build that kind of intelligence into their programs. The reason is simple: intelligence is expensive. The more checks and assertions you build into a program, particularly in a loop, the more resources you need in order to execute the code. It'll take more memory, or more CPU, or even network/database/filesystem resources as it goes out to validate assumptions like "is this customer still around", "is the logging system I'm supposedly publishing to still available", etc.

Most programmers are pathologically concerned about the runtime performance characteristics of their code, so much so that they will discard virtually all other measures of code-quality (including intelligence) in order to maximize performance. If any intelligence at all is put into the code, it's typically "protected" by compile-time or runtime flags that allow you to turn it off, so your program can run in Stupid Mode. This is of course the default in production, where the code is seeing the most load, and also where the majority of the assumptions will be tested.

If you've noticed, I've been drawing parallels between the mental processes that you go through as you write code and the actual code that you're producing. Smart programs are written by smart programmers. I leave it as an exercise to the reader to prove whether the contrapositive is also true.

Heirarchical Intelligence

My parallel between "intelligent self-awareness" as a quality metric for code and "intelligent self-awareness" as a quality metric for engineers is, as you may have guessed, just an isomorphism between two levels in a deeper heirarchy. Engineers work on teams, teams work in organizations, organizations in companies, and companies in industries. At each level, I think it becomes harder to build in systematized self-awareness.

Intelligence itself is a tower of meta-levels of self-awareness, and I think people tend to equate the word "intelligence" with "human intelligence", without allowing for simpler versions. We're pretty far away (at Amazon and in the world at large) from building a software system that would pass the Turing Test. But that doesn't mean we need to give up altogether! You can build systems that simply operate at one meta-level above the base "get stuff done" level, and they'll be significantly more robust. They won't necessarily be smart, but they'll be smarter.

At the level of program code, it can be as simple as adding assertions into your code that test your assumptions, and not turning those assertions off at runtime, unless you can pretty well prove that a particular module will never fail. We don't have many of those: it's difficult to prove such a thing about systems or modules that contain side-effects, which is why functional programming enjoys such acclaim in academic settings, although that's a separate essay. For our purposes, if you can't have a function or service call that never fails, the next-best thing is to have it say "Ouch!" very loudly if its internal diagnostics show that it's not working properly. That, of course, requires having internal diagnostics in the first place.

At the next level up, when writing your base-level code, it can be as simple as establishing some habitual checks before, during, and after you write the code. Before you begin, for instance, you may well ask: "Should I really write this code in C++?" That depends on how intelligent you want your system to be. C++ has virtually no introspection capabilities, at least not in any standard way supported by the majority of compilers; this (among many other things) makes it very difficult to write C++ code that would qualify as intelligent or self-aware. You can do it, certainly, but it may be too much effort on both levels we've discussed so far. At the source level, your supporting code may far exceed your base code in both size and complexity, and at the programmer level, the extra effort involved in writing the "intelligent" parts may overwhelm the work required to write the base parts. In the heat of battle, even if you have the best of intentions, you may triage the intelligence in order to meet a schedule. And then at runtime, you (and your code) will have very little insight into what's happening as it runs, and at crash-time, you'll have very little diagnostic information to tell you what happened.

The Meeting Treadmill

At the team-level or organization-level, a whole new set of issues arises for which we rarely build in self-awareness checks. For example, teams often schedule ongoing recurring meetings. A "war team" meeting is a typical example. When you schedule an ongoing meeting, you're setting up an infinite organizational loop. There's nothing in the meeting proposal itself that checks, at each recurrence, whether we should still be having the meeting. The meeting proposal relies on intelligence outside the meeting framework to determine whether it should continue; hence, a recurring meeting is by definition unintelligent. I'm not saying the organizer(s) are unintelligent; I'm saying the meeting itself is unintelligent, because it never stops to ask if it should continue.

It would be nice if your meeting-organizing framework (e.g. Microsoft Outlook) allowed you to program intelligent base-case checks into a meeting as part of its infinite recurrence. But scheduling a meeting isn't as flexible as writing a program, and you're stuck with whatever the framework gives you: for instance, setting an end-date on the meeting. But that's just a guess; you don't usually know when the meeting will really not be needed anymore. It would be better if somehow you could program in some checks that detect whether the initial criteria for having the meeting are all still valid. For a 1:1, it might check whether both invitees are still Amazon employees, and whether one still works for the other. For a war-team meeting, it might try to figure out whether we're still at war. Obviously this requires more organizational metadata than we typically have at our disposal. And it would make scheduling meetings even more of a pain than it is now. But it would have the decided advantage that we would no longer automatically continue to attend recurring meetings, even if we're not sure if they're still needed.

The problem of self-awareness is a very difficult one at the organizational level. It's really hard for a company to course-correct, because they've invested so much in getting onto their course in the first place. Jobs have been created and staffed, plans have been put into place, people have been trained, meetings have been scheduled... the larger an organization is, the harder it is for them to re-evaluate whether they're still doing the right thing. That's why so many companies filled with smart people go down like ships at sea. They build up too much steam, and their momentum carries them straight into the reef or iceberg.

I think this actually happens for a lot of reasons, not just momentum. For one thing, I think there's an issue of "saving face" that sometimes comes into play. You see this with government committees that perpetuate themselves; they find ways to justify their existence, and they'll fight to keep their charter even when any reasonable person can see that the committee had the wrong goals, or has met a point of diminishing returns, or whatever. There's also simple inertia at play: people get comfortable in their jobs, and they don't want to see their jobs go away, even if the job isn't the Right Thing overall.

I think the lack of institutionalized self-awareness is also due, in some part, to modeling technical organizations on military organizations, because the military just seems so darned squared away, at least to those outside of it. My brother and I each spent five years in the U.S. Navy -- he was a jet mechanic for F15s, and I was a nuclear reactor operator -- and we saw pretty clearly that even if the military is effective, it's far from efficient. You're trained from the very beginning to follow orders unquestioningly, to be a cog in a big machine piloted by someone else, which is exactly how unintelligent machines are constructed. So there's no dearth of famous stories of military inefficiency, in which everyone at the bottom knew some process was silly, but there was enough built-in resistance to communicating this upwards that it would be years or even decades before a money-squandering inefficiency was noticed and corrected.

I don't think it makes sense to model a software or technical organization on the military. You don't want your employees to be cogs; you want them to be constantly vigilant for schedule slips, organizational malfunctions, and so on. I think you want them to test the organization's assumptions, just like you want your program code to test your programmers' assumptions. And you don't want your organization to act like a machine, happily adding 1 to a number over and over (especially if the number in question is "the number of SDEs employed here") without ever questioning whether it's the right thing.

But again -- all this really hard to do, especially with larger organizations, for all the reasons I've outlined, and others as well. For today, I guess I'd be happy if we all agreed to write smarter programs, taking the performance hit in favor of robustness and intelligence.

Conclusion and Lessons

I'm in an "action items" kind of mood, so I'll summarize with a few lessons I think we can take away about creating intelligent systems. All of these takeaways are about intelligence, but I've set them up so you can conveniently substitute "self-aware" wherever you see "intelligent" below.

1) Write intelligent programs. You don't need to build HAL 9000, but you should put checks into your code wherever you're making assumptions. Recognize that writing in higher-level languages automatically makes your programs somewhat more self-aware, and that you'll wind up doing it yourself if you write in C++, so you're spending the CPU cycles either way. You might as well not spend the engineering cycles on top of that.

And leave those checks on in production. That's where you need them the most.

2) Program intelligently. Come up with concrete plans for every function you write or modify, even if only in your head. Monitor your own progress, and validate the initial assumptions as you go. If you start getting behind, course correct and/or tell your manager. If you think the light at the end of the tunnel is an oncoming train, definitely tell your manager. This isn't the military, and you're allowed and encouraged to speak up.

3) Schedule intelligent meetings. Your meeting-organizing software is unlikely to provide this facility programmatically, so at the beginning of every ongoing/recurring meeting, you should have a little meta-meeting and determine whether it should continue. Make sure you have good reasons for continuing, and recognize that human nature tends to favor keeping things going long after they're no longer strictly necessary or even useful.

Note that this is different from saying "schedule meetings intelligently." You should do that, too, of course. But this takeaway is specifically about building self-awareness and introspection into the meeting itself (even if only as part of the agenda), so you have a formal way of course-correcting down the road.

4) Hire intelligent people. Ask people meta-questions about their work and their profession. Don't hire people who've obviously never thought about this kind of thing. Self-aware people tend to be less "epistemologically challenged", to quote our own Jay C., and they tend to learn things faster and be smarter, because they know the limits of their own knowledge, and how to push the boundaries.

5) Improve your own intelligence. Everyone can always get smarter. Here's the algorithm I use, expressed (of course) recursively.

function get_smarter():

    1. Am I smart enough yet? (default answer: of course not!)

    2. If yes, go make a grillion dollars, or go fishing, or something. (Note: this statement is unreachable).

    3. Otherwise:

        1. Do various important things, until I have some free time.

        2. Pick a book from my book list and read it.

        3. If I've read it already, check all the references at the end, and add the ones that look good to my book list.

        4. get_smarter()

If your book list is empty, add Gödel, Escher, Bach to it before calling the function for the first time.

And with that, I'm off to go finish the book!

(Published Dec 28th 2005)