Alfalfa Batmagoo

I wanted to build and maintain my own server, and not have to rely on Amazon or Google or Oracle clouds. This was my goal, not out of any philosophical objections to those platforms, but because I am a geek, and I wanted control over the machine from boot to shutdown and everything in-between.

My server is for hosting an AI player of the game of go. Yours is likely for providing content or services of some other type, but the principles are the same.

Click here to see how I did it.

A Behavior Classification Scheme for the Game of Go

Being: 1) A mechanism for categorizing and characterizing behavioral styles of play, and

2) Its implementation as a "bot", via machine learning (training neural networks).

Thousands of games between hundreds of pros are observed; then the "bot" can not only play go, but it can also:

Emulate the playing styles of other players whose games have been observed.
"Guess the Pro", even when presented with a previously unseen game by that pro.
"Guess the Next Move" when observing a previously unseen game where the identity of the players is known.

Best of all, this system can observe a few games by an individual, and then generate a unique identifier (somewhat like a "fingerprint") for that player. This identifier serves also as a descriptor of some features and aspects of the player's style. As such, it can be helpful for human players in pinpointing behaviors that may be weaknesses for them to strive to overcome, and for their opponents to try to exploit; we all have such weaknesses. Similarly, if a bot knows against whom it is playing it may adjust its own behavior so as to attempt to exploit the opponent's weaknesses, as described in that foe's identifier. It's closer to the truth, to say that individual moves possess such a fingerprint, and that the player identifier is an integrated, cumulative amalgamation of all the behaviors ever observed, in which this player has indulged.

Since a player's behavior changes over time, the identifier we describe below provides a snapshot outlining, or "summing up" the nature of player's behavior so far. Thus, the ID matrix for a specific player may change over time. as the player's level of skill changes. This player identifier is an amalgam of that player's history as we know it. In any case, the terms "fingerprint" and "snapshot" are offered only as analogies, presented here to explicate the following specific syllogism:

A fingerprint is to behavior (individual move), as a snapshot is to identifier (a player's ID matrix).

According to Interpol, "There are three main fingerprint patterns, called arches, loops and whorls. The shape, size, number and arrangement of minor details in these patterns make each fingerprint unique." Similar remarks apply to the moves, the plays in a go game. That is, to individual behaviors in go: There are moves that attach to the foe's stones, moves that attach to friendly stones, and moves that attach to no other stones. "The shape, size, number and arrangement of minor details in these patterns make each" behavior (i.e., move played) "unique." Our opinion is that go-moves should be at least as easy to classify, as are actual fingerprints. It is also devoutly to be hoped that our opinion stands to reason.

As for the notion of a snapshot, the "player ID matrix" introduced below is — like a photograph — the recording of a moment in time, captured along the path of one's progress, representing one's cumulative efforts up until that moment. The identification mechanism presented below provides an ID matrix that is a summation, rather an integration of all the moves, or behaviors — "fingerprints" — we have observed to have been made, by a particular player, during many games. Having generated the ID, the individual games we have observed may be discarded; yet we will still have an identifier describing the player's style. This allows us to compare, contrast, and categorize the various styles of many players — indeed, those of any player — between and among each other.

"I don't know what it is, but I know it when i see it."¹

It may be said, of moves selected by experts to play in a game of go, that they have a quality called "balance" that is lacking in the play of amateurs, and even more lacking in that of beginners. But... "It may be said" that the moon is made of unripe cheese. "It may be said" that I am Marie, Queen of Romania. Saying so does not make those statements true.

Saying "go players should seek balance in their moves" could be a meaningless platitude, or it could be the truth, or it could be somewhere in-between, e.g., true in general with some exceptions and so on. A proof would require a precise definition of balance. In the absence of such rigor, we ask for the reader's patience as we present at least some evidence of the truth of the following assertion (which we ourselves regard as not only obvious, but also as axiomatic):

Most behaviors in which pros actually indulge have a high degree of "balance".²

— Potter Stewart (1915–1985), associate justice of the U. S. Supreme Court from 1958 to 1981; frequently remembered for his famous non-definition of obscenity
— Whatever that may be, and however we may measure it.

The Adjectives of Behavior — In Go, and in Life

The Dimension of Attitude: Pessimism versus Optimism, or, from the Meek to the Bold

In go, as in life, thought and behavior range "from the sublime to the ridiculous." Thought and behavior are two forms of action. The former is focused inwardly, while the latter is focused outwardly. Sometimes people think one thing, but do a different thing, causing a conflict between the inward thought and the outward behavior. Generally speaking, thought and deed should be in harmony, or balance with one another. "Balanced" is one of several adjectives , properties really; that we use without defining rigorously, Several adjectival phrases also come to mind now: "Inwardly focused." "Outwardly focused." "In harmony;" "In conflict."

Cowardly ones, please line up to the left, to the "Meek" side of the mean line. Bullies — Hey, Buddy, I'm talking to you! — line up to the right, to the "Bold" side of the mean line. The more of a coward you are, the more leftward you go, and conversely, the more of a bully you are, the more rightward you go. And what does the "mean line" mean? It's the sublime, stable behavior in the middle, The behavior than which no other behavior has a greater degree of our confidence in its correctness. So the y-axis above represents the percent win-rate, or the "strength" of the behavior where strong means a high degree of balance. That balance is learned by the network from training data.is the output of the network's evaluation of a particular behavior. Every behavior has such a win-rate, the question is of course, how to measure it.

It cannot be stressed too strongly that we are not speaking of "left" and "right" as in political ideologies, but to the left or right of the bell-curve shown in the graphic above. There are cowards on both sides of the political aisle, just as there are bullies on both sides.

The measuring occurs during the feature-extraction phase of training the neural network.

Students of statistics will remember that he width of a bell curve is determined by the standard deviation — 68% of the data points are within one standard deviation of the mean, 95% of the data are within two standard deviations, and 99.7% of the data points are within three standard deviations of the mean. So, in the figure above, the Pathologically Defensive moves comprise 0.15% of the data, and the Pathologically Aggressive moves represent 0.15%. These extremes represent Ridiculous behaviors that comprise only 0.3% of the data: Pros and strong amateurs rarely play such moves, but it happens.

The remaining data points lie between these two extremes, comprising the 99.7% of behaviors that lie within three standard deviations from the Sublime mean. Assuming a Gaussian distribution, as depicted above, we can cover the various intervals with the categories shown:

Pros: [Light blue above] Cautious, but not overly so, hopeful but not overly so. Pro moves are at times somewhat reserved, but rarely timid, and never (overly) generous. At other times they are somewhat ambitious, but rarely eager and never (overly) greedy. (68% of the data.)
Strong Amateurs: [Light blue and purple above] Can range from: Moderately reserved, often timid, and at times even generous, to: moderately ambitious, often eager and at times even greedy. (Another 27% of the data, Along with the Pro data, that sums up to 95% of the data.)
Beginners: {Light blue, purple, and maroon above] Beginner behavior ranges from defensive and generous all the way to greedy and aggressive. (Another 4.7% of things that Pros and Strong amateurs almost never do, bringing us to an accumulation of 99.7% of the data.)
Trolls, wing-nuts, and mistakes: Troll moves include even the pathologically defensive and pathologically aggressive behaviors, that is the entire bell curve, including the ridiculous 0.3% described above. (The remaining 0.3% of the data.)

An alternative labeling for the ranges shown. Each Chinese zodiac animal spans 1/3 of a standard deviation. For example, the Pig's range is from "minus two standard deviations from the mean", to "minus one-and-two-thirds standard deviations from the mean", while the Rat's is from "zero standard deviations from the mean", to "one-third of a standard deviation from the mean", the Ox's from 1/3 to 2/3, and so on.

The dimension of attitude, with Chinese Zodiac animals labeling various ranges within 2 standard deviations from the mean.

A few observations: The Horse "errs on the side of caution", and the animals to its left become increasingly more cautious. The Pig is "generous to a fault" while its traditional enemy the Snake is greedy, verging upon pathological aggression. Notice that each animal's range is opposite its traditional foe's range. The Horse and its traditional foe the Rat, are the most stable of the animals, which is entirely appropriate, considering that that's where they live (in a stable).

The Sheep is opposite the Ox, the Monkey opposite the Tiger, and so on. Say hello to some of my little friends!

It's not our intent to discriminate against nor to denigrate any person, living or dead, regardless of the year of their birth!

The degree to which each player/animal "believes" the proverb: "The enemy's key point is my own.".

For each available move (behavior), each zodiac animal places a different amount of emphasis on its foe's win-rate (should the foe play there) and on its own win-rate, for playing at the same point. The Pig, for example, is so self-absorbed that it almost entirely ignores the foe, placing 91% confidence in its own win-rate, and only 9% confidence in the foe's win-rate, if the foe were to play there. At the other extreme, the Snake is busy looking for points his foe wants to play. The Snake almost ignores its own win-rate, placing only 9% confidence in that, but 91% confidence in the foe's win-rate. So, where we use the scare-quotes around "believes" what we mean to convey is really "the degree to which each player/animal has confidence in the proverb."

As for "the Divine Move", tradition holds that it's the most sublime, wise, strong, and skillful move of all time. If we accept as axiomatic that its data point theoretically, is the unique behavior which has a degree of balance of exactly 50-50, then by definition the Divine Move possesses that balance and exhibits the qualities of deep understanding and supreme confidence that there is no better move than it. We'll go ahead and assert that it's the (unique) point at the "top" of the curve above. A line tangent to the curve, and parallel to the x-axis, intersects the curve at a unique point and blah-blah-blah. In theory. Thus all other behaviors fall to one side or the other of the mean, where x=0.

We do not presume to speak for the Divine,

In our limited understanding, the so-called "Hand of God" or "God's Move" is the best move (or "hand" in Asian languages, similar to a "hand of poker" in English). It's the likeliest point upon which both players most urgently want to place a stone. It is neither too defensive, nor too aggressive, neither too generous nor too greedy, neither too meek nor too bold, yet may be tinged with both a bit of caution and a bit of hope. It is as reserved as it must be, and at the same time it is as ambitious as it can be. The Divine move is neither pessimistic, nor is it optimistic: It is realistic. (And its timing — a topic about which we write elsewhere — is, was, or will be, perfect. Theoretically.) Therefore:

The y-axis above could be labeled "Realism", or "Objectivity", or "Detachment", or "Confidence", but in any case y varies from 0% upward to 100%. [Look! More adjectives for our psych glossary: Realistic. Objective. Detached. Confident. Measured as a percentage.]
Other than the Divine Move, all data points are either pessimistic or optimistic to some degree. [In other words, the data is subject to the constraint that, for all moves which are not the Divine Move, x ≠ 0. Only the Divine Move — the one data element which has never really been observed — is 100% realistic. Irony? You;re soaking in it!]

Footnote about distributions. For serious math geeks only!

Note that the percentages shown above are just points on a slider. Ideally, in theory, we should integrate across each animal's range to get more precise percentages for that range. Here's how to do that: https://www.mathpages.com/home/kmath045/kmath045.htm . [Whew!]

We just aim for the middle of that animal's section of the curve, and throw a dart at that value. It's sufficiently accurate for our purposes and we do not really require the precision that integrating would provide. But here is a really cool calculator, that does the same thing! https://www.hackmath.net/en/calculator/normal-distribution .

We could have used a different distribution to set the values, as long as the pair of values sums up to 100%. However, the Gaussian distribution seems more closely to match the actual data, than does, for example, a linear distribution such as the one shown in the table and graph below:

End of footnote about distributions.

So, given all of that, what, precisely is "balance" in the context of go? By definition:

Balance is achieved when a behavior is neither too aggressive, nor too submissive.

If one plays too aggressively, one may find oneself overextended, leaving faults that may be exploited by the foe in a later stage of the game.

Conversely, if one plays too submissively, one will find oneself over-concentrated, occupying less than half the territory at the game's end.

That is the very essence of balance.

TIME

An important way in which said balance reveals itself is in the timing of a move. The best players always seem to play moves neither "too early" nor "too late", but "right on time", "just in time", or "not a moment too soon". For example, endgame moves that could conceivably be played at any time during the game are left until a later time, after attending to other, more urgently pressing matters. So, it matters not only where a move is played, but when it is played, amidst a global context.

To put it another way, a move that is either too early or too late, is either too aggressive or too submissive for that time, for that moment, of the game. Thus the timeliness, or punctuality, of the better players' moves is (again) balanced.

The Endgame

We use the term "endgame" here in a metaphorical sense, to refer to our goal, our objective. What do we hope to prove?

Ideally, there would be a numeric indicator for each possible play which indicated the degree of balance. Then one may simply "play by the numbers" to win the game.

Is there a way to measure the degree of balance of go moves? The degree to which a move is indicative of a winning strategy? The best move?

Ultimately, might there also be a way in which to capture, to describe, to emulate, the behavioral style of a particular individual? To view go through the unique lens through which that individual views go, and to play go according to that style?

We think so.

As we stated at the beginning, our goals include not only "a bot that plays go", but also bots which are able to:

Emulate the playing styles of other players whose games have been observed.
"Guess the Pro", even when presented with a previously unseen game by that pro.
"Guess the Next Move" when observing a previously unseen game where the identity of the players is known.

Toward that end, we present the following machine-learning architecture for classifying go-behavior.

Above: Top-level overview of Gozillago program architecture.

Q, How many styles of go-playing are there?

The trivially true answer, not to be glib or dismissive of the question, really is:

A. There are at least as many styles of playing go, as there are go-players.

Every go-player has his or her own unique style. The odds against any game being identical to another, that is, comprising exactly the same series of individual moves, become astronomically huge after about forty moves. Furthermore, because an individual's style is always changing, there are actually more styles of play than there are players! It is not too far off the mark to suggest that every individual move, by each and every player, in every game, is a behavior which, likewise, characterizes a style of play,

We assert "at least" in our answer to the question above, because we may also include imaginary players, such as "bots", whose characteristics may be set to arbitrary values, by means of "sliders". For example. think of potentiometers and tone-controllers in audio equipment, which are used to control electrical resistance, measured in Ohms (Ω).

We may set the values arbitrarily, measured instead in "horsiness" or "rattiness" or "dragoniness", etc. to create an infinite number of player profiles. These may be considered "random variables" for purposes of statistical analysis.

Of course, these behaviors do not occur in a vacuum, but rather within a context that has been created by previous behaviors by Black and White. How a player reacts in a given situation is also indicative of that player's style, or type of behavior, within that specific context,. A part of that context can include "What do I, the bot, know about this specific opponent's past behavior?"

And time marches on.

We further -- somewhat arbitrarily -- define eight stages of the game:

Early fuseki, Deep fuseki, Late fuseki, Early middle-game, Deep middle-game, Late middle-game, Early endgame, and Late endgame.

Are seven enough? Are nine too many? At this point, the reader may be wondering, "where is all this leading?"

time series. 8 phases of the game, 12 animals Ninety-six pieces of pie.

Now, as promised, we present the "Player Identification Matrix", which describes the "style" of a player at a specific point in that player's history, which is to say that it comprises, or encapsulates "what we know about this player so far". Not only (like a fingerprint) is it unique to that player, but it is also unique to the game itself, and as such can also be useful as a hash-value to identify a specific game (Indeed, to index a specific play within a specific game by a specific player and a specific opponent!) The odds that two of these matrices will share identical values (after a few moves into a game) are infinitesimal for, to astronomical against. [As Dave Dyer has shown, a go game quickly -- that is, after about forty moves -- becomes unique, not being required to share its hash-value with any other game of go.]

Eight pies ✖ twelve pieces per pie ⩵ 96 pieces of pie

Each pie is a 12-dimensional vector, and there are eight of them. The size of the slices may vary wildly. Since their sum within the pie must always be 1 (that is, 100% of the pie) their sizes may be expressed as percentages, to the degree of precision desired. Each pie represents a stage of the game, i.e., early fuseki, late fuseki, early middle-game, mid-middle-game, late middle-game, and three flavors (early/middle/late) of endgame.

So, for example, a player could play like a tiger in the fuseki stages, tending toward horse-like play in the early middle-game and to some other zodiac animals for the endgame stages. There are as many styles of play as there are players. The values 0 through 7 represent each stage, but notice that some games may end before entering the endgame phases at all, while every game of go has an early fuseki (zero) stage.

overview.pdf

backenddb.pdf

zodiac stuff.pdf

This section is still under construction.

More amazing developments to follow.

Please stay tuned for further results.