Alfalfa Batmagoo
I wanted to build and maintain my own server, and not have to rely on Amazon or Google or Oracle clouds. This was my goal, not out of any philosophical objections to those platforms, but because I am a geek, and I wanted control over the machine from boot to shutdown and everything in-between.
My server is for hosting an AI player of the game of go. Yours is likely for providing content or services of some other type, but the principles are the same.
A Behavior Classification Scheme for the Game of Go
Being: 1) A mechanism for categorizing and characterizing behavioral styles of play, and
2) Its implementation as a "bot", via machine learning (training neural networks).
Thousands of games between hundreds of pros are observed; then the "bot" can not only play go, but it can also:
Emulate the playing styles of other players whose games have been observed.
"Guess the Pro", even when presented with a previously unseen game by that pro.
"Guess the Next Move" when observing a previously unseen game where the identity of the players is known.
Best of all, this system can observe a few games by an individual, and then generate a unique identifier (somewhat like a "fingerprint") for that player. This identifier serves also as a descriptor of some features and aspects of the player's style. As such, it can be helpful for human players in pinpointing behaviors that may be weaknesses for them to strive to overcome, and for their opponents to try to exploit; we all have such weaknesses. Similarly, if a bot knows against whom it is playing it may adjust its own behavior so as to attempt to exploit the opponent's weaknesses, as described in that foe's identifier. It's closer to the truth, to say that individual moves possess such a fingerprint, and that the player identifier is an integrated, cumulative amalgamation of all the behaviors ever observed, in which this player has indulged.
Since a player's behavior changes over time, the identifier we describe below provides a snapshot outlining, or "summing up" the nature of player's behavior so far. Thus, the ID matrix for a specific player may change over time. as the player's level of skill changes. This player identifier is an amalgam of that player's history as we know it. In any case, the terms "fingerprint" and "snapshot" are offered only as analogies, presented here to explicate the following specific syllogism:
A fingerprint is to behavior (individual move), as a snapshot is to identifier (a player's ID matrix).
According to Interpol, "There are three main fingerprint patterns, called arches, loops and whorls. The shape, size, number and arrangement of minor details in these patterns make each fingerprint unique." Similar remarks apply to the moves, the plays in a go game. That is, to individual behaviors in go: There are moves that attach to the foe's stones, moves that attach to friendly stones, and moves that attach to no other stones. "The shape, size, number and arrangement of minor details in these patterns make each" behavior (i.e., move played) "unique." Our opinion is that go-moves should be at least as easy to classify, as are actual fingerprints. It is also devoutly to be hoped that our opinion stands to reason.
As for the notion of a snapshot, the "player ID matrix" introduced below is — like a photograph — the recording of a moment in time, captured along the path of one's progress, representing one's cumulative efforts up until that moment. The identification mechanism presented below provides an ID matrix that is a summation, rather an integration of all the moves, or behaviors — "fingerprints" — we have observed to have been made, by a particular player, during many games. Having generated the ID, the individual games we have observed may be discarded; yet we will still have an identifier describing the player's style. This allows us to compare, contrast, and categorize the various styles of many players — indeed, those of any player — between and among each other.
"I don't know what it is, but I know it when i see it."¹
It may be said, of moves selected by experts to play in a game of go, that they have a quality called "balance" that is lacking in the play of amateurs, and even more lacking in that of beginners. But... "It may be said" that the moon is made of unripe cheese. "It may be said" that I am Marie, Queen of Romania. Saying so does not make those statements true.
Saying "go players should seek balance in their moves" could be a meaningless platitude, or it could be the truth, or it could be somewhere in-between, e.g., true in general with some exceptions and so on. A proof would require a precise definition of balance. In the absence of such rigor, we ask for the reader's patience as we present at least some evidence of the truth of the following assertion (which we ourselves regard as not only obvious, but also as axiomatic):
Most behaviors in which pros actually indulge have a high degree of "balance".²
— Potter Stewart (1915–1985), associate justice of the U. S. Supreme Court from 1958 to 1981; frequently remembered for his famous non-definition of obscenity
— Whatever that may be, and however we may measure it.
The Adjectives of Behavior — In Go, and in Life
The Dimension of Attitude: Pessimism versus Optimism, or, from the Meek to the Bold
In go, as in life, thought and behavior range "from the sublime to the ridiculous." Thought and behavior are two forms of action. The former is focused inwardly, while the latter is focused outwardly. Sometimes people think one thing, but do a different thing, causing a conflict between the inward thought and the outward behavior. Generally speaking, thought and deed should be in harmony, or balance with one another. "Balanced" is one of several adjectives , properties really; that we use without defining rigorously, Several adjectival phrases also come to mind now: "Inwardly focused." "Outwardly focused." "In harmony;" "In conflict."
Cowardly ones, please line up to the left, to the "Meek" side of the mean line. Bullies — Hey, Buddy, I'm talking to you! — line up to the right, to the "Bold" side of the mean line. The more of a coward you are, the more leftward you go, and conversely, the more of a bully you are, the more rightward you go. And what does the "mean line" mean? It's the sublime, stable behavior in the middle, The behavior than which no other behavior has a greater degree of our confidence in its correctness. So the y-axis above represents the percent win-rate, or the "strength" of the behavior where strong means a high degree of balance. That balance is learned by the network from training data.is the output of the network's evaluation of a particular behavior. Every behavior has such a win-rate, the question is of course, how to measure it.
It cannot be stressed too strongly that we are not speaking of "left" and "right" as in political ideologies, but to the left or right of the bell-curve shown in the graphic above. There are cowards on both sides of the political aisle, just as there are bullies on both sides.
The measuring occurs during the feature-extraction phase of training the neural network.
Students of statistics will remember that he width of a bell curve is determined by the standard deviation — 68% of the data points are within one standard deviation of the mean, 95% of the data are within two standard deviations, and 99.7% of the data points are within three standard deviations of the mean. So, in the figure above, the Pathologically Defensive moves comprise 0.15% of the data, and the Pathologically Aggressive moves represent 0.15%. These extremes represent Ridiculous behaviors that comprise only 0.3% of the data: Pros and strong amateurs rarely play such moves, but it happens.
The remaining data points lie between these two extremes, comprising the 99.7% of behaviors that lie within three standard deviations from the Sublime mean. Assuming a Gaussian distribution, as depicted above, we can cover the various intervals with the categories shown:
Pros: [Light blue above] Cautious, but not overly so, hopeful but not overly so. Pro moves are at times somewhat reserved, but rarely timid, and never (overly) generous. At other times they are somewhat ambitious, but rarely eager and never (overly) greedy. (68% of the data.)
Strong Amateurs: [Light blue and purple above] Can range from: Moderately reserved, often timid, and at times even generous, to: moderately ambitious, often eager and at times even greedy. (Another 27% of the data, Along with the Pro data, that sums up to 95% of the data.)
Beginners: {Light blue, purple, and maroon above] Beginner behavior ranges from defensive and generous all the way to greedy and aggressive. (Another 4.7% of things that Pros and Strong amateurs almost never do, bringing us to an accumulation of 99.7% of the data.)
Trolls, wing-nuts, and mistakes: Troll moves include even the pathologically defensive and pathologically aggressive behaviors, that is the entire bell curve, including the ridiculous 0.3% described above. (The remaining 0.3% of the data.)
An alternative labeling for the ranges shown. Each Chinese zodiac animal spans 1/3 of a standard deviation. For example, the Pig's range is from "minus two standard deviations from the mean", to "minus one-and-two-thirds standard deviations from the mean", while the Rat's is from "zero standard deviations from the mean", to "one-third of a standard deviation from the mean", the Ox's from 1/3 to 2/3, and so on.
The dimension of attitude, with Chinese Zodiac animals labeling various ranges within 2 standard deviations from the mean.
A few observations: The Horse "errs on the side of caution", and the animals to its left become increasingly more cautious. The Pig is "generous to a fault" while its traditional enemy the Snake is greedy, verging upon pathological aggression. Notice that each animal's range is opposite its traditional foe's range. The Horse and its traditional foe the Rat, are the most stable of the animals, which is entirely appropriate, considering that that's where they live (in a stable).
The Sheep is opposite the Ox, the Monkey opposite the Tiger, and so on. Say hello to some of my little friends!
It's not our intent to discriminate against nor to denigrate any person, living or dead, regardless of the year of their birth!
The degree to which each player/animal "believes" the proverb: "The enemy's key point is my own.".
For each available move (behavior), each zodiac animal places a different amount of emphasis on its foe's win-rate (should the foe play there) and on its own win-rate, for playing at the same point. The Pig, for example, is so self-absorbed that it almost entirely ignores the foe, placing 91% confidence in its own win-rate, and only 9% confidence in the foe's win-rate, if the foe were to play there.. At the other extreme, the Snake is busy looking for points his foe wants to play. The Snake almost ignores its own win-rate, placing only 9% confidence in that, but 91% confidence in the foe's win-rate. So, where we use the scare-quotes around "believes" what we mean to convey is really "the degree to which each player/animal has confidence in the proverb."
As for "the Divine Move", tradition holds that it's the most sublime, wise, strong, and skillful move of all time. If we accept as axiomatic that its data point theoretically, is the unique behavior which has a degree of balance of exactly 50-50, then by definition the Divine Move possesses that balance and exhibits the qualities of deep understanding and supreme confidence that there is no better move than it. We'll go ahead and assert that it's the (unique) point at the "top" of the curve above. A line tangent to the curve, and parallel to the x-axis, intersects the curve at a unique point and blah-blah-blah. In theory. Thus all other behaviors fall to one side or the other of the mean, where x=0.
We do not presume to speak for the Divine,
In our limited understanding, the so-called "Hand of God" or "God's Move" is the best move (or "hand" in Asian languages, similar to a "hand of poker" in English). It's the likeliest point upon which both players most urgently want to place a stone. It is neither too defensive, nor too aggressive, neither too generous nor too greedy, neither too meek nor too bold, yet may be tinged with both a bit of caution and a bit of hope. It is as reserved as it must be, and at the same time it is as ambitious as it can be. The Divine move is neither pessimistic, nor is it optimistic: It is realistic. (And its timing — a topic about which we write elsewhere — is, was, or will be, perfect. Theoretically.) Therefore:
- The y-axis above could be labeled "Realism", or "Objectivity", or "Detachment", or "Confidence", but in any case y varies from 0% upward to 100%. [Look! More adjectives for our psych glossary: Realistic. Objective. Detached. Confident. Measured as a percentage.]
- Other than the Divine Move, all data points are either pessimistic or optimistic to some degree. [In other words, the data is subject to the constraint that, for all moves which are not the Divine Move, x ≠ 0. Only the Divine Move — the one data element which has never really been observed — is 100% realistic. Irony? You;re soaking in it!]
Footnote about distributions. For serious math geeks only!
Note that the percentages shown above are just points on a slider. Ideally, in theory, we should integrate across each animal's range to get more precise percentages for that range. Here's how to do that: https://www.mathpages.com/home/kmath045/kmath045.htm . [Whew!]
We just aim for the middle of that animal's section of the curve, and throw a dart at that value. It's sufficiently accurate for our purposes and we do not really require the precision that integrating would provide. But here is a really cool calculator, that does the same thing! https://www.hackmath.net/en/calculator/normal-distribution .
We could have used a different distribution to set the values, as long as the pair of values sums up to 100%. However, the Gaussian distribution seems more closely to match the actual data, than does, for example, a linear distribution such as the one shown in the table and graph below:
End of footnote about distributions.
So, given all of that, what, precisely is "balance" in the context of go? By definition:
Balance is achieved when a behavior is neither too aggressive, nor too submissive.
If one plays too aggressively, one may find oneself overextended, leaving faults that may be exploited by the foe in a later stage of the game.
Conversely, if one plays too submissively, one will find oneself over-concentrated, occupying less than half the territory at the game's end.
That is the very essence of balance.
TIME
An important way in which said balance reveals itself is in the timing of a move. The best players always seem to play moves neither "too early" nor "too late", but "right on time", "just in time", or "not a moment too soon". For example, endgame moves that could conceivably be played at any time during the game are left until a later time, after attending to other, more urgently pressing matters. So, it matters not only where a move is played, but when it is played, amidst a global context.
To put it another way, a move that is either too early or too late, is either too aggressive or too submissive for that time, for that moment, of the game. Thus the timeliness, or punctuality, of the better players' moves is (again) balanced.
The Endgame
We use the term "endgame" here in a metaphorical sense, to refer to our goal, our objective. What do we hope to prove?
Ideally, there would be a numeric indicator for each possible play which indicated the degree of balance. Then one may simply "play by the numbers" to win the game.
Is there a way to measure the degree of balance of go moves? The degree to which a move is indicative of a winning strategy? The best move?
Ultimately, might there also be a way in which to capture, to describe, to emulate, the behavioral style of a particular individual? To view go through the unique lens through which that individual views go, and to play go according to that style?
We think so.
As we stated at the beginning, our goals include not only "a bot that plays go", but also bots which are able to:
Emulate the playing styles of other players whose games have been observed.
"Guess the Pro", even when presented with a previously unseen game by that pro.
"Guess the Next Move" when observing a previously unseen game where the identity of the players is known.
Above: Top-level overview of the Gozillago program architecture.
This section is still under construction.
More amazing developments to follow.
Please stay tuned for further results.