python code golf

I've never really gone in for any code-golf games, but one recently caught my eye: word frequency chart.

First, my entry was, as pointed out in the comments, not fully conformant with some additional clarifications in the question. I know, and I accept that ;)

Second, Python doens't seem to be a very good language for this kind of thing.

So my mind wandered, and before I'd even finished my Python entry, I was wondering at the conciseness of golfscript.

Can we make a briefer Python dialect for golf games? My mind raced ahead to a 'golf' Python module. What tricks can we play with a prototyped language to make it briefer?

  • simply make all the functions and methods one or two character names
    • (although I don't like making everything one character; if the convention is that L is length, overloading it to mean lower somewhere feels wrong; however, Lw says lower to me; so we ought sometimes have two or more character names even if strictly we could avoid it)
  • make useful (for golf) functions like re.findall() top-level
  • if a function has no parameters, we can actually use a property instead and compute the value to return it
  • relax the checks of the language
    • subtract a list from a set? what would you expect it to do? So do liberal implicit casting
    • multiply a string by a float? liberal implicit cast to int first
  • add useful functions that aren't around today
    • joining several iterables
    • borrow bits from golfscript like dividing a string (or any iterable?) to make a list of the comonent parts

To my thinking, abusing Python in this way makes a domain-specific language for golf and the total characters of the script are what count, not the characters in the Python module.

So, first what I came up with in pure Python:

import re,sys
t=re.findall("\w+","".join(sys.stdin).lower())
W=sorted((-t.count(w),w)for w in set(t)-set(sys.argv[1:]))[:22]
Z,U=W[0],lambda n:"_"*int(n*(76.-len(Z[1]))/Z[0])
print"",U(Z[0])
for(n,w)in W:print"|"+U(n)+"|",w
(223 chars)

And this was the first version of my Golf Python:

t=R.fa("\w+",I.Lw)
W=St((-t.C(w),w)for w in t.U-Sa[1:])[:22]
Z,U=W[0],lambda n:S("_")*(n*(76.-Z[1].L)/Z[0])
print"",U(Z[0])
for(n,w)in W:print"|"+U(n)+"|",w
(156 chars)

Lets take that apart. First this no import statements (although there could be if you use any other libraries; this is still Python). This is because my 'golf' module has the following code to chain the script into it's context:


if __name__ == "__main__":    
    exec open(Sa[1]) in {'__builtins__': {
        "R":R, # regex wrapper
        ...

So you'd run your script with python -m G yourscript.py args...

We can't monkeypatch the built-in classes like list, dict, set and str. set is special since you always get a set by using the set function and we can rewrite that. But we have no way of intercepting when someone creates an empty list with [] and such.

So we create our own List and Set and S (for string, but we use it a lot explicitly in code so we want the name to be short). That's what the S("_") is doing in there still - its converting a classic boring string into our superstring type S. There is a lot of wrapping and proxying utilities in the G module to ensure that when standard functions return normal lists and strings and such, they get promoted to our supertypes automatically.

The R.fa is an abbrievation for re.findall. R is not a simple alias, however, its a wrapper. It converts the return from a normal list into a List. St means sort.

I.Lw is a input - I for input, and it returns a string (supertype S) that represents all of sys.stdin (read only once, lazy on demand). An S has a Lw property that is the lowercase copy of the string.

Sa is an abbrievation for system arguments (sys.argv). The variable t is a List, which has a U function to get the unique (Set) of the items, and the supertype Set can do subtraction of any iterables and not just other sets.

The full code for my 'G' module is here.

As I was quickly typing this module, I was struck by how little luck it was going to have against golfscript. I started to wonder:

  • If we are execing the script, perhaps we can parse it and inject all the wrapping there - intercept the strings and lists that way?
  • Can we have functions that replace for and even lambda or print? The former, prehaps, but how to get the names of the fields? the latter, harder?

And of course my mind is made up - we need to invent another better-than-golfscript golf-script!

Actually, I plan to go learn golfscript instead :)

Comments