We follow the PEP8 style guide for Python. Docstrings follow PEP257. The rest of the document describes additions and clarifications to the PEP documents that we follow at Khan.
You can use
Use 4 spaces -- never tabs -- for indentation.
This is a strict rule and ignoring this can (has) cause(d) bugs.
Unless an exception is explicitly allowed by a codebase owner (Kamens or Jason, for now), __init__.py files should be empty.
Rationale: when you doIf you have code that you think every user of every function inside this directory needs to run first, __init__.py may be appropriate, but you should also consider just creating a function that executes that code, and running the function at the top level (that is, not indented) inside each file in your directory. This makes it more obvious what's going on, and also makes it easier to special-case certain files if the need ever arises.
Using __init__.py to bring variables from sub-modules into the main module space totally defeats the point of having sub-modules in the first place; don’t do it.
For more discussion, see http://stackoverflow.com/questions/1944569/how-do-i-write-good-correct-init-py-files.
Exception: for third-party code, where the module documentation explicitly says to import individual symbols.
If the basename of a sub-module is generic, prefer the
Rationale: This is the single best -- and easiest -- way to avoid the circular-import problem. To simplify, when you say
Another way to think about it is saying
Side note: While this rule helps with most circular-import problems, it doesn’t help with all: python may still need to look up symbols from x even at parse time. For instance, if you say
The downside of this rule is that code gets more verbose: where before you could do
now you have to do
I argue, though, this verbiage is beneficial: in the same way that
Rationale: When I see autocomplete.foo() in the code, and I want to know what it does, it’s helpful to know if I should be looking on the web (because autocomplete is part of the python distribution), or in the local source tree (because autocomplete is written by us). It’s also helpful to know if it’s code we wrote (and the barrier to hacking on it is low) or code someone else wrote (which means it may be easier to just work around any problems I have with it). The three sections tell me that with just a quick glance at the top of the file. Plus, since each section is alphabetical, it’s easy for me to find the import within the section.
Alphabetical sorting is by the main module name (so second word of the line), and ignores case:
Here are some constructs that are not consistent with this style rule:
We are planning (as of 13 April 2012) on moving to a world where third-party (aka ‘vendor’) code all lives in a
PEP257. For more examples, see the Google style guide around docstrings.
To summarize: There are two types of docstrings, long-form and short-form.
A short-form docstring fits entirely on one line, including the triple quotes. It is used for simple functions, especially (though by no means exclusively) ones that are not part of a public API:
Note that the text is specified as an action (“return”) rather than a description (“returns”). This has the added advantage of taking less space, so the comment is more likely to fit on a single line. :-)
If the description spills past one line, you should move to the long-form docstring: a summary line (one physical line) starting with a triple-quote (
A function (including methods and generators) must have a docstring, unless it meets all of the following criteria:
The docstring should end with the following special sections (see the Google style guide for more details).
Modules (files) should have a docstring too, at the top of the file, starting with the usual one-line summary:
Rationale: People will read a piece of code many more times than they will write it. Time spent documenting at write-time more than pays off at read time. What is obvious to you as the code-author, well versed in the module where this function lives, may not be at all obvious to a code reader, who is possibly jumping into this function from some unrelated part of the codebase.The rules here may seem like overkill, especially the need to document every argument and return value. I can say from experience two things: it often does seem like overkill when writing it (especially when the docstring is longer than the function!) but I've almost never thought it was overkill when reading unfamiliar code. You may find, as you write the docstring, you're putting down something that wasn't as obvious as you thought it was:
Even though the meaning of
Exception: if the python file is meant to be executable, it should start with the following shebang line:
Rationale: a shebang line is useless for non-executable files. AnTODO(csilvers): should we put in a line indicating licensing?
Using PEP8 as a guideline for Python formatting runs us head-long into a great debate: the 79-character line limit. For better or worse, the PEP8 limit is part of the lint check for Khan Academy's Python code.
Python expressions end with a newline, not a semicolon, unlike many C-based languages. The trick is that lines can be continued within parentheses, brackets, and braces, or following a backslash. Parentheses are recommended. Backslashes should be avoided.
Notably, splitting string literals doesn't require use of the
This makes splitting long messages easy.
Because Python's indentation style is unlike many C-based languages, your editor might need some cajoling to support it.
There are cases where line splitting doesn't feel nice. Let's look at a few of them, sigh, and move on.
This long method reference needs surrounding parens and splits the line before the dot operator.
This long string path needs to be split.
Sometimes, the best way to avoid long lines is to use temporary variables. This can improve readability in any case.
When a logical statement is split into multiple lines and is followed by an indented line, the continued lines should be indented further to set them apart from the next logical statement.
In the following example taken from http://www.python.org/dev/peps/pep-0008/#maximum-line-length, notice how the second and third lines of the