Code reviews at Khan

Author: csilvers@khanacademy.org

Last modified: 20 February 2012

Developers at Khan Academy are expected to have all code reviewed. This page describes how we do code reviews, and why. Sub-pages give specific advice and insight about parts of the code review process.


What are code reviews?

Code reviews are when another person looks at a change to the codebase that you’re making (or have made). It’s basically another pair of eyes on each change that happens to the code. Code reviews can happen in person or over the interwebs, and can happen as the code is being written (as part of pair programming), before the code is pushed to the repository (“pre-commit”), or after the code has been pushed (“post-commit”).

At Khan, the policy is that all changes, no matter how minor, go through a “post-commit” review. We have tooling in place to make this as low-friction as possible.

Why have code reviews?

Code reviews serve several purposes:
  1. They keep code maintainable. (This is, in my opinion, the main advantage of the code review process.) It ensures you are not only writing code that works, you’re writing code that others can understand.
  2. They keep code clean. Sometimes you start to implement a feature one way, end up implementing it another, and forget to fully delete everything related to the first implementation. A code reviewer will notice that: “Why do you have an unused parameter ‘ui’ here”?
  3. They catch big-picture issues early. A reviewer might say, “This won’t work with IPv6” and you’ll be able to rearchitect the change appropriately, before you’ve built a lot of other stuff on top of it. Or the reviewer might say, “You’ll need to support UTF-8 here, because we’re going to be launching this in Chinese soon.” The best reviewers understand not only the code but the requirements behind the code.
  4. They keep you honest. When you know nobody but you will be looking at the code, you may be tempted to take shortcuts that you know will come back to bite you -- for instance, delaying necessary refactoring. Code reviewers will call you on that, saving you time overall.
  5. They ensure that at least two people are familiar with every bit of code Khan has. This makes it easier to brainstorm solutions to problems, since there’s someone with all the necessary knowledge you can talk to. It has ancillary benefits as well: when there’s a bug found in the wild, at least two people are qualified to diagnose and fix it; and it makes it easier to hand off responsibility when you want to move on to another project.
  6. They socialize new engineers into coding styles and practices. Among other benefits, this leads to more consistency in the codebase. On the flip side, it also makes it easy for new hires and interns to contribute: even experienced Khan engineers can learn things from them via a code review.
Catching bugs is not a primary purpose of code reviews, though they often do. (Tests are a better way of catching bugs.) Nor are code reviews meant to be an opportunity to rethink every decision the code author has ever made. Rather, they are an essential part of code hygiene, making sure that a change is understandable, maintainable, and a good fit with the direction of the project.

How does the code review process work?

At Khan, we use a “post-commit” code review strategy for Mercurial code. This means you write and commit your code as normal. It is only when you push your code to the repository that the code review process starts.

We have provided a Mercurial plug-in that wraps the hg push command, providing extra arguments for specifying reviewers and review options (such as text to send the reviewer). It will fail if the user does not specify a reviewer. The command does a normal push, but also automatically opens a review on kiln, assigning it to the person or people you specified.

The plug-in provides instructions on how to use it; you basically will need to edit your .hgrc file.

You will be notified via the normal kilnhg process when reviews have been assigned to you. You can then use the normal kiln process to do the review. The review may be very quick, for trivial changes, or may require some back and forth.

We will send out daily nag-mail to remind you of the reviews that have been assigned to you that you have not yet completed.

Feel free to say, in your weekly snippets, how many code reviews you did! :-)

GIT

For git code, we have a "pre-commit" review strategy, or more accurately, "pre-push" -- you check in your code with 'git commit', then use Phabricator to send the review out.  Once it's been approved, you do 'git push' so others can see it.

If you wish to use Phabricator for Mercurial code, you are encouraged to use the pre-commit style there as well.  (You can do post-commit reviews in Phabricator using the "Audit" tool, but Phabricator is intended to be used pre-commit.)


What makes for a successful code review?

For code authors:
  1. Push early and often. Do not let giant masses of code pile up that then need to be reviewed. It is much easier to do ten 100-line reviews than one 1000-line review.
  2. Pick appropriate reviewers. If you’re editing existing code, find out who wrote the code originally and have them review it. If you’re writing new code, find someone who will be willing to continue to review the code as you develop it. Two reviewers is often the sweet spot, since a single reviewer might be busy/sick/away from email and not get to your review quickly.
  3. “Pre-review” your code. Take a quick glance over your code and see if there’s anything you would push back on if you were reviewing this code. Either fix it before sending for review, or add a TODO to fix it later.
For code reviewers:
  1. Figure out if you're needed. If the review has multiple reviewers, and someone else is already looking at the code, you don't have to! Feel free to just ignore it unless you're particularly interested in (a part of) the review, or the code author specifically called out a piece of the review for you to look at.
  2. Handle reviews promptly. Either review the code as soon as you see the request, or tell the author when you will be able to get to it, or tell the author to find another reviewer (because, say, you don’t feel qualified to review the change).
  3. Stop early if you don’t understand something. If you can’t understand a piece of code, don’t spend a long time puzzling over it. Just write back saying what you don’t understand, and let the author rewrite (or recomment) it so it’s clearer. If you can’t understand it as reviewer, the next casual visitor to the code definitely won’t understand it.
  4. Respect the author’s code. Don’t push back on something just because you would have done it a different way. If the code is readable and maintainable, let it go. On the other hand, definitely do push back if there’s a way to rewrite code that’s cleaner, faster, or better designed.
  5. Don’t require perfection. At Khan we prefer getting stuff out early to having it be perfect. If there’s ways the code could be improved, definitely make sure they’re noted in the code (with a TODO), but if the code is still an improvement on the status quo, let it go. If you and the code author can’t agree on this point, bring in a third person to mediate.

Also check out Ben Kamen's thoughts about what makes for a successful reviewer and a successful code author and Tom Yedwab's reflections on how review's changed the way he coded.

FAQ

Won’t reviews slow us down?

The review process is designed to be as lightweight as possible. There’s very little process put in place. It’s definitely the case that you will be spending time doing code reviews that before would have been spent doing something else, but not that much, and the time is well spent.

Do we really need to code review everything? What about a one-line comment fix?

Yes, all new code should be reviewed, even the most trivial. You’ll want someone to verify that the comment fix is correct and that the new comment is understandable by others. Sure, for trivial changes the common case will be a quick approval without any need for comment, but on the other hand, that interaction also has a very low cost. In fact, the cost for trivial reviews is likely smaller than the cognitive cost of deciding if something is “complex” enough to need a review or not.

However, trivial merges (when resolving a local commit against changes upstream) do not need to be reviewed; they do not involve the introduction of new code.