Give your code some space!

Post date: Jun 01, 2014 1:10:58 PM

I have seen recently quite a few different coding conventions regarding the use of white spaces being used by different development teams. This has intrigued me as I have always thought that there ought to be a "best convention" for white spaces, and code format in general. In practice, this seems not to be the case. But this begs the question: why does every team, if given the option, seems to come up with different rules (or lack of rules) for what they consider to be the best use of white spaces?

Status quo

To start with, I would like to discuss the Java code convention used by default in most Java IDEs.

Here's an example:

public class Example {   public static void main(String[] args) {    int answer = 2 + 4 * 6;    for (int i = 0; i < 5; i++) {         doSomething();    }    System.out.println("The answer is " + answer);}      private static void doSomething() {        // something    }}

Notice that this style is probably the most common (if far from the only one) style used in the Java community and other C-like languages.

There is no space within parenthesis. A space is used before parenthesis only if it comes after a keyword such as for or while, but not for method invocations. However, a space is always used before opening curly-braces (which usually come as the last character in a line). Mathematical operators are always between spaces. No space before a semi-colon though.

This may seem natural if you're really used to reading and writing code, but if you are not a programmer, you might think that the rules seem to be quite arbitrary, with many, arguably unjustified, special cases.

More spaces, please

In my current employment, we have two main teams in the office, the Load Test Products team and the Functional Test Products team (not the official names). We both used to use a somewhat unconventional style where white spaces are used much more generously, as the following example shows:

public class Example {      public static void main( String[] args )     {         int answer = 2 + 4 * 6;         for( int i = 0; i < 5; i++ )         {             doSomething();         }         System.out.println( "The answer is " + answer );     }      private static void doSomething()     {         // something     }}

This style makes the code look much less dense and might be considered less arbitrary (eg. there's never spaces before opening parenthesis and there is always spaces between parenthesis, square/curly brackets etc)! I am not saying it is better or worse than the previous style... but it is certainly easier to distinguish words, at the cost of making less code visible in the same amount of pixels.

Regardless of the advantages and disadvantages of each style, what surprised me was that the FTP team actually decided to switch from this style to the more common style mentioned earlier. They thought it was worth the effort to make the change, and actually even discuss it in the first place, so this must be something they consider to be important. The justification was not that one style was better than the other... I believe it focused mostly on what people are mostly used to, and as many of our products are open source, it made it, perhaps, easier for the community to make contributions.

In my team, the LTP team, we decided that our style was just fine and did not consider making changes. At least for me, this decision was based on technicalities: how can we justify changing our style if the new style cannot be shown to be superior in any way (and community contributions are much less of an issue for us).

Forget about rules

I make contribution to open-source projects, such as the Ceylon programming language, and was really surprised that they don't actually adopt any style at all. You can write code using whatever style you want to.

While this may give a developer who might have been really frustrated about the multitude of rules adopted in some companies some relieve, this seemed to me to be a recipe to making your project's code a huge stylistic mess! Not surprisingly, contributors have put their personal preferences to use in every commit, resulting in a hugely irregular code base where, in just a few lines of code, you might see all sorts of different styles:

while (exists cell = iter) {         if (exists elem = cell.element,         elem==element) {             last = cell;         }         iter = cell.rest;     }}if (exists cell=last){     cell.element=replacement;     return true;}

assert (0<=index<length);

 

But here's the main question I would like to consider: can a style (or lack thereof) be superior to another by any criteria? What are the criteria that should be considered? Can this impact on our productivity as Software Developers?

I believe that the answer lies on the answer to the similar question: can code format impact on code readability?

Code format impact on readability

In my opinion, there is an easy answer! Of course it can. Or do you think it's just as easy to read the two examples below (adapted from my own Ceylon project - CeylonCreate)?

function validModuleNameChar(Character c)=>c.letter||c.digit||c in ['_','.' ];if (!trimmedName.empty,     validModuleNameFirstChar(trimmedName.first else 'X'),     trimmedName.every(validModuleNameChar),     !(trimmedName.split('.'.equals,true,false)).containsAny(ceylonKeywords.chain {""})) {     return trimmedName;}

function validModuleNameChar( Character c ) => c.letter || c.digit || c in ['_', '.' ];if ( ! trimmedName.empty,     validModuleNameFirstChar ( trimmedName.first else 'X' ),     trimmedName.every ( validModuleNameChar ),     ! ( trimmedName.split ( '.'.equals, true, false ) ).containsAny ( ceylonKeywords.chain { "" } ) )

{     return trimmedName;}

It may take just a few seconds longer to read the first example, but when you spend most of your day reading code, this difference counts a lot!

Lessons from written language

Instead of focusing on code, perhaps we should look at it from the more general ability of reading text, which is something people have done for thousands of years, so perhaps they have come up with a good solution for that problem?!

In "normal" text, there seems to be well-established rules for white spaces, so you don't see writers arguing about whether there should be white spaces between parenthesis or not!

What would code written in the "normal" text style look like?

Back to our simple example:

public class Example {      public static void main (String[] args) {         int answer = 2 + 4 * 6;         for (int i = 0; i < 5; i++) {             doSomething ();         }         System. out. println ("The answer is " + answer);     }      private static void doSomething () {         // something     }}

Most things in code are just like normal text. You "group" certain words between parenthesis if needed. You have white spaces between words and symbols - and here we notice a distinction from normal coding practice: why don't we use a white space before the opening parenthesis of a method invocation? Or after the dot operator (as there would be in "normal" text)?

I can see the idea behind not adding spaces in those locations probably being that it emphasizes that the expression before the dot or parenthesis is the "owner" of the expression following it, or something to that effect. But do you really need to squeeze them together to be able to see that? That seems clear enough to me when looking at the above example.

Other things do not exist in normal text, like nested expressions... so we need to try to expand the rules a little bit. But that does not seem like a big challenge: just add spaces between symbols as a general rule.

Does the above code look weird to you? Well, I have to confess it does look slightly weird to me, but I am willing to assume it is only because I am not used to reading code like this.

But even not being use to this, I can certainly read the code easily.

And I can definitely say that if all code were regular like this, it would make my life much easier when reading code (which is something I do much more often than writing code).

Final remarks

I hope that no one will argue that imposing the use of a single, well-defined code format rule (at least regarding the use of white spaces) will actually make it harder for developers to write code!

First of all, any IDE (and even some simpler text editors) can be configured to automatically apply the code format rules to everything you write, so you can be really sloppy when writing stuff, as long as you remember to use the shortcut to format your code before any commits/saves. Also, I don't think novel writers complain that the written language format rules are too restrictive or that they impair their creativity in any way!

A good thing is that the currently most commonly used convention on the use of white spaces is not too bad. But a few more white spaces could certainly be introduced to make it clearer where the boundaries between symbols lie... a small modification to get written code a little nearer general written language could fix that.

What do you think?

Comments on Reddit.