The North Star
The most maintainable and perfect software in the world would be one where you can swap the entire development team with a totally fresh set of engineers every two weeks. The new engineers can contribute code comfortably such that software doesn’t regress, and new features are added at a consistent pace without slowing down.
Maintainable software is not just based on what you end up building, it’s how you build it, and how you enable others to build on top of it. It is not just about code or tests or object oriented principles, it involves people, future people, documentation, minimizing tribal knowledge, automation, consistent and disciplined knowledge of software engineering etc.
Obviously, the real world is not perfect and not ideal. As part of continuous improvement, any improvement should work towards this ideal direction, even though the perfect world will never be reached. This is like a “North Star” – something we will never reach, but will guide our direction.
Real World
So what are some of the things we see in reality that make software not maintainable? Below I outline some continuous improvement themes to work towards, which will make software more maintainable. These can also be considered guidelines of maintainability. They are backed with real world situations based on experience.
Theme/Guideline 1: Follow Object Oriented Principles
Following object oriented principles is one way to make software more maintainable. There are the SOLID Principles, concepts like High Cohesion and Low Coupling, and Design Patterns which serve as a good guide.
In general, improving the modularity, single responsibility-ness of your software will help with maintainability. Software can be considered maintainable in this regard if
Adding a new feature means simply plugging in a new class without having to modify any other class, except for dependency injection code. This allows any new engineer to easily add features.
Updating a feature means only having to make changes in one class, and that class doesn’t serve any other feature. This allows any new engineer not to worry about breaking something else when updating the class.
A good rule of thumb I like to follow is having only one public method per class. I’ve found that it provides good maintainability and avoids temptations of overloading classes. It also keeps your tests simple and effective.
It is almost impossible to create perfect OO Design. No one can perfectly anticipate the direction in which software can change. Requirements can evolve in numerous directions. Hence, part of making your software maintainable is constantly refactoring your system to conform to the direction its going. Martin Fowler has two articles which I really like. YAGNI and Preparatory Refactoring. YAGNI states that we should not build presumptive features. Make the software modular easy to change, but don’t actually build the features you don’t need, since we don’t know the future. I have several experiences where unused features slow us down, especially when it is unclear whether they are actually unused. Preparatory Refactoring states to “Make the change easy, then make the easy change”. Before adding a new feature to the system, we first restructure it such that adding it becomes easy, and then we add it. From personal experience, this is usually faster and requires less cognitive load than the typical “Hack the feature in and refactor later when we have time” approach.
Theme/Guideline 2: Eliminate Tribal Knowledge – Put the Knowledge in the System
Putting tribal knowledge in the system is another way of making software more maintainable. How many times have you read code and thought “Why the hell did they do this? Does this number have any meaning? What will break if I change this number?” This hinders the ability of someone new to change the system. Two techniques to put tribal knowledge in the system:
Add an automated test – If there is some reason for a value to be set in a certain way, put in an automated test that will fail if that value is changed or the constraint is violated. In the error message, include why that value was set that way, and what to consider when changing it. This way an engineer who attempts to change this value, will know and can make an informed choice of if and how to change it. For example, if the number of retries must match the number of values in an enum, put in a test that will fail if they don’t match. That way if anyone changes either the enum, or the number of retries, this test will prevent them from releasing a regression to the software, inline with our definition of maintainability.
Add documentation in the code – If it is not possible to put in an automated test for whatever reason, then add documentation. You can also add documentation in addition to the tests. In the above example, documentation can be added both where you define the number of retries, and where you define the enum. That way if anyone changes either the enum, or the number of retries, they will come across the documentation. In the documentation, include why the value is set that way and what to consider when changing it. Documentation with just “what” and no “why” are seldom useful. Use documentation for tribal knowledge which is not expressed in the code. Don’t just say “this value must be 3”. Also include why it must be 3, why it must match the enum, because that reason may no longer be true in the future, and this will empower the future engineers to make an informed decision of how to change it. If it no longer needs to match, the engineer can now safely remove the test and update the documentation. If it still needs to match, the engineer will safely make changes accordingly. This again is inline with the definition of maintainability.
In general, any knowledge about the software that is in your head and not codified in the system, will go against the definition of maintainability. If you and your entire team are swapped out, the new engineers will not have a way to get that knowledge from the system, resulting in regressions when they make changes. Always find a way to put tribal knowledge in the system. Work backwards from ways a new engineer can break it. What file will they go to? If they unknowingly change something, how can it be caught with a test? etc.
Theme/Guideline 3: Conform to widely known standards
Going back to our definition of maintainability, if you totally swap out your team for a fresh set of engineers – where will they start? How do they go about getting things setup? Is there a common vocabulary we have? Stick to widely followed practices and standards, and don’t invent your own. Many problems are common across development teams, so look around for widely known standards. A few examples:
README.md – When you have a software repository, it is typical to have a README.md file in the root directory, which will give all the instructions as to how to setup your workspace, how files are structured, links to any documentation or wikis etc. This is a widely known standard, open source repositories follow this pattern. Don’t invent a new pattern such as creating a important_file.txt in a directory called not_code/docs, or putting the instructions in a doc file in your company’s drive. It won’t be easily reachable, and the fresh set of engineers will be confused as to how to get to any instructions. Moreover, ensure to keep the README updated with everything a new engineer needs to start contributing. In code reviews, look out for any changes that require README updates, and ask for them. When looking at code reviews, keep the future engineer in mind who doesn’t know what you know about the system.
Naming Convention – There are a set of known software engineering terminologies, like abstraction, encapsulation, inheritance, parent class, child class etc. There are also terminologies on design patterns such as factory, decorator, visitor, delegate etc. Use these terminologies in documentation, and your class/variable naming – because this is shared software engineering vocabulary that speaks a thousand words. If a class is responsible for selecting different subclasses of a DatabaseReader interface based on input parameters, calling it DatabaseReaderFactory, will communicate that, and new engineers will instantly know what to put in that class and what that class does. Calling it something like DatabaseReaderNewObjectCreator might have a similar effect, but it will require additional cognitive load to figure out what exactly you mean. Things like “Is it a factory? Then why is not called Factory? Does it differ from a factory since the author chose not to call It a factory?” will go through the readers mind. Similarly, if the class is slightly different from a traditional factory, don’t call it factory, call it something else. In the documentation add comments on why it is not a factory and how it is different from a factory. This will put the reader at ease. Also don’t call a DatabaseReaderFactory as DatabaseReaderBuilder because Builder means something else in design patterns, and this will confuse readers.
These are just a few examples, and don’t mistake this for “Everyone must know design patterns” – not suggesting one way or the other on that. The general idea is to use widely known standards, use them correctly, and don’t try to reinvent the wheel.
Theme/Guideline 4: Keep Learning about Software Engineering
The fourth guideline on maintainable software is to continuously learn and improve your knowledge on software engineering. This is the shared vocabulary and disciplines required for keeping software maintainable.
As you can see in the above examples, in order for fresh engineers to be able to come in and contribute effectively to an existing system, shared vocabulary is critical. If I never read design patterns, I might unknowingly call something that is a factory, as “DatabaseReaderBuilder” which will confuse engineers after me because Builder means something else. Similarly, if I name something correctly as DatabaseReaderFactory, and the engineer coming after me doesn’t know Design Patterns, they might end up saying “WTF is this? A factory? Wow creative naming, what does it do?”.
As a software engineer, write your systems expecting your future engineers to know as little as possible about the system. However, do expect them to know and learn about software engineering basics and standards. Requiring them to know as few standards as possible is good, but not at the expense of increasing cognitive to load to someone who does know those standards. For example, if using a stream is more readable then using a for loop, don’t use a for loop just because your future engineer might not know streams - expect them to learn that. Another example is deciding to write your caching logic using Aspect Oriented Programming (AOP) or OOP. AOP might sound cool, and is a standard, but it is not so intuitive, and if you can gain equal readability by using traditional OOP, go ahead with that instead. Caching is also something that might require frequent debugging and changing, so OOP might be a better choice since it will enable future engineers to change it more comfortably. Using AOP for things like metrics or logging is less impactful since it won’t impede their ability to change other parts of the software. New engineers can learn about AOP in their time, and won’t have to change the logging code as much anyway, since all its doing is logging.
Theme/Guidelines 5: Good Tests
Good tests can easily help mitigate some of the pitfalls above, but having good tests is not an excuse to neglect the remaining guidelines. Tests are like your guardians, if someone breaks something unknowingly because of lack of knowledge or documentation, your tests will stop them from releasing that change. The best case is to prevent new engineers from breaking things in the first place, but if they do happen to break it, good tests will block it from reaching your customers. I have a separate article on good testing. They key part of a good test is that it covers all the functional use cases of the class or subject under test. A good test treats the subject under test as a black box, and knows no more about it than a customer or regular user of that black box does. Test Driven Development (TDD) is a discipline that forces you to write good tests.