Improving Code Readability
Greetings friends of programming, this is Marco from 2GuyGames. Today I thought we should talk about code readability. Readability is one of several measures that estimate code quality. As I explained in my last post, I think readability is the single most important criteria for code quality. Most, if not all, other measures of quality rely on readability in one way or another. Take changeability and maintainability as an example. Both are concerned with architecture at their core, yet without readability they are meaningless. A code that cannot be read can neither be changed nor maintained. As a less obvious example, let us look at execution speed. Execution speed means that you want your algorithm to run in as few CPU cycles as possible (in most cases). This means that you have to improve your code structure. This goes back to the aforementioned changeability. If you cannot read the code, cannot understand it, you cannot change and improve it. Makes sense? Good!. So without further ado, let us dig into readability!
What does “readability” even mean? Putting it into a non-programmer context, one would describe a handwriting as “unreadable”. Of course, text on a PC is always this kind of readable, because it is designed to be. Readability in a programming sense is more concerned with whether or not the text can be understood (easily). Code that you scratch your head over for hours, trying to understand it, would be called unreadable. On the other hand, code that is very clear at a glance is considered readable. Thinking about it, maybe the term “readability” is not well chosen.
Anyway, to start with we should ask ourselves: What makes code unreadable, meaning hard to understand? I think readability exists on three different scopes. The system, the module and the methods. Of course, all of these are somehow interrelated. The system’s readability is determined by the architecture that was chosen and how well it was executed. A convoluted architecture that looks like patchwork is hard to understand and to extend. To improve a system’s readability, Design Patterns (https://en.wikipedia.org/wiki/Software_design_pattern) are very useful. They each have very specific and well documented advantages and disadvantages. This gives us the opportunity to choose early on which aspects are important for our software and which are not. A good example for this is the Abstract Factory Pattern (https://en.wikipedia.org/wiki/Abstract_factory_pattern). Let’s look at Sid Meier’s Civilization to illustrate it: In Civilization several players play different countries. These countries have different quirks. These include slightly altered units that belong to the same archetype. For instance, a Japanese knight is a Samurai that has improved stats compared to the standard knight unit. Now, Civilization is a game with a long lifetime. This means we want to make a lot of money over a long period of time. Of course, this means DLC! In DLC we want to be able to easily introduce new countries with differing units. The Abstract Factory Pattern lets us do exactly that. If we implement countries as factories and units as products, it is a simple task to add a new country to the game. All you have to do is extend the default factory (or newly implement the interface, whatever you prefer) and implement the necessary methods that create your units. Easy as pie! However, implementing it like this will make it very cumbersome to create a new unit archetype. This is because a new product requires you to implement a new method for each factory, meaning for each country that already exists. So you have to touch every factory class that already exists. This goes against the Open-Closed-Principle (https://en.wikipedia.org/wiki/Open%E2%80%93closed_principle). Still, this exemplifies how Design Pattern can give us a good starting point for our software. This ensures that we have a clear vision of what our software should look like, making the architecture clear and readable!
Now let us turn to the module scope. As I mentioned, all levels of readability are interrelated. Having a good architecture, i.e. system readability, is a good first step to having readable modules. In OO programming, classes constitute our modules. Of course there are different kinds of modules, but for this article let’s talk about classes specifically. Having a sound architecture doesn’t make your classes well written. If you have a good architecture, all classes should already have a well defined purpose. This is important, because classes without a clear-cut purpose are probably the single most readability impeding factor! A class should have one — and only one — well defined and necessary purpose. A class without a purpose is obviously superfluous, as it adds nothing to the software. On the other side, the more common mistake I see, is classes with multiple purposes. A class with multiple purposes is often easily identified by the number of class variables. If a class has 10 or more variables, it probably does more than it should. The next indicator is how often these variables are used. If a variable is only used by one or two out of 10 methods, it probably doesn’t belong. The same might be the case for the methods using it. We call this “coherence” of the class. A class with high coherence uses all its variables in all its methods. Of course, in actual practice this is hard to achieve. In reality you will often find a set of five or so variables, out of which two are used in almost all methods. The remaining variables tend to be used in about half of all methods. In my book, this is a practical coherent class. What makes classes with multiple purposes so hard to read is the fact that you always have to ask yourself which method belongs to which purpose. You have to build a mental model of each purpose and each method that belongs to it and how they are related, if at all. It should be quite obvious that this can get out of hand very quickly, as the complexity rises exponentially with each purpose and method. Of course, this makes working with the class very hard, leading back to the points I made at the beginning.
The last scope in my division of readability is the method scope. It is the one that probably most think about when they consider readability. Readability on a method scope is often attributed to the complexity of the algorithm that was implemented. This means if your algorithm is complex, then your code must be hard to read. I disagree. What I see most commonly in hard to read methods is that they are too damn long. Methods that have more than 10 lines quickly become classes of themselves, with tons of variables, scopes, branches and so on. Just like a class, a method should have one specific purpose. The implementation of one algorithm is not necessarily one purpose. An algorithm often has multiple parts, which can be split into different methods (sometimes even classes). Having several short methods — with appropriate names, mind you — are far easier to understand than one 50 or so lines method. Another quite common mistake is jumping between levels of abstraction in one method. For instance, if a method calculates something and in between there is a method call, there is no way to understand the method without also looking at the method being called. However, if you also wrap the calculation into another method and call that, you even out the levels of abstraction. In most cases, this will make the method easier to read.
So that’s it for the day. Readability is a very complex subject and I only scratched the surface, but I hope I gave you something to think about. Next time Stefan will be back and talk about our hybrid business model in more detail. Hope to see you then!