Lately, I see an excessive use of code-formatting tools. The basic argument is that if the code follows a strict standard, it is easier to read. Yet, what is easier to read for a computer is not necessarily easier to read for humans, and this article will present situations where understanding the code style is more important than just following a simple recipe that can make things really weird.
Introduction
What code do you think is easier to read and will pass the formatting validation tool? - I am using images in this article because I don't want lines to be broken at random places because of HTML issues.
If you said the first option is more readable, you are right!
If you said the second option is more readable, you think like me and are "mostly wrong" according to a lot of new coding practices. I am not kidding... according to some people and the tools they use, the first code is the most readable code. It uses opening and closing braces on the if
. It correctly aligns one argument below the other and, because the ))) {
would span to a new line assuming the indent, it did an extra line-break aligned right to left, to put all the missing )) {
braces.
The second code is clearly unreadable because there is no use of {
and }
for the if
. How would anyone know what was meant there when there isn't the {
and }
braces on the if
?
If you think what I just said looks like a joke... well, to me it is a very bad joke, yet I am seeing more and more places using "auto-formatting tools", trying to impose good coding-style and quality, but making easy to read code be unacceptable while cryptic code is just good.
Not Understanding the Problem
My main pet-peeve right now is that people are completely ignoring the purpose of the programming practices, including styling.
Instead, they are just putting rules that need to be blindly followed saying that it creates "standardized code" and that's just better and more readable.
Yet, what is easier to read for a computer is not easier to read for a human. Writing an entire app on a single line is perfectly readable for a compiler in many programming languages. That simply doesn't match how humans read code.
Using formatting tools is great when they work. Having a consistent code style is great when it really makes things more readable. But are the standards companies are enforcing really making things more readable?
I will give another example now. This is how I would write this fictional piece of code:
Yet, I know that many people will immediately complain about the lack of braces on the if
s. Also, because the braces will waste too many lines, they will remove the blank lines, and even the method declaration will use (
and )
in a different manner. They would end-up writing something like the following:
Apparently that's "more readable" and "less error prone". Notice that on the method declaration, the line break was not put around the parenthesis. It was just done at the last possible moment.
But things don't end there. Too many people were told (or somehow just believe) that using var
is a problem, as we don't know the type of the variable. So, we would need to write it explicitly, like this:
And, independently if I consider this much less readable right now, many people will start avoiding declaring local variables if the types are messy. Strangely, using var
to have a local to reference obj.SomeProperty
is bad... but writing obj.SomeProperty
all over the place (which also doesn't show the type immediately) is not. So, they would end-up with something like this:
And, from all the presented solutions, only the first solution is seen as wrong. All of the others are accepted as the "good and readable ones".
Are they, though?
The Braces Problem is also a Left to Right Problem
The "always use of braces" is a problem on itself. Yet, I think it is actually based on a huge misunderstanding of the main problem: We read left to right.
When talking about always using braces vs using braces only when necessary, my point is that if the opening one is put at the end of the line, we need to always use them because we don't know if readers are going to read the entire line before assuming an end-brace is there. So, we need to consistently use {
and }
at every statement that might use one (if
s, while
s, etc).
On the other hand, if they are put on their own lines (and correctly indented) we could just avoid them when we have inner single-line statements. In fact, I always read the {
as "begin multiple", and it makes no sense to use a "begin multiple" to have just one inner statement.
I am not trying to give arguments about always using braces or not, what I want to show about those two styles is that if we know it is either one or the other, we can read just half the line and still get a good clue of what's happening.
Yet, in some of my examples, that was not the case. Take another look:
vs:
In the first case, we see an if
with a method call, the beginning of an argument, and we don't know why we have apparently some blank lines before the DoSomethingHere
.
On the second case, because we are using just indentations to show the relationship between things (and then a single line when the group ends), we can more easily assume the relationship is there. Of course, that if somebody purposely misindents things, we might get things wrong, but assuming that's not the case, and the indentation is correct, we can get a better idea of what's happening just looking at the beginning of the line.
Unfortunately, many tools assume that indenting items one below the other is "easier" to read, which means we cannot use tabs or everything breaks quite easily (and it also happens if we rename a method to a different length using a refactoring tool that doesn't also reformat the code).
Many tools will only break the line at the latest possible moment, not when it makes more sense, as a way to allow more "compact" code. But this means the next line of code might be a continuation of an inner call, not a continuation of the main call... and if the main call also spans multiple lines, everything needs to be reformatted.
Then, because some code practices mean code spans too many lines, new resources are added to the language to avoid spanning too many lines... and that, to me, is a real issue, as people are writing cryptic code to try to avoid some of the enforced practices.
Cryptic Code to Avoid Multiple Lines
I love how new language constructs allow things to be single-liners, while some coding practices span 4 or more lines for simple things. Then, developers come up with tricks to avoid multiple lines, like:
That is just a single line of code, yet it is doing three things: Checking inputArgument
and then either throwing an exception or assigning inputArgument
to "discard". Well, the use of discard actually means it is only doing two things, yet the code is written like that because the ?? throw
only works in expressions (like an assignment expression). That trick is used to avoid writing the "unacceptable and unreadable":
I am pretty sure for most English readers and even for C# developers that still didn't get used to the new features (like the ?? throw
), the second option should be quite easy to read, and will not span as many lines as the other accepted solution:
Yet, "code style is there to help".
Other Consequences
The code-style actually has other consequences aside from how easy it is to read code. For example, some code coverage tools only evaluate "touched lines" of code. That is, if I do unit tests passing only non-null
values, this code will still have 100% coverage:
Yet, any of the other two alternatives will only have 100% coverage when a null
is passed.
I am not going to discuss if having code coverage for input argument validation is a good or bad thing. I could write an entire article just about code coverage. The thing is, the code style can affect the results of some code coverage evaluations. As well, depending on the debugger, it will be very hard to put a breakpoint only on the throw
using the single-liner, while it will not be an issue to do the same in the other styles.
So, as a general rule, I consider that enforcing a tool generated formatting that can be "made to pass" by using more cryptic code is a bad idea.
What we can do, though, is try to avoid things like:
- Doing more than one method call in a single statement - This will more naturally avoid having to break statements in multiple lines, making it overall easier to understand.
- Use more local variables, even with
var
, instead of doing calls like obj.SomeProperty.Item1
then obj.SomeProperty.Item2
just to avoid having to type the entire type again and again. - Just indent and unindent things, do not align items one below the other, as that can break easily if using tabs or if things get renamed.
- If your statement needs to span into multiple lines, break at the first moment that makes sense, not at the last. The second line of the statement should be a continuation of what we saw at the left of the first line, not of a possible inner-call we didn't see at the end of the line.
And, well, using common sense instead of hard rules is definitely much better.
History
- 24th May, 2022: Initial version