Friday

One liners for readability

(For clarity, when I say one liner, I mean single statement. For various reasons, including readability, it may be desirable to split statements across multiple lines)

Which is easier to read? Is it this multi-statement snippet:
def params = request.params
def config = whatever.getMyConfig()
def request = MyLongTypeNameRequestTransformer.toMyRequest(params, config)
def result = myLongNamedServiceThatDoesSomethingCool.execute(request)
or this single statement:
def result = myLongNamedServiceThatDoesSomethingCool.execute(
MyLongTypeNameRequestTransformer.toMyRequest(
request.params, whatever.getMyConfig ) )
// line breaks added to protect the page-width. 80 columns
Now your answer to that question probably depends on how you interpreted 'easier to read'. Most of you probably thought that I meant "Which format makes it easier to understand what this snippet does?" Typically people are going to say that the first one is more clear in that respect. However, I want to consider a different question, "Which format makes it easier to understand/follow/debug/grok/refactor the overall codebase?"

I would argue that the second form is better in that respect, and that overall manageability of a codebase is generally more important than a small readability improvement for a section of code. I see one liners as an implied abstraction. It hints to the reader that, despite the complexity involved, we are really only interested in one result (either a return value or a side effect), so they can mentally collapse all that code into "get X" or "do Y". It says, "Skim over me, the details are not important."

There are two alternatives to one liners. The first is to use a bunch of local variables. While this can certainly make an algorithm more readable, each one adds one more thing that I have to keep track of when I'm reading your code. I have to remember that config came from whatever.getMyConfig(), and I have to consider that it may be used later in the method. If config is inlined, I don't even have to mentally register its existence unless I'm trying to parse the one liner.

The second alternative is to encapsulate the snippet as a function. This gets rid of the mental scope pollution from stray variables, but it introduces two new problems. Firstly, if I want to know what the function does, I have to go to another place in the file (or another file), which breaks the continuity. Often times one liners also need several of the variables in the current scope, so you end up with a lot of parameters, which adds even more clutter. Secondly, you end up poluting the scope of the class or script with yet another function that is only used in one place. You've just moved the clutter up a level.

Now, there are certainly times and places where you don't want to inline everything all willy nilly. If an expression appear multiple times in a function (e.g., whatever.getConfig()), don't repeat yourself, use a local variable. If you can extract a function that can be used in more than one place, go right ahead. If you have code that is several indentation levels deep, first consider a redesign and/or refactoring that simplifies, but if that doesn't work, extract a function to make it more readable. However, declaring variables and functions should be things that you do out of a clear need, not just because you think you'll need it, or because your CS professor always told you to (CS professors are concerned with algorithms and minutia, not codebase managability). Remember, premature optimization is the devil.

Most modern languages offer syntax that can assist in creating readable one-liners. If your language has map/has and/or list literals, you should avoid building those constructs procedurally. e.g., in JavaScript:
var array = new Array();
array[0] = 'foo';
array.push('bar');
myMethod(array);
is lame. Write that as
myMethod(['foo','bar']);
Groovy's map literal:
myMethod([foo:'bar',baz:'quxx']);
Even Java can be slightly improved thanks to anonymous inner classes and initializer blocks:
myMethod(new HashMap() {
{
put("foo","bar");
put("baz","qux");
}
});
Hopefully you have the idea by now.

No comments: