arei.net

by arei [11-Aug-2022 20:59] in code, java, javascript, nodejs, uncategorized, web, work

Writing code is hard. Getting code to work in a repeatable, consistent fashion that addresses all of the possible use cases and edge cases takes deliberate time and fortitude. Yet getting code working is only half the battle of writing good code. See, good code not only solves the problem, it does so in a manner that other people can use your solution to not just solve a problem, but use it to understand the problem, use it to understand the solution, and use it to understand how to go beyond it. Good code works, but great code teaches.

So, with that in mind, how do we as engineers, write great code? What separates code that merely gets the job done from code that empowers people who read it? How do we transform our code from simply functional to deliberately empowering?

Below I present my ideas on what makes code approachable to others. Following these techniques will make your code more readable by others, more understandable by others, and more reusable by others. To meet these goals requires four specific areas of your attention: readability, context, understandability, and re-usability.

Readability: Write Beautiful Code

The first step in writing great code is to make it readable to others. By the very nature of using a higher order programming language, one would think that readability is solved, but a programming language alone is not enough. A programming language, like any communication system, like any language, is merely a means to express an idea. It is how we use that language and how we structure that usage that makes it beautiful or not, that gives it real understanding.

1. Indentation Conveys Hierarchy

Most modern programming languages do not require indentation. You could simply write all your code like this:

function add(x,y) {
return x + y;
}

and it would compile and execute just fine. But clearly, there is a reason why most modern programming languages do allow for liberal indentation. This code with indentation is infinitely more readable:

function add(x,y) {
    return x + y;
}

We use indentation to indicate hierarchy. This works because of how we visual scan things with our eyes. Western language readers are taught to scan from top to bottom, left to right. Indentation plays into that. As you scan down the code from top to bottom, the left to right indentation allows you to easily ascertain hierarchy within the code.

The return statement in the above code clearly is subordinate to the function definition, merely by that act of indenting it.

In fact, beyond even most computer languages, most descriptive languages like HTML, XML, CSS, JSON, etc, all support indentation to imply hierarchy. One of the hardest things of all when inheriting someone else’s code or data is it not being well formatted. Reformatting and format commands work mostly, but not always, and that is where thing can rapidly fall apart.

2. Meaningful Whitespace

Indentation is not the only way we convey meaning in our code using whitespace. Consider this code:

const FS = require('fs');
const contents = FS.readFileSync(process.argv[2], 'utf8');
let lines = contents.split(/\r\n|\n/);
lines = lines.map(line => {
    line = line.toLowerCase();
    line = line.replace(/[^\sA-Za-z0-9-]/g, '');
    line = line.replace(/\s\s|\t/g, ' ');
    return line;
});
lines = lines.map(line => line.split(/\s/));
const words = lines.reduce((words, line) => {
    line.forEach(word => {
        words[word] = words[word] + 1 || 1;
    });
    return words;
},{});
Object.keys(words).forEach(word => {
    console.log(word + ' ' + words[word]);
});

Like our indentation, this is perfectly executable code that will run and do the several “things” it is intended to do. But understanding what those “things” are is not so clear.

Code runs in an organized top to bottom fashion, and we mostly structure that code into logical steps. The code reads a file, the code breaks the file contents up by newlines, etc. These are the steps that your code performs.

If you look at this code you can even begin to see each of these steps. It might take a minute or two, but the steps are there.

Now, look at this code again:

const FS = require('fs');

const contents = FS.readFileSync(process.argv[2], 'utf8');

let lines = contents.split(/\r\n|\n/);
lines = lines.map(line => {
    line = line.toLowerCase();
    line = line.replace(/[^\sA-Za-z0-9-]/g, '');
    line = line.replace(/\s\s|\t/g, ' ');
    return line;
});
lines = lines.map(line => line.split(/\s/));

const words = lines.reduce((words, line) => {
    line.forEach(word => {
        words[word] = words[word] + 1 || 1;
    });
    return words;
},{});

Object.keys(words).forEach(word => {
    console.log(word + ' ' + words[word]);
});

The logical steps of our code are much clearer to see, much cleaner to read. The only change between the two examples is whitespace.

This process of visually splitting logical steps of code up is called meaningful whitespace. Meaningful whitespace is the art of using well placed line breaks to chunk code into discreet steps. Just like indentation, we use whitespace, new lines, to organize the logic of what our code does for other people to easily understand. And this is about more than just breaking up large chunks of text. Smaller blocks of code are easier to understand when examined than larger ones. Visually when reading this code one can easily understand that each of these steps does something in common and works together to achieve some result.

This is another reason that most programming languages ignore whitespace.

When inheriting someone else’s code it is not uncommon to find a complete lack of meaningful whitespace. The burden this places on the developer to go through the code and chunk it up correctly is significant and slows down their rate of understanding. By grouping your code into chunks that are digestible and understandable to someone else you aid them in time to understanding.

Context and Comments

Up to now we’ve talked about how to make our code more readable. Indentation and Meaningful Whitespace provides readers of your code super easy cues to help the scan and breakdown your code and this makes the code more readable. Readability is the first key to great code.

The next step to great code: context. That is, once my code can be easily read, how do I give it meaning so it can be easily understood. And the keys to understanding code is understanding the context of what any section operates within and on, and how the logic flow of any section operates.

3. Contextual Naming

Fun fact, excluding comments and data, the only things in your code that you get to name are classes and variables and functions. Everything else is a pre-defined keyword or operator of some type. This is why class and variable and function naming is so important, because it is the only part of your actual code where you can provide context to what your code does.

Consider the following code:

function x(y) {
   return y < 1 ? 0
        : y <= 2 ? 1
        : x(y - 1) + x(y - 2);
}

What this code does is not obvious by just reading the code. Providing a contextual name to the function changes everything here:

function fibonacci(y) {
   return y < 1 ? 0
        : y <= 2 ? 1
        : fibonacci(y - 1) + fibonacci(y - 2);
}

Class names, function names, variable names, these are opportunities to describe what they represent (classes), what they do (functions), and what they hold (variables). The name provides context to the role in the system.

Now, there are a few special rules around this, for sure. If iterating a number in a loop, we use i for the index. This is a well-known convention. But even this convention can stand breaking some times. If the context is more important than the convention, always go for more context.

4. Convey Logic through Comments

Class names, function names, and variables names all provide context as to their roles, but program logic is dictated by keywords and in almost all programming languages there is no way to change keywords. So how do we provide context and meaning around program logic?

That is where comments come in, and specifically inline comments within the code. I view this differently than documentation. Documentation answers the bigger question about the code purpose and how people use the code. Comments, instead help to describe the logic and the context around the logic in the code. (We will talk about documentation in greater detail in a little bit.)

Comments within your code are simple useful hints as to how the logic within the following section of code works. Have you ever read a block of code and then wondered to yourself what is going on? That is a failure of commenting. The initial author failed to recognize that another developer would understand the code, and that stems from either hubris or laziness, neither of which are particularly good habits to have as a developer.

Now, this is not an advocation for writing a comment before every line of code. I am a firm believer in writing code that is simple to follow and does not require comments to understand, but that is not always the case. And over-commenting creates a whole separate level of problem. Instead, comment where it is needed. When considering any section of code (which you nicely formatted with whitespace), ask yourself “is it readily obvious what it does?” If the answer is even slightly a no, take a second to add a comment or two of context.

This code from an earlier example is a lot more understandable for a handful of comments, and adding these comments as the logic is written takes almost nothing:

// Bring in the standard filesystem library
const FS = require('fs');

// Read the file
const contents = FS.readFileSync(process.argv[2], 'utf8');

// Split our file content into lines and
// convert each line to an array of simple words
let lines = contents.split(/\r\n|\n/);
lines = lines.map(line => {
    line = line.toLowerCase();
    line = line.trim();
    line = line.replace(/[^\sA-Za-z0-9-]/g, '');
    line = line.replace(/\s\s|\t/g, ' ');
    return line;
});
lines = lines.map(line => line.split(/\s/));

// Count the occurance of each word in all the content.
const words = lines.reduce((words, line) => {
    line.forEach(word => {
        words[word] = words[word] + 1 || 1;
    });
    return words;
},{});

// Output our word cloud and frequency 
Object.keys(words).forEach(word => {
    console.log(word + ' ' + words[word]);
});

There are two different uses of comments. First is what we covered above, to add logic understanding to your code. The second use of comments is to add documentation around how your code is used. This second type is covered in a later section, so please keep reading.

For logic comments, the general rule of thumb is to use // instead of /* */ in your code. /* */ block style comments are really more for providing documentation than doing quick one-off logic comments.

Also, when doing logic comments, do not hide them on the end of a line. Put them on their own line so that they are more visible and easier to see.

const superWeirdThing = !superWeirdThing; // DON'T COMMENT LIKE THIS

Organize Your Code

Understanding about your Code is done partially through providing context about what that code does, but equally important to understanding is organizing and structuring your code in a repeatable, understandable, and accepted fashion.

5. Organized from Top to Bottom

The first step in organizing your code is to physically lay the code out in logical sections. Almost all modern programming languages have a recommended approach to organizing your code. Different languages have different approaches to this, and even different approaches within that. JavaScript, is one of the more chaotic languages with little formal agreement on the best way to do this. However, there are still little things you can do without a formal community accepted standard. For example, import/require statements are expected to be first in the file.

The important thing here is to keep like with like. Do not put two class definitions separated by some variable assignments. Put the class definitions together. This is an implied structural organization for your code and it applies at the file level, the class level, and the function level.

A common layout pattern for a file in many languages is:

Require Libraries
Declare Constants
Declare Variables
Declare Classes
Declare Functions

A common layout pattern for a class in many languages is:

Declare Static Members
Declare Members
Declare Static Methods
Declare Constructor
Declare Methods

There is a common layout pattern for writing functions in some languages, but JavaScript is a little more lose here, so I am not going to offer a formal pattern. Instead, I will tell you how I like to structure things for a function:

Arguments and Argument Checking at the top
Declare variables as I need them, but nearer to the top if possible.

6. Separation of Concerns

In computer science the concept of Separation of Concerns (SOC) allows one to divide a computer program into logical sections based on what each section of that computer program does. In Web Development, HTML, CSS, and JavaScript is an example of Separation of Concerns where HTML is responsible for the content, CSS is responsible for how the content looks, and JavaScript is responsible for how the content behaves. Model-View-Controller (MVC) is yet another Separation of Concerns system that is often used.

But beyond a fancy CS Degree, separation of concerns means creating one thing to do that thing well without side effects. It means writing many functions that each do something well, instead of one single function that handles all the cases. It also makes more sense to break things up into smaller discreet units in terms of reusable code.

It also means letting the various pieces do exactly what they are good at and not trying to force them to do tasks that are better suited elsewhere. It is this reason that we should not write style information in JavaScript when a perfectly good CSS rule will do. CSS is optimized for dealing with these things and JavaScript is not, so why would you try to outwit it? Of course, there are always times when you have to break these rules, but they are very, very rare.

Related to SOC, in Computer Science there are three other principals to be aware of:

Coupling – Coupling is the measurement of how interdependent two modules or sections of code are to one another. There are a lot of types of coupling between two sections of code with a lot of fancy names and descriptions. For now it is best to just understand it this way: How dependent on A is B and vice-versa? Can A run without B? Can B run without A? Two modules are said to be Tightly Coupled, if they are very dependent on each other. Conversely, they are said to be Weakly Coupled it they are only slightly dependent on one another. Coupling in code is necessary because code executes linearly. However, designing classes and functions to be independent provides for better re-use and more understandable code. Consider some code you are looking at and that moment when you say, “Well, this calls function X, now where is function X and what does it do?” This is an example of coupling.
Cohesion – Is somewhat the opposite of coupling as it measures the degree to which things within a section or module of code belong together. This is usually described as having High Cohesion, meaning things within the module belong together, or Low Cohesion, meaning that they do not. In Object Oriented Programming, high cohesion of what is in any given class is a design goal. The class should do what it sets out to do and only what it sets out to do. If I have a Class called Car and inside that class there’s a function called juggle() it probably means my class has poor cohesion (and interestingly probably also implies tight coupling with something else). Side Effects from running a module or function implies poor cohesion. If function juggle() also fills the car with gasoline, either this is an amazing juggling trick, or there is a side-effect in the code.
Cyclomatic Complexity – Cyclomatic Complexity is the measure of how complex a section of code is, based on the number of paths through the code. That is to say, the more non-linear outcomes the code generates, the higher its Cyclomatic Complexity score. This is often judge by the number of conditional statements in your code. A function that has no conditionals, is said to have a Cyclomatic Complexity of 1. If it had a single conditional with two out comes (a typical if/else for example) it would have a Cyclomatic Complexity of 2, and so on. Cyclomatic Complexity is not bad, in and of itself, but the more complex the code, the higher the possibility of failure, and the more testing required. A function should strive for a lower complexity score if possible.

So, we talk about these three measures of code in order to reinforce the point of Separation of Concerns. A Separated module is one that is not coupled directly to another module, is cohesive in what it does, and reduces complexity.

When you find yourself in code that is hard to follow, it is almost always because it fails on one of these three factors.

One way to reduce complexity and structure your code better is to identify the cohesive sections of your code and isolate them into Logical Blocks. A Logical Block is a portion of your code that requires multiple lines to complete a single task. As we talked about earlier, breaking your code up with meaningful whitespace makes these logical blocks easier to identify. Adding comments makes them clearer in their purpose.

But a Logical Block also is a good hint as to how to abstract logic into small, more cohesive behaviors. For example, if your logical block needs to get reused, isolate it out into a new function.

Also, it is worth mentioning about Code Folding. Code Folding is a feature of most modern IDEs that allow you to hide (or fold) blocks of code. Code Folding in your IDE can be a powerful code reading tool. However, Code Folding requires certain syntax structures to work. By thinking of your code as Logical Blocks you can see where Code Folding opportunities could exist, if the logical block was in an acceptable structure.

Thinking about your code as a Logical Block allows you to more easily see other ways to do the same thing, ways that may be cleaner and more cohesive. It illustrates where there may be room for improvement in your code as well.

For example, this code is a Logical Block.

let sum = 0;
for (const num of numbers) {
    sum += num;
}
const avg = sum / numbers.length

You could change this logical block to be more readable by…

Abstracting it out to a function, especially if it is going to be reused.

const avg = calculateAvg(numbers);

Wrapping it in a IIFE (Immediately Invoked Function Expression)

const avg = (() => {
    let sum = 0;
    for (const num of numbers) {
        sum += num;
    }
    return sum / numbers.length; 
}());

Using a built-in library to do it simpler.

const avg = numbers.reduce((sum, num) => sum + num, 0) / numbers.length;

Using a third party library helper function.

const sum = _.avg(numbers);

All of these are better than the first because they create an isolated block that cleanly identifies what it does. With proper whitespacing and a comment above the block, makes this tight, readable, well reasoned code.

Code Usability

At some level, everything you see, smell, taste, touch, or interact with is a User Interface. An API, for example, is a User Interface between your backend and your frontend. This goes for the code you write as much as anything else. At some point, another person is going to read your code, run it, and even have to make sense of it. So, build your code like you would an API. Take into consideration how another person is going to interact with that code, how they are going to try and reason about it, how they are going to try and work within it.

Build everything with another user’s experience in mind.

7. Defensive Coding

Software is a piece of code that produces some sort of output, usually based on some sort of input. Now, assume someone is going to run your code at some point. Assume that that person is not going to read the documentation. What would happen? Would it work if the first argument is a null? Would it work if the second argument was a negative zero or a NaN? How your code behaves when executed under uncertain conditions is a testament to the quality of engineering employed in building it.

An approach to handling these questions is called Defensive Programming. In defensive programming you assume that users will maliciously try to cause problems with your code and therefore your code must guard against their attack. This means making the assumption that all data is tainted and must be verified to be correct.

Here’s an example:

if (this.onClose) this.onClose();

The assumptions made by this code are not defensive. The snippet assumes that onClose is a function and then executes it as a function. A corrected version of this code might look like this:

if (this.onClose && this.onClose instanceof Function) this.onClose();

But there is another assumption here as well, that executing this.onClose() will work? So even more defensive code might do it thus:

if (this.onClose && this.onClose instanceof Function) {
    try {
        this.onClose();
    }
    catch (ex) {
        ... do something about the error ...
    }
}

There is even an argument, that another defensive point might be that if this.onClose() is a long running process, it could cause your thread execution to grind to a halt and an even more defensive posture might be thus:

if (this.onClose && this.onClose instanceof Function) {
    setTimeout(()=>{
        try {
            this.onClose();
        }
        catch (ex) {
            ... do something about the error ...
        }
    },0);
}

That last form might not work for all situations, but it is extremely defensive.

The key to writing code defensively, is to checking your inputs early in your code. Make sure what you are getting is what you are expecting. If your function only works with even numbered integers, you better make sure it takes only even numbered integers.

The same is true for classes. Write your classes to assume that users are going to try to harm them and break your code. This means writing defensive methods but also, preventing unwanted or unknown class state mutation. Setters are a great way to protect your class from bad user input and should be used whenever possible instead of exposed public members. Consider this class:

class Car {
    make: 'Chevy';
    model: 'Malibu';
    mpg: 25;
    tankSizeInGallons: 14;

    toString() {
        return `The ${this.make} ${this.model} can go ${this.mpg * this.tankSizeInGallons} miles on a single tank of gas.`;
    }
}

The publicly exposed make, model, mpg, and tankSizeInGallons are all mutable without any thought for defensiveness. A malicious coder could cause quite a problem with these variables. More defensive style would be to use private variables (or Symbols) to hold the values, and provide public getter and setter functions to expose them. The setter functions, in particular, can do error checking and ensure type and values match acceptable inputs.

8. Code is a User Interface

Your code, whether a component, a utility, or an API, is going to get used by someone else. This is even more true when one considers that sharing of code online has become pervasive. And if you work in a team setting at all, this compounds the likelihood someone else will need your code. I do not care how throw away you believe your code is, assume that someone is going to use it.

To that end, and as stated before, treat your code like a User Interface. This means designing your code to be read is important, but also designing your code to be used by others is equally important. Ask yourself how another user is going to come in contact with your code. If they see the function signature, does it provide enough information as to what it does and how it works or does it require additional information? If they read your code, is it obvious how they would step through it from line to line or does it require more contextual logic?

A lot of what we have already covered in this document describes how to address writing code toward usage by others, but it bears repeating here:

Design your code to be read by others.
Highlight hierarchy with indentation.
Separate discreet sections of behavior with whitespace.
Name classes, functions, and variables contextually.
Provide comments around programming logic.
Clearly Separate your application concerns.

Following these steps is the first part of writing code like a user interface.

The second part is to always be asking yourself, how will others use my code? Challenge yourself to always be mindful of this question and your code will be the better for it.

Most professional coding work is a shared environment where everyone is always working on each other’s code; any team experience really is. As such it is critical to always be considerate of how others are going to use your code. Whenever you write a new component you should be consciously aware that someone else may very well need what you are building for their own purposes. Write your component with that reuse in mind. But also, be insatiable in your desire to make your component as accessible to another developer as possible. The more we reuse one another’s components, the better our product becomes.

9. Documentation

Finally, let us talk about documentation. To many software engineers the act of writing documentation is seen as an after thought, something to be farmed out to some loser English Literature person desperate for a job. But the reality of software engineering is that it is only 50% about writing code. The remaining part of Software Engineering is communicating with other people.

As we have stated above: someone else is going to use your code. It is inevitable. Given that, telling those users how to use your code is critical. If you fail to communicate how your code runs, you might as well not have written the code at all.

We talked about comments in a prior section. Comments are used to describe the logic your code follows. Documentation, on the other hand, describes how a user of your code interacts with the code and gets it to do what is expected.

Unlike inline comments to describe logic blocks, documentation is about describing the interfaces of your code, the places where others will interface with your code. We write documentation to tell others whom are using the code what it does and how it is used. This does not mean you document every single function and variable. Instead this means you document every public facing part of your code, every place other will interact with it.

Fortunately, in this day and age, you do not even have to lift your eyes out of your IDE to write documentation. Any modern language today has an associated documentation language built into it. Java has JavaDoc, Typescript has TSDoc, JavaScript has JSDoc, etc. These documentation languages assist you in describing your class, function, variables, every aspect of your code as much as you want.

Even better, most modern IDEs have these documentation languages built right into them and have tooling that makes writing that documentation easier.

Here’s an example of using JSDoc:

/**
 * Move the selection to the "first" section.
 * @return {void}
 */
selectFirst() {
    const first = this.sections.find(section => section.startHere);
    this.select(first || this.sections[0] || null);
},

In this example the comment block that precedes the selectFirst() function describes what it is and what it returns.

Documentation languages vary by programming language, so please read up on your particular one.

One note about documenting that to address: Your code is not self-documenting. Many, many people have said this over the years and every single one of them has been wrong. What they are really saying is their code is Readable and that is enough. In some case this is true, but mostly it is not. Please add documentation to your code about what it does and how it is used.

Denouement

So that’s it. Above we outlined nine (9) separate ways to make your code more readable, more descriptive, more defensive, and more user friendly. It’s a lot, to be sure, but each of these approaches is a small step towards the larger goal of writing better code that is more than just runnable.

We recognize that this could be seen as a lot. We all know that software is a demanding job full of demanding bosses and customers who think it is super easy. And these best practices they take additional time, they cost cycles, and adding them can seem overwhelming. However, starting small is the key here. When considered as a whole, these rules can seem daunting. But individually, they are not so rough. Try adopting one or two of these at first, then a few weeks later add another, until they are all second nature. It’s a good bet you are probably already doing one or two of these without even thinking about it. So, add a few more, then a few more, and next thing you know you will be shipping code you are proud to have others to read.

permalink: http://www.arei.net/archives/302