How to effectively comment code

It's important that code has good comments accompanying it, because comments expose the higher level design and intentions behind code, as well as allow the programmer to leave notes to future developers. The purpose of writing a comment is to explain something about the code in a way that makes it easier to understand — comments exist to be read not by the computer, but by people.

Writing good comments

The hardest part of writing comments is actually remembering and taking the time to put them in. When writing them, though, the following general guidelines should be kept in mind:

Comment location matters

Comments describing a single line of code should go either on the line before, or at the end of the line of code, but never after. This is because code is read from top to bottom, and comments are generally going to be read before code. For example, some code that uses the Pythagorean Theorem:

Good comment positions:
# Calculate length of hypotenuse
c = square_root((a * a) + (b * b))

c = square_root((a * a) + (b * b)) /* Calculate length of hypotenuse */

Bad comment position:
c = square_root((a * a) + (b * b))
// Calculate length of hypotenuse

Comments that describe multiple lines of code should go on the line before the first line of code. Further, in this case, because the lines of code are meant to be logically grouped together, they should also be visually grouped together. Code that doesn't have a comment should be visually separated from code that does, to make it clear that it is not part of any of the other comments.

# Comment describing operation A
code for operation A
more code for operation A
even more code for operation A

# Something else
code not part of operation A

code that doesn't have a comment associated with it

# Comment describing operation B
code for operation B
still more code for operation B

Don't state the obvious

Comments that parrot back what the code says are not useful.

Here are two simple examples of comments that do nothing more than repeat what the code says:

# Get list average
list_average = CalculateAverage(list)

# Get the difference between how many bytes
# have currently been read and how many bytes
# were previously read, and divide by how
# much time has passed, to get the read speed
read_speed = (bytes_read_now - bytes_read_previous) / time_difference

The first comment isn't even needed, the code very clearly calculates a list average; if a comment has to be put in, it should say why the average of the list is needed, but that will likely be made clear from surrounding code. The second comment is entirely too verbose and doesn't contain any additional information useful to the reader. A comment should still be provided, and in this case, a comment such as Get read speed, in bytes per second would be very helpful. This tells the reader the unit of the variable, so that when, farther down, they see the expression read_speed / 1024, they know that expression is in kilobytes per second.

Here is another example of a comment which, at first glance, appears to be non-obvious, but on closer inspection is barely telling more than what the code says.

# Check if a user is an administrator
# before deleting the file. Give a message
# to non-admins letting them know that they
# can't delete the file
if IsAdministrator(user):
    ShowMessage("You can't delete this file")

It is only because of how clear these function names are that the comment is obvious. In this case, it would be better to have a comment that says Only administrators can delete files, and even that is potentially debatable; such a comment does provide a very concise summary of what the code is intended to do, but it's also somewhat redundant after looking at the code. It is worth emphasizing that this is an example of "self documenting code", where the code is self-explanatory when reading it. It is also worth emphasizing that most code is not actually self-explanatory, that most code should have at least a few comments explaining what is going on. It's better to have a few concise and somewhat-redundant comments than to have no comments, because they not only help confirm the reader's thoughts on what the program is doing, they help the reader more rapidly group operations together.

Have transition comments between distinct operations

Sometimes, particularly in chunks of code that are especially complex or long, it's easy for the reader (or developer) to get lost, moreso when there is a sudden transition from one set of operations to another. To help combat this, brief comments may be written that simply say the current state of the function or program. For example, comments like Failed to create file, bail out, Loaded header, start reading records, or Secure connection established, send data. These comments help guide the reader through the flow of the program, especially when the flow changes.

When dealing with if statements that have long bodies, it might be helpful to have a quick comment at the top of the else body (if there is one) that states the condition that is true if the else body is executed. For example:

if(IsAdministrator(user)) {
    20 lines of code later...
} else {
    # User is not an administrator
    more code

Comments like these are nice because they help the reader verify that they understand the program flow correctly.

Comments must be accurate

At their simplest, inaccurate comments give the wrong impression of what's going on. If there's a comment describing any given line or section of code, there's a good chance, depending on what the reader's goals are, that the reader will not really look at the code, and will take the comment at face value and move on. If the comment does not accurately convey what the code is doing, then the reader may come away with the wrong idea or a flawed understanding of the code.

What happens though when a comment says save email to file, it will be sent later, but the code that follows actually sends the email immediately? Which is correct, the code, or the comment? Obviously what's written in the code is what will be executed, but which is the intended operation? Did the code not get updated to reflect what the comment says, or is the comment a hold-over from a previous iteration of the project? Questions like these are why it's best to just make sure that comments are always as accurate and up-to-date as possible.

Don't leave code commented out

During the coding process, sometimes it's necessary to comment out a few lines of code. Perhaps it's to see if removing the code fixes a bug, or to keep it around for reference while rewriting a section of code. Regardless of the reason, during active development, there's nothing wrong with this.

Where it becomes a problem, however, is when code is left commented out longer than needed. Not only does this create bloat, there's great potential for confusion. The bloat comes from the simple fact that dead code is being carried around, taking up space both on disk and on screen. As to the confusion, when there's commented out code next to non-commented-out code, the reader is left wondering which version is correct, whether the two chunks of code are supposed to be equivalent, and why the commented code is left in. While obviously only the non-commented-out code will be executed, the reader has to work harder than necessary to figure out what's going on.

Another important reason to not leave code commented out is that, depending on what precisely the commented out code is supposed to do, by the time the code is uncommented, it will no longer function as originally intended. Referenced variable or function names may have changed, or even the flow of the surrounding code, and it's extremely unlikely that the commented-out code was kept up to date. If the code won't function as intended if it's uncommented, then there is very little reason to keep it around.

It's worth noting that proper use of version control means that old code can always be recovered, even after it's deleted. It is for this reason that commented-out code shouldn't be committed into the history of a project backed by version control.

Leave useful notes for future developers

Sometimes bugs, or areas where bugs may exist, are discovered that, for whatever reason, can't be fixed at the moment in time they are discovered. While there's nothing wrong with this, these bugs should be noted immediately, before they are forgotten about. These comments should include the label FIXME or XXX, with FIXME being used to indicate a particular bug, and XXX generally being used to indicate an area of code that needs attention. Adequate detail should be provided, so that future developers have the information needed to identify and fix the bug.

There are also times where incomplete functions or future features need to be noted. In this case, such comments should include the label TODO, and include sufficient detail so that the feature or modification (and possibly the rationale) are clearly understandable.

These labels are not arbitrary. Many code editors recognize these labels in comments and highlight or otherwise color the labels differently, making them stand out to anyone reading the code. Further, consistent use of these labels allows one to search for them in a codebase, allowing developers to easily identify areas that need improvement.

Putting it all together

When comments are written effectively, they allow a reader to more easily understand and navigate a codebase. Comments should be written to maximize (correct) understanding; after all, comments are meant for people, not computers. When in doubt, include a few more comments than might be strictly necessary, so long as they are accurate and useful.