Accelerate your software engineer career by managing scope creep in your pull requests

After submitting code for review, you will often receive requests to include fixes or improvements not directly related to the primary goal of your change. You can recognize them easily as they usually sound like: “Because you are touching this code, could you…?” and typically fall into one of the following categories:

  • Random bug fixes
  • Code refactoring
  • Fixing typos in code you didn’t touch
  • Design changes
  • Increasing test coverage in unrelated tests

While most of these requests are made in good faith and aim to improve the codebase, I strongly recommend against incorporating them into your current change for a couple of reasons:

You will trade an important change for a lower-priority one

Your main change was the reason why you opened the pull request. By agreeing to additional, unrelated changes, you sacrifice the more important change for minor improvements. How? The additional changes you agreed to will need at least one more round of code reviews, which can trigger further feedback and more iterations that will delay merging the main change.

I learned this lesson the hard way when I nearly missed an important deadline after agreeing to fix a ‘simple bug’ that was unrelated to the feature I was working on. Fixing the bug turned out much harder than initially perceived and caused unexpected test issues that took a lot of time to sort out. Just when I thought I was finished, I received more requests and suggestions. By the time I merged my pull request, I realized how close I had come to not delivering what I promised on time, only because I agreed to focus on an edge case that ultimately had minimal impact on our product.

You will dilute the purpose of the pull request

Ideally, each pull request should serve a single purpose. If it’s about refactoring, only include refactoring. If it’s fixing a bug, just fix the bug. This approach makes everyone’s job easier. It makes pull requests easier and quicker to review. Commits are less confusing when looked at months later and if you need to revert them, no other functionality is lost. All these benefits diminish if your pull request becomes a mixed bag of random code changes.

How to deal with incidental requests?

You have a few ways to proceed with suggestions you agree with. Usually, the best approach is to propose addressing them in separate pull requests. If your plate is full, log tasks or add them to the project’s backlog to tackle them later. Occasionally, a request may suggest a significant redesign that may fundamentally alter your work. If the idea resonates with you and won’t jeopardize your timelines, you might want to consider implementing it first and redo your changes on top of it.

How not to ruin your code with comments

The road to the programmer’s hell is paved with code comments

Many developers (and their managers) think that comments improve their code. I disagree. I believe that the vast majority of comments are just clutter that makes code harder to read and often leads to confusion and, in some cases, bugs.

A heavily commented code base can significantly slow down the development. Reading code requires developers to constantly filter out comments, which is mentally taxing. Over time, developers learn to ignore them to focus better on code. As a result, comments are not only not read, but also forgotten when related code changes.

Moreover, because all comments look similar, it is hard to distinguish between important and not-so-important comments.

Unhelpful comments

In my experience, unhelpful comments fall into a few categories.

Pointless comments

Pointless comments are comments that tell exactly what the code does. Here is an example:

// check if x is null
if (x == null)

These comments add no value but have a great potential to turn into plainly wrong comments. There is no harm in simply removing them.

Plainly wrong comments

My all-time favorite in this category is:

// always return true
return false;

The most common reason for these comments is to miss updating the comment when changing the code. But even if these comments were correct when written, most were not useful. Similarly to “pointless comments”, these comments should just go.

TODO/FIXME comments

When I see a // TODO: or a // FIXME: comment, I cannot help but check when they were added. Usually, I find it was years before.

Assuming these comments are still relevant, they only point to problems that were low priority from the very beginning. This might sound extreme, but these comments were written with the intention of addressing the problem “later”. Serious problems get fixed right away.

Let’s be honest – these low-priority issues are never going to get fixed. Over the years the product had many successful releases and the code might already be considered legacy, so there is no incentive to fix these problems.

Instead of using comments in the code to track issues, it is better to use an issue tracking system. If an issue is deprioritized, it can be closed as “Won’t Fix” without leaving clutter in the code.

Comments linking to tasks/tickets

Linking to tasks or tickets from code is similar to the TODO/FIXME comments but there is a twist. Many tasks will be closed or even auto-closed due to inactivity, but no one will bother to remove the corresponding comments from the code. It could get even more problematic if the company changes its issue tracking system. When this happens, these comments become completely irrelevant as the tasks referred to are hard or impossible to find.

Referencing tasks in the code is not needed – using an issue tracker is enough. Linking the task in the commit message when fixing an issue is not a bad idea though.

Commented out code

Checking in commented code doesn’t help anyone. It probably already doesn’t run as expected, lacks test coverage, and soon won’t compile. It makes reading the code annoying and can cause confusion when looked at if syntax coloring is not available.

Confusing or misleading comments

Sometimes comments make understanding the code harder. This might happen if the comment contradicts the code, because of a typo (my favorite: missing ‘not’), or due to poor wording. If the comment is important, it should be corrected. If not, it’s better to leave it out.

Garbage comments

Including a Batman ASCII Art Logo in the code for a chat app might be amusing but is completely unnecessary. And no, I didn’t make it up.

Another example comes from the “Code Complete” book. It shows an assembly source file with only one comment:

MOV AX, 723h     ; R.I.P. L. V.B.

The comment baffled the team, as the engineer who wrote it had left. When they eventually had a chance to ask him, he explained that the comment was a tribute to Ludwig van Beethoven who died in 1827 which is 723h in hexadecimal notation.

Useful comments

While comments can often be replaced with more readable and better-structured code, there are situations when it is not possible. Sometimes, comments are the only way to explain why seemingly unnecessary code exists or was written unconventionally. Here is one example from a project I worked on a while ago:

https://github.com/SignalR/SignalR-Client-Cpp/blob/4dd22d3dbd6c020cca6c8c6a1c944872c298c8ad/src/signalrclient/connection.cpp#L15-L17

// Do NOT remove this destructor. Letting the compiler generate and inline the default dtor may lead to
// undefined behavior since we are using an incomplete type. More details here:  <http://herbsutter.com/gotw/_100/>
connection::~connection() = default;

This comment aims at preventing the removal of what might appear to be redundant code, which could lead to hard-to-understand and debug crashes.

Even the most readable code is often insufficient when working on multi-threaded applications. As the environment can change unpredictably in unintuitive ways, comments are the best way to explain assumptions under which the code was written.

Finally, if you are working on frameworks, libraries, or APIs meant to be used outside of your team you may want to put comments on public types or methods. They are often used to generate documentation and can be shown to users in the IDE.

How to write good comments

Here are a few tips when it comes to writing good code comments.

  • Comment sparingly. Make your code speak for itself – use meaningful variable names and, break down large functions into smaller ones with descriptive names. Reserve comments for clarifying non-obvious logic or hard-to-guess assumptions.
  • Focus on “why”, not on “what”. Readers can see what the code does but often struggle to understand why. Good comments explain the assumptions, intentions, and nuances behind the code.
  • Be concise and relevant. Avoid including unnecessary details that could lead to confusion.
  • Ensure comments are clearly written and free from typos or mistakes that make them hard to understand.
  • Use abbreviations cautiously. Expand them on the first use. An abbreviation that is obvious now may become incomprehensible after a few years even for you.

The downsides of an inconsistent codebase and what you can do about it.

Code consistency is essential but arguing about coding style is not productive.

Why does code consistency matter?

A consistent code base makes developers’ lives better.

Reading and understanding code that follows a coding standard is a lot easier. Even the parts of code you rarely, if ever, work with are less intimidating.

Code that is easier to read is code that is easier to review. A consistent code style makes code reviews more efficient and thorough because it allows focusing on the important aspects of the change.

Writing code is more effortless because you don’t need to use your mental energy to find the best way to format it. In larger code bases, a consistent style allows applying extensive automated code transformations (a.k.a. codemods) without many misses.

Searching the code is more effective. The consistent structure makes it easier to create search terms that work well. Plus, you can quickly tell if what you found is what you were looking for.

What happens if there are no coding guidelines in place?

Without clear coding guidelines, teams waste time arguing over coding styles. I have yet to see two software developers agree on the style, let alone witness one convince another to adopt their style. What I have seen, however, was diff reviews where obvious bugs or design issues were missed because the diff was dominated by heated (and sometimes personal) arguments about code style.

I once worked in a codebase where a class had four different conventions for naming private variables: underscore prefixed (_name), “m” prefixed (mName), “m_” prefixed (m_Name), no prefix (name). But that was just the tip of the iceberg. The reason for this madness was that four developers, each with their preferred style, who actively worked on this file wouldn’t compromise. Everyone dreaded modifying this file because they knew that the review would take ages and spark conflicting comments (occasionally turning into flamewars) over the chosen coding convention.

How to implement coding guidelines?

If your team doesn’t have coding style guidelines, consider setting them up to unlock the benefits discussed above. However, be warned – it won’t be easy. The main issue is that everyone has their preferred style and, it is tough to create guidelines that satisfy everyone. Since reaching an agreement can be difficult, adopting coding guidelines from other companies or using the default settings from your IDE might be the easiest practical solution. Securing support from your manager and/or a senior engineer on your team before you begin can play a key role in ensuring its success.

Ensuring that style violations are easy to detect and fix is crucial. They should be highlighted both in the IDE and during code reviews, and fixing them should not require more than a few keystrokes. Fortunately, all popular IDEs offer now excellent support for coding guidelines, including project-specific coding guidelines.

The role of tooling cannot be overstated. I witnessed first-hand how inadequate tooling caused an attempt to implement coding guidelines to fail. In my team, we introduced a standalone linter that had to be run from the command line. Reported issues had to be fixed manually. This process was not only tedious but often required several iterations. Because style issues weren’t shown in the code review tool, many of them slipped through, burdening the next person who touched the file. Before long, fewer and fewer team members followed this process and eventually, the initiative was abandoned.

How to deal with code written before coding guidelines were implemented?

Once you have implemented coding guidelines there is one more problem to solve: what do to with the existing code that doesn’t follow the guidelines? Generally, there are two approaches: fixing the code to match new guidelines or leaving it as is. Each option has its pros and cons.

Updating all your code to comply with new coding standards can quickly get you to a clean state. Typically, it involves sending a huge diff that re-formats a significant portion, if not all, of the code. This approach has two major drawbacks. Firstly, it can be highly disruptive for the team. The formatting diff can interfere with ongoing work and may make familiar code hard to recognize. If you decide to go with this approach, choose a time that doesn’t put important deadlines at risk, giving developers time to adjust. Secondly, the big re-formatting will make the change history hard to follow. Finding what changed between commits from before and after “the big diff” will be difficult as it would seem like everything has changed.

You can choose to keep the existing code unchanged and enforce applying new guidelines only to new code. This approach won’t give you an instantly consistent code base, but over time, your code will gradually move towards the ideal state. Frequently modified files will comply with the new guidelines sooner, while others may never get there. This approach is much less disruptive than the previous one.

You can also combine the two approaches. Start by enforcing the new guidelines only on new code to minimize disruption. Then, after some time, update all the remaining code to match the guidelines. The impact of the second phase should be relatively small because it would only involve code that isn’t frequently looked at.

Future proof code

“Because in the future we might need…” triggers me a lot.

When I hear “In the future we might need…” I am hearing: “I am adding this complex code for no particular reason, and we will have to maintain it for a long time only to spend weeks to remove it”.

It is like saying “In the future I might want to build a pool so let me dig a giant hole somewhere in my backyard right now.”

Digging this hole will create a bunch of problems. It’s a safety hazard, so you’ll have to fence it. During a storm, a lot of water will pool there, and you’ll need to pump it out. Your guests will ask about your hole or comment on it each time they visit you. Your or your friends’ kids won’t be able to play in the backyard, and if they try, you’ll be the one to get the ball from the hole. Eventually, you may decide to sell the house or come to the conclusion that a pool in the backyard was not that splendid of an idea, and your giant hole is now a giant problem. If you do happen to move forward with the pool project years later, your pool hole will likely not be where you want the pool now. It will probably also be either too small, too big, too deep, or too shallow.

What do you do if you consider building a pool next to your house sometime in the future? You don’t make decisions that will make it impossible to build it when the time comes: you don’t sell this land, you don’t build a bunker or mother-in-law unit there. You just keep the land maintained. You can even build some lightweight structures like a playground or a shed. If you give up on the pool idea, decide to sell the house, or use the land differently, you can easily do so at no additional cost. If you decide to build the pool and are ready for the project end-to-end, only then do you dig the hole. This time, you will know exactly what hole to dig and where.

Software is not much different. Integrating frameworks and adding unnecessary abstractions or superfluous extensibility because we might need them in the future is like digging a pool hole for a future pool. It makes code hard to understand and modify and creates a huge maintenance burden. It is a source of bugs, on-call toil, and performance issues. Removing gets harder with each incoming change. And when you finally try to use this code (if it ever happens) you will find it impossible. The assumptions you made will have changed. The abstractions won’t be quite right, and the extensibility points won’t be usable.

The best way to make the code ready for the future is to keep it as simple as possible with good test coverage. Then, when you’re ready to implement your feature, you can change it safely and easily.

Originally published at https://growingdev.substack.com/p/future-proof-code

Code is tax – have a good reason to write it

I once told software engineers on my team: “We’re not here to write code”. They were flabbergasted. But I strongly believe that our job is not to write code but to solve problems. It just so happens that for many problems, code is the best, if not the only solution.

We often get this backwards. We identify as software developers, we love writing code so we will write code whether it is needed or not.

Years of experience taught me to think about a problem before writing a single line of code. I usually try to answer two questions:

  • does this problem need to be solved now or, even at all?
  • can this problem be solved without writing code?

Surprisingly often, this exercise allows me to discover alternative ways of solving a problem that are more effective than writing code.

Not all problems need to be solved

Sometimes, assessing the importance of a problem is difficult, especially when done in isolation. The issue might seem significant, prompting us to solve it with code, only to realize after the fact (or maybe even not) that this work didn’t yield any noticeable improvement.

Consider a service making blocking I/O calls. Such a service won’t scale well. However, if it receives only a few requests per minute rewriting it to improve its throughput won’t have significant impact and is not necessary.

While many such problems will never require further attention, some might need to be revisited. If the service mentioned above needs to be integrated with a system generating many requests per second, fixing the blocking I/O calls could be a top priority. So, it is important to stay on guard and address issues that were initially punted if they resurface again.

Not all problems need to be solved with code

Many problems can be solved without writing any code.

There is an entire class of problems that can be mitigated by changing the configuration of the software. For example, if your streaming service occasionally crashes due to the Out-Of-Memory exception when processing spikes of data, reducing the maximum allowed batch the service fetches for processing could be a perfectly valid solution. Processing the spikes may take a little longer, but the overall processing time may not be affected in a noticeable way and the issue is fixed without changing the code.

Often, you can solve problems making careful design choices. Take user accounts on a website, for example. When deciding how users log in, there are two common options: using their email or letting them create a unique username. Coming up with unique user names can be challenging for applications with millions of users, so many of them suggest an available user name. Using an email as the user name is an elegant way of solving the problem of unique user names without writing any code. It also simplifies the account verification and password recovery flows.

What’s the problem with writing code anyway?

In my perspective, code should be treated like a tax. Just as taxes, once code is added, it is rarely removed. Every person on the team pays this tax day in and day out because each line of code needs maintenance and is a potential source of alerts, bugs or performance issues. This is why I believe that, as software engineers, we should opt for no-code solutions when possible, and write code only when absolutely necessary.