Do Unit Tests Find Bugs?

I’ve been writing software for over 20 years and don’t believe unit tests find bugs.

Yet, I wouldn’t want to work in a code base without unit tests.

Why unit tests don’t find bugs?

To understand why unit tests don’t find bugs, we can look at how they are created. Here are the three main ways to handle unit tests:

  • developers write the tests along with writing the code
  • Test Driven Development (TDD)
  • unit tests are considered a waste of time, so they don’t exist

When the same software developer writes unit tests and code simultaneously, the tests tend to reflect closely what the code does. Both tests and code follow the same logic, stemming from the same understanding of the problem. As a result, the tests won’t find major implementation issues. If they find small typos or bugs, it’s usually only by chance.

Test-driven development calls for writing unit tests before implementing product changes. Because no product code exists, the unit tests are expected to fail initially or even not compile. The goal is to write product code to make the tests pass. In TDD, new unit tests are added mostly to drive the implementation of new scenarios. An unsupported scenario could be considered a bug, but it’s far-fetched. As a result, TDD rarely finds existing bugs.  

If unit tests don’t exist, they cannot find any bugs.

If unit tests don’t find bugs, why do we write them?

While unit tests are not great at finding bugs, they are extremely effective at preventing new ones. Unit tests pin the program’s behavior. Any change that visibly modifies this behavior should make the tests fail. The developer whose changes caused the failures should examine them and either fix the tests—if the change in the behavior was intentional—or fix the code. Many test failures indicate assumptions that the developer unknowingly broke. Without tests, they would turn into customer-impacting bugs.

Other important advantages of unit tests include:

  • Documentation – comprehensive unit tests can serve as product specification
  • More modular and maintainable code – writing unit tests for tightly coupled code is difficult. Unit tests drive writing more modular and loosely coupled code because it is much easier to test.
  • Automated testing – unit tests are much faster to run and more comprehensive than testing changes manually.

If unit tests don’t find bugs, what does?

There are many ways to find bugs in the code. Integration testing, fuzz testing, and stress testing are just some examples. However, the three below are my favorite because they require little to no additional effort from the developers:

  • Exploratory testing: Try using the product you’re working on. See what happens if you combine a few features or try less common scenarios.
  • Code reviews: One weakness of unit tests is that they are implemented with the same perspective as the code. Code reviews offer the ability to look at the change from a different angle, which often leads to discovering issues.
  • Paying attention: Whenever you code, debug, or troubleshoot an issue, have your eyes open. Many bugs are hiding in plain sight. Carefully reading error messages, logs, or stack traces can lead to identifying serious problems.

The Curious Case of Bugs that Fix Themselves

I can’t count how many times I had this conversation:
  “Good news! The bug is fixed!”
  “Did you fix it?”
  “No.”
  “Did anyone else fix it?”
  “No.”
  “How is it fixed then?”
  “I don’t know. I could reproduce it last week, but I cannot reproduce it anymore, so it’s fixed.”
  “Hmmm, can you dig a bit more to understand why it no longer reproduces?”
  [Two hours later]
  “I have a fix. Can you review my PR?”

Can bugs fix themselves?

I grow extremely skeptical whenever a fellow software developer tries to convince me that an issue they were assigned to fix magically fixed itself. Software defects are a result of incorrect program logic. This logic has to change for the defect to be fixed.

Can bugs disappear? Yes, they can and do disappear. But this doesn’t mean that they are fixed. Unless the root cause has been identified and addressed, the issue exists and will pop up again.

The most common reasons bugs disappear

Seeing an issue disappear might feel like a stroke of luck – no bug, no problem. But if the bug was not fixed, it is there – always lurking, ready to strike. Here are the most common reasons a bug may disappear:

Environment changes

A change in the environment no longer triggers the condition responsible for the bug. For instance, a bug that could be easily reproduced on February 29th is not reproducible on March 1st.

Configuration changes

The code path responsible for the bug may no longer be exercised after reconfiguring the application.

Data changes

Many bugs only manifest for specific data. If this data is removed, the bug disappears until the next time the same data shows up.

Unrelated code changes

Someone modified the code, changing the condition that triggers the bug.

Concurrency (threading) bugs

Concurrency bugs are among the hardest to crack because they can’t be reproduced consistently. Troubleshooting is difficult: even small modifications to the program (e.g., adding additional logging) can make reproducing the issue much harder, which is why concurrency bugs are a great example of Heisenbugs. And the worst part: when the fix lands, there is always the uncertainty of whether it worked because the bug could never be reproduced consistently, to begin with.

The bug was indeed fixed

A developer touching the code fixed the bug. This fix doesn’t have to be intentional – sometimes, refactoring or implementing a feature may result in deleting or fixing the buggy code path.

The bug didn’t disappear

The developer tasked with fixing the bug missed something or didn’t understand the bug in the first place. We’ve all been there. Dismissing a popup without reading what it says or ignoring an error message indicating a problem happens to everyone.

Fixing a bug can be easier than figuring out why it stopped manifesting. But understanding why a bug suddenly disappeared is important. It allows for re-assessing its severity and priority under new circumstances.

Conclusion

If nobody fixed it, it ain’t fixed.

“This code is s**t!” and Other Mistakes Code Reviewers Make

Code reviews are a standard practice in software development. Their purpose is to have another pair of eyes examine the code to catch issues before they affect users and to provide feedback that helps make the code cleaner and easier to understand.

While the goals of code reviews are noble, the experience for many developers is often less than stellar. There are many contributing factors, but one that stands out is how code reviewers approach and conduct code reviews.

Here are the top 5 mistakes I’ve seen code reviewers make (and I had made myself) that left fellow, often junior, developers discouraged and frustrated with the code review process.

Approving a PR without understanding the change

Many developers approve PRs very quickly, looking at the code only barely or not at all. While the speed matters, this attitude leads to problems:

  • bugs that could be identified during code reviews are missed, reach production, and impact users
  • the quality of the code deteriorates over time, making implementing new features more challenging
  • both the code reviewer and the author miss the opportunity to learn something new from the PR

Proper code review requires time, effort, and, often, additional context. I sometimes realize that a change I started reviewing requires more time than I can afford or that I don’t have enough context to understand it fully. If this happens, I will still review the change as best as I can, but I will let the author know that I can’t sign off on it.

“If you approve it, you’re responsible for it” was one piece of advice I received that completely changed my perspective on carelessly approving PRs.

Unprofessional feedback

Code reviews should be all about code. Sadly, they sometimes become personal attacks with harsh or condescending comments. This kind of “feedback” usually extends beyond code reviews and leads to a toxic team culture.

Even if the code sent for review has multiple issues, comments like: “this code is s**t!” are not helpful. Explaining the problems and suggesting solutions is a much more effective approach. Talking to the author is even more effective.

Too much focus on less important details

Flooding a PR with nitpicky comments is not good feedback. Not only is it borderline passive-aggressive behavior, but these comments can also drown out ones that raise important issues.

One great example is comments about code formatting. Asking the author to adhere to the Coding Style Guidelines adopted by the team is one thing, but commenting on each single incorrect indentation or misplaced parenthesis is not OK. Fortunately, this entire class of arguments can be easily avoided by integrating a code formatting tool. Not only will the tool end the petty arguments about code formatting, but it will also allow developers to focus on what’s important.

(I wrote about this in more detail here: The downsides of an inconsistent codebase and what you can do about it.)

Unclear or unactionable feedback

Comments like: “I am sure it can be done better” are not useful. They leave the author clueless about the reviewer’s expectations and the improvements they expect. In the best case, the author will ignore the feedback. In the worst case, they will try guessing what the reviewer meant and iterate on the code, often unnecessarily.

From my experience, illustrating comments with code suggestions is one of the clearest and most effective code review feedback.

Delaying code reviews

One of the most common complaints about code reviews is that they significantly slow software development. The most common reasons are:

  • reviewers are not picking up PRs for review
  • reviewers are not responding after the author addressed the feedback
  • changes are flooded with comments on minor issues, and resolving them requires many iterations

Assuming a PR is not intentionally blocked due to a serious concern, delaying reviewing it can be frustrating for the author. Often, reviewers are simply busy with their work and don’t have time to review someone else’s changes. But this is a double-edged sword—eventually, they will want someone to review their changes, and they shouldn’t expect quick reviews if they don’t review PRs promptly.

Sometimes, it is a matter of being better organized. Blocking half an hour daily on the calendar for code reviews should help team members move faster.

That being said, if your PRs are not getting reviewed, there may be something you can do about this. Check out my post here: 7 Tips To Accelerate Your Code Reviews.

Use Test Plans to become a more effective Software Developer

Shipping high-quality software is the responsibility of each software developer. Not only are the days when handing untested code to the QA team for validation a common practice long gone, but many companies have also moved away from QA-based testing, making developers fully own the quality of the product. This creates a problem: how can you ensure developers do their due diligence and validate their changes? At Meta (a.k.a. Facebook), we use “test plans.”

What are test plans?

Test plans describe how authors tested their changes. They are an integral part of Meta’s code review process: the code review tool does not allow submitting code for review if the test plan is empty.

While it is common for test plans to say: “unit tests,” many are much more interesting and often include:

  • screenshots showing the UI before and after the change
  • videos showing the change in action
  • API requests and corresponding responses (e.g., JSON payloads)
  • dashboard snapshots from canary runs
  • terminal printouts
  • Funny memes – especially if the testing was not comprehensive or applicable (e.g., auto-generated code, changes to unit tests only, etc.)

Benefits of requiring Test Plans in commits

The most obvious benefit of requiring test plans is forcing developers to think about validating their changes. While not all test plans are comprehensive, most are decent.

Occasionally, test plans reveal that the code change does not work as the author intended. I once proudly included a graph hovering a little below 100% as my test plan, only for a reviewer to point out that my graph represented not the success rate, as I claimed, but the error rate.

However, one often overlooked benefit of requiring test plans is that they can be lifesavers when working with unfamiliar code.

Imagine you are tasked with building a new feature in a mobile app. While working on the feature, you discover that the API powering your app doesn’t return all the necessary information. You can ask the API team to add it, but they may not be able to accommodate your request on short notice. Perhaps it would be faster if you implemented this change yourself. The only problem is that you are not familiar with the backend code. You don’t know how to test it to ensure you didn’t break anything. In these situations, test plans can come in extremely handy. You can check the commit history of the code you want to change and see how developers who regularly contribute to it test it. In addition, checking past test plans may help you discover edge cases you need to consider. I successfully used this strategy multiple times to change an unfamiliar codebase I would otherwise be afraid to touch.

“My unit test coverage is 100%”

Test plans should not be considered a replacement for unit tests. As great as unit testing is, it is not always sufficient. Additional end-to-end validation helps confirm that the changes worked as intended outside of the isolation provided by unit tests and that they didn’t introduce unwanted behavior. I have seen (and caused) situations where my application wouldn’t start even though all unit tests were passing.

Call To Action

Including test plans in pull requests is not a common practice, let alone a requirement, in most companies or teams. Despite that, I encourage you to follow this practice. Your team members will notice it and may start doing the same if they find it useful. And even if they don’t, you will still benefit from this habit. Going the extra mile can help you find issues before they impact users. With time, it will get easier because you can reuse past test plans to validate some of your current changes. You may need to tweak them a little, but you won’t have to start from scratch each time.

Code is tax – have a good reason to write it

I once told software engineers on my team: “We’re not here to write code”. They were flabbergasted. But I strongly believe that our job is not to write code but to solve problems. It just so happens that for many problems, code is the best, if not the only solution.

We often get this backwards. We identify as software developers, we love writing code so we will write code whether it is needed or not.

Years of experience taught me to think about a problem before writing a single line of code. I usually try to answer two questions:

  • does this problem need to be solved now or, even at all?
  • can this problem be solved without writing code?

Surprisingly often, this exercise allows me to discover alternative ways of solving a problem that are more effective than writing code.

Not all problems need to be solved

Sometimes, assessing the importance of a problem is difficult, especially when done in isolation. The issue might seem significant, prompting us to solve it with code, only to realize after the fact (or maybe even not) that this work didn’t yield any noticeable improvement.

Consider a service making blocking I/O calls. Such a service won’t scale well. However, if it receives only a few requests per minute rewriting it to improve its throughput won’t have significant impact and is not necessary.

While many such problems will never require further attention, some might need to be revisited. If the service mentioned above needs to be integrated with a system generating many requests per second, fixing the blocking I/O calls could be a top priority. So, it is important to stay on guard and address issues that were initially punted if they resurface again.

Not all problems need to be solved with code

Many problems can be solved without writing any code.

There is an entire class of problems that can be mitigated by changing the configuration of the software. For example, if your streaming service occasionally crashes due to the Out-Of-Memory exception when processing spikes of data, reducing the maximum allowed batch the service fetches for processing could be a perfectly valid solution. Processing the spikes may take a little longer, but the overall processing time may not be affected in a noticeable way and the issue is fixed without changing the code.

Often, you can solve problems making careful design choices. Take user accounts on a website, for example. When deciding how users log in, there are two common options: using their email or letting them create a unique username. Coming up with unique user names can be challenging for applications with millions of users, so many of them suggest an available user name. Using an email as the user name is an elegant way of solving the problem of unique user names without writing any code. It also simplifies the account verification and password recovery flows.

What’s the problem with writing code anyway?

In my perspective, code should be treated like a tax. Just as taxes, once code is added, it is rarely removed. Every person on the team pays this tax day in and day out because each line of code needs maintenance and is a potential source of alerts, bugs or performance issues. This is why I believe that, as software engineers, we should opt for no-code solutions when possible, and write code only when absolutely necessary.