Top 5 Unit Test Problems That Haunt Software Developers

Well-written unit tests are one of the most effective tools for ensuring product quality. Unfortunately, not all unit tests are well written, and the ones that are not are often a source of frustration and lost productivity. Here are the most common unit test issues I encountered during my career.

Flaky unit tests

Flaky tests pass most of the time, but not always. They may randomly fail even though no code has changed. The quickest and most common “fix” developers employ is to re-run them. With time, the number of flaky tests grows, and even multiple re-runs are insufficient.

Flaky tests are caused primarily by the following:

  • shared state
  • dependency on external systems

A shared state is the number one cause of test flakiness. Static variables could be one example. If one test sets a static variable and another passes only if this variable is set, the second test will fail if the order of execution changes.

Debugging flakiness caused by shared state is usually tricky because sharing state is rarely intentional.

Tests that depend on external systems tend to be flaky because the systems they rely on are outside their control. Any deployments, crashes, or throttling will cause test failures. Network, which is inherently unreliable, is yet another contributor. The best fix is to mock external dependencies.

Multithreaded applications deserve special mention. Race conditions in the product code could make tests for these applications flaky, and finding the root cause is often challenging.

Slow tests

Slow tests are a productivity killer. If running tests for a code change takes more than a few seconds, developers will use it as an excuse to find a distraction.

One of the most common reasons tests are slow is their dependency on external systems: network calls and the time to process the requests initiated by tests add up.

But tests that depend on external systems are also flaky, so slowness and flakiness go hand-in-hand.

Again, mocking external dependencies is the best fix to make tests fast and reliable.

If relying on external systems is intentional (e.g., end-to-end testing), it is worth separating end-to-end tests into a dedicated suite executed separately, for instance, as part of the nightly build.

I was once on a team where running all the tests took more than two hours because most of them communicated with a database. These tests were also flaky, so merging more than one Pull Request a day was virtually impossible.

Bugs in unit tests

Tests are there to ensure the quality of the product, but nothing is there to ensure the quality of tests. As a result, tests may fail to do their job due to bugs. Unfortunately, identifying these bugs is not easy. Paying attention can help. For instance, if all tests continue to pass after changing the product code, it usually indicates either bugs in tests or missing test coverage.

Hard to maintain tests

Tying tests and implementation details closely usually causes numerous test failures after even simple product code changes. Keeping tests focused on functionality instead of on the implementation can significantly reduce the number of unnecessary test failures.

Writing “tests” only to hit the code coverage number

Test code written solely to meet code coverage goals is usually low quality. Assertions in such code are often missing because they don’t contribute to the coverage goal but can cause failures. Test coverage reported by tools can make the manager look good, but this test code is useless as it can’t prevent bugs. What’s worse, the high coverage hides areas that do need attention.

This is my list of the top 5 unit test issues. What’s yours?

“Think big” or make progress?

I am being told to “think big.”

But I don’t know what this means.

And I doubt that most people who tell others to think big can do this themselves.

It is easy to come up with big ideas that are not realistic. “Inhabit Venus” sounds like a big idea, but I can do nothing meaningful to implement it. Finding big ideas that are also realistic is hard.

Big ideas

For the sake of the argument, let’s define a Big Idea as follows:

An idea is big if its implementation spans multiple teams or requires substantially altering business-critical systems. By definition, implementing such an idea takes quarters or even years to complete.

Given the risk and the funding Big Ideas require, pitching them is not easy. In many cases, even Staff Software Engineers do not have enough credibility to get such an idea funded. Instead, it takes a team of Product Managers, Engineering Managers, and Software Engineers to prepare and propose the idea.

Big Ideas promise high rewards but are inherently risky. Many won’t bring the promised benefits, and some will flop completely. Because of how long it takes to implement a Big Idea, not everyone who started it will be there to witness its completion and get the reward.

Earlier in my career, I spent most of my time trying to find the “big thing.” I was not successful. I was obsessed with “thinking big” but couldn’t come up with anything. It took me a while to realize that time was passing, but my career was not progressing. This realization led me to an alternative path.

If not Big Ideas, then what?

When I found that trying to “think big” was not helping my career, I shifted my focus to finding and solving problems I and my team or users faced. No problem was too small. I simplified code that was hard to modify, added missing test coverage, or sped up the build. These were simple changes that I could tackle immediately, and they usually didn’t take more than a day to finish. But they helped push my career on a growing trajectory. Here are the most important reasons why:

  • I learned how to take the initiative and propose improvements no one asked me to do
  • I got better at identifying problems
  • I improved life for our users, my team, and myself

The nice thing about these small bets is that they can boost a career even though they are easy to find and not risky:

  • due to the smaller scope, they are much easier to complete
  • if something goes wrong, it usually is not a big deal
  • delivering them consistently helps build the credibility
  • they sometimes lead to bigger ideas

Some of these small ideas occasionally had a much bigger impact than I expected. Here is one example. In response to an incident, I needed to write a tool that inspected data in our data store and deleted stale records. The store my team used was one of the most common infra used by many teams across the company. When researching how to complete my task, I noticed that some other teams already built similar one-off solutions. Instead of creating another specialized tool, I wrote a framework to make these tools rapidly. Because my framework saved weeks of development time, a few teams adopted it. I moved to a different team a few years ago, but my framework is still in use. Some developers even extended it to handle more scenarios.

I observed that, with time, my ideas started to grow. I started noticing bigger problems that needed more time and people to be solved. They still are not in the “Big Ideas” category, but many are noticeable as they span multiple teams.

Parting words

Please note that I am not saying never to “think big.” If you can propose or help drive a “Big Idea,” by all means, do so. But don’t dismiss smaller problems, especially if your “Big Idea” is not quite there yet. At the end of the day, when the review time comes, it’s always better to show a few completed small ideas than a “Big Idea” that hasn’t or couldn’t be implemented.

Do Unit Tests Find Bugs?

I’ve been writing software for over 20 years and don’t believe unit tests find bugs.

Yet, I wouldn’t want to work in a code base without unit tests.

Why unit tests don’t find bugs?

To understand why unit tests don’t find bugs, we can look at how they are created. Here are the three main ways to handle unit tests:

  • developers write the tests along with writing the code
  • Test Driven Development (TDD)
  • unit tests are considered a waste of time, so they don’t exist

When the same software developer writes unit tests and code simultaneously, the tests tend to reflect closely what the code does. Both tests and code follow the same logic, stemming from the same understanding of the problem. As a result, the tests won’t find major implementation issues. If they find small typos or bugs, it’s usually only by chance.

Test-driven development calls for writing unit tests before implementing product changes. Because no product code exists, the unit tests are expected to fail initially or even not compile. The goal is to write product code to make the tests pass. In TDD, new unit tests are added mostly to drive the implementation of new scenarios. An unsupported scenario could be considered a bug, but it’s far-fetched. As a result, TDD rarely finds existing bugs.  

If unit tests don’t exist, they cannot find any bugs.

If unit tests don’t find bugs, why do we write them?

While unit tests are not great at finding bugs, they are extremely effective at preventing new ones. Unit tests pin the program’s behavior. Any change that visibly modifies this behavior should make the tests fail. The developer whose changes caused the failures should examine them and either fix the tests—if the change in the behavior was intentional—or fix the code. Many test failures indicate assumptions that the developer unknowingly broke. Without tests, they would turn into customer-impacting bugs.

Other important advantages of unit tests include:

  • Documentation – comprehensive unit tests can serve as product specification
  • More modular and maintainable code – writing unit tests for tightly coupled code is difficult. Unit tests drive writing more modular and loosely coupled code because it is much easier to test.
  • Automated testing – unit tests are much faster to run and more comprehensive than testing changes manually.

If unit tests don’t find bugs, what does?

There are many ways to find bugs in the code. Integration testing, fuzz testing, and stress testing are just some examples. However, the three below are my favorite because they require little to no additional effort from the developers:

  • Exploratory testing: Try using the product you’re working on. See what happens if you combine a few features or try less common scenarios.
  • Code reviews: One weakness of unit tests is that they are implemented with the same perspective as the code. Code reviews offer the ability to look at the change from a different angle, which often leads to discovering issues.
  • Paying attention: Whenever you code, debug, or troubleshoot an issue, have your eyes open. Many bugs are hiding in plain sight. Carefully reading error messages, logs, or stack traces can lead to identifying serious problems.

The Curious Case of Bugs that Fix Themselves

I can’t count how many times I had this conversation:
  “Good news! The bug is fixed!”
  “Did you fix it?”
  “No.”
  “Did anyone else fix it?”
  “No.”
  “How is it fixed then?”
  “I don’t know. I could reproduce it last week, but I cannot reproduce it anymore, so it’s fixed.”
  “Hmmm, can you dig a bit more to understand why it no longer reproduces?”
  [Two hours later]
  “I have a fix. Can you review my PR?”

Can bugs fix themselves?

I grow extremely skeptical whenever a fellow software developer tries to convince me that an issue they were assigned to fix magically fixed itself. Software defects are a result of incorrect program logic. This logic has to change for the defect to be fixed.

Can bugs disappear? Yes, they can and do disappear. But this doesn’t mean that they are fixed. Unless the root cause has been identified and addressed, the issue exists and will pop up again.

The most common reasons bugs disappear

Seeing an issue disappear might feel like a stroke of luck – no bug, no problem. But if the bug was not fixed, it is there – always lurking, ready to strike. Here are the most common reasons a bug may disappear:

Environment changes

A change in the environment no longer triggers the condition responsible for the bug. For instance, a bug that could be easily reproduced on February 29th is not reproducible on March 1st.

Configuration changes

The code path responsible for the bug may no longer be exercised after reconfiguring the application.

Data changes

Many bugs only manifest for specific data. If this data is removed, the bug disappears until the next time the same data shows up.

Unrelated code changes

Someone modified the code, changing the condition that triggers the bug.

Concurrency (threading) bugs

Concurrency bugs are among the hardest to crack because they can’t be reproduced consistently. Troubleshooting is difficult: even small modifications to the program (e.g., adding additional logging) can make reproducing the issue much harder, which is why concurrency bugs are a great example of Heisenbugs. And the worst part: when the fix lands, there is always the uncertainty of whether it worked because the bug could never be reproduced consistently, to begin with.

The bug was indeed fixed

A developer touching the code fixed the bug. This fix doesn’t have to be intentional – sometimes, refactoring or implementing a feature may result in deleting or fixing the buggy code path.

The bug didn’t disappear

The developer tasked with fixing the bug missed something or didn’t understand the bug in the first place. We’ve all been there. Dismissing a popup without reading what it says or ignoring an error message indicating a problem happens to everyone.

Fixing a bug can be easier than figuring out why it stopped manifesting. But understanding why a bug suddenly disappeared is important. It allows for re-assessing its severity and priority under new circumstances.

Conclusion

If nobody fixed it, it ain’t fixed.

Generating Ideas and Driving them to Completion

It is impossible to achieve a successful and fulfilling career in Software Engineering only by following someone else’s orders. A significant part of taking the lead is coming up with ideas to innovate and move the team or the company forward.

The two models of idea generation

Over the years, I’ve witnessed companies using two main ways to generate ideas: on-demand and organic.

On-demand idea generation

The on-demand idea generation model works as follows: the manager shows up out of nowhere and demands, “We need some ideas for X!” They organize a brainstorming meeting where the team members try to invent some ideas. After the meeting, everyone returns to their work, feeling they have fulfilled their idea generation duty until the next time.

My experience with on-demand idea generation has been mixed. This setting usually seeks big ideas, which are generally quite hard to come up with on the spot and under pressure. In the end, only a few ideas are proposed, and barely any are implemented.

The most important reason why on-demand idea generation is ineffective is that most interesting ideas strike at unexpected times and not during a scheduled meeting.

Organic idea generation

Organic idea generation is the opposite of the on-demand model. Here, the ideas stem from observations made when working on daily tasks:

  • writing or reviewing code
  • investigating issues reported by users
  • struggling with tools or infrastructure
  • mitigating incidents
  • discussing issues with co-workers

All these activities are great opportunities to identify problems and propose improvements.

One of the biggest advantages of organic idea generation is that it is a continuous process. As a result, it allows for the generation of many ideas.

Most ideas generated organically are small: refactor some code, add test coverage, or fix a non-critical but annoying bug. Some are medium, e.g., redesigning a component for better extensibility. Occasionally, you will stumble upon a big idea that may lead to revamping your entire architecture and unlocking previously unthinkable possibilities.

Executing ideas

Even the best idea is not worth much if not acted upon. However, careless execution may have negative consequences. For example, failing to deliver a promised feature on time due to working on unplanned and non-critical refactoring is hard to justify. Here is my approach to avoiding these problems.

I start by noting the idea in my work log. This way, I rest assured that I won’t forget about it and will consider it when planning my work for the next week. Implementing small ideas is usually a matter of finding time to work on it. If I don’t have the bandwidth, I may ask a fellow developer working on related code to pick it up or use it to ramp up a new team member.

Medium and big ideas require more thinking. When planning my week, I block a couple of hours to write a one-pager describing the idea in more detail. Writing allows me to get more clarity on my idea, understand its feasibility, and weigh the costs and benefits.

Most ideas never reach the execution stage. Some are just not great, and external circumstances may block others. Over time, when these circumstances change, an infeasible idea may become viable. I recently revived an idea I had a year ago when I learned that our partner team had fixed a long-standing issue in their system.

I share ideas I believe are worth pursuing with my team and my manager to gather feedback. Depending on this feedback, I either shelve the idea or continue working on it until completed, often with other teammates.

But there is one more step after successful implementation: spreading the word about what we did, how we did it, and who made it possible. This step is especially important for bigger projects that take a while to implement and involve other team members. Everyone who contributed deserves to get the credit.

Conclusion

The most successful software developers generate many ideas because they understand that only some will come to fruition. But ideas are only the first step. The key is execution. Successfully executing an idea, letting the right people know, and sharing the credit is a huge career booster.