Legacy code and tests

When talking about code, we usually assume that we've got a nice, clean slate and everything is done from scratch. And this is understandable - when you're starting a new project, you've got the opportunity to do things just right, and you apply all your accumulated experience and wisdom. It's a chance to realize your vision in code.

The problem is we don't often get to work on something brand new. Most of the time, we maintain something already built before us. When we're presented with a large and unfamiliar system, we call it legacy code, and working with it always takes more effort - invention is like going downhill, analysis is like climbing uphill.

There are excellent books on dealing with existing legacy code. How does it come about, though? Can we prevent it from emerging altogether? To answer, we'll start by defining legacy code.

Definitions

Let's start with the obvious and most used definitions. Legacy code is

code that uses technology that is not supported anymore

code whose author isn't around.

However, the word legacy is also often used in a pejorative sense. It conveys the feeling we get when dealing with a tangled mess of lines, and if we go with just this feeling, legacy code is

code we don't want to touch

or even

code we're afraid to touch.

Such an approach might be criticized for being too subjective - just tantrums of over-sensitive people who badmouth stuff that doesn't follow the latest fad in programming. But there is an actual problem here - legacy code is

code we don't understand

and

code that is risky and/or difficult to change.

That last one sums up the economic side of things very nicely. If the technology is obsolete, trying to change something might require major upgrades and refactoring parts of the system; if the author isn't around, we might break stuff so bad we'll be in a mess for weeks. And this is also the reason why we don't want to touch the code.

That definition - code that is difficult to change - points us to a larger problem. Why do we want to change it then? Because:

Legacy code is useful code

If code is useless, it will end up in the garbage disposal - thus, it will never become legacy. On the other hand, if code is useful, people will likely be adding new functionality to it. That means more dependencies, more interconnectedness between parts of the system, more complexity, and more temptations to take shortcuts.

This kind of growth is called active rot - as people work on a system, they add entropy; it takes extreme discipline and diligence to keep reversing that entropy.

Cleaning up your code diligently

The implication is that all useful code (=all code that people work on) tends to get sucked into the state of legacy.

That is an important point. Dealing with legacy code isn't just a one-time thing you do when you inherit someone else's code base. Unless you take specific measures to prevent it, the code you and your team have written will also become legacy.

And if we take that point of view, we can no longer explain away the problems with legacy by calling the author an idiot or saying we can't talk to them directly. So, how do we prevent our code from becoming an unchangeable mess?

Here, another definition of legacy code can help us, one provided by Michael Feathers:

Legacy code is code without tests

Isn't that a bit too specific? We were just talking about general stuff like the cost of change.

Well, tests prevent your code from becoming legacy - provided they've been properly written. Let's explore how they do it and how to write them.

Dealing with creeping legacy

OOP, functional programming, and everything in between teach us how to avoid spaghetti code. The general principles here are well-known. Promote weak coupling and strong cohesion and avoid redundancy; otherwise, you'll end up with a bowl of pasta.

Spaghetti code

Spaghetti code means any attempt to change something causes cascading changes all over the code, and high cost of change means legacy code.

The point we want to make is that tests provide an objective indicator of how well you adhere to those well-known best practices.

It makes sense, right? To test a piece of code, you need to run it in isolation. This is only possible if that piece is sufficiently modular. If your code has high coupling, running a part of it in a test will be difficult - you'll need to grab a lot of stuff from all over your code base. If code has low cohesion, then tests will be more expensive because each tested piece will have different dependencies.

If object-oriented programming is your jam, you know the value of polymorphism and encapsulation. Well, both of those enhance testability. Encapsulation means it's easier to isolate a piece of code; polymorphism means you're using interfaces and abstractions, which are not dependent on concrete implementations - making it easier to mock them.

The modularity of your code impacts the cost of testing it. If it is sufficiently modular, you don't need to run your entire test suite for every change; you can limit yourself to tests that cover a part of the code.

For example, if database access only happens from inside one package dedicated to that purpose, you don't need to re-run all tests for that package whenever changes happen elsewhere.

In other words, testability is an objective external indicator of code quality; writing tests makes your code more reliable, modular, and easy on the eyes. If you want an example in code, we've written in-depth on this subject.

Dealing with knowledge problems

Rotating developers

The turnover rate of software engineers is a significant problem - they rarely keep one job for more than two years. This means the code's author often isn't around to talk to; they usually don't experience the consequence of their design decisions, so they don't see how a model performs outside an ideal scenario.

There's a vicious cycle going on. An engineer arrives at a greenfield project or convinces people that the current system just won't do: we absolutely have to rewrite it from scratch. That engineer is smart, so they don't want to work with the legacy of the half-brained individual who worked on the code before them.

Two years pass, the project gets just as twisted and crooked as it had been, the engineer leaves on to a higher salaried position, and when a new programmer arrives, all the problems become the previous idiot's fault. Again. This cycle is something we've witnessed many times.

The cycle of code rot

Dungeon masters

Things might take a different road, and the author of the new system - usually a genuinely talented developer - might turn into a dungeon master (Alberto Brandolini's term). In this case, the developer stays, and only they know the system in and out.

The result is that any change in the system goes through that single developer. They are always the bottleneck; consequently, change is costly, and processes are mostly manual.

The fault is not with that person; they might not want to be in that position. The problem is systemic, and paradoxically, it's the same problem as with rotation: knowledge about code is only in its creator's head.

Spreading knowledge around

How do we get knowledge about code out into the world? We use tools such as code reviews, comments, documentation - and tests. Actually, tests are a great way of documenting your code.

One of Python's PEPs says that a docstring for a method should say what we should feed that method, what we should get from it, its side effects, exceptions, and restrictions. This is more or less what you can read from a well-written arrange-act-assert test, provided the assert is informative (see also this).

Javas guidelines for Javadoc distinguish between two types of documentation: programming guides and API specs. The guides can contain quite a bit of prose; they describe concepts and terms, but they're not what we're looking for here. The API specs, however, are similar in purpose to tests: they are even supposed to contain assertions.

That does not mean you can do away with documenting altogether and just rely on tests (at the very least, you'll want your tests to be documented, too). If you're asking yourself why some gimmick was used in a function, you'll need documentation. But if you want to know how that function is used, a test will provide an excellent example, simulate all dependencies, etc.

In other words, if the test base for your code is well-written (if the author isn't just testing what is easiest to test), then the code is much easier to understand because you've got a whole bunch of examples of how it should be used. Thus, it won't become legacy if the person who knows the code leaves.

Dealing with obsolete technology and changing requirements

No matter how well-written your code is or how good the documentation is, sooner or later, you'll need to migrate to a new technology, the requirements will change, or you'll want to refactor everything. XKCD has the truth of it:

XKCD on changing requirements

All code is rewritten at some point. And again, this is where tests can make your life easier.

There's this thing called Ilizarov surgery, when you get a metal frame installed around a damaged limb, and this both preserves the limb and helps direct the growth of fractured bones.

Ilizarov surgery

Essentially, that's what tests are. They fix the system in a particular state and ensure that whatever break you make inside doesn't affect functionality. With that external skeleton, you can safely improve design without changing behavior - i. e. do refactoring, as Michael Feathers defined it.

When you introduce a new technology, tests allow you to transition. When requirements change, tests preserve the rest of the functionality that hasn't changed and make sure that whatever you do to the code doesn't break what's already working.

Big changes will always be painful, but with tests and other supporting practices, the pain is manageable, and your code doesn't become legacy.

What this all means for writing tests

So, your tests can do many things: they push you to write cleaner code, present examples of how the code is supposed to be used, and serve as a safety net in transitions. How do you write such tests?

Testing should be a habit

First and foremost, writing tests should be a habit. If you've written a piece of code five months ago, and a tester tells you today it lacks testability - that's just annoyance. Now, you've got to weigh your headache from digging into old code versus their headache from making it testable.

On the other hand, if you're covering your code with tests right away, it's all fresh in your mind, so zero headaches. And if you're doing this as a habit, you'll write testable code from the get-go, no changes necessary.

Run tests often

If your tests are to serve as an Ilizarov frame for changes, you'll need to be able to run your tests often.

First and foremost, this means having a good unit test base that you can run at the drop of a hat. Ideally - it means having a proper TestOps process where all tests can be run automatically.

Running tests as a habit goes hand in hand with covering code with tests as a habit, and it means it's easier to localize errors.

Have lightweight tests

Of course, your tests need to run quickly for that to work. In addition to hardware requirements, this also means you should keep down the size of the test base. Here is where code modularity also helps, as it allows running just a part of your code base, provided that you're certain that changes won't propagate beyond that part.

What also helps is keeping tests at their appropriate level of the pyramid - don't overcrowd the e2e level. It is only for testing business logic, so move everything lower if possible. If something can be run as a unit test without invoking costly dependencies - do that. We've shown how to write lightweight unit tests elsewhere)

Together, these things mean that a developer has to have QA as part of their work cycle. Keep tests in mind when writing code, follow the code with tests, and run the test base after finishing a chunk of work:

Write code, write tests, test, repeat

Cover the entire pyramid

It's also important that you've got the entire pyramid of tests because its different levels reinforce each other.

Generally speaking, if you need the tests to be this frame that helps your code heal after a break, then the e2e tests are the ones you'll rely on most because unit tests tend to die along with the old code you're replacing.

However, if you're refactoring the front end, then the unit- and other tests on the back end will still be of use, and writing new e2e tests will be easier if you're using the old ones as a guideline and not inventing everything from scratch.

Think about readability

Finally, if the test base is to provide information about the purpose of your code, then the assertion statements should be informative - as precise as possible.

Generally speaking, keep in mind that readability should be at the forefront of your concerns. Your tests will be read way more times than they have been written. And they will likely be read by people other than you, who have to read many other tests once they're finished with this one.

Conclusion: use the right tools

The traditional approach to testing is - change and then poke around to see if everything is all right. The more you poke, the more you care. But preventing creeping system rot is not so much about care as about using the right tools. You need a test base that is

not haphazard, that covers every module and promotes modularity
that informs you about the function of each module
that can be run on demand to ensure the changes you're introducing don't change functionality

Legacy is a team problem

Legacy code is a problem that arises when a team has to deal with code that was written "selfishly" - without following best practices regarding documentation, comments, tests, etc. How a team implements the best practices we've just discussed is also important.

Writing tests should be accepted practice and part of the deadlines

As well as being a personal habit for developers, writing tests should also be accepted practice at the organization level. In particular, the definition of done when setting deadlines should include documentation and tests.

A deadline often means two completely different things for the programmer and the manager. The manager expects something polished and shiny; the programmer is happy if he can run it once and it doesn't crash. Will tests be part of their expectations?

The manager might also be afraid to bother the programmer too much, or they might think the programmer's time is more valuable than the tester's, so let's not waste it by making programmers write tests.

If the testers also live by the same deadline, then 1 month of developing + 1 month of testing will usually turn out to be 1.5 months of developing and then overtime for testers.

Whatever the reason is, the only way you get a stable workflow without crunch time is when deadlines are set with the knowledge that code must be shipped along with tests and documentation, maybe even Architectural Decision Records (ADRs) and Requests for Comments (RFCs) - those are excellent at conveying the context and backstory to the future reader of the code.

Without these tools, even the author will forget their code once a few months have passed, and the code will become legacy.

Testers and devs should talk to each other

Promoting testability is important for testers, but the ones who can improve it are developers. If the two departments are separate worlds, testing will always be a chore for developers, something that happens once the real work is done. And code is going to be a foreign country for testers. Everyone annoys each other, nobody is happy, and tests do little to prevent system rot.

The thing is, developers do care about writing modular code. There is no fundamental conflict here. It's just that testers need to know how to "speak code," developers need to write tests, and both of them need to talk to each other.

It's important to share expertise and have software and QA engineers participate in decisions. This can be done through code reviews and meetings - you can learn each other's practices and things that annoy the other team.

There are also the RFCs we've just talked about - as a dev, write down what you're planning to do before you do it, and request comments from other teams; if there is no issue, this won't take up much time, but if QA has other ideas, you might want to hear them before getting elbow-deep in code. In any case, you'll have a record of your thought process for the future, making your code less likely to become legacy.

Poor team communication leads to a high cost of change

Having a well-structured pyramid of tests requires all the things mentioned before.

Let's talk about excessive test coverage. Of course, an individual programmer or tester can write too many tests all by themself - but very often, excessive coverage masks poor team structure and lack of expertise sharing.

Say a developer has written a new piece of code. They'll also write unit tests for that code if they're diligent. Then, the testers will add integration and API tests on top of that. The unit test might have already checked all the corner cases and such in the new function, but the tester doesn't know about that, so they will redo all that work in their own tests.

Then, as a final layer, you've got the end-to-end tests, which call everything from start to finish. Ultimately, the function gets called a dozen times every test run.

Excessive coverage from bad communication

Now, what if we need to change that function? If its correct behavior changes, each of the dozen tests that cover it must also be changed. And, of course, you'll forget one or two tests, and they will fail, and the tester who gets the failed result probably won't know why this happened and so on.

In addition, if there is a separate team for test automation that also doesn't know anything about what the other teams are doing, the problems multiply even more.

A personal bugbear of ours is tickets that say - hey, we've got some changes in the code, so please change the tests accordingly. And you'll be lucky if the ticket arrives before the test suite has run and failed because of the changes.

This example shows how poor team structure and lack of visibility into each other's work can lead to cumbersome code (in this case, tests) that increases the cost of change.

Conclusion

Legacy code isn't a result of some accidental event, like a programmer leaving or technology becoming deprecated.

A lot of software starts in the head of one individual; if it's used, it inevitably outgrows that head. A large enterprise or public project might have millions of lines of code (e.g., in 2020, the size of Linux's kernel was 27.8 million lines of code). It doesn't matter how big your head is or how much you care; no single programmer can ever keep that much knowledge in.

Legacy code is what you get when the hobby project approach is applied to large systems. The only way such a system doesn't collapse is if you use tools to control it and share your knowledge with your peers. Automated tests are among the best tools for that purpose.

A well-written test base will promote sound code structure, keep a record of how to use the code, and help you improve design without affecting functionality. All of this helps achieve one thing: reducing the cost and risk of change, thus preventing your code from becoming legacy.

Providing such a test base isn't a job for one person (although individual responsibilities and habits are important building blocks). It is the team's responsibility and should be part of deadline estimations. Finally, the team should have clear communication channels; they should be able to talk about the code with each other.

That is how code can make a healthy journey from a vision in its creator's head to a large system supported by a team or a community.

Continuous Testing

Smart Test Cases

Test Visibility

Reporting and Analytics

Allure report

Getting Started

Connect CI

Allurectl

Documentation

Success Stories

Blog

Events

About

Partners

Contact us

Support

Get quote