If you’re a member of the tech industry then you’ve probably had to work with legacy code — those ancient systems that just hold everything together. Not every project is greenfield, thus “Working Effectively with Legacy Code” by Michael Feathers is a book that has a reputation that would provide a good insight into how we could improve our relationships with these necessary systems and provide a better service to our customers. This blog post will share of some the key takeaways from the book.
What is Legacy Code?
In a world where continuous delivery is on every company’s radar, it’s important to accept that legacy code exists and you’re going to be required to bring it into the fold. One of the key aspects of a successful continuous delivery model is the feedback loop. Your developers and your businesses build confidence in your ability to make changes through testing. Static analysis, unit testing, integration testing, and performance testing are all the difference between having a continuously deployed application and a high risk, rarely changed legacy codebase.
To quote the book, legacy code can be simply defined as ‘code without tests’. This definition at it’s surface may seem too simple and lacks the nuance that comes with the anxiety of working with legacy code. However, underneath it provides an immediate insight into an important way that we shed some of that negativity that comes with working with legacy code.
The transformation of untested legacy code into a low-risk well understood codebase is the focus of this book.
Making Changes & Minimizing Risk
“Anyone can open a text editor and spew the most arcane non-sense into it.”
Many of the changes to software can be simplified to: understanding the design, making changes and poking around to ensure you didn’t break anything. It’s not a very effective way to manage the risk that is associated with code changes. As developers, we all want to follow best practice: to write clean, well-structured and heavily tested code. Unfortunately, this Utopian culture is a rare find and we must learn to live with real-world deliverable code.
Changes are made to code for one of four reasons: new features, bug fixing, design improvements or optimization. Working on any codebase involves risk, but poorly understood software makes it impossible to understand the risk with any change. This leads to a mentality of risk mitigation where developers follow the ‘If it ain’t broke, don’t fix it’ mantra. I’m not talking about developers leaving obvious problems fester when working on a piece of code. Rather, when a change is being committed to a legacy codebase, developers tend to take the quick and painless approach — the ‘let’s just hack it in right here’.
There are three questions the author suggests that you can use to mitigate risk:
- What changes will we have to make?
- How will we know that we’ve done it correctly?
- How will we know that we haven’t broken anything?
It’s that concept of software development that is known by many but understood by few. Any developer will tell you that code should be tested, but for a variety of reasons it’s either never done or abandoned early. This comes from a pattern of poor understanding and laziness — and I mean both by managers and developers. It’s as much cultural as technical, if not more.
Tests are a way to detect change and improve the speed at which you receive feedback in your workflow. Software is complicated and so is testing — have you ever wondered why there are so many types of testing? We’ve mentioned a few already, but try to imagine each type as a way of localizing problems. With each type of testing you’re focusing the feedback on the various layers of your application testing deeper and deeper so you can detect changes in functionality at any level.
Take unit tests for example. We know they’re supposed to run in isolated environments, be fast (really fast!), and provide very localized error information. In a situation where you’re continuously integrating (i.e. making frequent commits) your unit tests should be able to rapidly provide feedback on your changes. You’ll know exactly which change caused the error and you’ll be able to integrate more quickly.
Regression tests come in all shapes, sizes and have plenty of names: integration tests, smoke tests, acceptance tests, etc to name a few. The key point to understand with each layer of testing you’re moving further away from localized feedback into a realm of testing for regression. It’s a hard concept, so the author of the book gives an interesting anecdote to describe a situation where your unit tests may pass but some higher level functionality will be changed unintentionally — imagine the code works correctly, but there’s unintended consequences. With each layer of testing you’re moving away from precision error localization and checking for code functionality to testing if your program is working correctly.
Working with legacy code has a bit of a conundrum — how do you write tests into a codebase without first making changes? You cannot, and so the author proposes taking on the technical debt of adding tests at the same time as performing code refactoring.
“We want to make functional changes that deliver value while bringing more of the system under test.”
The key to successful refactoring is picking off a chunk of work that you can manage. It’s very easy as a developer to go down a rabbit hole of refactoring and adding tests when you initially only needed to fix a simple bug. Sure, you may succeed in your hare adventures of refactoring an entire interface. It’ll pass all the tests, then, suddenly it’s now failing in integration and you’re burning countless hours making it worse than before. When you’re working on a system that will need refactoring, try to limit your scope to only the functionality that requires change in the first place. Then, when you succeed and your colleagues are cheering, move onto the next.
“Let’s face it, working with legacy code is surgery and doctors never operate alone.”
The author made a small reference to Extreme Programming (XP) and I feel it deserves more attention. If you’re not already, you should try pair programming when working on difficult problems and especially when refactoring. It’s an excellent way to reduce risk and spread knowledge around.
Overall, “Working Effectively with Legacy Code” was a great read. Every experienced developer has something to learn with some of the techniques illustrated. Additionally, the breadth of topics can provide useful patterns about transforming a dreary legacy system into a world where Continuous Delivery is possible. It’s a fun topic to cover that allows you to find enjoyment in refactoring and writing tests.
There’s a ton more information within the text that relates specifically to how to implement changes varying from object-oriented design, dependency management and cultural guidance.
Interested in working someplace that gives all employees an impressive book expense budget? We’re hiring.