Taking over the Codebase, Solving the Spaghetti Crisis

We’ve all been there. Somebody asks if you can take a look at their website that has been stagnant for a while. Something small needs to be changed. You feel up for a challenge, so you dive in. What you find is a mess. It’s really nobody’s fault. Things evolved over time, different developers and designers have done their thing at various times. Nobody meant any ill will, everybody just did their best. But here you are.

Those problems come in many forms, but the one that pops up the most is the shared server that’s running a website for that shop next door. Over the years the owner became more and more reliant on the site. Maybe it contains his master inventory, maybe his contact database. It started out as nice novelty that nobody depended on, but now it’s quickly becoming mission critical.

So there you are, you’ve just opened the public HTML folder in your FTP client and you’ve got PHPMyAdmin opened in the browser.

This is my current action plan, feel free to add and suggest tools.

Backup the entire site

For the love of god, make a backup now! Chances are good that you are looking at the only copy of the site in existence. If anything happens with the server, if you mess up something small, you’re going to be very very sorry.

Both cPanel and Plesk, the most popular domain control panels offer backup solutions out of the box. They are not perfect, but they allow you to create a full dump of the entire site. If you can schedule a daily backup, that’s a plus. If you can send the backup somewhere off site, another plus.

If you have shell access to the server, there are a whole slew of other tools available that may or may not be easier to use than the above.

Whatever you do, also check the backup. Does it contain a database dump? Does it contain all files? If you’re going to be messing with DNS and e-mail, you may want to check if that’s backed up too.

You’re now at a point where you can start developing/debugging with some confidence. It’s not perfect, but at least you have something to fall back on. I’d take it at least one step further.

Get the files under version control

If you’re going to be making multiple changes for multiple different tasks, you’re going to want to have all the code under version control. The easiest way: just put the entire public HTML folder under version control. You may be versioning too many files there, but at least you’re not missing anything.

One typical issue that pops up is the fact that not every one is going to be using version control. For instance, even the simplest WordPress blog can cause issues, because it is possible to edit some of the files from within the administration console.

If you have shell access, you could install and use version control on the server itself. But that doesn’t work for shared hosting.

I haven’t found the perfect, automated, solution but there are a few tools out there that allow you to view the difference between an FTP directory and a local on. Beyond Compare 3 is a pretty good one, once you get past its archaic interface.

You’re now at a pretty good place. Major disasters will be solved by the backup and smaller issues can be resolved by rolling back the change that caused them.

There’s still one wildcard: the database. Especially if your work involves structural changes to the database, you may want to look into …

Version control for the database

Few people do database version control and when it happens, it doesn’t always work quite right. But if you want to feel save doing that normalization operation on a few tables, there’s no way around it. You have to get the database in your version control system.

Start here and if you want to take it further, there have been many tools written since that post that will make your life easier.

(Unit) Test the code

Depending on the language of the application, you may just automatically be writing tests, even before you considered creating a backup. If it’s a PHP site however, chances are nobody has thought of this before.

You may think testing isn’t important for your particular application, but do me a favor: get at least one test in there, so that you’ve got the structure set up. If you ever add new code, you will be much more likely to add more tests.

[tweetherder]Start the test suite with a single test and let it grow from there.[/tweetherder]

Integration/GUI testing & Continuous integration

If you get through the previous steps, you’re doing better than most. Automated GUI testing, a continuous integration server or even a continuous deployment environment, etc. it’s all icing on the cake. But if you’ve got the time and budget to set this up, you’re going to be a very happy developer down the road.

Conclusion

Many sites out there are still alive only by the mere fact that nothing bad ever happened until now. If you’re going to be updating such a site, chances are good that you will be held responsible if anything goes wrong. Even if it is completely beyond your control. The above steps will make sure that you are prepared and that you can start refactoring the code without worrying.

This is an evolving article, I will be updating it as I go along. Tips are more than welcome.

(image credit)