Managing disasters : A quick checklist

Controlling the future.

When the unthinkable happens there is no point shouting or screaming or crying. You should be dusting off your emergency plan.

“Hope for the best, plan for the worst” applies in the digital world more than most.

Democratisation of web platforms such as WordPress might have wrested control away from IT and into the hands of the people that use them but, take care.

IT might have been seen as a slow creature inflexible to the needs of the business but it was usually very good at managing risk and recovering from disasters.

Last week, I watched a client fall into a nightmare of distributed suppliers, unlinked/closed thinking, penny-pinching and poor management. The web-site went down with a huge crash and no-one knew why, or how to recover it.

It transpires that supplier A, an outsourced server management company in Eastern Europe had made some dramatic back-end changes that stressed the load on the server, killing it. They had no backup, and effectively washed their hands of the problem. Useless.

Supplier B, this time looking after SEO, had also made server/level changes recently. They too had no backup, despite assuring everyone of their “best practices” that purported to “backup, test” etc.

The client was pulling their hair out. The site had been live for a couple of month and it seems that they hadn’t taken backups. Or had a recovery plan.

They too had made many changes, often deep inside the WordPress installation without taking any backups, or testing a recovery plan.

Disaster Recovery should have taken 5 minutes, maybe 15 minutes tops. Instead it taken 20hrs of work so far, rolling back to the last working version taken from a staging server used before the site went live and then incrementally re-making changes by hand.

A nightmare that cost thousands in resource and lost nearly 4 days of service.

What do you do when you have no recovery plan

We use phrases such as “all hands to the pumps” or “leaning in” and other such phrases that really only describe what happens when you don’t have a plan so “make it up as you go along”.

It’s very expensive, and you do need access to skilled folks to carry out very stressful work under huge pressure. It’s interesting that this is often the first time you hear from the MD, writing “this is unacceptable” and “why did/who is” type emails.

“Red” Adair, an american oil-well fire fighter once said “If you think it’s expensive to hire a professional to do the job, wait until you hire an amateur.”

Whilst “Red” was talking about 300ft high oil fires and the loss of life, the parallel to digital services is there. Too often these days, companies are heavily reliant on their web-site yet penny pinch and hire amateurs to do professional work.

Fire fighting has benefits, you can be innovative and in many cases clients are forced into thinking about “minimum service requirements” which arguably produces a better end-product.

Planning is a better way

Everyone should have a disaster plan. A “what happens when stuff stops working” list. It starts with who steps up to the plate and how much authority they have.

If the disaster is a predicted one (they often are not) then it’s down to simple operational delivery. More often it’s a hybrid of known problems with a little does of the unknown chucked in for good measure: Skills and experience fix this, in a mixture of the operational plan and some ad-hoc thinking.

Basic stuff such as backups, and testing restores, allows most of us to recover from technical glitches and hardware failures. But there are other risks we should be planning for, this is another subject but worth touching on here.

A market and technology landscape can change at the drop of a hat. If you consider the number of major disruptions in the last 5 years that couldn’t have been predicted 10 years ago: Tablets, Smartphones, Netflix, the Cloud, Micro-payments, Amazon, Pervasive Internet, Twitter, Facebook, geo-location and the “internet of things” just a name a few.

With disruption not just being physical failure, I wonder how much resource companies spend tracking market and technology changes. Not enough, I’d bet. Maybe you should allocate 10% of your resources to plan for the future.