When a working server is critical to your business success, you need to have a plan to deal with server failure. There are a number of approaches you can take.
Beef It Up
One approach is to build as resilient a system as possible: redundant disks (“RAID”: see this Wikipedia page) and redundant power supplies go some way towards mitigating against hardware failure.
But you can’t mitigate against every failure. You can have multiple CPUs on a system board, but the if system board itself fails, all bets are off.
Modern hardware – like modern cars – is reliable, but not immune to failure. The reality is that where one server may run for ten years without a hiccup, its sibling may fail twice in the first two months.
One solution is to put your systems “in the cloud”, such as Amazon Web Services. This can be a very cost effective solution for the right servers, but be aware that the design of a cloud infrastructure is very different from a conventional hardware infrastructure.
A cloud solution isn’t always appropriate, and an alternative is to have two (or more) servers configured as a High Availability Cluster. Here, the two servers have all the user data (website configuration, user files, etc) replicated between them. At any one time, one server is “live” and the other is “standby”. Each server monitors the other, and if a problem is detected on the live server, all services automatically switch to the standby server, which then becomes the live server.
This is seamless and transparent to users, save for a few seconds while the switchover takes place. This gives a number of advantages:
- it provides a degree of hardware fault tolerance
- maintenance, configuration or other work may be done on the off-line server without impacting the business
- the servers can be of a lower specification (for example, dual power supplies are not so important)
- if more than two servers are used, a degree of load-sharing may be implemented as well
Not all roses
The downside, of course, is that you need two servers. However, they need not be twice the cost of one because the specification may be lower, as mentioned above. You’ll also need to accommodate both servers in your rack or data centre.
The bottom line
Like so many IT decisions, this is essentially a business decision rather than a technical one, and it should be approached on that basis.
If you’d like some help with High Availability Linux Systems, contact us today:
- call us on 01600 483 484 or
- email firstname.lastname@example.org