Effective system monitoring, with automated alarm notifications to support staff, can preempt many problems and improve system availability. Such monitoring will typically include:
- Free disk space available
- CPU utilisation
- Memory utilisation
- RAID health
- Checking for disk errors
- Checking system logs for potential problems
- Ensuring essential services are running
- The status of the backups
- Whether security updates need to be installed
- Routine system security checks
- Checking the validity and lifetime of any SSL certificates
Business Process Monitoring
Extending the monitoring to cover key business processes is a relatively easy extension. An online shop, for example, may know that it takes around 10 orders an hour. One check may be to look at the time of the last update of the “sales” table in the database and if it was more than 10 minutes ago, it could raise an alarm.
This is by no means the only test that should be run on such a server, and it wouldn’t be very helpful in diagnosing the cause of the problem, but it would alert staff who could check that all is well. If an issue is found, it may be appropriate to add more specific checks that would alert staff of future similar issues.
IT is there to support the business, and monitoring its effectiveness in doing so is most certainly a worthwhile approach.