Server Monitoring

Summary

Businesses rely on IT to such an extent that many of them almost cease to function if there's a major IT problem. Effective system monitoring, with automated alarm notifications to support staff, can pre-empt many problems and thus improve system availability.

Types of monitoring

There are three types of monitoring:

Tiger Computing's server monitoring service uses all three methods, ensuring that your Linux servers are lean, fit and secure.

Status monitoring

Status monitoring looks at the current state of the server. Many parameters may be monitored; for example:

In reality, many more parameters are measured.

Each parameter has defined acceptable limits. For example, a disk may be considered "OK" if less then 80% full, in a "Warning" state if between 80% and 90% full, and in a "Critical" state if more than 90% full.

Once a parameter leaves the "OK" state, support staff are notified and corrective action can be taken.

Trend monitoring

System trend monitoring looks at various system parameters over a longer period of time.

Many of the same parameters are measured, but are displayed as graphs, which allows reasonable predictions to be made.

The graph below shows a server's disk usage over time:

It can be seen that one part of the disk, represented by the top line, was filling up between June and early October. Once it breached the 80% mark, the system status monitor alerted support staff. By looking at the trend graph, it is clear that, unless something changed, the disk would be full in a couple of months.

The client was informed, and some files that were no longer required were deleted, resulting in the drop in mid-October.

Other possible courses of action would have been to schedule the fitting of an additional or larger disk, or to archive old data. The important point is that action was taken pro-actively rather than reactively.

Log monitoring

Every server event is logged: a user logs in, a mail is sent, another is received, the server corrects its internal clock by 12 milliseconds, the anti-virus utility is updated, someone tries to break into the system.

Searching logs for malicious actions or signs of an impending problem is like looking for a needle in a haystack. In these times of ever bigger haystacks and ever smaller needles, automation is required.

One approach is to look for anything suspicious. The problem with that approach is that one must define in advance what constitutes “suspicious” in order for an automated process to find it. An alternative is to use a human being, which is time-consuming and expensive, not to mention error-prone.

A far better approach is to have an automated process that has been told what all the "expected" events are. It discards them and reports what's left to the support staff for further analysis. That way, anything that is unexpected will be found.

Contact us today to find out how we can help keep your servers healthy too.

Back to Linux Support

© 2012 Tiger Computing Ltd, Wyastone Business Park, Wyastone Leys, Monmouth, NP25 3SR | Legal | Sitemap