Thresholds and Alerts. The other day I mentioned how you can use the SNMP Manager to tell you when your devices are about to go off the rails. Every device that is networked and has SNMP measurements offers a way for you to focus on specific items to monitor thier 'Normal' behavior. If all the subsystems are functioning normally the device will continue to function (assuming that all the auxilary dependant devices are working properly). When a service or function begins to have issues this devices measurement (status, reading) will begin to operate outside of the 'Normal' behavior. When this occurs this service/status/reading will 'exceed a threshold'. When this occurs you would want some sort of notification/alert from this device in the form of a message on the screen, a text to your phone, an email or even a audible alert. A well programmed SNMP manager will also allow you to determine the severity of the alert. There can be thousands of these measurements/statuses for just one device. How can you find the most important ones and connect those to an alert? As mentioned earlier some software has templates to use to bring the most important to surface for 'device types'. Sometimes after a particularly bad incident you tell your SNMP Manger what specific alert you want to watch for that behavior in the future and alert you accordingly. This type of problem is your way of knowing that something MAY happen if you don't do something. As I relayed earlier I have used OpManager to monitor many different device types. Below is an example from their demo site that show an alert.
Tomorrow I will discuss monitoring device events that are sudden and can be bad, and what to do about them.
No comments:
Post a Comment