System Health Event Monitor Reference Guide

System Health Event Monitor

Monitors key health values for servers and workstations.

Overview

The System Health Event Monitor collects CPU, memory, disk, bandwidth, and ping response time data all in one run. It allows you to set alert thresholds based on each metric. It also populates the system health dashboard and the system health reports.

Use Cases

  • Starting out with FrameFlow and setting up your first event monitor
  • Monitoring multiple metrics for multiple systems
  • Reducing the total number of event monitors needed to monitor key metrics

Monitoring Options

This event monitor provides the following options:

Alert With [Info/Warning/Error/Critical] if the Device Cannot Be Contacted

Use this option to get alerts if FrameFlow could not contact the selected device.

Alert based on CPU usage

Use this option to get alerts when CPU usage exceeds the thresholds that you specify.

Alert if the directory contains [more than / less than] a specified number of files

Use this option to set alerting thresholds based on the number of files found in the directory.

Alert based on disk space used

Use this option to get alerts about systems whose disk or partition space is running low. In the option to ignore specified drives, use the following format: "deviceName(C)" (without the quotes). You can use the * character to match multiple devices or drives. For example, use "devicename(*)" to ignore all drives on a device. Use "x*(D)" to ignore the D: drive on all devices whose names start with x. To specify multiple drives to ignore separate them with commas, for example, "device1(G), device2(F)".

Alert based on the percentage of memory used

Use this option to generate alerts based on how much physical memory is in use on each network device.

Alert based on ping response times

Use this option to alert based on the observed ping response time for the device.

Alert based on total bandwidth rate

This option lets you generate alerts based on the total bandwidth rate for the device. This rate is the sum of all incoming and outgoing bandwidth on all interfaces.

Alert based on outgoing bandwidth rate

This option lets you generate alerts based on the combined outgoing bandwidth rate of all interfaces.

Alert based on incoming bandwidth rate

This option lets you generate alerts based on the combined incoming bandwidth rate of all interfaces.

SSH Port Number

The default port number of SSH connections is 22. If your servers are using a non-standard port you can specify it here. This only applies if SSH is the protocol you have selected.

Authentication and Security

PDH: The account used for authentication must be a member of the Performance Monitor Users group or have admin rights.

WMI: The account used for authentication must be a member of the Performance Monitor Users group and the Distributed COM Users group, or have admin rights.

SSH: The account used for authentication must have the ability to run df, netstat, sysctl, and similar commands.

SNMP: For SNMPv1 and SNMPv2c, a community string for the device being monitored is required. For SNMPv3, a username and other SNMPv3 parameters are required.

Protocols

This event monitor can be configured to use one of four protocol options. To learn more about any individual protocol, select it below.

WMI
SSH
SNMP

Data Points

This event monitor generates the following data points:

Data Point Description
Average Response Time The calculated average ping response time.
CPU Usage The total CPU used, by percent.
Used Memory The percentage of total memory used.
Space used: C; The percentage of space used in the C drive.

Sample Output

Tutorial

To view the tutorial for this event monitor, click here.

Back to Library

Comments

There are no user-contributed comments for this page. Be the first to submit a comment!

Add a comment