Dell iDRAC Event Monitor Reference Guide

Dell iDRAC Event Monitor

Connects to the iDrac interface on Dell servers and monitors hardware-related metrics.

Overview

The Dell iDRAC Event Monitor connects to the iDRAC system found on many systems sold by Dell. iDRAC provides hardware-related data about a system, including details about fans, temperatures, and power supplies.

This event monitor uses the "winrm" Windows command line tool to connect to the iDRAC interface and retrieve the data that it requires. Before using the event monitor for the first time, you may need to run "winrm quickconfig" on the monitoring server or remote node.

When configuring the event monitor, be sure to select the IP or host name of the iDRAC interface and not the system's network interface.

Use Cases

  • Monitoring the hardware health status of your Dell servers
  • Alerting based on high power consumption
  • Warning about fan and temperature status

Monitoring Options

This event monitor provides the following options:

Alert with [Info/Warning/Error/Critical] if the iDRAC system is unreachable

Use this option to get alerts if the iDRAC system could not be contacted.

Alert with [Info/Warning/Error/Critical] if the system's health status is not OK

Checks the overall system's rollup status. A value of "Degraded" indicates that one or more system components are in a failed state but the system is still operational. A value of "Error" indicates a critical failure of one or more system components.

Alert with [Info/Warning/Error/Critical] if the chassis health status is not OK

Checks the chassis health and warns about intrusion status and CPU health status.

Alert with [Info/Warning/Error/Critical] if fan health status is not OK

Checks the status of each fan and warns if the fan's primary status is not 'OK'.

Alert with [Info/Warning/Error/Critical] if temperature probe health status is not OK

Checks system board inlet temperature and CPU temperatures. Alert if any temperatures are outside of system configured thresholds. Records data points for each temperature.

Alert with [Info/Warning/Error/Critical] if power supply health status is not OK

Checks power supply status, alerts about failed power supplies, alerts if redundancy status is not OK.

Alert with [Info/Warning/Error/Critical] if RAID and/or drive status is not OK

Checks primary and RAID status for all physical and virtual disks. Alerts if the status is not "OK" or if the RAID status is not "Online".

Alert if the average consumed power for a chassis is greater than specified thresholds

Checks power consumption of the chassis and alerts if it exceeds thresholds that you define. Records current power consumption as a graph data point.

Authentication and Security

The account used for authentication must have access to the iDRAC interface.

Protocols

Data Points

This event monitor generates the following data points:

Data Point Description
Temp The temperature of your hardware.
Power Consumption The amount of power consumed by your hardware.

Sample Output

Tutorial

To view the tutorial for this event monitor, click here.

Back to Library

Comments

There are no user-contributed comments for this page. Be the first to submit a comment!

Add a comment