Skip to content

Events & Alerts

An event is GridNMS telling you that something happened: a device went down, an interface saturated, a trap arrived, or a log pattern matched. The Events page is where you triage all of it — acknowledge what you’ve seen, close what’s resolved, and search the full history. This page covers everything from daily triage to the rules that quiet the noise.

The Events page The Events page: a live, severity-colored feed with a histogram and time-range controls.

Open Events to see your live event feed. Each row is one event, showing its timestamp, severity, the device it relates to, a message, a tag (a short label like PING_DOWN that groups related events), and its status. The feed is color-coded by severity so the most serious items stand out.

Events are ranked by severity, from most to least urgent. See severity levels for the full reference, but in short:

Severity Meaning
Critical Major impact — needs attention now.
Major Significant problem.
Minor A smaller issue worth knowing about.
Warning Early or low-impact signal.
Info Informational; no action required.

Above the feed, a histogram shows event volume over time, broken down by severity. Use the time-range picker to focus the histogram and the feed on a window — the last hour during an active incident, or the last week for a review. The histogram makes spikes obvious: a sudden tall bar of Critical events usually means a real outage just started.

Events move through three states:

  • Open — new and unhandled.
  • Acknowledged — someone has seen it and is on it. Acknowledging signals to the rest of the team that the event is being worked, without removing it.
  • Closed — resolved. Closed events drop out of the live feed but remain in history.

To handle an event, click it and choose Acknowledge or Close, optionally adding a note. Many events that come from monitoring — like a recovered device or an interface that dropped back under its thresholdclose themselves automatically when the underlying condition clears.

During a storm of related events, select multiple rows with the checkboxes and acknowledge or close them all at once. This is the fastest way to clear a batch of events from a single failing device once you know the cause.

The Problems view groups related events by device so you see one row per affected device instead of dozens of individual events. When a single device is generating a flood of events, Problems gives you the concise “these devices are unhealthy” summary, while the raw Events feed gives you the detail. Use Problems for a quick health read; drill into a device for the underlying events.

The live feed shows what’s current. To dig into the past — for an audit, a post-incident review, or to confirm how often something recurs — use Event History search. You can filter by:

  • Date range — any window, not just recent.
  • Device — everything that happened to one device.
  • Severity — only Critical, only Warning, and so on.
  • Text — match words in the event message.

History is kept far longer than the live feed, so you can answer questions like “how many times did this uplink flap last month?”

Events are raised from several sources, all flowing into the same feed:

Source Example event
Reachability checks A device stops responding → device down.
Thresholds An interface crosses its bandwidth threshold.
SNMP traps A device sends an unsolicited alert (e.g. a power-supply fault).
Log detections An incoming log matches a detection rule (e.g. repeated auth failures).

From the feed, matching events can be delivered to people through Notifications.

Administrators can shape events before they reach you using transformation rules. Each rule matches events by criteria (device, class, severity, message text) and then takes an action:

Action What it does
Tag Add a label to matching events for easier filtering.
Change severity Raise or lower how urgent an event is treated.
Suppress Hide noisy, known-benign events entirely.
Auto-close Immediately close events you never need to act on.

Transformation rules are how you tune GridNMS to your environment — for example, downgrading a chatty informational trap to Info, or suppressing a known cosmetic warning so it never clutters the feed.

Planned work shouldn’t page anyone. A maintenance window tells GridNMS that certain devices are expected to be offline during a scheduled period, so events from those devices are suppressed and no alerts fire.

To use one:

  1. Create a maintenance window and set its start and end time.
  2. Choose the devices (or a site/class) it covers.
  3. Save. During the window, events from those devices are held back, and they appear in the upcoming maintenance panel on your dashboard.

When the window ends, normal alerting resumes automatically.