How Monitoring Works
This page explains, in plain terms, how GridNMS keeps an eye on your network — from the moment something happens on a device to the alert that lands in your inbox. You don’t need to know any of this to use GridNMS, but it helps to have the mental model.
The big picture
Section titled “The big picture” Your devices ──► Collector ──► GridNMS ──► You (signals) (watches) (decides) (alerts)- Your devices produce signals — they respond (or don’t) to pings, expose counters over SNMP, and send syslog messages and SNMP traps.
- A collector near your network gathers all of that.
- GridNMS interprets it, stores it, and decides what’s worth an event.
- You get alerted about the events you’ve chosen to care about.
Four ways GridNMS watches
Section titled “Four ways GridNMS watches”| Signal | What it is | Example |
|---|---|---|
| Reachability | Is the device responding? Checked regularly. | A switch stops answering → it’s marked Down. |
| Metrics (polling) | Counters read from the device on a schedule. | Interface traffic, CPU, memory, disk usage. |
| SNMP traps | The device proactively reports an event. | A line card fails and the device sends a trap. |
| Logs (syslog) | Messages the device writes about what it’s doing. | A firewall logs a blocked intrusion attempt. |
Reachability and metric polling are things GridNMS goes and checks. Traps and syslog are things your devices send to the collector. See Monitoring & Metrics and Logs & Detections.
From signal to event
Section titled “From signal to event”Not every signal is worth your attention, so GridNMS turns the meaningful ones into events, each with a severity from Critical down to Info. Events are created when, for example:
- A device becomes unreachable, or comes back.
- A metric crosses a threshold you set (like an interface running hot).
- An SNMP trap arrives (a link going down, a cold start, an auth failure).
- A log detection matches something you’ve told GridNMS to watch for.
Cleaning up the noise
Section titled “Cleaning up the noise”Before an event reaches you, transformation rules (set by admins) can automatically re-tag it, change its severity, suppress it entirely, or close a related open event. This keeps the feed focused on what matters. And during planned work, maintenance windows suppress alerts for specific devices so you aren’t paged for expected outages. Both are covered in Events & Alerts.
From event to alert
Section titled “From event to alert”Every event comes from a detection (or an interface threshold). Notifications live right on that detection: turn on Notify on match and the events it raises are delivered to your chosen notification endpoints — email, Slack, a webhook, or PagerDuty — and/or to you directly. There’s no separate subscriptions list; the rule that makes the event is also what decides who’s told. See Notifications.
Where your history lives
Section titled “Where your history lives”GridNMS keeps recent, active events front-and-centre for fast browsing, and rolls older events into long-term history you can search any time on the Event History view — filter by device, time range, severity, or text, with a severity histogram for the period.
What stays where (self-hosted)
Section titled “What stays where (self-hosted)”In a self-hosted deployment, your devices, events, metrics, and logs all live on your own infrastructure. Your instance only reaches out to GridNMS for licensing and updates — and not even that in an air-gapped setup.