Skip to content

Poller Queue

When GridNMS needs to fetch fresh data from a device on demand — an interface refresh, a metric pull, a one-off check you triggered yourself — that request doesn’t run instantly. It joins a queue, gets handed to the collector responsible for that device, runs, and reports back. The Poller Queue is your window into that queue: what’s waiting, who’s handling it, and whether it’s flowing or piling up.

Open it from Configure → Event Management → Poller Queue.

Poller Queue The Poller Queue lists pending device polls, the collector handling each, and the outcome.

Each row is one poll request. The columns tell you the whole story of that request:

Column What it tells you
Status Where the request is in its life: waiting, running, finished, or failed.
Device Which device the poll targets.
Action What’s being fetched — for example a metric refresh or an interface poll.
Collector Which collector is assigned to run it. Blank means none is assigned yet.
Queued By Who or what asked for it — a user, or system for automated requests.
Queued How long ago the request was added.
Completed How long ago it finished (blank while still pending or running).
Result / Error The outcome — a short result on success, or the error message on failure.

Above the table, summary chips show a running count for each status, and filter buttons let you narrow the list. There’s also a collector dropdown to focus on a single collector’s work.

Every request moves through these states:

  1. Pending — accepted and waiting to be sent to a collector. A brief stop here is normal.
  2. Running — handed to the collector and currently being carried out.
  3. Done — completed successfully. The result appears in the last column.
  4. Error — the collector tried but couldn’t complete it; the reason is shown in the last column.

The page refreshes itself every few seconds while anything is pending or running, so you can watch items move through. Finished and failed items stay visible for a day and then clear automatically. You can also remove an individual item with the trash icon at the end of its row.

A healthy queue is one that’s mostly empty because work moves through it quickly:

  • Items spend only a short time in Pending before flipping to Running.
  • Running items finish within seconds and become Done.
  • The pending and running counts stay low — they rise briefly when you trigger a batch of polls, then drain back down.
  • Errors are rare and, when they happen, have a clear cause in the Result / Error column.

In short: the queue should feel like a fast-moving line, not a waiting room.

A growing queue is a signal, not just a nuisance. When pending items pile up and aren’t moving into running, the bottleneck is almost always the collector assigned to handle them. The most common causes:

  • The collector is offline. If a collector has lost its connection, nothing assigned to it can run — its items sit in Pending indefinitely. Filter by that collector to confirm everything is stuck, then check the collector itself.
  • The collector is overloaded. A collector responsible for a very large number of devices, or running on undersized hardware, can fall behind. Items still move, just slowly, and the pending count stays high.
  • No collector is assigned. Rows with a blank Collector column have nowhere to run. This usually points to a device that isn’t covered by any collector’s assigned networks.

When you spot a backlog, work it down like this:

  1. Open the collector dropdown and select the collector whose items look stuck. If everything for one collector is pending, that collector is the suspect.
  2. Switch the status filter to Error to see whether items are failing rather than simply waiting — failures point at credentials, reachability, or device problems instead of collector load.
  3. Note the Queued ages. Items queued many minutes ago and still pending confirm the queue isn’t draining.
  4. Head to Monitoring Your Collectors to confirm whether that collector is connected and keeping up.

When an item lands in Error, the message in the last column usually tells you what to fix:

  • Timeouts or “unreachable” — the device didn’t respond. Check that it’s actually up and reachable from the collector’s network.
  • Authentication failures — the credentials GridNMS has for that device are wrong or missing. Update them on the device.
  • No collector available — nothing is assigned to run the poll; the device may not fall within any collector’s networks.

A single failure is rarely cause for alarm; a wave of identical failures across many devices points at a shared cause — a network outage, an expired credential, or a collector problem.