Skip to content

rokernel/sonic-exporter

 
 

Repository files navigation

sonic-exporter

Prometheus exporter for SONiC network switches.

This project collects switch telemetry from SONiC Redis databases and exposes it in Prometheus format. It also enables a curated subset of node_exporter host metrics, so you can monitor switch services and system health in one scrape target.

Why exporters matter

Exporters let you turn platform-specific telemetry into a standard metrics format:

  • Prometheus can scrape data from many systems in one consistent way.
  • You can build shared dashboards and alerts across vendors and platforms.
  • Metrics become queryable with one language (PromQL), which reduces operational friction.
  • You can correlate switch-level signals (interfaces, queues, FDB, LLDP) with host-level signals (CPU, memory, filesystem).

For SONiC environments, this means less custom glue code and faster troubleshooting from a single monitoring stack.

What this project is for

sonic-exporter is focused on production-friendly SONiC observability:

  • Reads from SONiC Redis and selected local read-only sources.
  • Keeps scrape latency stable with cached refresh loops per collector.
  • Enforces guardrails (timeouts, caps, bounded labels) to control cardinality and scrape cost.
  • Keeps experimental collectors opt-in.

Architecture

Runtime flow

flowchart LR
    subgraph SONiC host
        R[(SONiC Redis DBs)]
        F[/Read-only files/]
        C[[Allowlisted commands]]
    end

    subgraph sonic-exporter
        M[cmd/sonic-exporter/main.go]
        COL[Collectors\ninterface, hw, crm, queue, lldp, vlan, lag, fdb\nsystem*, docker*]
        CACHE[(In-memory metric cache)]
        NODE[node_exporter subset\nloadavg,cpu,diskstats,filesystem,meminfo,time,stat]
    end

    P[(Prometheus)]

    R --> COL
    F --> COL
    C --> COL
    COL --> CACHE
    CACHE --> M
    NODE --> M
    M -->|/metrics| P
Loading

* Experimental collectors are disabled by default.

Repository structure

sonic-exporter/
├── cmd/sonic-exporter/      # bootstrap, collector registration, HTTP server
├── internal/collector/      # SONiC collectors and collector tests
├── pkg/redis/               # Redis access wrapper
├── fixtures/test/           # test fixtures loaded into miniredis
├── scripts/                 # static build and package helpers
└── .github/workflows/       # CI test and release pipelines

For a deeper breakdown, see docs/architecture.md.

Collectors

Collector Purpose Default
Interface Interface operation and traffic metrics Enabled
HW PSU and fan health metrics Enabled
CRM Critical resource monitoring Enabled
Queue Queue counters and watermarks Enabled
LLDP LLDP neighbors from Redis Enabled
VLAN VLAN and VLAN member state Enabled
LAG PortChannel and member state Enabled
FDB FDB summary from ASIC DB Disabled (FDB_ENABLED=false)
System (experimental) Switch identity, software metadata, uptime Disabled (SYSTEM_ENABLED=false)
Docker (experimental) Container runtime metrics from STATE_DB Disabled (DOCKER_ENABLED=false)

Collector implementations live in internal/collector/*_collector.go.

Quick start

Run locally

./sonic-exporter
curl localhost:9101/metrics

Run dev environment

docker-compose up --build -d
curl localhost:9101/metrics

Configuration

Core settings

Variable Description Default
REDIS_ADDRESS Redis address (host:port for TCP) localhost:6379
REDIS_PASSWORD Password for Redis empty
REDIS_NETWORK Redis network type (tcp or unix) tcp

LLDP collector

Variable Description Default
LLDP_ENABLED Enable LLDP collector true
LLDP_INCLUDE_MGMT Include management interfaces like eth0 true
LLDP_REFRESH_INTERVAL Cache refresh interval 30s
LLDP_TIMEOUT Timeout for one refresh cycle 2s
LLDP_MAX_NEIGHBORS Max neighbors exported per refresh 512

VLAN collector

Variable Description Default
VLAN_ENABLED Enable VLAN collector true
VLAN_REFRESH_INTERVAL Cache refresh interval 30s
VLAN_TIMEOUT Timeout for one refresh cycle 2s
VLAN_MAX_VLANS Max VLANs exported per refresh 1024
VLAN_MAX_MEMBERS Max VLAN members exported per refresh 8192

LAG collector

Variable Description Default
LAG_ENABLED Enable LAG collector true
LAG_REFRESH_INTERVAL Cache refresh interval 30s
LAG_TIMEOUT Timeout for one refresh cycle 2s
LAG_MAX_LAGS Max LAGs exported per refresh 512
LAG_MAX_MEMBERS Max LAG members exported per refresh 4096

FDB collector

Variable Description Default
FDB_ENABLED Enable FDB collector false
FDB_REFRESH_INTERVAL Cache refresh interval 60s
FDB_TIMEOUT Timeout for one refresh cycle 2s
FDB_MAX_ENTRIES Max ASIC FDB entries processed per refresh 50000
FDB_MAX_PORTS Max per-port FDB series exported 1024
FDB_MAX_VLANS Max per-VLAN FDB series exported 4096

System collector (experimental)

Variable Description Default
SYSTEM_ENABLED Enable system collector false
SYSTEM_REFRESH_INTERVAL Cache refresh interval 60s
SYSTEM_TIMEOUT Timeout for one refresh cycle 4s
SYSTEM_COMMAND_ENABLED Enable allowlisted read-only command fallback true
SYSTEM_COMMAND_TIMEOUT Timeout per command 2s
SYSTEM_COMMAND_MAX_OUTPUT_BYTES Max bytes read per command 262144
SYSTEM_VERSION_FILE SONiC version metadata path /etc/sonic/sonic_version.yml
SYSTEM_MACHINE_CONF_FILE Machine config path /host/machine.conf
SYSTEM_HOSTNAME_FILE Hostname path /etc/hostname
SYSTEM_UPTIME_FILE Uptime path /proc/uptime

Enable:

SYSTEM_ENABLED=true ./sonic-exporter

System collector exports:

  • sonic_system_identity_info
  • sonic_system_software_info
  • sonic_system_uptime_seconds
  • sonic_system_collector_success
  • sonic_system_scrape_duration_seconds
  • sonic_system_cache_age_seconds

Data source order:

  1. Redis (DEVICE_METADATA|localhost, CHASSIS_INFO|chassis 1)
  2. Read-only files (/etc/sonic/sonic_version.yml, /host/machine.conf, /etc/hostname, /proc/uptime)
  3. Optional allowlisted command fallback (show platform summary --json, show version, show platform syseeprom)

Docker collector (experimental)

Variable Description Default
DOCKER_ENABLED Enable docker collector false
DOCKER_REFRESH_INTERVAL Cache refresh interval 60s
DOCKER_TIMEOUT Timeout for one refresh cycle 2s
DOCKER_MAX_CONTAINERS Max container entries exported per refresh 128
DOCKER_SOURCE_STALE_THRESHOLD Source age threshold for stale signal 5m

Enable:

DOCKER_ENABLED=true ./sonic-exporter

Docker collector behavior:

  • Reads STATE_DB keys DOCKER_STATS|* and DOCKER_STATS|LastUpdateTime.
  • No Docker socket access.
  • No writes.
  • Controlled label cardinality (container only).

Metrics examples

These are compact anonymized examples. Labels can vary by SONiC platform/version.

sonic_interface_operational_status{device="Ethernet0"} 1
sonic_hw_psu_operational_status{psu="PSU1"} 1
sonic_crm_stats_used{resource="ipv4_route"} 1610
sonic_queue_dropped_packets_total{device="Ethernet0",queue="3"} 73
sonic_lldp_neighbors 64
sonic_vlan_admin_status{vlan="Vlan1000"} 1
sonic_lag_oper_status{lag="PortChannel1"} 1
sonic_fdb_entries 1331
sonic_system_uptime_seconds 123456
sonic_docker_container_cpu_percent{container="swss"} 1.5
node_memory_MemAvailable_bytes 1.24e+10

Validated platforms

These tests were done with SONiC Community releases (not SONiC Enterprise releases).

Model Number SONiC Software Version SONiC OS Version Distribution Kernel Platform ASIC
DellEMC-S5232f-C8D48 202012 10 Debian 10.13 4.19.0-12-2-amd64 x86_64-dellemc_s5232f_c3538-r0 broadcom
SSE-T7132SR 202505 12 Debian 12.11 6.1.0-29-2-amd64 x86_64-supermicro_sse_t7132s-r0 marvell-teralynx
MSN2100-CB2FC 202411 12 Debian 12.12 6.1.0-29-2-amd64 x86_64-mlnx_msn2100-r0 mellanox

Development

go test ./...
go build ./...
./scripts/build.sh
./scripts/package.sh
docker-compose up --build -d

Notes:

  • ./scripts/build.sh produces a static Linux binary (CGO_ENABLED=0).
  • If you add keys to Redis fixtures manually, persist them with SAVE in Redis.

Run with systemd

This section shows an example way to run sonic-exporter as a Linux service using systemd, with collector toggles set by environment variables.

Note: this systemd setup is not fully tested yet. Validate it in a lab or canary environment before using it in production.

1) Create a dedicated service user

sudo useradd --system --no-create-home --shell /usr/sbin/nologin sonic-exporter

If your distro uses /sbin/nologin, use that path instead.

2) Install the binary

sudo install -m 0755 ./sonic-exporter /usr/local/bin/sonic-exporter

3) Create an environment file

Use an env file so collector toggles and Redis settings are easy to manage without editing the unit file.

sudo install -d -m 0755 /etc/sonic-exporter
sudo tee /etc/sonic-exporter/sonic-exporter.env >/dev/null <<'EOF'
REDIS_ADDRESS=localhost:6379
REDIS_PASSWORD=
REDIS_NETWORK=tcp

LLDP_ENABLED=true
VLAN_ENABLED=true
LAG_ENABLED=true
FDB_ENABLED=false
SYSTEM_ENABLED=false
DOCKER_ENABLED=false
EOF

4) Create the systemd unit

Create /etc/systemd/system/sonic-exporter.service:

[Unit]
Description=SONiC Prometheus Exporter
Documentation=https://github.com/rokernel/sonic-exporter
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
User=sonic-exporter
Group=sonic-exporter

EnvironmentFile=/etc/sonic-exporter/sonic-exporter.env
ExecStart=/usr/local/bin/sonic-exporter
Restart=on-failure
RestartSec=5s

# Logging
StandardOutput=journal
StandardError=journal

# Hardening (safe defaults for this exporter)
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
LockPersonality=true
MemoryDenyWriteExecute=true
RestrictSUIDSGID=true
RestrictRealtime=true
SystemCallArchitectures=native

# Allow read access to host and SONiC files used by collectors
ReadOnlyPaths=/etc/sonic /host /proc

[Install]
WantedBy=multi-user.target

5) Enable and start

sudo systemctl daemon-reload
sudo systemctl enable --now sonic-exporter
sudo systemctl status sonic-exporter

6) Validate service and metrics

sudo systemctl show sonic-exporter --property=Environment
curl -s http://127.0.0.1:9101/metrics | head

To confirm a specific collector is enabled/disabled, look for its metric prefix:

  • LLDP: sonic_lldp_
  • VLAN: sonic_vlan_
  • LAG: sonic_lag_
  • FDB: sonic_fdb_
  • System: sonic_system_
  • Docker: sonic_docker_

7) Change collector settings safely

Edit env file only, then restart:

sudoedit /etc/sonic-exporter/sonic-exporter.env
sudo systemctl restart sonic-exporter

8) Prefer overrides for local customization

If the unit is package-managed in future, do not edit it directly. Use an override:

sudo systemctl edit sonic-exporter

Example override:

[Service]
Environment="FDB_ENABLED=true"
Environment="SYSTEM_ENABLED=true"

Then:

sudo systemctl daemon-reload
sudo systemctl restart sonic-exporter

Testing safely without breaking production

Validate unit syntax first (no restart):

sudo systemd-analyze verify /etc/systemd/system/sonic-exporter.service

Run a canary service on a different port:

  1. Copy unit to sonic-exporter-canary.service
  2. Change ExecStart=/usr/local/bin/sonic-exporter --web.listen-address=:19101
  3. Optionally use a canary env file
  4. Start only canary:
    sudo systemctl daemon-reload
    sudo systemctl start sonic-exporter-canary
    sudo systemctl status sonic-exporter-canary
  • Verify:
    • curl -sf http://127.0.0.1:19101/metrics >/dev/null && echo OK
    • journalctl -u sonic-exporter-canary -n 100 --no-pager
  • Clean rollback:
    sudo systemctl stop sonic-exporter-canary
    sudo systemctl disable sonic-exporter-canary
    rm /etc/systemd/system/sonic-exporter-canary.service

Notes

  • SONiC collector toggles are controlled by environment variables, not dedicated CLI flags.
  • Keep SYSTEM_ENABLED and DOCKER_ENABLED off unless you need them.
  • If hardening blocks file access on your distro, relax only the minimum setting and document the reason.

Disclaimer

This project is primarily a learning exercise, and parts of it were developed using AI-assisted workflows (often referred to as "vibe coding") while learning Go.

Before any production deployment, the source code must be reviewed, validated, and approved by qualified human engineers.

Upstream credits and acknowledgments

This project builds on work from upstream open source projects. Thank you to the maintainers and contributors.

If this repository was forked from another sonic-exporter repository in your organization history, add that URL here as well so lineage stays explicit for users.

About

sonic-exporter is prometheus exporter for network switches running sonic NOS.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Go 98.8%
  • Other 1.2%