Skip to content

Monitoring

bunqueue exposes Prometheus-compatible metrics for production monitoring. This guide covers the built-in metrics endpoint and a ready-to-use Grafana dashboard.

Quick Start with Docker Compose

bunqueue includes a pre-configured monitoring stack:

Terminal window
# Start bunqueue + Prometheus + Grafana
docker compose --profile monitoring up -d

Access the dashboards:

Prometheus Endpoint

bunqueue exposes metrics at /prometheus on the HTTP port (default 6790):

Terminal window
curl http://localhost:6790/prometheus

Available Metrics

MetricTypeDescription
bunqueue_jobs_waitinggaugeJobs waiting in queue
bunqueue_jobs_delayedgaugeDelayed jobs
bunqueue_jobs_activegaugeJobs being processed
bunqueue_jobs_completedgaugeCompleted jobs in memory
bunqueue_jobs_dlqgaugeJobs in dead letter queue
bunqueue_jobs_pushed_totalcounterTotal jobs pushed
bunqueue_jobs_pulled_totalcounterTotal jobs pulled
bunqueue_jobs_completed_totalcounterTotal jobs completed
bunqueue_jobs_failed_totalcounterTotal jobs failed
bunqueue_uptime_secondsgaugeServer uptime
bunqueue_cron_jobs_totalgaugeRegistered cron jobs
bunqueue_workers_totalgaugeRegistered workers
bunqueue_workers_activegaugeActive workers
bunqueue_workers_processed_totalcounterJobs processed by workers
bunqueue_workers_failed_totalcounterJobs failed by workers
bunqueue_webhooks_totalgaugeTotal webhooks
bunqueue_webhooks_enabledgaugeEnabled webhooks

Example Output

# HELP bunqueue_jobs_waiting Number of jobs waiting in queue
# TYPE bunqueue_jobs_waiting gauge
bunqueue_jobs_waiting 42
# HELP bunqueue_jobs_active Number of jobs being processed
# TYPE bunqueue_jobs_active gauge
bunqueue_jobs_active 8
# HELP bunqueue_jobs_pushed_total Total jobs pushed
# TYPE bunqueue_jobs_pushed_total counter
bunqueue_jobs_pushed_total 150432

Prometheus Configuration

Add bunqueue to your prometheus.yml:

scrape_configs:
- job_name: 'bunqueue'
scrape_interval: 5s
static_configs:
- targets: ['localhost:6790']
metrics_path: /prometheus

With authentication:

scrape_configs:
- job_name: 'bunqueue'
scrape_interval: 5s
static_configs:
- targets: ['localhost:6790']
metrics_path: /prometheus
bearer_token: 'your-auth-token'

Grafana Dashboard

The included dashboard provides:

Overview Row

  • Jobs Waiting, Delayed, Active, Completed, DLQ
  • Active Workers, Cron Jobs, Uptime

Throughput & Performance

  • Job throughput (pushed/pulled/completed/failed per second)
  • Queue depth over time (stacked area chart)

Success & Failure Analysis

  • Success rate gauge (with thresholds)
  • Failure rate gauge (5-minute window)
  • Completed vs Failed bar chart

Workers & Processing

  • Worker count over time
  • Worker throughput (processed/failed per second)
  • Worker utilization gauge

Webhooks & Cron

  • Webhook status
  • Cron job count
  • Lifetime totals

Alerts & Health

  • Visual alert indicators for:
    • DLQ > 100 jobs
    • Failure rate > 5%
    • Queue backlog > 10,000
    • No active workers
    • Server health

Alert Rules

Pre-configured Prometheus alerts in monitoring/alert_rules.yml:

AlertConditionSeverity
BunqueueDLQHighDLQ > 100 for 5mcritical
BunqueueHighFailureRateFailure > 5% for 5mwarning
BunqueueQueueBacklogWaiting > 10k for 10mwarning
BunqueueNoWorkers0 workers + waiting jobscritical
BunqueueServerDownServer unreachablecritical
BunqueueLowThroughput< 1 job/s for 10mwarning
BunqueueWorkerOverloadUtilization > 95%warning
BunqueueJobsStuckActive jobs, no completionswarning

Example Alert Rule

- alert: BunqueueDLQHigh
expr: bunqueue_jobs_dlq > 100
for: 5m
labels:
severity: critical
annotations:
summary: "High number of jobs in DLQ"
description: "{{ $value }} jobs are in the dead letter queue."

CLI Metrics

View metrics from the command line:

Terminal window
# JSON format
bunqueue metrics
# Prometheus format
bunqueue metrics --format prometheus
# Server stats
bunqueue stats

Health Endpoints

bunqueue provides Kubernetes-compatible health endpoints:

Terminal window
# Detailed health (includes memory stats)
curl http://localhost:6790/health
# Kubernetes liveness probe
curl http://localhost:6790/healthz
# Kubernetes readiness probe
curl http://localhost:6790/ready

Debug Endpoints

For troubleshooting:

Terminal window
# Heap statistics
curl http://localhost:6790/heapstats
# Force garbage collection
curl -X POST http://localhost:6790/gc

File Structure

monitoring/
├── prometheus.yml # Prometheus config
├── alert_rules.yml # Alert definitions
└── grafana/
├── provisioning/
│ ├── datasources/ # Auto-configure Prometheus
│ └── dashboards/ # Auto-load dashboards
└── dashboards/
└── bunqueue.json # Complete dashboard

Custom Dashboards

Import the dashboard JSON directly:

  1. Open Grafana → Dashboards → Import
  2. Upload monitoring/grafana/dashboards/bunqueue.json
  3. Select Prometheus datasource
  4. Click Import

Best Practices

  1. Scrape interval: Use 5-15 seconds for real-time visibility
  2. Retention: Keep 15+ days for trend analysis
  3. Alerts: Start with the included rules, tune thresholds for your workload
  4. Labels: Consider adding custom labels for multi-queue environments