Skip to content

Slack Alerting

All infrastructure alerts route to two Slack channels based on environment. AlertManager handles Prometheus metric alerts; Icinga2 handles infrastructure health check alerts. Both systems send to the same channels.


Channel Strategy

Channel Environments Sources
#cwiq-shared-infra-alerts Shared Services (14 hosts) AlertManager + Icinga2 master
#cwiq-dev-infra-alerts DEV + Demo (7 hosts) AlertManager + Icinga2 satellite

All severities (warning, critical, resolved) post to the same channel, color-coded:

Condition Color
Critical alert firing Red (danger)
Warning alert firing Orange (warning)
Alert resolved Green (good)

Alert Routing

Prometheus AlertManager

Environment routing is based on the environment label attached by Alloy agents:

Prometheus fires alert
    |
AlertManager routing tree
    ├── environment=shared          → #cwiq-shared-infra-alerts
    └── environment=development|demo → #cwiq-dev-infra-alerts
          ├── severity=critical  → repeat every 1h
          └── severity=warning   → repeat every 4h

Icinga2

Icinga routes based on the host's vars.environment:

vars.environment Slack Channel
shared #cwiq-shared-infra-alerts
dev #cwiq-dev-infra-alerts

Domain Separation

The two alerting systems cover complementary concerns:

Domain Tool Examples
Infrastructure health ("is it alive?") Icinga2 SSH connectivity, TCP port open, SSL certificate expiry, Docker container state
Metric thresholds ("is it working correctly?") AlertManager CPU > 80%, disk filling, HTTP 5xx rate, memory exhaustion

Do not duplicate alerts across both systems.


AlertManager Message Format

Each Prometheus alert message includes:

Field Example
Alert name (links to Grafana) HighDiskUsage
Host prometheus-shared-cwiq-io
Mount /data (disk alerts) or n/a
Severity warning or critical
Environment shared or development
Description Root filesystem at 83% on prometheus-shared-cwiq-io

Webhook Management

Slack webhook URLs are stored in Vault:

vault kv put secret/slack/webhooks \
  shared="https://hooks.slack.com/services/..." \
  dev="https://hooks.slack.com/services/..."

Set in group_vars/all.yml on the ansible server:

alertmanager_slack_webhook_shared: "https://hooks.slack.com/services/..."
alertmanager_slack_webhook_dev: "https://hooks.slack.com/services/..."

After updating webhooks, redeploy Prometheus:

cd prometheus
ansible-playbook -i inventory/shared.yml deploy-prometheus.yml

Testing Alerting

Test AlertManager → Slack

# Fire a test alert (run on prometheus-shared-cwiq-io)
curl -X POST http://localhost:9093/api/v2/alerts \
  -H 'Content-Type: application/json' \
  -d '[{
    "labels": {
      "alertname": "TestAlert",
      "severity": "warning",
      "environment": "shared"
    },
    "annotations": {
      "summary": "Test alert",
      "description": "Slack integration test — safe to ignore"
    }
  }]'

The alert appears in #cwiq-shared-infra-alerts within 30 seconds. Use "environment": "development" to test #cwiq-dev-infra-alerts.

Test Icinga → Slack

Run a forced check on a known host from the IcingaWeb2 UI (https://icinga.shared.cwiq.io) and verify the notification appears in the channel.


Adding Alerting for a New Environment

  1. Create Slack channel #cwiq-{env}-infra-alerts and configure an Incoming Webhook.
  2. Store webhook URL in Vault: vault kv patch secret/slack/webhooks {env}="https://..."
  3. Add AlertManager receiver in prometheus/roles/deploy_prometheus/templates/alertmanager.yml.j2.
  4. Add Icinga notification user in icinga/conf.d/notifications.conf.j2.
  5. Create Alloy inventory file at alloy/inventory/{env}.yml.
  6. Update documentation: SLACK_ALERTING.md, prometheus/docs/ALERTING.md, icinga/README.md, alloy/docs/README.md.

Silencing Alerts

To silence a flapping or known-maintenance alert:

  1. Open https://grafana.shared.cwiq.ioAlertingSilences
  2. Create a silence matching {alertname="...", host="..."} with a start/end time
  3. Alternatively, use the AlertManager API:
    # Run on prometheus-shared-cwiq-io
    curl -s http://localhost:9093/api/v2/silences | python3 -m json.tool