Skip to content

Alloy Log & Metric Collection

Grafana Alloy 1.13.2 runs as a native systemd service on every server, collecting Docker container logs, systemd journal logs, and system metrics, then forwarding them to Loki and Prometheus.


Overview

Property Value
Version 1.13.2
Installation RPM via rpm.grafana.com
Runtime Native systemd service (NOT Docker)
Service name alloy
Config path /etc/alloy/config.alloy
UI port 12345 (local only)
Playbook alloy/deploy-alloy.yml
Covered hosts 22 servers (15 Shared, 7 DEV)

Alloy replaced Promtail (EOL March 2026). A single Alloy instance replaces both Promtail (logs) and a separate node exporter (metrics).


What Alloy Collects

Source Enabled By Default Config Variable
Docker container logs Yes alloy_collect_docker_logs: true
Systemd journal logs Yes alloy_collect_journal_logs: true
System metrics (CPU, memory, disk, network) Yes alloy_collect_system_metrics: true
Application metrics (per-host opt-in) No alloy_scrape_app_metrics: true + alloy_app_metrics_targets

Docker Log Labels

Alloy attaches these labels to every container log stream:

Label Source Example
host inventory_hostname orchestrator-dev-cwiq-io
environment alloy_environment development
job hardcoded docker
container Docker container name orchestrator-server
service com.docker.compose.service label server
compose_project com.docker.compose.project label orchestrator

System Metrics Collected

Metric Family Description
node_cpu_seconds_total CPU time per mode (idle, user, system, iowait)
node_memory_* MemTotal, MemAvailable, Buffers, Cached, SwapFree
node_filesystem_* size, available, free, inodes per mountpoint
node_disk_* read/write bytes, I/O time per device
node_network_* receive/transmit bytes, errors, drops per interface
node_load1, node_load5, node_load15 System load averages
node_timex_offset_seconds NTP clock offset (feeds NTPClockDrift alert)

Data Transport

Cross-VPC: always use Tailscale hostnames from DEV

Alloy agents on DEV servers must use Tailscale hostnames — not FQDN — to reach the Shared observability stack. FQDN resolves to VPC private IPs that are not routable cross-VPC.

Destination Shared VPC DEV VPC Port
Loki loki.shared.cwiq.io loki-shared-cwiq-io 3100
Prometheus prometheus.shared.cwiq.io prometheus-shared-cwiq-io 9009

Configured in group_vars/all.yml (shared) and group_vars/dev_servers.yml (DEV) on the ansible server.


Monitored Servers

Shared Environment (15 hosts)

Host Docker Logs App Metrics
prometheus-shared-cwiq-io Yes prometheus:9090, alertmanager:9093
loki-shared-cwiq-io Yes loki:3100
grafana-shared-cwiq-io Yes grafana:3000
gitlab-shared-cwiq-io Yes No
sonarqube-shared-cwiq-io Yes No
icinga-shared-cwiq-io Yes No
ansible-shared-cwiq-io No No
authentik-shared-cwiq-io-1 Yes No
authentik-shared-cwiq-io-2 Yes No
vault-shared-cwiq-io Yes No
nexus-shared-cwiq-io Yes No
semaphore-shared-cwiq-io Yes No
defectdojo-shared-cwiq-io Yes No
reportportal-shared-cwiq-io Yes No
openldap-shared-cwiq-io Yes No

ansible-shared-cwiq-io has no Docker installed; it uses ansible_connection: local and Docker log collection is disabled.

DEV Environment (7 hosts)

Host Docker Logs App Metrics Targets
orchestrator-dev-cwiq-io Yes server:8000, runner-api:8003, audit-api:8007, ai-catalogue-api:8006, monitoring-api:8008, notification-api:8009, iam-api:8004
orchestrator-demo-cwiq-io Yes server:8000
icinga-dev-cwiq-io Yes No
langfuse-dev-cwiq-io Yes No
datastorea-dev-cwiq-io Yes No
datastoreb-dev-cwiq-io Yes No
identity-db-dev-cwiq-io Yes No

Deployment

Deploy to a Single Host

ssh ansible@ansible-shared-cwiq-io
ansible-helper
cd alloy

ansible-playbook -i inventory/dev.yml deploy-alloy.yml \
  --limit orchestrator-dev-cwiq-io

Deploy to All Hosts in an Environment

# All DEV hosts
ansible-playbook -i inventory/dev.yml deploy-alloy.yml

# All Shared hosts
ansible-playbook -i inventory/shared.yml deploy-alloy.yml

The playbook installs the Alloy RPM, adds the alloy user to the docker and systemd-journal groups, templates /etc/alloy/config.alloy, and enables the systemd service.


Adding a New Host

  1. Add the host entry to inventory/dev.yml or inventory/shared.yml:

    <hostname>-cwiq-io:
      ansible_host: <hostname>-cwiq-io
      ansible_user: ec2-user
      alloy_environment: development
      alloy_scrape_app_metrics: false
      alloy_app_metrics_targets: []
    

  2. Update alloy/docs/README.md — add a row to the Monitored Servers table.

  3. Deploy:

    ansible-playbook -i inventory/dev.yml deploy-alloy.yml \
      --limit <hostname>-cwiq-io
    


Verification

# Check systemd service
ssh ec2-user@<hostname>-cwiq-io "sudo systemctl status alloy"

# Check for connection errors
ssh ec2-user@<hostname>-cwiq-io "sudo journalctl -u alloy -n 50 --no-pager"

# Verify logs in Grafana Explore (Loki datasource)
# {host="<hostname>-cwiq-io"}

# Verify metrics in Prometheus
# up{host="<hostname>-cwiq-io"}

Logs appear within 30 seconds of Alloy starting. If no data arrives after 2 minutes, check the journal for HTTP 429 (rate limit), authentication errors, or S3 errors.