Skip to content

SSH Access Patterns

All server SSH access goes through Tailscale. There are no public IPs, no bastion hosts, and no per-instance key management — connect directly using the Tailscale hostname. The SSH user varies by server and purpose.


SSH User Reference

Server Tailscale Hostname Application User Infrastructure User Notes
Orchestrator DEV orchestrator-dev-cwiq-io cwiq ec2-user cwiq for app/logs; ec2-user for Ansible
LangFuse DEV langfuse-dev-cwiq-io ec2-user No application user
Demo orchestrator-demo-cwiq-io cwiq ec2-user cwiq for app; Ansible deploys only
Identity DB DEV identity-db-dev-cwiq-io ec2-user No application user
Ansible Server ansible-shared-cwiq-io ansible Use ansible for all operations
GitLab gitlab-shared-cwiq-io ec2-user App managed by Docker
Vault vault-shared-cwiq-io ec2-user App managed by Docker
Nexus nexus-shared-cwiq-io ec2-user App managed by Docker
SonarQube sonarqube-shared-cwiq-io ec2-user App managed by Docker
DefectDojo defectdojo-shared-cwiq-io ec2-user App managed by Docker
Prometheus prometheus-shared-cwiq-io observability ec2-user observability for Docker stack
Loki loki-shared-cwiq-io observability ec2-user observability for Docker stack
Grafana grafana-shared-cwiq-io observability ec2-user observability for Docker stack
Icinga Master icinga-shared-cwiq-io icinga ec2-user icinga for Docker stack
Icinga DEV Satellite icinga-dev-cwiq-io icinga ec2-user Satellite only (no web UI)
Tailscale Router (Shared) tailscale-shared ec2-user Subnet router only
Tailscale Router (DEV) tailscale-dev ec2-user Subnet router only

Quick Connect

# Application servers
ssh cwiq@orchestrator-dev-cwiq-io
ssh cwiq@orchestrator-demo-cwiq-io

# Infrastructure
ssh ansible@ansible-shared-cwiq-io
ssh ec2-user@identity-db-dev-cwiq-io
ssh ec2-user@langfuse-dev-cwiq-io

# Observability stack
ssh observability@prometheus-shared-cwiq-io
ssh observability@loki-shared-cwiq-io
ssh observability@grafana-shared-cwiq-io

# Shared services (most use ec2-user for direct access)
ssh ec2-user@gitlab-shared-cwiq-io
ssh ec2-user@vault-shared-cwiq-io
ssh ec2-user@nexus-shared-cwiq-io
ssh ec2-user@sonarqube-shared-cwiq-io
ssh ec2-user@defectdojo-shared-cwiq-io

# Icinga
ssh ec2-user@icinga-shared-cwiq-io
ssh ec2-user@icinga-dev-cwiq-io

Tailscale required

SSH connections use Tailscale MagicDNS hostnames. Your laptop must be connected to the CWIQ.IO tailnet before SSHing to any server.


Ansible Server SSH Convention

Always use ansible-helper, never raw ansible-playbook

The ansible-helper shell function is the only correct way to run Ansible on the ansible server. It activates the virtual environment and loads Vault credentials. Running ansible-playbook directly will fail silently or use stale credentials.

# Connect to ansible server
ssh ansible@ansible-shared-cwiq-io

# Activate environment and Vault credentials
ansible-helper
# This function:
#   1. cd /data/ansible/cwiq-ansible-playbooks
#   2. source .venv/bin/activate
#   3. vault login → exports ROLE_ID and SECRET_ID

# Run any playbook from /data/ansible/cwiq-ansible-playbooks/
git pull
ansible-playbook cwiq-orchestrator/playbooks/deploy-infrastructure.yml -v

Never checkout a feature branch on the ansible server

The ansible server MUST always remain on the main branch. Develop and test playbook changes locally, merge to main via MR, then git pull on the ansible server. See Ansible Conventions for the full procedure.


Icinga check_by_ssh Pattern

Icinga uses SSH to run check scripts on monitored hosts. This is required for:

  • Docker container checks — querying container running state and restart counts
  • Internal API health checks — curling microservice ports (8003–8009) that are not reachable via Tailscale ACLs

How It Works

Icinga Satellite (icinga-dev-cwiq-io)
  └── check_by_ssh  →  SSH port 22  →  target host (e.g., orchestrator-dev-cwiq-io)
        └── /usr/local/lib/nagios/plugins/check_docker_container.sh -n <container>
              └── docker inspect → running state, restart count → Nagios exit code

For API health checks on restricted ports:

Icinga Satellite
  └── check_by_ssh  →  SSH port 22  →  orchestrator-dev-cwiq-io
        └── /usr/local/lib/nagios/plugins/check_http_health.sh
              └── curl http://localhost:8009/api/health → Nagios exit code

Icinga Container User (UID 5665)

SSH keys must be owned by UID 5665, not the host icinga user

The Icinga2 Docker container runs as UID 5665 internally. SSH keys are mounted into the container from the host filesystem. The host icinga user is typically UID 1001 — if SSH keys are owned by 1001:1001, the container cannot read them and check_by_ssh silently fails.

The setup-satellite-ssh.yml Ansible playbook handles this automatically by setting ownership to 5665:5665 after key generation.

SSH key mount path: /var/lib/icinga2/.ssh/ (inside the container, mapped to a host volume). The legacy path /var/lib/nagios/.ssh/ is not used — using this path is a common misconfiguration that causes all check_by_ssh calls to fail with permission errors.

SSH ACL Requirements

The Tailscale ACL must permit SSH from the Icinga node to the target host. The relevant rule allows the Icinga satellite (tagged tag:cwiq-ss-account or tag:cwiq-dev-account) to reach target hosts on port 22.

When adding a new server that requires Docker or API health checks:

  1. Ensure the server has tag:cwiq-dev-account (or equivalent) in its Tailscale tags
  2. Verify the ACL permits SSH from the Icinga satellite tag to the new server tag
  3. Deploy the check scripts to the new host:
    # From the ansible server
    ansible-playbook -i inventory/docker-hosts.yml deploy-check-scripts.yml --limit <hostname>
    
  4. Set up SSH keys for the satellite:
    ansible-playbook -i inventory/dev.yml setup-satellite-ssh.yml
    
  5. Restart the Icinga2 container on the satellite to pick up the updated SSH key volume:
    ssh ec2-user@icinga-dev-cwiq-io
    sudo -u icinga docker restart icinga2
    

Host Config: Docker Container Checks

// conf.d/hosts/dev/orchestrator-dev.conf (excerpt)
vars.docker_containers["Docker: orchestrator-server"] = {
  container_name     = "orchestrator-server"
  expect_healthcheck = true
  restart_threshold  = "10"
}

Keys must be prefixed with "Docker: " to avoid name collisions with vars.tcp_ports entries.

Host Config: Internal API Health Checks

// conf.d/hosts/dev/orchestrator-dev.conf (excerpt)
vars.ssh_health_checks["Runner API Health"] = {
  health_url      = "http://localhost:8003/api/health"
  expected_status = "200"
  timeout         = "10"
  warn_seconds    = "5"
  ssh_user        = "cwiq"
}

vars.ssh_health_checks["IAM API Health"] = {
  health_url      = "http://localhost:8004/api/health"
  ssh_user        = "cwiq"
}

The apply rule in conf.d/services/common-services.conf iterates both dictionaries and creates one Icinga service per entry automatically.


EKS Pods Cannot Use SSH via Tailscale

EKS runner pods have no Tailscale connectivity

GitLab CI/CD runner pods run on cwiq-dev-eks-cluster using the VPC CNI plugin. Pods receive real VPC IPs (10.1.34.x / 10.1.35.x) but have no Tailscale client. Any job that SSHes to a server (e.g., deploy-dev) must use the VPC private IP, not the Tailscale hostname.

# Correct — deploy-dev CI/CD variable (set at GitLab group level)
DEV_SERVER_IP: "10.1.35.46"  # VPC private IP of orchestrator-dev-cwiq-io

# WRONG — these do not resolve from EKS pods
# DEV_SERVER_HOST: "orchestrator-dev-cwiq-io"     (Tailscale hostname)
# DEV_SERVER_IP: "100.122.206.4"                   (Tailscale IP)

The dev server's security group must allow SSH (port 22) ingress from the EKS cluster security group for deploy-dev jobs to succeed.