EKS Cluster¶
cwiq-dev-eks-clusteris the Kubernetes cluster hosting all GitLab CI/CD runners. It replaced the legacy Fleeting/EC2 Autoscaling runners on 2026-02-26 and is now the sole active runner system.
Cluster Reference¶
| Attribute | Value |
|---|---|
| Cluster Name | cwiq-dev-eks-cluster |
| Account | dev (686123185567) |
| Kubernetes Version | 1.31 |
| Node Autoscaler | Karpenter v1.1.1 |
| Region | us-west-2 |
| kubectl Access | Ansible server only (ansible-shared-cwiq-io) |
Node Groups¶
System Node Group (cwiq-dev-eks-system)¶
Fixed node group for Karpenter controller, CoreDNS, and VPC CNI pods:
| Attribute | Value |
|---|---|
| Instance Type | t3.medium |
| Desired/Min/Max | 1 / 1 / 2 |
| EC2 Name Tag | cwiq-dev-eks-system |
Single system node behavior
With desired_size=1, one Karpenter replica will always show Pending — this is expected. Scale to 2 when the team grows and runner concurrency increases.
Karpenter-Managed Runner Nodes¶
Karpenter dynamically provisions EC2 nodes for pipeline jobs. Instance type selection is defined in the NodePool resource.
Karpenter IAM constraint
Karpenter's ec2:RunInstances IAM permission cannot use aws:RequestTag conditions. The IAM policy must use resource-level conditions instead. This is a known Karpenter limitation.
GitLab Runner Tiers¶
Three runners are registered, selected by pipeline job tags:
| Runner | GitLab ID | Tag | Typical Use | Karpenter Instance |
|---|---|---|---|---|
| k8s-small | 19 | small |
Most jobs (validate, test, deploy-dev) | t3.medium |
| k8s-medium | 20 | medium |
UI Kaniko builds (8 GB RAM required) | t3.large |
| k8s-large | 21 | large |
Executor Nuitka + rpmbuild | t3.xlarge |
UI builds require medium runner
Kaniko image builds for the React UI require medium tag (t3.large, 8 GB RAM). Using small (t3.medium, 4 GB) will OOM and fail the build.
Pod Networking¶
No Tailscale in EKS pods
GitLab runner pods use the VPC CNI plugin and receive real VPC IP addresses in subnets 10.1.34.0/24 and 10.1.35.0/24. Pods have no Tailscale connectivity.
Any job that must reach a server (e.g., deploy-dev SSH to the orchestrator) must use the VPC private IP, not the Tailscale IP or Tailscale hostname.
# Correct — deploy-dev job configuration
variables:
# Set at GitLab group level (group 9) — DO NOT define per-project
DEV_SERVER_IP: "10.1.35.46" # VPC private IP of orchestrator-dev-cwiq-io
Pods can reach shared-services resources over VPC peering:
- Nexus: nexus.shared.cwiq.io → resolves to VPC private IP via private Route53 zone
- SonarQube: sonarqube.shared.cwiq.io → resolves to 10.0.10.8
- Vault: vault.shared.cwiq.io → Vault API port 8200
Runner Security Context¶
runAsNonRoot default
GitLab Runner on Kubernetes defaults to runAsNonRoot: true. This causes most container jobs to fail unless the runner TOML config overrides it:
This is already configured on all three CWIQ runners.
Legacy Runners (PAUSED)¶
The legacy Fleeting/EC2 Autoscaling runners are paused and the Runner Manager EC2 is stopped:
| Runner | ID | Status |
|---|---|---|
| dev-small | 16 | Paused (2026-02-27) |
| dev-medium | 17 | Paused (2026-02-27) |
| dev-large | 18 | Paused (2026-02-27) |
| Runner Manager EC2 | i-0af3f2d4bf8a4f1d2 |
Stopped (can restart if needed) |
Remaining cleanup: terminate the Runner Manager EC2 and delete runners 16-18 from GitLab.
kubectl Access¶
kubectl is configured only on the ansible server. Do not attempt to use kubectl from the dev server or local workstation without first configuring kubeconfig.
ssh ansible@ansible-shared-cwiq-io
ansible-helper
kubectl get nodes
kubectl get pods -n gitlab-runner
Terraform Location¶
terraform-plan/organization/environments/dev/
├── eks-cluster/ ← Cluster, node group, Karpenter
├── eks-karpenter/ ← Karpenter NodePool and NodeClass
└── gitlab-runners-k8s/ ← Runner Helm chart deployment
Related Pages¶
- Dev Account — Dev server that runners deploy to
- VPC & Networking — Subnet CIDRs for pod IP ranges
- Security Groups — EKS cluster SG rules
- GitLab Runner Architecture — Pipeline configuration