Skip to content

Backup and Restore

Nexus data is backed up daily via AWS Backup (EBS snapshots of the /data volume). The backup tag Backup=daily is applied by Terraform at provisioning time.

Backup Strategy

Configuration

Setting Value
Service AWS Backup
Selection method Tag-based (Backup=daily)
Schedule Daily
Retention 7 days
Backup window 5:00 AM UTC (8-hour window)
Target EBS volume mounted at /data

AWS Backup creates point-in-time EBS snapshots. The Backup=daily tag is set on the /data volume by Terraform during provisioning — no manual configuration required.

What Is Backed Up

The /data EBS volume contains everything needed to restore Nexus:

Path Contents
/data/nexus/data/ Nexus blob stores, OrientDB, config, logs
/data/ssl/ SSL certificates (Let's Encrypt fullchain + privkey)
/data/vault/ Vault Agent config, role-id, secret-id, templates
/data/nexus/docker-compose.yml Stack definition
/data/.env Non-secret environment variables

What Is NOT Backed Up

Path Contents Recovery Method
/var/lib/containerd/ Docker images and layers Re-pull on startup
/ (root filesystem) OS, packages, Docker engine Re-provision via Ansible
/vault/secrets/ (tmpfs) Rendered secrets (RAM-only) Auto-regenerated by Vault Agent on startup

Vault AppRole credentials in backups

The role-id and secret-id files on /data/vault/ are included in EBS snapshots. After restoring, these credentials may still work if the AppRole has not been revoked. If the restore is for a security incident, rotate the Secret ID before starting services.


Restore Procedure

Step 1: Identify the Snapshot

# List recovery points from AWS Backup
aws backup list-recovery-points-by-backup-vault \
  --backup-vault-name Default \
  --profile shared-services \
  --query "RecoveryPoints[?ResourceType=='EBS'].{Arn:RecoveryPointArn, Date:CreationDate, Status:Status}" \
  --output table

# Or find snapshots by volume tag directly
aws ec2 describe-snapshots \
  --filters "Name=tag:Backup,Values=daily" \
  --profile shared-services \
  --query "Snapshots[*].{ID:SnapshotId, Date:StartTime, State:State, Size:VolumeSize}" \
  --output table \
  --region us-west-2

Step 2: Stop Nexus Services

ssh nexus-shared-cwiq-io \
  "docker compose -f /data/nexus/docker-compose.yml down"

Step 3: Restore the EBS Volume

Option A — AWS Backup Console:

  1. Open AWS Backup console in shared-services account (308188966547)
  2. Navigate to Backup vaults > Default
  3. Select the desired recovery point
  4. Click Restore — choose same volume type and availability zone as the Nexus EC2 instance
  5. Wait for restore to complete

Option B — AWS CLI:

# Create a new volume from the snapshot
aws ec2 create-volume \
  --snapshot-id snap-0123456789abcdef0 \
  --availability-zone us-west-2a \
  --volume-type gp3 \
  --tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=nexus-shared-data},{Key=Backup,Value=daily}]' \
  --profile shared-services \
  --region us-west-2

Step 4: Swap the EBS Volume

# Detach the damaged volume
aws ec2 detach-volume \
  --volume-id vol-OLD_VOLUME_ID \
  --instance-id i-INSTANCE_ID \
  --profile shared-services \
  --region us-west-2

# Wait for detachment
aws ec2 wait volume-available \
  --volume-ids vol-OLD_VOLUME_ID \
  --profile shared-services \
  --region us-west-2

# Attach the restored volume
aws ec2 attach-volume \
  --volume-id vol-NEW_VOLUME_ID \
  --instance-id i-INSTANCE_ID \
  --device /dev/xvdf \
  --profile shared-services \
  --region us-west-2

Step 5: Mount and Verify

ssh nexus-shared-cwiq-io

# Verify mount (should be automatic via /etc/fstab)
mount | grep /data

# If not mounted:
sudo mount /dev/xvdf /data

# Verify key paths
ls -la /data/nexus/data/
ls -la /data/ssl/
cat /data/nexus/docker-compose.yml | head -20

Step 6: Start Services

ssh nexus-shared-cwiq-io \
  "docker compose -f /data/nexus/docker-compose.yml up -d"

Step 7: Verify Recovery

# Check containers
ssh nexus-shared-cwiq-io \
  "docker compose -f /data/nexus/docker-compose.yml ps"

# Check Nexus health
curl -sk https://nexus.shared.cwiq.io/service/rest/v1/status | jq .

# Check Vault Agent rendered secrets
ssh nexus-shared-cwiq-io "docker logs vault-agent --tail 20"

Manual Snapshots

Create a manual snapshot before upgrades or configuration changes:

# Find the data volume ID
VOLUME_ID=$(aws ec2 describe-volumes \
  --filters "Name=tag:Name,Values=nexus-shared-data" \
  --profile shared-services \
  --query "Volumes[0].VolumeId" \
  --output text \
  --region us-west-2)

# Create snapshot
aws ec2 create-snapshot \
  --volume-id "${VOLUME_ID}" \
  --description "Manual backup - Nexus shared - $(date +%Y-%m-%d)" \
  --tag-specifications "ResourceType=snapshot,Tags=[{Key=Name,Value=nexus-manual-$(date +%Y%m%d)},{Key=Environment,Value=shared}]" \
  --profile shared-services \
  --region us-west-2

Application-Consistent Snapshot

For maximum consistency, stop Nexus before snapshotting:

# Stop only Nexus (keep Vault Agent running so secrets remain rendered)
ssh nexus-shared-cwiq-io "docker stop nexus"

# Create the snapshot (command above)

# Restart Nexus
ssh nexus-shared-cwiq-io "docker start nexus"

Full Disaster Recovery

Use this procedure when the EC2 instance itself is lost.

Step 1: Provision New Infrastructure

cd terraform-plan/organization/environments/shared-services/ec2-instances/nexus
terraform init
terraform plan
terraform apply

Step 2: Restore the /data Volume

Follow the restore procedure above. Attach the restored volume to the new instance.

Step 3: Run Ansible Playbook

ssh ansible@ansible-shared-cwiq-io
ansible-helper
ansible-playbook nexus/deploy.yml \
  -i inventory/shared \
  -l nexus-shared-cwiq-io

Step 4: Regenerate Vault Credentials (if needed)

# Generate new Secret ID if old one may be compromised
vault write -f auth/approle/role/nexus-shared/secret-id

# Write to the restored instance
ssh nexus-shared-cwiq-io \
  "echo '<new-secret-id>' > /data/vault/secret_id"

# Restart vault-agent to pick up new credentials
ssh nexus-shared-cwiq-io \
  "docker restart vault-agent"

Step 5: Verify SSL Certificates

# Check certificate validity
ssh nexus-shared-cwiq-io \
  "openssl x509 -in /data/ssl/nexus.shared.cwiq.io/fullchain.pem -noout -dates"

If expired, reissue via cert-server:

ssh ansible@ansible-shared-cwiq-io
ansible-helper
ansible-playbook cert-server/ssl-issue-all.yml \
  -e "target_host=nexus-shared-cwiq-io" \
  -e "domain=nexus.shared.cwiq.io"

Step 6: Start and Verify

ssh nexus-shared-cwiq-io \
  "docker compose -f /data/nexus/docker-compose.yml up -d"

curl -sk https://nexus.shared.cwiq.io/service/rest/v1/status | jq .

Recovery Time Objectives

Scenario Estimated RTO
Container crash (Docker auto-restart) 1–2 minutes
Instance reboot 5–10 minutes
Volume restore from snapshot 15–30 minutes
Full instance rebuild 45–60 minutes

Export Repository Configuration

Back up Nexus repository configuration separately as a safety net (in addition to EBS snapshots):

NEXUS_URL="https://nexus.shared.cwiq.io"
NEXUS_AUTH="admin:$(vault kv get -field=password secret/nexus/admin)"

# Export repositories
curl -su "${NEXUS_AUTH}" \
  "${NEXUS_URL}/service/rest/v1/repositories" | jq '.' > nexus-repos-$(date +%Y%m%d).json

# Export roles
curl -su "${NEXUS_AUTH}" \
  "${NEXUS_URL}/service/rest/v1/security/roles" | jq '.' > nexus-roles-$(date +%Y%m%d).json

# Export users
curl -su "${NEXUS_AUTH}" \
  "${NEXUS_URL}/service/rest/v1/security/users" | jq '.' > nexus-users-$(date +%Y%m%d).json

Configuration exports capture repository definitions but NOT blob store contents. The EBS snapshot is the primary backup for blob data.