The following details how to backup and restore a Wallaroo Ops installation.
One method of Wallaroo backup and restores is through the Velero application. This application provides a method of storing snapshots of the Wallaroo installation, including deployed pipelines, user settings, log files, etc., which can be retrieved and restored at a later date.
For full details and setup procedures, see the Velero Documentation. The installation steps below are intended as short guides.
The following procedures are for Wallaroo Enterprise installed via kots
or helm
in the cloud services listed below. These procedures are not tested for other environments.
velero
client.Velero contains both a client and a Kubernetes service that is used to manage backups and restores.
The Velero client supports MacOS and Linux. Windows support is available but not officially supported. The following steps are based on the Velero CLI installation procedure.
Velero is available on MacOS through the Homebrew project. With Homebrew installed, Velero is installed with the following command:
brew install velero
tar.gz
file and place the velero
executable into an executable path directory.The Velero service runs in the same Kubernetes environment where the Wallaroo instance is installed. Before installation, storage known as a bucket
must be made available for the Velero service to place the backup files.
The following shows basic steps on creating the storage containers used for each major cloud service. Organizations are encourage to use these steps with the official Velero instructions, available from the links within each cloud provider section below.
The following instructions are based on the Velero Plugin for AWS instructions.
These steps assume the user has installed the AWS Command-Line Interface (CLI) and has the necessary permissions to perform the steps below.
The following items are required to create the Velero bucket via a AWS S3 Storage:
kube2iam
as defined in the Velero plugins for AWS Set permissions for Velero.If these steps are complete, jump to the Install the Velero Service into the AWS Wallaroo Cluster.
Create the S3 bucket used for Velero based backups and restores with the following command, replacing the variables AWS_BUCKET_NAME
and AWS_REGION
based on your organization’s requirements. In the command below, if the region is us-east-1
, remove the --create-bucket-configuration
option.
AWS_BUCKET_NAME=<YOUR_BUCKET>
AWS_REGION=<YOUR_REGION>
aws s3api create-bucket \
--bucket $AWS_BUCKET_NAME \
--region $AWS_REGION \
--create-bucket-configuration LocationConstraint=$AWS_REGION
There are multiple options for setting permissions for the Velero service in an AWS Kubernetes cluster as detailed in the Velero plugins for AWS Set permissions for Velero. The following examples assume the IAM user method as follows.
Create the IAM user. In this example, the name is velero
.
aws iam create-user --user-name velero
Attach the following AWS policies to the new velero
AWS user.
cat > velero-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeVolumes",
"ec2:DescribeSnapshots",
"ec2:CreateTags",
"ec2:CreateVolume",
"ec2:CreateSnapshot",
"ec2:DeleteSnapshot"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:DeleteObject",
"s3:PutObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::${BUCKET}/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::${BUCKET}"
]
}
]
}
EOF
Create an access key for the velero
user:
aws iam create-access-key --user-name velero
This creates the following sample output:
{
"AccessKey": {
"UserName": "velero",
"Status": "Active",
"CreateDate": "2017-07-31T22:24:41.576Z",
"SecretAccessKey": <AWS_SECRET_ACCESS_KEY>,
"AccessKeyId": <AWS_ACCESS_KEY_ID>
}
}
Store the SecretAccessKey
and AccessKeyID
for the next step. In this case, the file ~/.credentials-velero-aws
:
[default]
aws_access_key_id=<AWS_ACCESS_KEY_ID>
aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
The following procedure will install the Velero service into the AWS Kubernetes cluster hosting the Wallaroo instance.
Verify the connection to the AWS Kubernetes cluster hosting the Wallaroo instance.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
aws-ce-default-pool-5dd3c344-fxs3 Ready <none> 31s v1.23.14-gke.1800
aws-ce-default-pool-5dd3c344-q95a Ready <none> 25d v1.23.14-gke.1800
aws-ce-default-pool-5dd3c344-scmc Ready <none> 31s v1.23.14-gke.1800
aws-ce-default-pool-5dd3c344-wnkn Ready <none> 31s v1.23.14-gke.1800
Install Velero into the AWS Kubernetes cluster. This assumes the $BUCKET_NAME and $REGION variables from earlier, and the AWS velero user credentials are stored in ~/.credentials-velero-aws
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.6.0 \
--bucket $BUCKET_NAME \
--backup-location-config region=$REGION \
--secret-file ./credentials-velero-aws \
--use-volume-snapshots=false \
--use-node-agent --wait
Update the velero
namespace to accept all pods:
kubectl -n velero patch ds node-agent -p='{"spec": {"template": {"spec": {"tolerations":[{"operator": "Exists"}]}}}}'
Once complete, verify the installation is complete by checking for the velero
namespace in the Kubernetes cluster:
kubectl get namespaces
NAME STATUS AGE
default Active 222d
kube-node-lease Active 222d
kube-public Active 222d
kube-system Active 222d
velero Active 5m32s
wallaroo Active 7d23h
The following instructions are based on the Velero Plugin for Microsoft Azure instructions.
These steps assume the user has installed the Azure Command-Line Interface (CLI) and has the necessary permissions to perform the steps below.
The following items are required to create the Velero bucket via a Microsoft Azure Storage Container:
If these elements are available, then skip straight to the Install Velero In the Wallaroo Azure Kubernetes Cluster step.
To retrieve the Azure Subscription ID:
To create the Azure Resource Group, use the following command, replacing the variables $AZURE_VELERO_RESOURCE_GROUP
and $AZURE_LOCATION
with your organization’s requirements.
az group create -n $AZURE_VELERO_RESOURCE_GROUP --location $AZURE_LOCATION
To create the Azure Storage Account, the Azure Storage Account ID must be composed of only lower case alphanumeric characters and -
and .
, with the ID beginning or ending in an alphanumeric character. So velero-backup-account
is appropriate, while VELERO_BACKUP
will not. Update the variables $AZURE_VELERO_RESOURCE_GROUP
and $AZURE_STORAGE_ACCOUNT_ID
with your organization’s requirements.
AZURE_STORAGE_ACCOUNT_ID="wallaroo_velero_storage"
az storage account create \
--name $AZURE_STORAGE_ACCOUNT_ID \
--resource-group $AZURE_VELERO_RESOURCE_GROUP \
--sku Standard_GRS \
--encryption-services blob \
--https-only true \
--min-tls-version TLS1_2 \
--kind BlobStorage \
--access-tier Hot
Use the following command to create the Azure Storage Container for use by the Velero service. Replace the BLOB_CONTAINER
variable with your organization’s requirements. Note that this new container should have a unique name.
BLOB_CONTAINER=velero
az storage container create -n $BLOB_CONTAINER --public-access off --account-name $AZURE_STORAGE_ACCOUNT_ID
This step sets a method for the Velero service to authenticate with Azure to create the backup and restore jobs. Velero recommends different options in its Velero Plugin for Microsoft Azure Set permissions for Velero documentation. Organizations are encouraged to use the method that aligns with their security requirements.
The steps below will cover using a storage account access key.
Set the default resource group to the same one used for the Valero Resource Group in the step Create Azure Resource Group.
az configure --defaults group=$AZURE_VELERO_RESOURCE_GROUP
Retrieve the Azure Storage Account Access Key using the $AZURE_STORAGE_ACCOUNT_ID
created in the step Create Azure Storage Account. Store this key in a secure location.
AZURE_STORAGE_ACCOUNT_ACCESS_KEY=`az storage account keys list --account-name $AZURE_STORAGE_ACCOUNT_ID --query "[?keyName == 'key1'].value" -o tsv`
Store the name of the Azure Kubernetes cluster hosting the Wallaroo instance as $AZURE_CLOUD_NAME
and the $AZURE_STORAGE_ACCOUNT_ACCESS_KEY
into a secret key file. The following command will store it in the location ~/.credentials-velero-azure
:
cat << EOF > ~/.credentials-velero-azure
AZURE_STORAGE_ACCOUNT_ACCESS_KEY=${AZURE_STORAGE_ACCOUNT_ACCESS_KEY}
AZURE_CLOUD_NAME=AzurePublicCloud
EOF
This step will install the Velero service into the Azure Kubernetes Cluster hosting the Wallaroo instance using the variables from the steps above.
Install the Velero service into the cluster with the following command:
velero install \
--provider azure \
--plugins velero/velero-plugin-for-microsoft-azure:v1.6.0 \
--bucket $BLOB_CONTAINER \
--secret-file ~/.credentials-velero-azure \
--backup-location-config storageAccount=$AZURE_STORAGE_ACCOUNT_ID,storageAccountKeyEnvVar=AZURE_STORAGE_ACCOUNT_ACCESS_KEY \
--use-volume-snapshots=false \
--use-node-agent --wait
Update the velero
namespace to accept all pods:
kubectl -n velero patch ds node-agent -p='{"spec": {"template": {"spec": {"tolerations":[{"operator": "Exists"}]}}}}'
Once complete, verify the installation is complete by checking for the velero
namespace in the Kubernetes cluster:
kubectl get namespaces
NAME STATUS AGE
default Active 222d
kube-node-lease Active 222d
kube-public Active 222d
kube-system Active 222d
velero Active 5m32s
wallaroo Active 7d23h
To view the logs for the Velero service installation, use the command kubectl logs deployment/velero -n velero
.
The following instructions are based on the Velero Plugin for Google Cloud Platform (GCP) instructions.
These steps assume the user has installed the gcloud Command-Line Interface (CLI) and gsutil tool and has the necessary permissions to perform the steps below.
The following items are required to create the Velero bucket via a GCP Bucket:
If these items are already complete, jump to the step Install Velero In the Wallaroo GCP Kubernetes Cluster.
Create the GCS bucket for storing the Wallaroo backup and restores with the following command. Replace the variable $BUCKET_NAME
based on your organization’s requirements.
BUCKET_NAME=<YOUR_BUCKET>
gsutil mb gs://$BUCKET_NAME/
Create the Google Service Account for the Velero service using the following commands:
Retrieve your organization’s GCP Project ID and store it in the PROJECT_ID
variable. Note that this will retrieve the default project ID for the gcloud
configuration. Replace with the actual GCP Project ID as required.
PROJECT_ID=$(gcloud config get-value project)
Create the service account. Update the $GSA_NAME
variable based on the organization’s requirements.
GSA_NAME=velero
gcloud iam service-accounts create $GSA_NAME \
--display-name "Velero service account"
Use gcloud iam service-accounts list
to list out the services.
gcloud iam service-accounts list
DISPLAY NAME EMAIL DISABLED
Velero service account veleroexample.iam.gserviceaccount.com False
Select the email address for the new Velero service account and set the variable SERVICE_ACCOUNT_EMAIL
equal to the accounts email address:
SERVICE_ACCOUNT_EMAIL=veleroexample.iam.gserviceaccount.com
Create a Custom Role with the following minimum positions, and bind it to the new Velero service account. The ROLE
needs to be unique and DNS compliant.
ROLE="velero.server"
TITLE="Velero Server"
ROLE_PERMISSIONS=(
compute.disks.get
compute.disks.create
compute.disks.createSnapshot
compute.snapshots.get
compute.snapshots.create
compute.snapshots.useReadOnly
compute.snapshots.delete
compute.zones.get
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list
iam.serviceAccounts.signBlob
)
gcloud iam roles create $ROLE \
--project $PROJECT_ID \
--title $TITLE \
--permissions "$(IFS=","; echo "${ROLE_PERMISSIONS[*]}")"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member serviceAccount:$SERVICE_ACCOUNT_EMAIL \
--role projects/$PROJECT_ID/roles/$ROLE
Bind the bucket to the new Service Account:
gsutil iam ch serviceAccount:$SERVICE_ACCOUNT_EMAIL:objectAdmin gs://${BUCKET_NAME}
There are multiple methods of granting the Velero service GCP access as detailed in the Plugins for Google Cloud Platform (GCP) Grant access to Velero steps. The following examples will use the Service Account Key method.
Create the Google Service Account Key, and store it in a secure location. In this example, it is stored in ~/.credentials-velero-gcp
:
gcloud iam service-accounts keys create ~/.credentials-velero-gcp \
--iam-account $SERVICE_ACCOUNT_EMAIL
The following steps assume that the Google Service Account Key method was used in the Grant Velero Service GCP Access. See the Plugins for Google Cloud Platform (GCP) Grant access to Velero for other methods.
To install the Velero service into the Kubernetes cluster hosting the Wallaroo service:
Verify the connection to the GCP Kubernetes cluster hosting the Wallaroo instance.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-wallaroodocs-ce-default-pool-5dd3c344-fxs3 Ready <none> 31s v1.23.14-gke.1800
gke-wallaroodocs-ce-default-pool-5dd3c344-q95a Ready <none> 25d v1.23.14-gke.1800
gke-wallaroodocs-ce-default-pool-5dd3c344-scmc Ready <none> 31s v1.23.14-gke.1800
gke-wallaroodocs-ce-default-pool-5dd3c344-wnkn Ready <none> 31s v1.23.14-gke.1800
Install Velero into the GCP Kubernetes cluster. This assumes the $BUCKET_NAME variable from earlier, and the Google Service Account Key are stored in ~/.credentials-velero-gcp
velero install \
--provider gcp \
--plugins velero/velero-plugin-for-gcp:v1.6.0 \
--bucket $BUCKET_NAME \
--secret-file ~/.credentials-velero-gcp \
--use-volume-snapshots=false \
--use-node-agent --wait
Update the velero
namespace to accept all pods:
kubectl -n velero patch ds node-agent -p='{"spec": {"template": {"spec": {"tolerations":[{"operator": "Exists"}]}}}}'
Once complete, verify the installation is complete by checking for the velero
namespace in the Kubernetes cluster:
kubectl get namespaces
NAME STATUS AGE
default Active 222d
kube-node-lease Active 222d
kube-public Active 222d
kube-system Active 222d
velero Active 5m32s
wallaroo Active 7d23h
Once the Velero Installation Procedure and the Velero Kubernetes Install are complete, Wallaroo instance backups are performed through the following process:
Before starting the backup, force the Plateau service to complete writing logs so they can be captured by the backup. This assumes that Wallaroo was installed in the namespace wallaroo
.
kubectl -n wallaroo scale --replicas=0 deploy/plateau
kubectl -n wallaroo scale --replicas=1 deploy/plateau
Set the $BACKUP_NAME
. This must be all lowercase characters or numbers or -/.
and must end in alphanumeric characters.
BACKUP_NAME={give it your own name}
Issue the following backup command. The --exclude-namespaces
is used to exclude namespaces that are not required for the Wallaroo backup and restore. By default, these are the namespaces velero
, default
, kube-node-lease
, kube-public
, and kube-system
.
This process will back up all namespaces that are not excluded, including deployed Wallaroo pipelines. Add any other namespaces that should not be part of the backup to the --exclude-namespaces
option as per your organization’s requirements.
velero backup create $BACKUP_NAME --default-volumes-to-fs-backup --include-cluster-resources=true --exclude-namespaces velero,default,kube-node-lease,kube-public,kube-system
To view the status of the backup, velero backup describe --details $BACKUP_NAME
. Once the Completed
field shows a date and time, the backup is complete.
In progress backup.
velero backup describe --details $BACKUP_NAME
Name: sample-doctest-backup-20240502
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/resource-timeout=10m0s
velero.io/source-cluster-k8s-gitversion=v1.28.7-gke.1026000
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=28
Phase: InProgress
Namespaces:
Included: *
Excluded: velero, default, kube-node-lease, kube-public, kube-system
Resources:
Included: *
Excluded: <none>
Cluster-scoped: included
Label selector: <none>
Or label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
Snapshot Move Data: false
Data Mover: velero
TTL: 720h0m0s
CSISnapshotTimeout: 10m0s
ItemOperationTimeout: 4h0m0s
Hooks: <none>
Backup Format Version: 1.1.0
Started: 2024-05-14 16:26:43 -0600 MDT
Completed: <n/a>
Expiration: 2024-06-13 16:26:43 -0600 MDT
Estimated total items to be backed up: 1073
Items backed up so far: 28
Resource List: <backup resource list not found>
Backup Volumes:
Velero-Native Snapshots: <none included>
CSI Snapshots: <none included or not detectable>
Pod Volume Backups - kopia:
Completed:
gmp-system/alertmanager-0: alertmanager-config, alertmanager-data
gmp-system/collector-cdsm4: config-out, storage
gmp-system/collector-fslhc: config-out, storage
gmp-system/collector-p6f85: config-out, storage
gmp-system/collector-q4djj: config-out, storage
gmp-system/rule-evaluator-7874c6f478-672vs: config-out
wallaroo/hub-65c45d4c7-nb9lp: pvc
wallaroo/kotsadm-b4f68468d-dzj5c: backup, tmp
wallaroo/kotsadm-minio-0: kotsadm-minio, minio-cert-dir, minio-config-dir
wallaroo/kotsadm-rqlite-0: kotsadm-rqlite, tmp
In Progress:
wallaroo/minio-cf97d78cb-pv82x: export
Completed backup.
velero backup describe --details $BACKUP_NAME
Name: sample-doctest-backup-20240502
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/resource-timeout=10m0s
velero.io/source-cluster-k8s-gitversion=v1.28.7-gke.1026000
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=28
Phase: Completed
Warnings:
Velero: <none>
Cluster: <none>
Namespaces:
wallaroo: resource: /pods name: /kotsadm-b4f68468d-dzj5c message: /volume migrations is declared in pod wallaroo/kotsadm-b4f68468d-dzj5c but not mounted by any container, skipping
Namespaces:
Included: *
Excluded: velero, default, kube-node-lease, kube-public, kube-system
Resources:
Included: *
Excluded: <none>
Cluster-scoped: included
Label selector: <none>
Or label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
Snapshot Move Data: false
Data Mover: velero
TTL: 720h0m0s
CSISnapshotTimeout: 10m0s
ItemOperationTimeout: 4h0m0s
Hooks: <none>
Backup Format Version: 1.1.0
Started: 2024-05-14 16:26:43 -0600 MDT
Completed: 2024-05-14 16:32:19 -0600 MDT
Expiration: 2024-06-13 16:26:43 -0600 MDT
Total items to be backed up: 719
Items backed up: 719
Resource List:
admissionregistration.k8s.io/v1/MutatingWebhookConfiguration:
- gmp-operator.gmp-system.monitoring.googleapis.com
- neg-annotation.config.common-webhooks.networking.gke.io
- pod-ready.config.common-webhooks.networking.gke.io
- warden-mutating.config.common-webhooks.networking.gke.io
...Other backed up resources
warden.gke.io/v1/Audit:
- autogke-default-linux-capabilities
- autogke-disallow-hostnamespaces
- autogke-disallow-privilege
- autogke-no-host-port
- autogke-no-write-mode-hostpath
- autogke-node-affinity-selector-limitation
- autogke-pod-affinity-limitation
- autopilot-admission-webhook-config-limitation
- autopilot-capacity-request-limitation
- autopilot-external-ip-limitation
- autopilot-no-ephemeral-containers
- autopilot-persistent-volume-limitation
- autopilot-volume-type-limitation
Backup Volumes:
Velero-Native Snapshots: <none included>
CSI Snapshots: <none included>
Pod Volume Backups - kopia:
Completed:
gmp-system/alertmanager-0: alertmanager-config, alertmanager-data
gmp-system/collector-cdsm4: config-out, storage
gmp-system/collector-fslhc: config-out, storage
gmp-system/collector-p6f85: config-out, storage
gmp-system/collector-q4djj: config-out, storage
gmp-system/rule-evaluator-7874c6f478-672vs: config-out
wallaroo/hub-65c45d4c7-nb9lp: pvc
wallaroo/kotsadm-b4f68468d-dzj5c: backup, tmp
wallaroo/kotsadm-minio-0: kotsadm-minio, minio-cert-dir, minio-config-dir
wallaroo/kotsadm-rqlite-0: kotsadm-rqlite, tmp
wallaroo/minio-cf97d78cb-pv82x: export
wallaroo/nats-0: nats-js, pid
wallaroo/plateau-7dfbd89655-9xz6v: plateau-storage
wallaroo/postgres-74d6948c48-mjmb5: postgres-storage
wallaroo/prometheus-deployment-666d968bfd-cxp46: alert-config-volume, metrics-storage-volume
wallaroo/wallsvc-0: socket-volume, spire-data
HooksAttempted: 1
HooksFailed: 0
To restore a from a Wallaroo backup:
Set the backup name as the variable $BACKUP_NAME
. Use the command velero backup get
for a list of previous backups.
velero backup get
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
doctest-20230315a Completed 0 0 2023-03-15 10:52:27 -0600 MDT 28d default <none>
doctest-magicalbear-20230315 Completed 0 1 2023-03-15 11:52:17 -0600 MDT
BACKUP_NAME={give it your own name}
Use the velero restore create
command to create the restore job, using the $BACKUP_NAME
variable set in the step above.
velero restore create --from-backup $BACKUP_NAME
Restore request "doctest-20230315a-20230315105647" submitted successfully.
Run `velero restore describe doctest-20230315a-20230315105647` or `velero restore logs doctest-20230315a-20230315105647` for more details.
To check the restore status, use the velero restore describe
command. The optional flag –details provides more information.
velero restore describe doctest-20230315a-20230315105647 --details
If the Kubernetes cluster does not have a static IP address assigned to the Wallaroo loadBalancer
service, the DNS information may need to be updated if the IP address has changed. Check with the DNS Integration Guide for more information.