Wallaroo Instance Backup and Restore with Velero
How to backup a Wallaroo instance and restore it using Velero
One method of Wallaroo backup and restores is through the Velero application. This application provides a method of storing snapshots of the Wallaroo installation, including deployed pipelines, user settings, log files, etc., which can be retrieved and restored at a later date.
For full details and setup procedures, see the Velero Documentation. The installation steps below are intended as short guides.
The following procedures are for Wallaroo Enterprise installed via kots
or helm
in the cloud services listed below. These procedures are not tested for other environments.
Prerequisites
- A Wallaroo Enterprise instance
- A client connected to the Kubernetes environment hosting the Wallaroo instance running the
velero
client. - Kubernetes cloud storage, such as:
- Azure Storage Container
- Google Cloud Storage (GCS) Bucket
- AWS S3 Bucket
Velero contains both a client and a Kubernetes service that is used to manage backups and restores.
Client Install
The Velero client supports MacOS and Linux. Windows support is available but not officially supported. The following steps are based on the Velero CLI installation procedure.
MacOS Install
Velero is available on MacOS through the Homebrew project. With Homebrew installed, Velero is installed with the following command:
Linux Install
Velero is available through a tarball installation through the Velero releases page. Once downloaded, expand the tar.gz
file and place the velero
executable into an executable path directory.
Velero Kubernetes Install
The Velero service runs in the same Kubernetes environment where the Wallaroo instance is installed. Before installation, storage known as a bucket
must be made available for the Velero service to place the backup files.
The following shows basic steps on creating the storage containers used for each major cloud service. Organizations are encourage to use these steps with the official Velero instructions, available from the links within each cloud provider section below.
1 - Velero AWS Cluster Installation
How to set up Velero with a AWS Kubernetes cluster
The following instructions are based on the Velero Plugin for AWS instructions.
These steps assume the user has installed the AWS Command-Line Interface (CLI) and has the necessary permissions to perform the steps below.
The following items are required to create the Velero bucket via a AWS S3 Storage:
- S3 Bucket Name: The name of the S3 bucket used to store Wallaroo backups.
- Amazon Web Services Region: The region where the Velero bucket is stored. This should be in the same region as the Wallaroo Kubernetes cluster.
- Authentication Method: A method of authenticating to AWS for the Velero service either with an IAM user or
kube2iam
as defined in the Velero plugins for AWS Set permissions for Velero.
If these steps are complete, jump to the Install the Velero Service into the AWS Wallaroo Cluster.
Create AWS Bucket for Velero
Create the S3 bucket used for Velero based backups and restores with the following command, replacing the variables AWS_BUCKET_NAME
and AWS_REGION
based on your organization’s requirements. In the command below, if the region is us-east-1
, remove the --create-bucket-configuration
option.
AWS_BUCKET_NAME=<YOUR_BUCKET>
AWS_REGION=<YOUR_REGION>
aws s3api create-bucket \
--bucket $AWS_BUCKET_NAME \
--region $AWS_REGION \
--create-bucket-configuration LocationConstraint=$AWS_REGION
Set Permissions for AWS Velero
There are multiple options for setting permissions for the Velero service in an AWS Kubernetes cluster as detailed in the Velero plugins for AWS Set permissions for Velero. The following examples assume the IAM user method as follows.
Create the IAM user. In this example, the name is velero
.
aws iam create-user --user-name velero
Attach the following AWS policies to the new velero
AWS user.
cat > velero-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeVolumes",
"ec2:DescribeSnapshots",
"ec2:CreateTags",
"ec2:CreateVolume",
"ec2:CreateSnapshot",
"ec2:DeleteSnapshot"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:DeleteObject",
"s3:PutObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::${BUCKET}/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::${BUCKET}"
]
}
]
}
EOF
Create an access key for the velero
user:
aws iam create-access-key --user-name velero
This creates the following sample output:
{
"AccessKey": {
"UserName": "velero",
"Status": "Active",
"CreateDate": "2017-07-31T22:24:41.576Z",
"SecretAccessKey": <AWS_SECRET_ACCESS_KEY>,
"AccessKeyId": <AWS_ACCESS_KEY_ID>
}
}
Store the SecretAccessKey
and AccessKeyID
for the next step. In this case, the file ~/.credentials-velero-aws
:
[default]
aws_access_key_id=<AWS_ACCESS_KEY_ID>
aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
Install the Velero Service into the AWS Wallaroo Cluster
The following procedure will install the Velero service into the AWS Kubernetes cluster hosting the Wallaroo instance.
Verify the connection to the GCP Kubernetes cluster hosting the Wallaroo instance.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
aws-ce-default-pool-5dd3c344-fxs3 Ready <none> 31s v1.23.14-gke.1800
aws-ce-default-pool-5dd3c344-q95a Ready <none> 25d v1.23.14-gke.1800
aws-ce-default-pool-5dd3c344-scmc Ready <none> 31s v1.23.14-gke.1800
aws-ce-default-pool-5dd3c344-wnkn Ready <none> 31s v1.23.14-gke.1800
Install Velero into the AWS Kubernetes cluster. This assumes the $BUCKET_NAME and $REGION variables from earlier, and the AWS velero user credentials are stored in ~/.credentials-velero-aws
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.6.0 \
--bucket $BUCKET_NAME \
--backup-location-config region=$REGION \
--secret-file ./credentials-velero-aws \
--use-volume-snapshots=false \
--use-node-agent --wait
Once complete, verify the installation is complete by checking for the velero
namespace in the Kubernetes cluster:
kubectl get namespaces
NAME STATUS AGE
default Active 222d
kube-node-lease Active 222d
kube-public Active 222d
kube-system Active 222d
velero Active 5m32s
wallaroo Active 7d23h
If using Kubernetes taints and tolerations for the Wallaroo installation, update the velero
namespace to accept all pods:
kubectl -n velero patch ds node-agent -p='{"spec": {"template": {"spec": {"tolerations":[{"operator": "Exists"}]}}}}'
2 - Velero Azure Cluster Installation
How to set up Velero with a Azure Kubernetes cluster
The following instructions are based on the Velero Plugin for Microsoft Azure instructions.
These steps assume the user has installed the Azure Command-Line Interface (CLI) and has the necessary permissions to perform the steps below.
The following items are required to create the Velero bucket via a Microsoft Azure Storage Container:
- Resource Group: The resource group that the storage container belongs to. It is recommended to either use the same Resource Group as the Azure Kubernetes cluster hosting the Wallaroo instance, or create a Resource Group in the same Azure location.
- Resource Group Location: The Azure location for the resource group.
- Azure Storage Account ID: Used to manage the storage container settings.
- Azure Storage Container Name: The name of the container being used.
- Azure Kubernetes Cluster Name: The name of the Azure Kubernetes Cluster hosting the Wallaroo instance.
- Create Azure Storage Account Access Key: This step sets a method for the Velero service to authenticate with Azure to create the backup and restore jobs. Velero recommends different options in its Velero Plugin for Microsoft Azure Set permissions for Velero documentation. The steps below will cover using a storage account access key.
If these elements are available, then skip straight to the Install Velero In the Wallaroo Azure Kubernetes Cluster step.
Get Azure Subscription ID
To retrieve the Azure Subscription ID:
- Login to Microsoft Azure.
- From the search bar, search for Subscription.
- From the Subscriptions Dashboard, select the Subscription ID to be used and store it for later use.
Create Azure Resource Group
To create the Azure Resource Group, use the following command, replacing the variables $AZURE_VELERO_RESOURCE_GROUP
and $AZURE_LOCATION
with your organization’s requirements.
az group create -n $AZURE_VELERO_RESOURCE_GROUP --location $AZURE_LOCATION
Create Azure Storage Account
To create the Azure Storage Account, the Azure Storage Account ID must be composed of only lower case alphanumeric characters and -
and .
, with the ID beginning or ending in an alphanumeric character. So velero-backup-account
is appropriate, while VELERO_BACKUP
will not. Update the variables $AZURE_VELERO_RESOURCE_GROUP
and $AZURE_STORAGE_ACCOUNT_ID
with your organization’s requirements.
AZURE_STORAGE_ACCOUNT_ID="wallaroo_velero_storage"
az storage account create \
--name $AZURE_STORAGE_ACCOUNT_ID \
--resource-group $AZURE_VELERO_RESOURCE_GROUP \
--sku Standard_GRS \
--encryption-services blob \
--https-only true \
--min-tls-version TLS1_2 \
--kind BlobStorage \
--access-tier Hot
Create Azure Storage Container
Use the following command to create the Azure Storage Container for use by the Velero service. Replace the BLOB_CONTAINER
variable with your organization’s requirements. Note that this new container should have a unique name.
BLOB_CONTAINER=velero
az storage container create -n $BLOB_CONTAINER --public-access off --account-name $AZURE_STORAGE_ACCOUNT_ID
Create Azure Storage Account Access Key
This step sets a method for the Velero service to authenticate with Azure to create the backup and restore jobs. Velero recommends different options in its Velero Plugin for Microsoft Azure Set permissions for Velero documentation. Organizations are encouraged to use the method that aligns with their security requirements.
The steps below will cover using a storage account access key.
Set the default resource group to the same one used for the Valero Resource Group in the step Create Azure Resource Group.
az configure --defaults group=$AZURE_VELERO_RESOURCE_GROUP
Retrieve the Azure Storage Account Access Key using the $AZURE_STORAGE_ACCOUNT_ID
created in the step Create Azure Storage Account. Store this key in a secure location.
AZURE_STORAGE_ACCOUNT_ACCESS_KEY=`az storage account keys list --account-name $AZURE_STORAGE_ACCOUNT_ID --query "[?keyName == 'key1'].value" -o tsv`
Store the name of the Azure Kubernetes cluster hosting the Wallaroo instance as $AZURE_CLOUD_NAME
and the $AZURE_STORAGE_ACCOUNT_ACCESS_KEY
into a secret key file. The following command will store it in the location ~/.credentials-velero-azure
:
cat << EOF > ~/.credentials-velero-azure
AZURE_STORAGE_ACCOUNT_ACCESS_KEY=${AZURE_STORAGE_ACCOUNT_ACCESS_KEY}
AZURE_CLOUD_NAME=AzurePublicCloud
EOF
Install Velero In the Wallaroo Azure Kubernetes Cluster
This step will install the Velero service into the Azure Kubernetes Cluster hosting the Wallaroo instance using the variables from the steps above.
Install the Velero service into the cluster with the following command:
velero install \
--provider azure \
--plugins velero/velero-plugin-for-microsoft-azure:v1.6.0 \
--bucket $BLOB_CONTAINER \
--secret-file ~/.credentials-velero-azure \
--backup-location-config storageAccount=$AZURE_STORAGE_ACCOUNT_ID,storageAccountKeyEnvVar=AZURE_STORAGE_ACCOUNT_ACCESS_KEY \
--use-volume-snapshots=false \
--use-node-agent --wait
Once complete, verify the installation is complete by checking for the velero
namespace in the Kubernetes cluster:
kubectl get namespaces
NAME STATUS AGE
default Active 222d
kube-node-lease Active 222d
kube-public Active 222d
kube-system Active 222d
velero Active 5m32s
wallaroo Active 7d23h
To view the logs for the Velero service installation, use the command kubectl logs deployment/velero -n velero
.
If using Kubernetes taints and tolerations for the Wallaroo installation, update the velero
namespace to accept all pods:
kubectl -n velero patch ds node-agent -p='{"spec": {"template": {"spec": {"tolerations":[{"operator": "Exists"}]}}}}'
3 - Velero GCP Cluster Installation
How to set up Velero with a GCP Kubernetes cluster
The following instructions are based on the Velero Plugin for Google Cloud Platform (GCP) instructions.
These steps assume the user has installed the gcloud Command-Line Interface (CLI) and gsutil tool and has the necessary permissions to perform the steps below.
The following items are required to create the Velero bucket via a GCP Bucket:
- Google Cloud Platform (GCP) Project ID: The project ID for where commands are performed from.
- Google Cloud Storage (GCS) Bucket: The object storage bucket where backups are stored.
- Google Service Account (GSA): A Velero specific Google Service Account to backup and restore the Wallaroo instance when required.
- Either a Google Service Account Key or Workload Identity: Either of these methods are used by the Velero service to authenticate to GCP for its backup and restore tasks.
If these items are already complete, jump to the step Install Velero In the Wallaroo GCP Kubernetes Cluster.
Create GCS Bucket
Create the GCS bucket for storing the Wallaroo backup and restores with the following command. Replace the variable $BUCKET_NAME
based on your organization’s requirements.
BUCKET_NAME=<YOUR_BUCKET>
gsutil mb gs://$BUCKET_NAME/
Create Google Service Account for Velero
Create the Google Service Account for the Velero service using the following commands:
Retrieve your organization’s GCP Project ID and store it in the PROJECT_ID
variable. Note that this will retrieve the default project ID for the gcloud
configuration. Replace with the actual GCP Project ID as required.
PROJECT_ID=$(gcloud config get-value project)
Create the service account. Update the $GSA_NAME
variable based on the organization’s requirements.
GSA_NAME=velero
gcloud iam service-accounts create $GSA_NAME \
--display-name "Velero service account"
Use gcloud iam service-accounts list
to list out the services.
gcloud iam service-accounts list
DISPLAY NAME EMAIL DISABLED
Velero service account veleroexample.iam.gserviceaccount.com False
Select the email address for the new Velero service account and set the variable SERVICE_ACCOUNT_EMAIL
equal to the accounts email address:
SERVICE_ACCOUNT_EMAIL=veleroexample.iam.gserviceaccount.com
Create a Custom Role with the following minimum positions, and bind it to the new Velero service account. The ROLE
needs to be unique and DNS compliant.
ROLE="velero.server"
TITLE="Velero Server"
ROLE_PERMISSIONS=(
compute.disks.get
compute.disks.create
compute.disks.createSnapshot
compute.snapshots.get
compute.snapshots.create
compute.snapshots.useReadOnly
compute.snapshots.delete
compute.zones.get
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list
iam.serviceAccounts.signBlob
)
gcloud iam roles create $ROLE \
--project $PROJECT_ID \
--title $TITLE \
--permissions "$(IFS=","; echo "${ROLE_PERMISSIONS[*]}")"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member serviceAccount:$SERVICE_ACCOUNT_EMAIL \
--role projects/$PROJECT_ID/roles/$ROLE
Bind the bucket to the new Service Account:
gsutil iam ch serviceAccount:$SERVICE_ACCOUNT_EMAIL:objectAdmin gs://${BUCKET_NAME}
Grant Velero Service GCP Access
There are multiple methods of granting the Velero service GCP access as detailed in the Plugins for Google Cloud Platform (GCP) Grant access to Velero steps. The following examples will use the Service Account Key method.
Create the Google Service Account Key, and store it in a secure location. In this example, it is stored in ~/.credentials-velero-gcp
:
gcloud iam service-accounts keys create ~/.credentials-velero-gcp \
--iam-account $SERVICE_ACCOUNT_EMAIL
Install Velero In the Wallaroo GCP Kubernetes Cluster
The following steps assume that the Google Service Account Key method was used in the Grant Velero Service GCP Access. See the Plugins for Google Cloud Platform (GCP) Grant access to Velero for other methods.
To install the Velero service into the Kubernetes cluster hosting the Wallaroo service:
Verify the connection to the GCP Kubernetes cluster hosting the Wallaroo instance.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-wallaroodocs-ce-default-pool-5dd3c344-fxs3 Ready <none> 31s v1.23.14-gke.1800
gke-wallaroodocs-ce-default-pool-5dd3c344-q95a Ready <none> 25d v1.23.14-gke.1800
gke-wallaroodocs-ce-default-pool-5dd3c344-scmc Ready <none> 31s v1.23.14-gke.1800
gke-wallaroodocs-ce-default-pool-5dd3c344-wnkn Ready <none> 31s v1.23.14-gke.1800
Install Velero into the GCP Kubernetes cluster. This assumes the $BUCKET_NAME variable from earlier, and the Google Service Account Key are stored in ~/.credentials-velero-gcp
velero install \
--provider gcp \
--plugins velero/velero-plugin-for-gcp:v1.6.0 \
--bucket $BUCKET_NAME \
--secret-file ~/.credentials-velero-gcp \
--use-volume-snapshots=false \
--use-node-agent --wait
Once complete, verify the installation is complete by checking for the velero
namespace in the Kubernetes cluster:
kubectl get namespaces
NAME STATUS AGE
default Active 222d
kube-node-lease Active 222d
kube-public Active 222d
kube-system Active 222d
velero Active 5m32s
wallaroo Active 7d23h
If using Kubernetes taints and tolerations for the Wallaroo installation, update the velero
namespace to accept all pods:
kubectl -n velero patch ds node-agent -p='{"spec": {"template": {"spec": {"tolerations":[{"operator": "Exists"}]}}}}'
4 - Wallaroo Backup and Restore with Velero Guide
How to use Velero in an installed Kubernetes cluster to back up and restore a Wallaroo instance
Once the Velero Installation Procedure and the Velero Kubernetes Install are complete, Wallaroo instance backups are performed through the following process:
Before starting the backup, force the Plateau service to complete writing logs so they can be captured by the backup.
kubectl -n wallaroo scale --replicas=0 deploy/plateau
kubectl -n wallaroo scale --replicas=1 deploy/plateau
Set the $BACKUP_NAME
. This must be all lowercase characters or numbers or -/.
and must end in alphanumeric characters.
BACKUP_NAME={give it your own name}
Issue the following backup command. The --exclude-namespaces
is used to exclude namespaces that are not required for the Wallaroo backup and restore. By default, these are the namespaces velero
, default
, kube-node-lease
, kube-public
, and kube-system
.
This process will back up all namespaces that are not excluded, including deployed Wallaroo pipelines. Add any other namespaces that should not be part of the backup to the --exclude-namespaces
option as per your organization’s requirements.
velero backup create $BACKUP_NAME --default-volumes-to-fs-backup --include-cluster-resources=true --exclude-namespaces velero,default,kube-node-lease,kube-public,kube-system
To view the status of the backup, use the following command. Once the Completed
field shows a date and time, the backup is complete.
velero backup describe $BACKUP_NAME
Name: doctest-20230315a
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.23.15
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=23
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: *
Excluded: velero, default, kube-node-lease, kube-public, kube-system
Resources:
Included: *
Excluded: <none>
Cluster-scoped: included
Label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
TTL: 720h0m0s
CSISnapshotTimeout: 10m0s
Hooks: <none>
Backup Format Version: 1.1.0
Started: 2023-03-15 10:52:27 -0600 MDT
Completed: 2023-03-15 10:52:49 -0600 MDT
Expiration: 2023-04-14 10:52:27 -0600 MDT
Total items to be backed up: 397
Items backed up: 397
Velero-Native Snapshots: <none included>
restic Backups (specify --details for more information):
Completed: 5
List Previous Wallaroo Backups
To list previous Wallaroo backups and their logs, use the following commands below:
List backups with velero backup get
to list all backups and their progress:
velero backup get
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
doctest-20230315a Completed 0 0 2023-03-15 10:52:27 -0600 MDT 28d default <none>
doctest-magicalbear-20230315 Completed 0 1 2023-03-15 11:52:17 -0600 MDT
Retrieve backup logs with velero backup logs $BACKUP_NAME
:
velero backup logs $BACKUP_NAME
Wallaroo Restore Procedure
To restore a from a Wallaroo backup:
Set the backup name as the variable $BACKUP_NAME
. Use the command velero backup get
for a list of previous backups.
velero backup get
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
doctest-20230315a Completed 0 0 2023-03-15 10:52:27 -0600 MDT 28d default <none>
doctest-magicalbear-20230315 Completed 0 1 2023-03-15 11:52:17 -0600 MDT
BACKUP_NAME={give it your own name}
Use the velero restore create
command to create the restore job, using the $BACKUP_NAME
variable set in the step above.
velero restore create --from-backup $BACKUP_NAME
Restore request "doctest-20230315a-20230315105647" submitted successfully.
Run `velero restore describe doctest-20230315a-20230315105647` or `velero restore logs doctest-20230315a-20230315105647` for more details.
To check the restore status, use the velero restore describe
command. The optional flag –details provides more information.
velero restore describe doctest-20230315a-20230315105647 --details
If the Kubernetes cluster does not have a static IP address assigned to the Wallaroo loadBalancer
service, the DNS information may need to be updated if the IP address has changed. Check with the DNS Integration Guide for more information.