Upgrade Standard Single Node Installations of Wallaroo via Kots
Table of Contents
The upgrade Wallaroo follows these general steps.
- Pre-upgrade Checklist: Actions that should be performed before the upgrade process is initiated.
- Upgrade Procedure: Steps for upgrading Wallaroo to a specified version.
Pre-Upgrade Checklist
Before starting an upgrade of Wallaroo, the following steps should be performed to provide a smooth transition from the previous version of Wallaroo to the new one.
Complete (Y/N) | Action |
---|---|
Create Support Bundle | |
Notify Users of Downtime | |
Backup Wallaroo |
Create Support Bundle
A support bundle creates a collection of logs, configurations, and other information for Wallaroo support staff. This should be generated before the upgrade procedure starts to preserve a set of current settings and information useful to track any potential issues during the upgrade process.
Support bundles are generated from one of the two methods.
At any time, the administration console can create troubleshooting bundles for Wallaroo technical support to assess product health and help with problems. Support bundles contain logs and configuration files which can be examined before downloading and transmitting to Wallaroo. The console also has a configurable redaction mechanism in cases where sensitive information such as passwords, tokens, or PII (Personally Identifiable Information) need to be removed from logs in the bundle.

Create Support Bundles via the Wallaroo Administrator Dashboard
This process is for kots
based installations of Wallaroo.
This assumes that kubectl
and kots
have been installed in a terminal with administrative access to the Kubernetes cluster hosting the Wallaroo installation.
- Launch the Kots Administrative Dashboard with
kubectl kots admin-console --namespace $WALLAROO_NAMESPACE
, replacing$WALLAROO_NAMESPACE
with the namespace the Wallaroo instance is installed in. For example:kubectl kots admin-console --namespace wallaroo
. - Log into the administration console with the Administrative Dashboard password set during the installation process.
- Select the Troubleshoot tab.
- Select Analyze Wallaroo.
- Select
Download bundle
to save the bundle file as a compressed archive. Depending on your browser settings the file download location can be specified. - Send the file to Wallaroo technical support.
At any time, any existing bundle can be examined and downloaded from the Troubleshoot
tab.
Create Support Bundles via the Command Line
To generate a support bundle via the command line for either kots
or helm
based installations of Wallaroo, the following applications are used.
kubectl
kubectl
plugins:krew
: Install Krewkrew support-bundle
: Install withkubectl krew install support-bundle
.
This creates a collection of log files, configuration files and other details into a .tar.gz file in the same directory as the command is run from in the format support-bundle-YYYY-MM-DDTHH-MM-SS.tar.gz
. This file is submitted to the Wallaroo support team for review.
kubectl support-bundle --load-cluster-specs --interactive=false
Notify Users of Downtime
The following is a short list of users to notify before the Wallaroo Ops downtime. ALL users that interact with Wallaroo Ops should be informed; the following list is provided to help DevOps engineers to know what stakeholders to notify.
Stakeholder | Description | What Users will Experience |
---|---|---|
Wallaroo Dashboard Users | Users that interact via the Wallaroo Dashboard. | If users are active in the dashboard:![]() If a user attempts to go to the dashboard in their browser: ![]() |
API users | These users interact with Wallaroo via the Wallaroo MLOps API or related API services. | MLOps API and other API services to the Wallaroo Ops instance will not be available during the upgrade and any requests will return a 503 (service unavailable) with the message “Wallaroo upgrade in progress.” |
External SDK users | Users who perform actions via the Wallaroo SDK that do not use the Wallaroo JupyterHub service. | Wallaroo SDK connections will not be available during the upgrade and any attempts to use the SDK with return an error. |
Wallaroo JupyterHub users | Users who use the Wallaroo JupyterHub Service to run JupyterNotebooks with the Wallaroo Ops instance. | If users are in JupyterHub at the time of the upgrade:![]() If users attempt to go to JupyterHub during the upgrade: ![]() |
Deployed pipeline | Users who perform inferences through deployed pipelines. | During the upgrade process, deployed pipelines are undeployed for the upgrade process. Once the upgrade process is complete, any previously deployed pipelines are automatically redeployed. Any inference requests to deployed pipelines will return a 503 (service unavailable) with the message “Wallaroo upgrade in progress.” |
Edge and Multi-cloud Deployment Users | Users and services that perform inference requests and other services via models deployed to multicloud and edge locations. | Edge and multicloud deployments of Wallaroo are not interrupted. While an upgrade is in progress, no logs can be received by the Wallaroo Ops instance. Once connection is restored, edge locations upload their inference logs. |
Wallaroo Orchestration Users | Scheduled Wallaroo orchestrations. | Wallaroo orchestrations scheduled tasks will be interrupted during the upgrade process. Scheduled task runs will be missed while the upgrade is in progress and will run at their next scheduled time after the upgrade completes. |
Backup Wallaroo
Before starting the upgrade procedure, backup the Wallaroo Ops instance. The following procedure summary is based on the provided Wallaroo Backup and Restore Guides.
Wallaroo Backup Procedure
Before starting the backup, force the Plateau service to complete writing logs so they can be captured by the backup. This assumes that Wallaroo was installed in the namespace
wallaroo
.kubectl -n wallaroo scale --replicas=0 deploy/plateau kubectl -n wallaroo scale --replicas=1 deploy/plateau
Set the
$BACKUP_NAME
. This must be all lowercase characters or numbers or-/.
and must end in alphanumeric characters.BACKUP_NAME={give it your own name}
Issue the following backup command. The
--exclude-namespaces
is used to exclude namespaces that are not required for the Wallaroo backup and restore. By default, these are the namespacesvelero
,default
,kube-node-lease
,kube-public
, andkube-system
.This process will back up all namespaces that are not excluded, including deployed Wallaroo pipelines. Add any other namespaces that should not be part of the backup to the
--exclude-namespaces
option as per your organization’s requirements.velero backup create $BACKUP_NAME --default-volumes-to-fs-backup --include-cluster-resources=true --exclude-namespaces velero,default,kube-node-lease,kube-public,kube-system
To view the status of the backup,
velero backup describe --details $BACKUP_NAME
. Once theCompleted
field shows a date and time, the backup is complete.In progress backup.
velero backup describe --details $BACKUP_NAME Name: sample-doctest-backup-20240502 Namespace: velero Labels: velero.io/storage-location=default Annotations: velero.io/resource-timeout=10m0s velero.io/source-cluster-k8s-gitversion=v1.28.7-gke.1026000 velero.io/source-cluster-k8s-major-version=1 velero.io/source-cluster-k8s-minor-version=28 Phase: InProgress Namespaces: Included: * Excluded: velero, default, kube-node-lease, kube-public, kube-system Resources: Included: * Excluded: <none> Cluster-scoped: included Label selector: <none> Or label selector: <none> Storage Location: default Velero-Native Snapshot PVs: auto Snapshot Move Data: false Data Mover: velero TTL: 720h0m0s CSISnapshotTimeout: 10m0s ItemOperationTimeout: 4h0m0s Hooks: <none> Backup Format Version: 1.1.0 Started: 2024-05-14 16:26:43 -0600 MDT Completed: <n/a> Expiration: 2024-06-13 16:26:43 -0600 MDT Estimated total items to be backed up: 1073 Items backed up so far: 28 Resource List: <backup resource list not found> Backup Volumes: Velero-Native Snapshots: <none included> CSI Snapshots: <none included or not detectable> Pod Volume Backups - kopia: Completed: gmp-system/alertmanager-0: alertmanager-config, alertmanager-data gmp-system/collector-cdsm4: config-out, storage gmp-system/collector-fslhc: config-out, storage gmp-system/collector-p6f85: config-out, storage gmp-system/collector-q4djj: config-out, storage gmp-system/rule-evaluator-7874c6f478-672vs: config-out wallaroo/hub-65c45d4c7-nb9lp: pvc wallaroo/kotsadm-b4f68468d-dzj5c: backup, tmp wallaroo/kotsadm-minio-0: kotsadm-minio, minio-cert-dir, minio-config-dir wallaroo/kotsadm-rqlite-0: kotsadm-rqlite, tmp In Progress: wallaroo/minio-cf97d78cb-pv82x: export
Completed backup.
velero backup describe --details $BACKUP_NAME Name: sample-doctest-backup-20240502 Namespace: velero Labels: velero.io/storage-location=default Annotations: velero.io/resource-timeout=10m0s velero.io/source-cluster-k8s-gitversion=v1.28.7-gke.1026000 velero.io/source-cluster-k8s-major-version=1 velero.io/source-cluster-k8s-minor-version=28
Phase: Completed
Warnings:
Velero: <none>
Cluster: <none>
Namespaces:
wallaroo: resource: /pods name: /kotsadm-b4f68468d-dzj5c message: /volume migrations is declared in pod wallaroo/kotsadm-b4f68468d-dzj5c but not mounted by any container, skippingNamespaces:
Included: *
Excluded: velero, default, kube-node-lease, kube-public, kube-systemResources:
Included: *
Excluded: <none>
Cluster-scoped: includedLabel selector: <none>
Or label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
Snapshot Move Data: false
Data Mover: veleroTTL: 720h0m0s
CSISnapshotTimeout: 10m0s
ItemOperationTimeout: 4h0m0sHooks: <none>
Backup Format Version: 1.1.0
Started: 2024-05-14 16:26:43 -0600 MDT
Completed: 2024-05-14 16:32:19 -0600 MDTExpiration: 2024-06-13 16:26:43 -0600 MDT
Total items to be backed up: 719
Items backed up: 719Resource List:
admissionregistration.k8s.io/v1/MutatingWebhookConfiguration:
- gmp-operator.gmp-system.monitoring.googleapis.com
- neg-annotation.config.common-webhooks.networking.gke.io
- pod-ready.config.common-webhooks.networking.gke.io
- warden-mutating.config.common-webhooks.networking.gke.io…Other backed up resources
warden.gke.io/v1/Audit:
- autogke-default-linux-capabilities
- autogke-disallow-hostnamespaces
- autogke-disallow-privilege
- autogke-no-host-port
- autogke-no-write-mode-hostpath
- autogke-node-affinity-selector-limitation
- autogke-pod-affinity-limitation
- autopilot-admission-webhook-config-limitation
- autopilot-capacity-request-limitation
- autopilot-external-ip-limitation
- autopilot-no-ephemeral-containers
- autopilot-persistent-volume-limitation
- autopilot-volume-type-limitationBackup Volumes:
Velero-Native Snapshots: <none included>CSI Snapshots: <none included>
Pod Volume Backups - kopia:
Completed:
gmp-system/alertmanager-0: alertmanager-config, alertmanager-data
gmp-system/collector-cdsm4: config-out, storage
gmp-system/collector-fslhc: config-out, storage
gmp-system/collector-p6f85: config-out, storage
gmp-system/collector-q4djj: config-out, storage
gmp-system/rule-evaluator-7874c6f478-672vs: config-out
wallaroo/hub-65c45d4c7-nb9lp: pvc
wallaroo/kotsadm-b4f68468d-dzj5c: backup, tmp
wallaroo/kotsadm-minio-0: kotsadm-minio, minio-cert-dir, minio-config-dir
wallaroo/kotsadm-rqlite-0: kotsadm-rqlite, tmp
wallaroo/minio-cf97d78cb-pv82x: export
wallaroo/nats-0: nats-js, pid
wallaroo/plateau-7dfbd89655-9xz6v: plateau-storage
wallaroo/postgres-74d6948c48-mjmb5: postgres-storage
wallaroo/prometheus-deployment-666d968bfd-cxp46: alert-config-volume, metrics-storage-volume
wallaroo/wallsvc-0: socket-volume, spire-dataHooksAttempted: 1
HooksFailed: 0
Upgrade Procedure
Depending on the size and number of workspaces and artifacts, a typical upgrade can take 30-60 minutes. Select one of the following options based on the Wallaroo Install Process:
- Install via
kots
or Single Node](https://docs.wallaroo.ai/wallaroo-platform-operations/wallaroo-platform-operations-install/wallaroo-platform-operations-installation/wallaroo-platform-operations-install-single-node/wallaroo-enterprise-install-guide-single-node-linux/): Select [Upgrade via Kots - Install via
helm
: Select Upgrade via Helm.
Upgrade via Kots
The following procedure is used to upgrade a Wallaroo Ops instance via kots
. For Single Node installations, skip to Upgrade via Kots Procedure.
Update kots CLI and Cluster
The following section details how to upgrade the kots
plugin and cluster in preparation of upgrading the Wallaroo installation. Note that the default timeout for kots
based installations are 10 minutes. For more details, see kots CLI flags documentation.
Kubernetes and Kots Client Software Prerequisites
Before installing or upgrading Wallaroo, the administrative node managing the Kubernetes cluster will require these tools.
- kubectl
- For Kots based installs:
- kots Version
1.124.4
- kots Version
- For Helm installs:
helm
: Install Helm- Minimum supported version: Helm 3.11.2
krew
: Install Krewkrew preflight
andkrew support-bundle
. Install with the following commands:kubectl krew install support-bundle
kubectl krew install preflight
- For Kots based installs:
The following are quick guides for installing kubectl
for macOS.
To install kubectl
on a macOS system using Homebrew:
Issue the
brew install
command:brew install kubectl
Verify the installation:
kubectl version --client
Upgrade Kots Client Procedure
To upgrade the version of kots
used:
Upgrade the kubectl
kots
plugin to the specific version with the following command. For more details, see Installing the KOTS CLI.
curl https://kots.io/install/1.124.4 | bash
Upgrade Kots Cluster Application Procedure
Upgrade the kots
version of an installed cluster to match your CLI kots
version via the following command, replacing $NAMESPACE with the namespace the Wallaroo instance is installed in. For more details, see Update an Application: Using the KOTS CLI.
kubectl kots admin-console upgrade -n $NAMESPACE
For example, if the Wallaroo installation is in the default namespace wallaroo
, the command is:
kubectl kots admin-console upgrade -n wallaroo
Upgrade via Kots Procedure
To upgrade a kots
based installation of Wallaroo:
For installations for Single Node, access the Kots Administrative Dashboard via a browser via the installation’s external IP address and port 30000
. For example:
http://{YOUR IP ADDRESS}:30000
Access the Kots Administrative Dashboard via the domain name and port as provided in the previous step.
From the Kots Administrative Dashboard:
- If there is a new version of Wallaroo to install based on your Wallaroo license type, it will be displayed under the Version (B) display as New Version Available. Select Check for updates to check for updated versions.
- Select the version to upgrade to.
- To perform a preflight check, select the preflight icon and verify the cluster meets the requirements.
- If ready to upgrade, select Deploy (C).
- Verify the upgrade process by selecting Yes, Deploy.
During the upgrade process, the status indicator (A) changes from Ready to Unavailable. Selecting Details will show which services are available or are still being upgraded.
When the upgrade process is complete, the status indicator will change to Ready. At this point, users can resume their normal operations.