1 - Wallaroo Helm Standard Cloud Prerequisites
General Time to Completion: 30 minutes.
Before installing Wallaroo version, verify that the following hardware and software requirements are met.
Environment Requirements
Environment Hardware Requirements
The following system requirements are required for the minimum settings for running Wallaroo in a Kubernetes cloud cluster.
- Minimum number of nodes: 4
- Minimum Number of CPU Cores: 8
- Minimum RAM per node: 16 GB
- Minimum Storage: A total of 625 GB of storage will be allocated for the entire cluster based on 5 users with up to four pipelines with five steps per pipeline, with 50 GB allocated per node, including 50 GB specifically for the Jupyter Hub service. Enterprise users who deploy additional pipelines will require an additional 50 GB of storage per lab node deployed.
Wallaroo recommends at least 16 cores total to enable all services. At less than 16 cores, services will have to be disabled to allow basic functionality as detailed in this table.
|
|
|
|
|
|
|
Cluster Size |
|
< 8 core |
8 core/48GB |
16 core/48GB |
32 core/48GB |
Description |
Inference |
|
✔ |
✔ |
✔ |
✔ |
The Wallaroo inference engine that performs inference requests from deployed pipelines. |
Dashboard |
|
✘ |
✔ |
✔ |
✔ |
The graphics user interface for configuring workspaces, deploying pipelines, tracking metrics, and other uses. |
Jupyter HUB/Lab |
|
|
|
|
|
The JupyterHub service for running Python scripts, JupyterNotebooks, and other related tasks within the Wallaroo instance. |
|
Single Lab |
✘ |
✔ |
✔ |
✔ |
|
|
Multiple Labs |
✘ |
✘ |
✔ |
✔ |
|
Prometheus |
|
✘ |
✔ |
✔ |
✔ |
Used for collecting and reporting on metrics. Typical metrics are values such as CPU utilization and memory usage. |
|
Alerting |
✘ |
✘ |
✔ |
✔ |
|
|
Model Validation |
✘ |
✘ |
✔ |
✔ |
|
|
Dashboard Graphs |
✘ |
✔ |
✔ |
✔ |
|
Plateau |
|
✘ |
✘ |
✔ |
✔ |
A Wallaroo developed service for storing inference logs at high speed. This is not a long term service; organizations are encouraged to store logs in long term solutions if required. |
|
Model Insights |
✘ |
✘ |
✔ |
✔ |
|
Python API |
|
|
|
|
|
|
|
Model Conversion |
✘ |
✔ |
✔ |
✔ |
Converts models into a native runtime for use with the Wallaroo inference engine. |
For instructions on installing Wallaroo on a system with less than 16 cores, see the Install Wallaroo with Minimum Services.
Enterprise Network Requirements
The following network requirements are required for the minimum settings for running Wallaroo:
-
For Wallaroo Enterprise users: 200 IP addresses are required to be allocated per cloud environment.
-
For Wallaroo Community users: 98 IP addresses are required to be allocated per cloud environment.
-
DNS services integration is required for Wallaroo Enterprise edition. See the DNS Integration Guide for the instructions on configuring Wallaroo Enterprise with your DNS services.
DNS services integration is required to provide access to the various supporting services that are part of the Wallaroo instance. These include:
- Simplified user authentication and management.
- Centralized services for accessing the Wallaroo Dashboard, Wallaroo SDK and Authentication.
- Collaboration features allowing teams to work together.
- Managed security, auditing and traceability.
Environment Software Requirements
The following software or runtimes are required Wallaroo version 2023.2. Most are automatically available through the supported cloud providers.
Software or Runtime |
Description |
Minimum Supported Version |
Preferred Version(s) |
Kubernetes |
Cluster deployment management |
1.23 |
1.23 and above |
containerd |
Container Management |
1.7.0 |
1.7.0 |
kubectl |
Kubernetes administrative console application |
1.26 |
1.26 |
Node Selectors
Wallaroo uses different nodes for various services, which can be assigned to a different node pool to contain resources separate from other nodes. The following nodes selectors can be configured:
- ML Engine node selector
- ML Engine Load Balance node selector
- Database Node Selector
- Grafana node selector
- Prometheus node selector
- Each Lab * Node Selector
Kubernetes Installation Instructions
This sample Helm installation procedure has the following steps:
Install Kubernetes
This example requires the user use a Cloud Kubernetes installation.
Setup the Kubernetes Cloud cluster as defined in the Wallaroo Enterprise Environment Setup Guides.
Install Helm
The follow the instructions from the Installing Helm guide for your environment.
Install Krew
The following instructions were taken from the Install Krew guide.
To install the kubectl
plugin krew
:
-
Verify that git
is installed in the local system.
-
Run the following to install krew
:
(
set -x; cd "$(mktemp -d)" &&
OS="$(uname | tr '[:upper:]' '[:lower:]')" &&
ARCH="$(uname -m | sed -e 's/x86_64/amd64/' -e 's/\(arm\)\(64\)\?.*/\1\2/' -e 's/aarch64$/arm64/')" &&
KREW="krew-${OS}_${ARCH}" &&
curl -fsSLO "https://github.com/kubernetes-sigs/krew/releases/latest/download/${KREW}.tar.gz" &&
tar zxvf "${KREW}.tar.gz" &&
./"${KREW}" install krew
)
-
Once complete, add the following to the .bashrc
file:
export PATH="${KREW_ROOT:-$HOME/.krew}/bin:$PATH"
Install the preflight
and support-bundle
Krew tools via the following commands:
kubectl krew install preflight
kubectl krew install support-bundle
Install Wallaroo via Helm
Wallaroo Provided Data
Members of the Wallaroo support staff will provide the following information:
- Wallaroo Container Registration Login: Commands to login to the Wallaroo container registry.
- Preflight and Support Bundle configuration files: The files
preflight.yaml
and support-bundle.yaml
are used in the commands below to complete the preflight process and generate the support bundle package as needed for troubleshooting needs.
- Preflight verification command: The commands to verify that the Kubernetes environment meets the requirements for the Wallaroo install.
- Install Wallaroo Command: Instructions on installations into the Kubernetes environment using Helm through the Wallaroo container registry.
The following steps are used with these command and configuration files to install Wallaroo Enterprise via Helm.
Registration Login
The first step in the Wallaroo installation process via Helm is to connect to the Kubernetes environment that will host the Wallaroo Enterprise instance and login into the Wallaroo container registry through the command provided by the Wallaroo support staff. The command will take the following format, replacing $YOURUSERNAME
and $YOURPASSWORD
with the respective username and password provided.
helm registry login registry.replicated.com --username $YOURUSERNAME --password $YOURPASSWORD
Preflight Verification
IMPORTANT NOTE
The preflight test is not programmatically enforced during installation via Helm and should be performed manually before installation. If the Kubernetes environment does not meet the requirements the Wallaroo installation may fail or perform erratically. Please verify that all preflight test run successfully before proceeding to install Wallaroo.
Preflight verification is performed with the following command, using the preflight.yaml
configuration file provided by the Wallaroo support representative as listed above.
kubectl preflight --interactive=false preflight.yaml
If successful, the tests will show PASS
for each preflight requirement as in the following example:
name: cluster-resources status: running completed: 0 total: 2
name: cluster-resources status: completed completed: 1 total: 2
name: cluster-info status: running completed: 1 total: 2
name: cluster-info status: completed completed: 2 total: 2
--- PASS Required Kubernetes Version
--- Your cluster meets the recommended and required versions of Kubernetes.
--- PASS Container Runtime
--- Containerd container runtime was found.
--- PASS Check Kubernetes environment.
--- KURL is a supported distribution
--- PASS Cluster Resources
--- Cluster resources are satisfactory
--- PASS Every node in the cluster must have at least 12Gi of memory
--- All nodes have at least 12 GB of memory capacity
--- PASS Every node in the cluster must have at least 8 cpus allocatable.
--- All nodes have at least 8 CPU capacity
--- PASS wallaroo
PASS
1.1 - Wallaroo Helm Standard Cloud Install Procedures
The following instructions detail how to install Wallaroo Enterprise via Helm for Kubernetes cloud environments such as Microsoft Azure, Amazon Web Service, and Google Cloud Platform.
IMPORTANT NOTE
These instructions are for Wallaroo Enterprise only.
Install Wallaroo
With the preflight checks and prerequisites met, Wallaroo can be installed via Helm through the following process:
-
Create namespace. By default, the namespace wallaroo
is used:
kubectl create namespace wallaroo
-
Set the new namespace as the current namespace:
kubectl config set-context --current --namespace wallaroo
-
Set the TLS certificate secret in the Kubernetes environment:
-
Create the certificate and private key. It is recommended to name it after the domain name of your Wallaroo instance. For example: wallaroo.example.com
. For production environments, organizations are recommended to use certificates from their certificate authority. Note that the Wallaroo SDK will not connect from an external connection without valid certificates. For more information on using DNS settings and certificates, see the Wallaroo DNS Integration Guide.
-
Create the Kubernetes secret from the certificates created in the previous step, replacing $TLSCONFIG
with the name of the Kubernetes secret. Store the secret name for a the step Configure local values file.
kubectl create secret tls $TLSCONFIG --cert=$TLSSECRETS --key=$TLSSECRETS
For example, if $TLSCONFIG
is my-tls-secrets
with example.com.crt
and key example.com.key
, then the command would be translated as
kubectl create secret tls my-tls-secrets --cert=example.com.crt --key=example.com.key
-
Configure local values file: The default Helm install of Wallaroo contains various default settings. The local values file overwrites values based on the organization needs. The following represents the minimum mandatory values for a Wallaroo installation using certificates and the default LoadBalancer for a cloud Kubernetes cluster. The configuration details below is saved as local-values.yaml
for these examples.
For information on taints and tolerations settings, see the Taints and Tolerations Guide.
Note the following required settings:
domainPrefix
and domainSuffix
: Used to set the DNS settings for the Wallaroo instance. For more information, see the Wallaroo DNS Integration Guide.
deploymentStage
and custTlsSecretName
: These are set for use with the Kubernetes secret created in the previous step. External connections through the Wallaroo SDK require valid certificates.
- generate_secrets: Secrets for administrative and other users can be generated by the Helm install process, or set manually. This setting scrambles the passwords during installation.
apilb
: Sets the apilb
service options including the following:
serviceType: LoadBalancer
: Uses the default LoadBalancer setting for the Kubernetes cloud service the Wallaroo instance is installed into. Replace with the specific service connection settings as required.
external_inference_endpoints_enabled: true
: This setting is required for performing external SDK inferences to a Wallaroo instance. For more information, see the Wallaroo Model Endpoints Guide
domainPrefix: doc-test
domainSuffix: example.com # Ie, main URL is https://ds.big.corp, then https://ds.keycloak.big.corp etc
# Provide a TLS secret for DNS domain.
# Required for connecting the external SDK to a Wallaroo instance
deploymentStage: cust # Must be provided if `custTlsSecretName` is added.
custTlsSecretName: my-tls-secrets
generate_secrets: true
apilb:
# Generic loadbalancer for the cluster. Replace with the specific notebalancer for the cloud service.
serviceType: LoadBalancer
# Required to perform remote inferences either through the SDK or the API
external_inference_endpoints_enabled: true
dashboard:
# Sets the Wallaroo instance name
clientName: "YOUR COMPANY NAME HERE"
# Adds authentication to the Wallaroo Dashboard
auth:
enabled: true
-
The resources used by the Wallaroo services can be modified. For full details, see the Wallaroo Helm References Guides. The following example shows limiting the apilb
service:
domainPrefix: doc-test
domainSuffix: example.com # Ie, main URL is https://ds.big.corp, then https://ds.keycloak.big.corp etc
# Provide a TLS secret for DNS domain.
# Required for connecting the external SDK to a Wallaroo instance
deploymentStage: cust # Must be provided if `custTlsSecretName` is added.
custTlsSecretName: my-tls-certs
generate_secrets: true
apilb:
# Generic loadbalancer for the cluster. Replace with the specific notebalancer for the cloud service.
serviceType: LoadBalancer
# Required to perform remote inferences either through the SDK or the API
external_inference_endpoints_enabled: true
resources:
limits:
cpu: 0.5
requests:
cpu: 0.1
dashboard:
# Sets the Wallaroo instance name
clientName: "YOUR COMPANY NAME HERE"
# Adds authentication to the Wallaroo Dashboard
auth:
enabled: true
-
Install Wallaroo: The Wallaroo support representative will provide the installation command for the Helm install that will use the Wallaroo container registry. This assumes that the preflight checks were successful. This command uses the following format:
helm install $RELEASE $REGISTRYURL --version $VERSION--values $LOCALVALUES.yaml
Where:
$RELEASE
: The name of the Helm release. By default, wallaroo
.
$REGISTRYURL
: The URl for the Wallaroo container registry service.
$VERSION
: The version of Wallaroo to install. For this example, 2022.4.0-main-2297
.
$LOCALVALUES
: The .yaml file containing the local values overrides. For this example, local-values.yaml
.
For example, for the registration wallaroo
the command would be:
helm install wallaroo oci://registry.replicated.com/wallaroo/EE/wallaroo --version 2022.4.0-main-2297 --values local-values.yaml
-
Verify the Installation: Once the installation is complete, verify the installation with the helm test $RELEASE
command. With the settings above, this would be:
A successful installation will resemble the following:
NAME: wallaroo
LAST DEPLOYED: Wed Dec 21 09:15:23 2022
NAMESPACE: wallaroo
STATUS: deployed
REVISION: 1
TEST SUITE: wallaroo-fluent-bit-test-connection
Last Started: Wed Dec 21 11:58:34 2022
Last Completed: Wed Dec 21 11:58:37 2022
Phase: Succeeded
TEST SUITE: wallaroo-test-connections-hook
Last Started: Wed Dec 21 11:58:37 2022
Last Completed: Wed Dec 21 11:58:41 2022
Phase: Succeeded
TEST SUITE: wallaroo-test-objects-hook
Last Started: Wed Dec 21 11:58:41 2022
Last Completed: Wed Dec 21 11:58:53 2022
Phase: Succeeded
At this point, the installation is complete and can be accessed through the fully qualified domain names set in the installation process above. Verify that the DNS settings are accurate before attempting to connect to the Wallaroo instance. For more information, see the Wallaroo DNS Integration Guide.
To add the initial users if they were not set up through Helm values, see the Wallaroo Enterprise User Management guide.
Troubleshoot Wallaroo
If issues are detected in the Wallaroo instance, a support bundle file is generated using the support-bundle.yaml
file provided by the Wallaroo support representative.
This creates a collection of log files, configuration files and other details into a .tar.gz file in the same directory as the command is run from in the format support-bundle-YYYY-MM-DDTHH-MM-SS.tar.gz
. This file is submitted to the Wallaroo support team for review.
This support bundle is generated through the following command:
kubectl support-bundle support-bundle.yaml --interactive=false
Uninstall
To uninstall Wallaroo via Helm, use the following command replacing the $RELEASE with the name of the release used to install Wallaroo. By default, this is wallaroo
:
It is also recommended to remove the wallaroo
namespace after the helm uninstall
is complete.
IMPORTANT NOTE
Do not remove the Wallaroo namespace until after the helm uninstall
is complete. Removing the namespace first can leave resources hanging and can cause issues when trying to reinstall Wallaroo via Helm.
kubectl delete namespace wallaroo
2 - Wallaroo Helm Reference Guides
The following guides include reference details related to installing Wallaroo via Helm.
2.1 - Wallaroo Helm Reference Table
Wallaroo
A Helm chart for the control plane for Wallaroo
Configuration
The following table lists the configurable parameters of the Wallaroo chart and their default values.
Parameter |
Description |
Default |
kubernetes_distribution |
One of: aks, eks, gke, or kurl. May be safe to leave defaulted. |
"" |
imageRegistry |
imageRegistry where images are pulled from |
"ghcr.io/wallaroolabs" |
imageTag |
imageTag that images default to - can be overridden for each component |
"main" |
replImagePrefix |
imageRegistry where images are pulled from, as overridden by Kots |
"ghcr.io/wallaroolabs" |
assays.enabled |
Controls the display of Assay data in the Dashboard |
true |
custTlsSecretName |
Name of existing Kubernetes TLS type secret |
"" |
deploymentStage |
Deployment stage, must be set to “cust” when deployed |
"dev" |
custTlsCert |
Customer provided certificate chain when deploymentStage is “cust”. |
"" |
custTlsKey |
Customer provided private key when deploymentStage is “cust”. |
"" |
nodeSelector |
Global node selector |
{} |
tolerations |
Global tolerations |
[{"key": "wallaroo", "operator": "Exists", "effect": "NoSchedule"}] |
domainPrefix |
DNS prefix of Wallaroo endpoints, can be empty for none |
"xxx" |
domainSuffix |
DNS suffix of Wallaroo endpoints, MUST be provided |
"yyy" |
externalIpOverride |
Used in cases where we can’t accurately determine our external, inbound IP address. Normally “”. |
"" |
imagePullPolicy |
Global policy saying when K8s pulls images: Always, Never, or IfNotPresent. |
"Always" |
wallarooSecretName |
Secret name for pulling Wallaroo images |
"regcred" |
apilb.nodeSelector |
standard node selector for API-LB |
{} |
apilb.annotations |
Annotations for api-lb service |
{} |
apilb.serviceType |
Service type of api-lb service |
"ClusterIP" |
apilb.external_inference_endpoints_enabled |
Enable external URL inference endpoints: pipeline inference endpoints that are accessible outside of the Wallaroo cluster. |
true |
jupyter.enabled |
If true, a jupyer hub was deployed which components can point to. |
false |
keycloak.user |
administrative username |
"admin" |
keycloak.password |
default admin password: overridden if generate_secrets is true |
"admin" |
keycloak.provider.clientId |
upstream client id |
"" |
keycloak.provider.clientSecret |
upstream client secret |
"" |
keycloak.provider.name |
human name for provider |
"" |
keycloak.provider.id |
Type of provider, one of: “github”, “google”, or “OIDC” |
"" |
keycloak.provider.authorizationUrl |
URL to contact the upstream client for auth requests |
null |
keycloak.provider.clientAuthMethod |
client auth method - Must be client_secret_post for OIDC provider type, leave blank otherwise. |
null |
keycloak.provider.displayName |
human name for provider, displayed to end user in login dialogs |
null |
keycloak.provider.tokenUrl |
Used only for ODIC, see token endpoint under Azure endpoints. |
null |
dbcleaner.schedule |
when the cleaner runs, default is every eight hours |
"* */8 * * *" |
dbcleaner.maxAgeDays |
delete older than this many days |
"30" |
plateau.enabled |
Enable Plateau deployment |
true |
plateau.diskSize |
Disk space to allocate. Smaller than 100Gi is not recommended. |
"100Gi" |
telemetry.enabled |
Used only for our CE product. Leave disabled for EE/Helm installs. |
false |
dashboard.enabled |
Enable dashboard service |
true |
dashboard.clientName |
Customer display name which appears at the top of the dashboard window. |
"Fitzroy Macropods, LLC" |
minio.imagePullSecrets |
Must override for helm + private registry; eg -name: "some-secret" |
[] |
minio.image.repository |
Must override for helm + private registry |
"quay.io/minio/minio" |
minio.mcImage.repository |
Must override for helm + private registry |
"quay.io/minio/mc" |
minio.persistence.size |
Minio model storage disk size. Smaller than 10Gi is not recommended. |
"10Gi" |
fluent-bit.imagePullSecrets |
Must override for helm + private registry; eg -name: "some-secret" |
[] |
fluent-bit.image.repository |
Must override for helm + private registry |
"cr.fluentbit.io/fluent/fluent-bit" |
helmTests.enabled |
When enabled, create “helm test” resources. |
true |
helmTests.nodeSelector |
When helm test is run, this selector places the test pods. |
{} |
pythonAPIServer.enabled |
This service is used for model conversion. |
false |
explainabilityServer.enabled |
Enable the model explainability service |
false |
Documentation generated by Frigate.
2.2 - Wallaroo Helm Reference Details
post_delete_hook
This hook runs when you do helm uninstall
unless …
- you give –no-hooks to helm
- you set the enable flag to False at INSTALL time.
imageRegistry
Registry and Tag portion of Wallaroo images. Third party images are not included. Tag is
computed at runtime and overridden. In online Helm installs, these should not be touched; in
airgap Helm installs imageRegistry
must be overridden to local registry.
generate_secrets
If true, generate random secrets for several services at install time.
If false, use the generic defaults listed here, which can also be overridden by caller.
assays
This is a (currently) Dashboard-specific feature flag to control the display of Assays.
custTlsSecretName
To provide TLS certificates, (1) set deploymentStage
to “cust”, then (2) provide EITHER the
name of an existing Kubernetes TLS secret in custTlsSecret
OR provide base64 encoded secrets
in custTlsCert
and custTlsKey
.
domainPrefix
DNS specification for our named external service endpoints.
To form URLs, we concatenate the optional domainPrefix
, the service name in question, and then
the domainSuffix
. Their values are based on license, type, and customer config inputs. They
MUST be overriden per install via helm values, or by Replicated.
Community – prefix/suffix in license
domainPrefix |
domainSuffix |
dashboard_fqdn |
thing_fqdn (thing = jup, kc, etc) |
"" |
wallaroo.community |
(never) |
(never) |
cust123 |
wallaroo.community |
cust123.wallaroo. |
cust123.thing.wallaroo.community |
Enterprise et al – prefix/suffix from config
domainPrefix |
domainSuffix |
dashboard_fqdn |
thing_fqdn (thing = jup, kc, etc) |
"" |
wl.bigco |
wl.bigco |
thing.wl.bigco |
cust123 |
wl.bigco |
cust123.wl.bigco |
cust123.thing.wl.bigco |
wallarooSecretName
In online Helm installs, an image pull secret is created and this is its name. The secret allows
the Kubernetes node to pull images from proxy.replicated.com. In airgap Helm installs, a local
Secret of type docker-registry
must be created and this value set to its name.
privateModelRegistry
If the customer has specified a private model container registry, the enable flag will reflect
and the secret will be populated. registry
, username
, and password
are mandatory. email
is optional. registry
is of the form “hostname:port”.
apilb
Main ingress LB for Wallaroo services.
The Kubernetes Ingress object is not used, instead we deploy a single Envoy load balancer with a
single IP in all cases, which serves: TLS termination, authentication (JWT) checking, and both
host based and path based application routing. Customer should be aware of two values in particular.
api.serviceType
defaults to ClusterIP
. If api.serviceType
is set to LoadBalancer
, cloud
services will allocate a hosted LB service, in which case the apilb.annotations
should be
provided, in order to pass configuration such as “internal” or “external” to the cloud service.
Example:
apilb:
serviceType: LoadBalancer
annotations: service.beta.kubernetes.io/aws-load-balancer-internal: "true"
keycloak
Wallaroo can connect to a variety of identity providers, broker OpenID Connect authentication
requests, and then limit access to endpoints. This section configures a https://www.keycloak.org
installation. If a provider is specified here, Keycloak will configure itself to use that on
install. If no providers are specified here, the administrator must login to the Keycloak
service as the administrative user and either add users by hand or create an auth provider. In
general, a client must be created upstream and a URL, client ID, and secret (token) for that
client is entered here.
dbcleaner
Manage retention for fluentbit table. This contains log message outputs from orchestration tasks.
plateau
Plateau is a low-profile fixed-footprint log processor / event store for fast storage of
inference results. The amount of disk space provisioned is adjustable. Smaller than “100Gi” is
not recommended for performance reasons.
pythonAPIServer
Model conversion is an optional service that allows converting non-onnx models (keras, sklearn,
and xgboost) to onnx and adding them to your pipeline, without extensive manual conversion or
processing steps. This allows more rapid iteration over models or experiments.
wsProxy
This controls the wsProxy, and should only be enabled if nats (ArbEx) is also enabled.
wsProxy is required for the Dashboard to subscribe to events and show notifications.
orchestration
Pipeline orchestration is general task execution service that allows users to upload arbitrary
code and have it executed on their behalf by the system. nats and arbex must be enabled.
models
The model server supports model autoconversion and requires nats and arbitrary execution to be
enabled.