Wallaroo Enterprise Comprehensive Install Guide: Single Node Linux

How to set up Wallaroo Enterprise on Single Node Linux

Single Node Linux

Organizations can run Wallaroo within a single node Linux environment that meet the prerequisites.

The following guide is based on installing Wallaroo Enterprise into virtual machines based on Ubuntu 22.04.

For other environments and configurations, consult your Wallaroo support representative.

  • Prerequisites

Before starting the bare Linux installation, the following conditions must be met:

  • Have a Wallaroo Enterprise license file. For more information, you can request a demonstration.

  • A Linux bare-metal system or virtual machine with at least 32 cores and 64 GB RAM with Ubuntu 20.04 installed.

  • 650 GB allocated for the root partition, plus 50 GB allocated per node and another 50 GB for the JupyterHub service. Enterprise users who deploy additional pipelines will require an additional 50 GB of storage per lab node deployed.

  • Ensure memory swapping is disabled by removing it from /etc/fstab if needed.

  • DNS services for integrating your Wallaroo Enterprise instance. See the DNS Integration Guide for the instructions on configuring Wallaroo Enterprise with your DNS services.

  • IMPORTANT NOTE

    • Wallaroo requires out-bound network connections to download the required container images and other tasks. For situations that require limiting out-bound access, refer to the air-gap installation instructions or contact your Wallaroo support representative. Also note that if Wallaroo is being installed into a cloud environment such as Google Cloud Platform, Microsoft Azure, Amazon Web Services, etc, then additional considerations such as networking, DNS, certificates, and other considerations must be accounted for. For IP address restricted environments, see the Air Gap Installation Guide.
    • The steps below are based on minimum requirements for install Wallaroo in a single node environment.
    • For situations that require limiting external IP access or other questions, refer to your Wallaroo support representative.
  • Template Single Node Scripts

The following template scripts are provided as examples on how to create single node virtual machines that meet the requirements listed above in AWS, GCP, and Microsoft Azure environments.

Download template script here: aws-single-node-vm.bash

# Variables

# The name of the virtual machine
NAME=$USER-demo-vm                     # eg bob-demo-vm

# The image used : ubuntu/images/2023.4.1/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20230208
IMAGE_ID=ami-0557a15b87f6559cf

# Instance type meeting the Wallaroo requirements.
INSTANCE_TYPE=c6i.8xlarge # c6a.8xlarge is also acceptable

# key name - generate keys using Amazon EC2 Key Pairs
# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html
# Wallaroo people: https://us-east-1.console.aws.amazon.com/ec2/home?region=us-east-1#KeyPairs:v=3 - 
MYKEY=DocNode


# We will whitelist the our source IP for maximum security -- just use 0.0.0.0/0 if you don't care.
MY_IP=$(curl -s https://checkip.amazonaws.com)/32

# Create security group in the Default VPC
aws ec2 create-security-group --group-name $NAME --description "$USER demo" --no-cli-pager

# Open port 22 and 443
aws ec2 authorize-security-group-ingress --group-name $NAME --protocol tcp --port 22 --cidr $MY_IP --no-cli-pager
aws ec2 authorize-security-group-ingress --group-name $NAME --protocol tcp --port 443 --cidr $MY_IP --no-cli-pager

# increase Boot device size to 650 GB
# Change the location from `/tmp/device.json` as required.
# cat <<EOF > /tmp/device.json 
# [{
#   "DeviceName": "/dev/sda1",
#   "Ebs": { 
#     "VolumeSize": 650,
#     "VolumeType": "gp2"
#   }
# }]
# EOF

# Launch instance with a 650 GB Boot device.
aws ec2 run-instances --image-id $IMAGE_ID --count 1 --instance-type $INSTANCE_TYPE \
    --no-cli-pager \
    --key-name $MYKEY \
    --block-device-mappings '[{"DeviceName":"/dev/sda1","Ebs":{"VolumeSize":650,"VolumeType":"gp2"}}]'  \
    --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=$NAME}]" \
    --security-groups $NAME

# Sample output:
# {
#     "Instances": [
#         {
#             ...
#             "InstanceId": "i-0123456789abcdef",     # Keep this instance-id for later
#             ...
#         }
#     ]
# }

#INSTANCEID=YOURINSTANCE
      
# After several minutes, a public IP will be known. This command will retrieve it.
# aws ec2 describe-instances  --output text --instance-id $INSTANCEID \
#    --query 'Reservations[*].Instances[*].{ip:PublicIpAddress}'

# Sample Output
# 12.23.34.56

# KEYFILE=KEYFILELOCATION       #usually ~/.ssh/key.pem - verify this is the same as the key above.
# SSH to the VM - replace $INSTANCEIP
#ssh -i $KEYFILE ubuntu@$INSTANCEIP

# Stop the VM - replace the $INSTANCEID
#aws ec2 stop-instances --instance-id $INSTANCEID

# Restart the VM
#aws ec2 start-instances --instance-id $INSTANCEID

# Clean up - destroy VM
#aws ec2 terminate-instances --instance-id $INSTANCEID
  • Azure VM Template Script

  • Dependencies

Download template script here: azure-single-node-vm.bash

#!/bin/bash

# Variables list.  Update as per your organization's settings
NAME=$USER-demo-vm                          # eg bob-demo-vm
RESOURCEGROUP=YOURRESOURCEGROUP
LOCATION=eastus
IMAGE=Canonical:0001-com-ubuntu-server-jammy:22_04-lts:22.04.202301140

# Pick a location
az account list-locations  -o table |egrep 'US|----|Name'

# Create resource group
az group create -l $LOCATION --name $USER-demo-$(date +%y%m%d)

# Create VM. This will create ~/.ssh/id_rsa and id_rsa.pub - store these for later use.
az vm create --resource-group $RESOURCEGROUP --name $NAME --image $IMAGE  --generate-ssh-keys \
   --size Standard_D32s_v4 --os-disk-size-gb 500 --public-ip-sku Standard

# Sample output
# {
#   "location": "eastus",
#   "privateIpAddress": "10.0.0.4",
#   "publicIpAddress": "20.127.249.196",    <-- Write this down as MYPUBIP
#   "resourceGroup": "mnp-demo-230213",
#   ...
# }

# SSH port is open by default. This adds an application port.
az vm open-port --resource-group $RESOURCEGROUP --name $NAME --port 443

# SSH to the VM - assumes that ~/.ssh/id_rsa and ~/.ssh/id_rsa.pub from above are availble.
# ssh $MYPUBIP

# Use this Stop the VM ("deallocate" frees resources and billing; "stop" does not)
# az vm deallocate --resource-group $RESOURCEGROUP --name $NAME

# Restart the VM
# az vm start --resource-group $RESOURCEGROUP --name $NAME
  • GCP VM Template Script

Dependencies:

Download template script here: gcp-single-node-vm.bash

# Settings

NAME=$USER-demo-$(date +%y%m%d)      # eg bob-demo-230210
ZONE=us-west1-a                      # For a complete list, use `gcloud compute zones list | egrep ^us-`
PROJECT=wallaroo-dev-253816          # Insert the GCP Project ID here.  This is the one for Wallaroo.

# Create VM

IMAGE=projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20231030

# Port 22 and 443 open by default
gcloud compute instances create $NAME \
    --project=$PROJECT \
    --zone=$ZONE \
    --machine-type=e2-standard-32 \
    --network-interface=network-tier=STANDARD,subnet=default \
    --maintenance-policy=MIGRATE \
    --provisioning-model=STANDARD \
    --no-service-account \
    --no-scopes \
    --tags=https-server \
    --create-disk=boot=yes,image=${IMAGE},size=500,type=pd-standard \
    --no-shielded-secure-boot \
    --no-shielded-vtpm \
    --no-shielded-integrity-monitoring \
    --reservation-affinity=any


# Get the external IP address
gcloud compute instances describe $NAME --zone $ZONE --format='get(networkInterfaces[0].accessConfigs[0].natIP)'

# SSH to the VM
#gcloud compute ssh $NAME --zone $ZONE

# SCP file to the instance - replace $FILE with the file path.  Useful for copying up the license file up to the instance.

#gcloud compute scp --zone $ZONE $FILE $NAME:~/

# SSH port forward to the VM
#gcloud compute ssh $NAME --zone $ZONE -- -NL 8800:localhost:8800

# Suspend the VM
#gcloud compute instances stop $NAME --zone $ZONE

# Restart the VM
#gcloud compute instances start $NAME --zone $ZONE
  • Kubernetes Installation Steps

The following script and steps will install the Kubernetes version and requirements into the Linux node that supports a Wallaroo single node installation.

The process includes these major steps:

  • Install Kubernetes

  • Install Kots Version

  • Install Kubernetes

curl is installed in the default scripts provided above. Verify that it is installed if using some other platform.

  1. Verify that the Ubuntu distribution is up to date, and reboot if necessary after updating.

    sudo apt update
    sudo apt upgrade
    
  2. Start the Kubernetes installation with the following script, substituting the URL path as appropriate for your license.

    For Wallaroo versions 2022.4 and below:

    curl https://kurl.sh/9398a3a | sudo bash
    

    For Wallaroo versions 2023.1 and later, the install is based on the license channel. For example, if your license uses the EE channel, then the path is /wallaroo-ee; that is, /wallaroo- plus the lower-case channel name. Note that the Kubernetes install channel must match the License version. Check with your Wallaroo support representative with any questions about your version.

    curl https://kurl.sh/wallaroo-ee | sudo bash
    
    1. If prompted with This application is incompatible with memory swapping enabled. Disable swap to continue? (Y/n), reply Y.
  3. Set up the Kubernetes configuration with the following commands:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    chmod u+w $HOME/.kube/config
    echo 'export KUBECONFIG=$HOME/.kube/config' >> ~/.bashrc
    
  4. Log out, and log back in as the same user. Verify the installation was successful with the following:

    kubectl get nodes
    

    It should return results similar to the following:

    NAME     STATUS   ROLES                  AGE     VERSION
    wallux   Ready    control-plane,master   6m26s   v1.23.6
    
  • Install Kots

Install kots with the following process.

  1. Run the following script and provide your password for the sudo based commands when prompted.

    curl https://kots.io/install/1.103.3 | REPL_USE_SUDO=y bash
    
  2. Verify kots was installed with the following command:

    kubectl kots version
    

    It should return results similar to the following:

    Replicated KOTS 1.103.3
    

For instructions on updating the kots version for the Wallaroo Ops installation, see Updating KOTS.

  • Connection Options

Once Kubernetes has been set up on the Linux node, users can opt to copy the Kubernetes configuration to a local system, updating the IP address and other information as required. See the Configure Access to Multiple Clusters.

The easiest method is to create a SSH tunnel to the Linux node. Usually this will be in the format:

ssh $IP -L8800:localhost:8800

For example, in an AWS instance that may be as follows, replaying $KEYFILE with the link to the keyfile and $IP with the IP address of the Linux node.

ssh -i $KEYFILE ubuntu@$IP -L8800:localhost:8800

In a GCP instance, gcloud can be used as follows, replacing $NAME with the name of the GCP instance, $ZONE with the zone it was installed into.

gcloud compute ssh $NAME --zone $ZONE -- -NL 8800:localhost:8800

Port forwarding port 8800 is used for kots based installation to access the Wallaroo Administrative Dashboard.

  • Network Configurations

Note that the standard procedure of installing Wallaroo, the Model Endpoints Guide details how to enable external public communications with the Wallaroo instance.

When Ingress Mode for Wallaroo interactive services are set to None, the user will have to use port forwarding services to access the Wallaroo instance.

When Ingress Mode for Wallaroo interactive services are set to Internal or External, the IP address is set via NodePort, and requires the following ports be open to access from remote locations:

  • 80
  • 443
  • 8081
  • 8083

Check the network settings for the single node linux hosting the Wallaroo instance for instructions on how to enable external or port forwarding access as required.

Install Wallaroo

Organizations that use cloud services such as Google Cloud Platform (GCP), Amazon Web Services (AWS), or Microsoft Azure can install Wallaroo Enterprise through the following process. These instructions also work with Single Node Linux based installations.

Before installation, the following prerequisites must be met:

  • Have a Wallaroo Enterprise license file. For more information, you can request a demonstration.
  • Set up a cloud Kubernetes environment that meets the requirements. Clusters must meet the following minimum specifications:
    • Minimum number of nodes: 4
    • Minimum Number of CPU Cores: 8
    • Minimum RAM: 16 GB
    • A total of 625 GB of storage will be allocated for the entire cluster based on 5 users with up to four pipelines with five steps per pipeline, with 50 GB allocated per node, including 50 GB specifically for the Jupyter Hub service. Enterprise users who deploy additional pipelines will require an additional 50 GB of storage per lab node deployed.
    • Runtime: containerd is required.
  • DNS services for integrating your Wallaroo Enterprise instance. See the DNS Integration Guide for the instructions on configuring Wallaroo Enterprise with your DNS services.

Wallaroo Enterprise can be installed either interactively or automatically through the kubectl and kots applications.

Automated Install

To automatically install Wallaroo into the namespace wallaroo, specify the administrative password and the license file during the installation as in the following format with the following variables:

  • NAMESPACE: The namespace for the Wallaroo Enterprise install, typically wallaroo.
  • LICENSEFILE: The location of the Wallaroo Enterprise license file.
  • SHAREDPASSWORD: The password of for the Wallaroo Administrative Dashboard.
kubectl kots install wallaroo/ee -n $NAMESPACE --license-file $LICENSEFILE --shared-password $SHAREDPASSWORD

For example, the following settings translate to the following install command:

  • NAMESPACE: wallaroo.
  • LICENSEFILE: myWallaroolicense.yaml
  • SHAREDPASSWORD: snugglebunnies

kubectl kots install wallaroo/ee -n wallaroo --license-file myWallaroolicense.yaml --shared-password wallaroo

Interactive Install

The Interactive Install process allows users to adjust the configuration settings before Wallaroo is deployed. It requires users be able to access the Wallaroo Administrative Dashboard through a browser, typically on port 8080.

  • IMPORTANT NOTE: Users who install Wallaroo through another node such as in the single node installation can port use SSH tunneling to access the Wallaroo Administrative Dashboard. For example:

    ssh IP -L8800:localhost:8800
    
  1. Install the Wallaroo Enterprise Edition using kots install wallaroo/ee, specifying the namespace to install Wallaroo into. For example, if wallaroo is the namespace, then the command is:

    kubectl kots install wallaroo/ee --namespace wallaroo
    
  2. Wallaroo Enterprise Edition will be downloaded and installed into your Kubernetes environment in the namespace specified. When prompted, set the default password for the Wallaroo environment. When complete, Wallaroo Enterprise Edition will display the URL for the Admin Console, and how to end the Admin Console from running.

    • Deploying Admin Console
    • Creating namespace ✓
    • Waiting for datastore to be ready ✓
        Enter a new password to be used for the Admin Console: •••••••••••••
      • Waiting for Admin Console to be ready ✓
    
    • Press Ctrl+C to exit
    • Go to http://localhost:8800 to access the Admin Console
    

To relaunch the Wallaroo Administrative Dashboard and make changes or updates, use the following command:

kubectl-kots admin-console --namespace wallaroo

Configure Wallaroo

Once installed, Wallaroo will continue to run until terminated.

Change Wallaroo Administrative Dashboard Password

To change the password to the Wallaroo Administrative Dashboard:

  1. From the command line, use the command:

    kubectl kots reset-password -n {namespace}
    

    For example, for default installations where the Kubernetes namespace is wallaroo, the command would be:

    kubectl kots reset-password -n wallaroo
    

    From here, enter the new password.

  2. From the Wallaroo Administrative Dashboard:

    1. Login and authenticate with the current password.

    2. From the upper right hand corner, select to access the menu and select Change password.

      Select Change Password
    3. Enter the current password, then update and verify with the new password.

      Change Password

Setup DNS Services

Wallaroo Enterprise requires integration into your organizations DNS services.

The DNS Integration Guide details adding the Wallaroo instance to an organizations DNS services. The following is an abbreviated guide that assumes that certificates were already generated.

  1. From the Wallaroo Dashboard, select Config and set the following:

    1. Networking Configuration
      1. Ingress Mode for Wallaroo Endpoints:
        1. None: Port forwarding or other methods are used for access.
        2. Internal: For environments where only nodes within the same Kubernetes environment and no external connections are required.
        3. External: Connections from outside the Kubernetes environment is allowed.
          1. Enable external URL inference endpoints: Creates pipeline inference endpoints. For more information, see Model Endpoints Guide.
    2. DNS
      1. DNS Suffix (Mandatory): The domain name for your Wallaroo instance.
    3. TLS Certificates
      1. Use custom TLS Certs: Checked
      2. TLS Certificate: Enter your TLS Certificate (.crt file).
      3. TLS Private Key: Enter your TLS private key (.key file).
    4. Other settings as desired.
    Wallaroo DNS Records
  2. Once complete, scroll to the bottom of the Config page and select Save config.

  3. A pop-up window will display The config for Wallaroo Enterprise has been updated.. Select Go to updated version to continue.

  4. From the Version History page, select Deploy. Once the new deployment is finished, you will be able to access your Wallaroo services via their DNS addresses.

To verify the configuration is complete, access the Wallaroo Dashboard through the suffix domain. For example if the suffix domain is wallaroo.example.com then access https://wallaroo.example.com in a browser and verify the connection and certificates.

Setup Users

User management is handled through the Wallaroo instance Keycloak service. See the Wallaroo User Management for full guides on setting up users, identity providers, and other user configuration options. This step must be completed before using Wallaroo.

The following is an abbreviated guide on setting up new Wallaroo users.

Accessing The Wallaroo Keycloak Dashboard

Enterprise customers may access their Wallaroo Keycloak dashboard by navigating to https://keycloak.<suffix>, depending on their choice domain suffix supplied during installation.

Obtaining Administrator Credentials

The standard Wallaroo installation creates the user admin by default and assigns them a randomly generated password. The admin user credentials are obtained which may be obtained directly from Kubernetes with the following commands, assuming the Wallaroo instance namespace is wallaroo.

  • Retrieve Keycloak Admin Username

    kubectl -n wallaroo \
    get secret keycloak-admin-secret \
    -o go-template='{{.data.KEYCLOAK_ADMIN_USER | base64decode }}'
    
  • Retrieve Keycloak Admin Password

    kubectl -n wallaroo \
    get secret keycloak-admin-secret \
    -o go-template='{{.data.KEYCLOAK_ADMIN_PASSWORD | base64decode }}'
    

Accessing the User Management Panel

In the Keycloak Administration Console, click Manage -> Users in the left-hand side menu. Click the View all users button to see existing users. This will be under the host name keycloak.$WALLAROO_SUFFIX. For example, if the $WALLAROO_SUFFIX is wallaroo.example.com, the Keycloak Administration Console would be keycloak.wallaroo.example.com.

Adding Users

To add a user through the Keycloak interface:

  1. Click the Add user button in the top-right corner.

  2. Enter the following:

    Wallaroo Enterprise New User
    1. A unique username and email address.
    2. Ensure that the Email Verified checkbox is checked - Wallaroo does not perform email verification.
    3. Under Required User Actions, set Update Password so the user will update their password the next time they log in.
  3. Click Save.

  4. Once saved, select Credentials tab, then the Set Password section, enter the new user’s desired initial password in the Password and Password Confirmation fields.

    Wallaroo Enterprise New User
  5. Click Set Password. Confirm the action when prompted. This will force the user to set their own password when they log in to Wallaroo.

  6. To log into the Wallaroo dashboard, log out as the Admin user and login to the Wallaroo Dashboard as a preconfigured user or via SSO.