Wallaroo Enterprise Single Node Linux Setup Instructions

How to prepare a single node Linux environment for Wallaroo Enterprise installations.

Organizations can run Wallaroo within a single node Linux environment that meet the prerequisites.

The following guide is based on installing Wallaroo Enterprise into virtual machines based on Ubuntu 22.04 hosted in Google Cloud Platform (GCP), Amazon Web Services (AWS) and Microsoft Azure. For other environments and configurations, consult your Wallaroo support representative.

Installation Flow

A typical installation of Wallaroo Enterprise follows this flow:

  • Create Environment: Create the environment to install Wallaroo that meets the system prerequisites.
  • Install Wallaroo: Install Wallaroo into the target environment.
  • Configure DNS: Configure DNS services and the Wallaroo instance for your organization’s use.

Prerequisites

Before starting the bare Linux installation, the following conditions must be met:

  • Have a Wallaroo Enterprise license file. For more information, you can request a demonstration.
  • A Linux bare-metal system or virtual machine with at least 32 cores and 64 GB RAM with Ubuntu 20.04 installed.
  • 650 GB allocated for the root partition, plus 50 GB allocated per node and another 50 GB for the JupyterHub service. Enterprise users who deploy additional pipelines will require an additional 50 GB of storage per lab node deployed.
  • Ensure memory swapping is disabled by removing it from /etc/fstab if needed.
  • DNS services for integrating your Wallaroo Enterprise instance. See the DNS Integration Guide for the instructions on configuring Wallaroo Enterprise with your DNS services.

Note that if Wallaroo is being installed into a cloud environment such as Google Cloud Platform, Microsoft Azure, Amazon Web Services, etc, then additional considerations such as networking, DNS, certificates, and other considerations must be accounted for.

Template Single Node Scripts

The following template scripts are provided as examples on how to create single node virtual machines that meet the requirements listed above in AWS, GCP, and Microsoft Azure environments.

AWS VM Template Script

Download template script here: aws-single-node-vm.bash

# Variables

# The name of the virtual machine
NAME=$USER-demo-vm                     # eg bob-demo-vm

# The image used : ubuntu/images/current/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20230208
IMAGE_ID=ami-0557a15b87f6559cf

# Instance type meeting the Wallaroo requirements.
INSTANCE_TYPE=c6i.8xlarge # c6a.8xlarge is also acceptable

# key name - generate keys using Amazon EC2 Key Pairs
# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html
# Wallaroo people: https://us-east-1.console.aws.amazon.com/ec2/home?region=us-east-1#KeyPairs:v=3 - 
MYKEY=DocNode


# We will whitelist the our source IP for maximum security -- just use 0.0.0.0/0 if you don't care.
MY_IP=$(curl -s https://checkip.amazonaws.com)/32

# Create security group in the Default VPC
aws ec2 create-security-group --group-name $NAME --description "$USER demo" --no-cli-pager

# Open port 22 and 443
aws ec2 authorize-security-group-ingress --group-name $NAME --protocol tcp --port 22 --cidr $MY_IP --no-cli-pager
aws ec2 authorize-security-group-ingress --group-name $NAME --protocol tcp --port 443 --cidr $MY_IP --no-cli-pager

# increase Boot device size to 650 GB
# Change the location from `/tmp/device.json` as required.
# cat <<EOF > /tmp/device.json 
# [{
#   "DeviceName": "/dev/sda1",
#   "Ebs": { 
#     "VolumeSize": 650,
#     "VolumeType": "gp2"
#   }
# }]
# EOF

# Launch instance with a 650 GB Boot device.
aws ec2 run-instances --image-id $IMAGE_ID --count 1 --instance-type $INSTANCE_TYPE \
    --no-cli-pager \
    --key-name $MYKEY \
    --block-device-mappings '[{"DeviceName":"/dev/sda1","Ebs":{"VolumeSize":650,"VolumeType":"gp2"}}]'  \
    --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=$NAME}]" \
    --security-groups $NAME

# Sample output:
# {
#     "Instances": [
#         {
#             ...
#             "InstanceId": "i-0123456789abcdef",     # Keep this instance-id for later
#             ...
#         }
#     ]
# }

#INSTANCEID=YOURINSTANCE
      
# After several minutes, a public IP will be known. This command will retrieve it.
# aws ec2 describe-instances  --output text --instance-id $INSTANCEID \
#    --query 'Reservations[*].Instances[*].{ip:PublicIpAddress}'

# Sample Output
# 12.23.34.56

# KEYFILE=KEYFILELOCATION       #usually ~/.ssh/key.pem - verify this is the same as the key above.
# SSH to the VM - replace $INSTANCEIP
#ssh -i $KEYFILE ubuntu@$INSTANCEIP

# Stop the VM - replace the $INSTANCEID
#aws ec2 stop-instances --instance-id $INSTANCEID

# Restart the VM
#aws ec2 start-instances --instance-id $INSTANCEID

# Clean up - destroy VM
#aws ec2 terminate-instances --instance-id $INSTANCEID

Azure VM Template Script

Download template script here: azure-single-node-vm.bash

#!/bin/bash

# Variables list.  Update as per your organization's settings
NAME=$USER-demo-vm                          # eg bob-demo-vm
RESOURCEGROUP=YOURRESOURCEGROUP
LOCATION=eastus
IMAGE=Canonical:0001-com-ubuntu-server-jammy:22_04-lts:22.04.202301140

# Pick a location
az account list-locations  -o table |egrep 'US|----|Name'

# Create resource group
az group create -l $LOCATION --name $USER-demo-$(date +%y%m%d)

# Create VM. This will create ~/.ssh/id_rsa and id_rsa.pub - store these for later use.
az vm create --resource-group $RESOURCEGROUP --name $NAME --image $IMAGE  --generate-ssh-keys \
   --size Standard_D32s_v4 --os-disk-size-gb 500 --public-ip-sku Standard

# Sample output
# {
#   "location": "eastus",
#   "privateIpAddress": "10.0.0.4",
#   "publicIpAddress": "20.127.249.196",    <-- Write this down as MYPUBIP
#   "resourceGroup": "mnp-demo-230213",
#   ...
# }

# SSH port is open by default. This adds an application port.
az vm open-port --resource-group $RESOURCEGROUP --name $NAME --port 443

# SSH to the VM - assumes that ~/.ssh/id_rsa and ~/.ssh/id_rsa.pub from above are availble.
# ssh $MYPUBIP

# Use this Stop the VM ("deallocate" frees resources and billing; "stop" does not)
# az vm deallocate --resource-group $RESOURCEGROUP --name $NAME

# Restart the VM
# az vm start --resource-group $RESOURCEGROUP --name $NAME

GCP VM Template Script

Dependencies:

Download template script here: gcp-single-node-vm.bash

# Settings

NAME=$USER-demo-$(date +%y%m%d)      # eg bob-demo-230210
ZONE=us-west1-a                      # For a complete list, use `gcloud compute zones list | egrep ^us-`
PROJECT=wallaroo-dev-253816          # Insert the GCP Project ID here.  This is the one for Wallaroo.

# Create VM

IMAGE=projects/ubuntu-os-cloud/global/images/current/ubuntu-2204-jammy-v20230114

# Port 22 and 443 open by default
gcloud compute instances create $NAME \
    --project=$PROJECT \
    --zone=$ZONE \
    --machine-type=e2-standard-32 \
    --network-interface=network-tier=STANDARD,subnet=default \
    --maintenance-policy=MIGRATE \
    --provisioning-model=STANDARD \
    --no-service-account \
    --no-scopes \
    --tags=https-server \
    --create-disk=boot=yes,image=${IMAGE},size=500,type=pd-standard \
    --no-shielded-secure-boot \
    --no-shielded-vtpm \
    --no-shielded-integrity-monitoring \
    --reservation-affinity=any


# Get the external IP address
gcloud compute instances describe $NAME --zone $ZONE --format='get(networkInterfaces[0].accessConfigs[0].natIP)'

# SSH to the VM
#gcloud compute ssh $NAME --zone $ZONE

# SCP file to the instance - replace $FILE with the file path.  Useful for copying up the license file up to the instance.

#gcloud compute scp --zone $ZONE $FILE $NAME:~/

# SSH port forward to the VM
#gcloud compute ssh $NAME --zone $ZONE -- -NL 8800:localhost:8800

# Suspend the VM
#gcloud compute instances stop $NAME --zone $ZONE

# Restart the VM
#gcloud compute instances start $NAME --zone $ZONE

Kubernetes Installation Steps

The following script and steps will install the Kubernetes version and requirements into the Linux node that supports a Wallaroo single node installation.

The process includes these major steps:

  • Install Kubernetes
  • Install Kots Version

Install Kubernetes

curl is installed in the default scripts provided above. Verify that it is installed if using some other platform.

  1. Verify that the Ubuntu distribution is up to date, and reboot if necessary after updating.

    sudo apt update
    sudo apt upgrade
    
  2. Start the Kubernetes installation with the following script, substituting the URL path as appropriate for your license.

    For Wallaroo versions 2022.4 and below:

    curl https://kurl.sh/9398a3a | sudo bash
    

    For Wallaroo versions 2023.1 and later, the install is based on the license channel. For example, if your license uses the EE channel, then the path is /wallaroo-ee; that is, /wallaroo- plus the lower-case channel name. Note that the Kubernetes install channel must match the License version. Check with your Wallaroo support representative with any questions about your version.

    curl https://kurl.sh/wallaroo-ee | sudo bash
    
    1. If prompted with This application is incompatible with memory swapping enabled. Disable swap to continue? (Y/n), reply Y.
  3. Set up the Kubernetes configuration with the following commands:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    chmod u+w $HOME/.kube/config
    echo 'export KUBECONFIG=$HOME/.kube/config' >> ~/.bashrc
    
  4. Log out, and log back in as the same user. Verify the installation was successful with the following:

    kubectl get nodes
    

    It should return results similar to the following:

    NAME     STATUS   ROLES                  AGE     VERSION
    wallux   Ready    control-plane,master   6m26s   v1.23.6
    

Install Kots

Install kots with the following process.

  1. Run the following script and provide your password for the sudo based commands when prompted.

    curl https://kots.io/install/1.91.3 | sudo bash
    
  2. Verify kots was installed with the following command:

    kubectl kots version
    

    It should return results similar to the following:

    Replicated KOTS 1.91.3
    

Connection Options

Once Kubernetes has been set up on the Linux node, users can opt to copy the Kubernetes configuration to a local system, updating the IP address and other information as required. See the Configure Access to Multiple Clusters.

The easiest method is to create a SSH tunnel to the Linux node. Usually this will be in the format:

ssh $IP -L8800:localhost:8800

For example, in an AWS instance that may be as follows, replaying $KEYFILE with the link to the keyfile and $IP with the IP address of the Linux node.

ssh -i $KEYFILE ubuntu@$IP -L8800:localhost:8800

In a GCP instance, gcloud can be used as follows, replacing $NAME with the name of the GCP instance, $ZONE with the zone it was installed into.

gcloud compute ssh $NAME --zone $ZONE -- -NL 8800:localhost:8800

Port forwarding port 8800 is used for kots based installation to access the Wallaroo Administrative Dashboard.

Install Wallaroo

With the environment prepared, Wallaroo can now be installed.

Step Status
Setup Environment
NEXT STEP!

COMPLETED!
Install Wallaroo Enterprise Install Wallaroo into a prepared environment
Integrate Wallaroo with DNS Services Update Wallaroo post-install