Organizations can run Wallaroo within a single node Linux environment that meet the prerequisites.
The following guide is based on installing Wallaroo Enterprise into virtual machines based on Ubuntu 22.04 hosted in Google Cloud Platform (GCP), Amazon Web Services (AWS) and Microsoft Azure. For other environments and configurations, consult your Wallaroo support representative.
Installation Flow
A typical installation of Wallaroo Enterprise follows this flow:
- Create Environment: Create the environment to install Wallaroo that meets the system prerequisites.
- Install Wallaroo: Install Wallaroo into the target environment.
- Configure DNS: Configure DNS services and the Wallaroo instance for your organization’s use.
Prerequisites
Before starting the bare Linux installation, the following conditions must be met:
- Have a Wallaroo Enterprise license file. For more information, you can request a demonstration.
- A Linux bare-metal system or virtual machine with at least 32 cores and 64 GB RAM with Ubuntu 20.04 installed.
- See the Install Wallaroo with Minimum Services for installing Wallaroo with reduced services.
- 650 GB allocated for the root partition, plus 50 GB allocated per node and another 50 GB for the JupyterHub service. Enterprise users who deploy additional pipelines will require an additional 50 GB of storage per lab node deployed.
- Ensure memory swapping is disabled by removing it from
/etc/fstab
if needed. - DNS services for integrating your Wallaroo Enterprise instance. See the DNS Integration Guide for the instructions on configuring Wallaroo Enterprise with your DNS services.
Note that if Wallaroo is being installed into a cloud environment such as Google Cloud Platform, Microsoft Azure, Amazon Web Services, etc, then additional considerations such as networking, DNS, certificates, and other considerations must be accounted for.
Template Single Node Scripts
The following template scripts are provided as examples on how to create single node virtual machines that meet the requirements listed above in AWS, GCP, and Microsoft Azure environments.
AWS VM Template Script
- Dependencies
- AWS CLI
- IAM permissions to create resources. See IAM policies for Amazon EC2.
Download template script here: aws-single-node-vm.bash
# Variables
# The name of the virtual machine
NAME=$USER-demo-vm # eg bob-demo-vm
# The image used : ubuntu/images/current/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20230208
IMAGE_ID=ami-0557a15b87f6559cf
# Instance type meeting the Wallaroo requirements.
INSTANCE_TYPE=c6i.8xlarge # c6a.8xlarge is also acceptable
# key name - generate keys using Amazon EC2 Key Pairs
# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html
# Wallaroo people: https://us-east-1.console.aws.amazon.com/ec2/home?region=us-east-1#KeyPairs:v=3 -
MYKEY=DocNode
# We will whitelist the our source IP for maximum security -- just use 0.0.0.0/0 if you don't care.
MY_IP=$(curl -s https://checkip.amazonaws.com)/32
# Create security group in the Default VPC
aws ec2 create-security-group --group-name $NAME --description "$USER demo" --no-cli-pager
# Open port 22 and 443
aws ec2 authorize-security-group-ingress --group-name $NAME --protocol tcp --port 22 --cidr $MY_IP --no-cli-pager
aws ec2 authorize-security-group-ingress --group-name $NAME --protocol tcp --port 443 --cidr $MY_IP --no-cli-pager
# increase Boot device size to 650 GB
# Change the location from `/tmp/device.json` as required.
# cat <<EOF > /tmp/device.json
# [{
# "DeviceName": "/dev/sda1",
# "Ebs": {
# "VolumeSize": 650,
# "VolumeType": "gp2"
# }
# }]
# EOF
# Launch instance with a 650 GB Boot device.
aws ec2 run-instances --image-id $IMAGE_ID --count 1 --instance-type $INSTANCE_TYPE \
--no-cli-pager \
--key-name $MYKEY \
--block-device-mappings '[{"DeviceName":"/dev/sda1","Ebs":{"VolumeSize":650,"VolumeType":"gp2"}}]' \
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=$NAME}]" \
--security-groups $NAME
# Sample output:
# {
# "Instances": [
# {
# ...
# "InstanceId": "i-0123456789abcdef", # Keep this instance-id for later
# ...
# }
# ]
# }
#INSTANCEID=YOURINSTANCE
# After several minutes, a public IP will be known. This command will retrieve it.
# aws ec2 describe-instances --output text --instance-id $INSTANCEID \
# --query 'Reservations[*].Instances[*].{ip:PublicIpAddress}'
# Sample Output
# 12.23.34.56
# KEYFILE=KEYFILELOCATION #usually ~/.ssh/key.pem - verify this is the same as the key above.
# SSH to the VM - replace $INSTANCEIP
#ssh -i $KEYFILE ubuntu@$INSTANCEIP
# Stop the VM - replace the $INSTANCEID
#aws ec2 stop-instances --instance-id $INSTANCEID
# Restart the VM
#aws ec2 start-instances --instance-id $INSTANCEID
# Clean up - destroy VM
#aws ec2 terminate-instances --instance-id $INSTANCEID
Azure VM Template Script
- Dependencies
Download template script here: azure-single-node-vm.bash
#!/bin/bash
# Variables list. Update as per your organization's settings
NAME=$USER-demo-vm # eg bob-demo-vm
RESOURCEGROUP=YOURRESOURCEGROUP
LOCATION=eastus
IMAGE=Canonical:0001-com-ubuntu-server-jammy:22_04-lts:22.04.202301140
# Pick a location
az account list-locations -o table |egrep 'US|----|Name'
# Create resource group
az group create -l $LOCATION --name $USER-demo-$(date +%y%m%d)
# Create VM. This will create ~/.ssh/id_rsa and id_rsa.pub - store these for later use.
az vm create --resource-group $RESOURCEGROUP --name $NAME --image $IMAGE --generate-ssh-keys \
--size Standard_D32s_v4 --os-disk-size-gb 500 --public-ip-sku Standard
# Sample output
# {
# "location": "eastus",
# "privateIpAddress": "10.0.0.4",
# "publicIpAddress": "20.127.249.196", <-- Write this down as MYPUBIP
# "resourceGroup": "mnp-demo-230213",
# ...
# }
# SSH port is open by default. This adds an application port.
az vm open-port --resource-group $RESOURCEGROUP --name $NAME --port 443
# SSH to the VM - assumes that ~/.ssh/id_rsa and ~/.ssh/id_rsa.pub from above are availble.
# ssh $MYPUBIP
# Use this Stop the VM ("deallocate" frees resources and billing; "stop" does not)
# az vm deallocate --resource-group $RESOURCEGROUP --name $NAME
# Restart the VM
# az vm start --resource-group $RESOURCEGROUP --name $NAME
GCP VM Template Script
Dependencies:
- Gcloud CLI
- GCP Project ID
Download template script here: gcp-single-node-vm.bash
# Settings
NAME=$USER-demo-$(date +%y%m%d) # eg bob-demo-230210
ZONE=us-west1-a # For a complete list, use `gcloud compute zones list | egrep ^us-`
PROJECT=wallaroo-dev-253816 # Insert the GCP Project ID here. This is the one for Wallaroo.
# Create VM
IMAGE=projects/ubuntu-os-cloud/global/images/current/ubuntu-2204-jammy-v20230114
# Port 22 and 443 open by default
gcloud compute instances create $NAME \
--project=$PROJECT \
--zone=$ZONE \
--machine-type=e2-standard-32 \
--network-interface=network-tier=STANDARD,subnet=default \
--maintenance-policy=MIGRATE \
--provisioning-model=STANDARD \
--no-service-account \
--no-scopes \
--tags=https-server \
--create-disk=boot=yes,image=${IMAGE},size=500,type=pd-standard \
--no-shielded-secure-boot \
--no-shielded-vtpm \
--no-shielded-integrity-monitoring \
--reservation-affinity=any
# Get the external IP address
gcloud compute instances describe $NAME --zone $ZONE --format='get(networkInterfaces[0].accessConfigs[0].natIP)'
# SSH to the VM
#gcloud compute ssh $NAME --zone $ZONE
# SCP file to the instance - replace $FILE with the file path. Useful for copying up the license file up to the instance.
#gcloud compute scp --zone $ZONE $FILE $NAME:~/
# SSH port forward to the VM
#gcloud compute ssh $NAME --zone $ZONE -- -NL 8800:localhost:8800
# Suspend the VM
#gcloud compute instances stop $NAME --zone $ZONE
# Restart the VM
#gcloud compute instances start $NAME --zone $ZONE
Kubernetes Installation Steps
The following script and steps will install the Kubernetes version and requirements into the Linux node that supports a Wallaroo single node installation.
The process includes these major steps:
- Install Kubernetes
- Install Kots Version
Install Kubernetes
curl
is installed in the default scripts provided above. Verify that it is installed if using some other platform.
-
Verify that the Ubuntu distribution is up to date, and reboot if necessary after updating.
sudo apt update sudo apt upgrade
-
Start the Kubernetes installation with the following script, substituting the URL path as appropriate for your license.
For Wallaroo versions 2022.4 and below:
curl https://kurl.sh/9398a3a | sudo bash
For Wallaroo versions 2023.1 and later, the install is based on the license channel. For example, if your license uses the
EE
channel, then the path is/wallaroo-ee
; that is,/wallaroo-
plus the lower-case channel name. Note that the Kubernetes install channel must match the License version. Check with your Wallaroo support representative with any questions about your version.curl https://kurl.sh/wallaroo-ee | sudo bash
- If prompted with
This application is incompatible with memory swapping enabled. Disable swap to continue? (Y/n)
, replyY
.
- If prompted with
-
Set up the Kubernetes configuration with the following commands:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config chmod u+w $HOME/.kube/config echo 'export KUBECONFIG=$HOME/.kube/config' >> ~/.bashrc
-
Log out, and log back in as the same user. Verify the installation was successful with the following:
kubectl get nodes
It should return results similar to the following:
NAME STATUS ROLES AGE VERSION wallux Ready control-plane,master 6m26s v1.23.6
Install Kots
Install kots
with the following process.
-
Run the following script and provide your password for the
sudo
based commands when prompted.curl https://kots.io/install/1.91.3 | sudo bash
-
Verify
kots
was installed with the following command:kubectl kots version
It should return results similar to the following:
Replicated KOTS 1.91.3
Connection Options
Once Kubernetes has been set up on the Linux node, users can opt to copy the Kubernetes configuration to a local system, updating the IP address and other information as required. See the Configure Access to Multiple Clusters.
The easiest method is to create a SSH tunnel to the Linux node. Usually this will be in the format:
ssh $IP -L8800:localhost:8800
For example, in an AWS instance that may be as follows, replaying $KEYFILE
with the link to the keyfile and $IP
with the IP address of the Linux node.
ssh -i $KEYFILE ubuntu@$IP -L8800:localhost:8800
In a GCP instance, gcloud
can be used as follows, replacing $NAME
with the name of the GCP instance, $ZONE
with the zone it was installed into.
gcloud compute ssh $NAME --zone $ZONE -- -NL 8800:localhost:8800
Port forwarding port 8800
is used for kots
based installation to access the Wallaroo Administrative Dashboard.
Install Wallaroo
With the environment prepared, Wallaroo can now be installed.
Step | Status |
---|---|
Setup Environment
NEXT STEP!
|
COMPLETED! |
Install Wallaroo Enterprise | Install Wallaroo into a prepared environment |
Integrate Wallaroo with DNS Services | Update Wallaroo post-install |