This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Wallaroo Community Environment Setup Guides

How to set up your Wallaroo Community Environments

How to prepare your different environments for the Wallaroo Community installation.

Setting up Wallaroo typically requires the following steps.

Step Description    Average Setup Time   
Current Step: Setup Environment Create an environment that meets the Wallaroo prerequisites 30 minutes
Install Wallaroo Install Wallaroo into a prepared environment 15 minutes

Select the environment to prepare for the Wallaroo installation from the list below. Organizations that have already prepared an environment that meets the Wallaroo Prerequisites Guide can skip ahead to the Wallaroo Community Install Guides.

1 - Wallaroo Community AWS EC2 Setup Instructions

How to set up your Wallaroo Community Edition AWS Environment with EC2

Setup AWS EC2 Environment for Wallaroo

The following instructions are made to assist users set up their Amazon Web Services (AWS) environment for running Wallaroo using AWS virtual servers with EC2. This allows organizations to stand a single virtual machine and used a pre-made Amazon Machine Images (AMIs) to quickly stand up an environment that can be used to install Wallaroo.

AWS Prerequisites

To install Wallaroo in your AWS environment based on these instructions, the following prerequisites must be met:

  • Register an AWS account: https://aws.amazon.com/ and assign the proper permissions according to your organization’s needs. This must be a paid AWS account - Wallaroo will not operate on the free tier level of virtual machines.

Steps

Create the EC2 VM

To create your Wallaroo instance using a pre-made AMI:

  1. Log into AWS cloud console.

  2. Set the region to N. Virginia. Other regions will be added over time.

    Set the region
  3. Select Services -> EC2.

    Select EC2
  4. Select Instances, then from the upper right hand section Launch Instances->Launch Instances.

    Select Launch Instances
  5. Set the Name and any additional tags.

  6. In Application and OS Images, enter Wallaroo Install and press Enter.

  7. From the search results, select Community AMIs and select Wallaroo Installer 3a.

    Select AMI
  8. Set the Instance Type as c3.8xlarge as the minimum machine type. This provides 32 cores with 60 GB memory.

    Select Instance Type
  9. For Key pair (login) select one of the following:

    Select or Create Key Pair
    1. Select an existing Key pair name
    2. Select Create new key pair and set the following:
      1. Name: The name of the new key pair.
      2. Key pair type: Select either RSA or ED25519.
      3. Private key file format: Select either .pem or .ppk. These instructions are based on the .pem file.
      4. Select Create key pair when complete.
  10. Set the following for Network settings:

    Set Network
    1. Firewall: Select Create security group or select from an existing one that best fits your organization.
    2. Allow SSH traffic from: Set to Enabled and Anywhere 0.0.0.0/0.
    3. Allow HTTPs traffic from the internet: Set to Enabled.
  11. Set the following for Configure Storage:

    1. Set Root volume to at least 400 GiB, type standard.
  12. Review the Summary and verify the following:

    1. Number of instances: 1
    2. Virtual server type: Matches the minimum requirement listed above.
    3. Verify the other settings are accurate.
  13. Select Launch Instance.

It is recommended to give the instance time to complete its setup process. This typically takes 20 minutes.

Verify the Setup

To verify the environment is setup for Wallaroo:

  1. From the EC2 Dashboard, select the virtual machine created for your Wallaroo instance.

  2. Note the Public IPv4 DNS address.

    Instance Public DNS
  3. From a terminal, run ssh to connect to your virtual machine. The installation requires access to port 8800 and the private key selected or created in the instructions above.

    The ssh command format for connecting to your virtual machine uses the following format, replacing the $keyfile, $VM_DNS with your private key file and the DNS address to your Amazon VM:

    ssh -i "$keyfile" ubuntu@$VM_DNS -L8800:localhost:8800
    

    For example, a $keyfile of Doc Sample Key.pem and $VM_DNS of ec2-54-160-227-100.compute-1.amazonaws.com would be as follows:

    ssh -i "Doc Sample Key.pem" ubuntu@ec2-54-160-227-100.compute-1.amazonaws.com -L8800:localhost:8800
    
  4. If the Kubernetes setup is still installing, wait until complete and when prompted select EXIT to complete the process. This process may take up to 20 to 30 minutes.

    Complete Kubernetes Install

Install Wallaroo

With your environment ready, it’s time to install Wallaroo.

Step Status
Setup Environment
NEXT STEP!
COMPLETE
Install Wallaroo Install Wallaroo into a prepared environment

Cost Saving Tips

The following tips can be used to save costs on your AWS EC2 instance.

Stop Instances When Not In Use

One cost saving measure is to stop instances when not in use. If you intend to stop an instance, register it with static IP address so when it is turned back on your services will continue to function without interruption.

Stop instance.

Troubleshooting

I keep seeing the errors such as connect failed. Is this a problem?

Sometimes you may see an error such as channel 3: open failed: connect failed: Connection refused. This is the ssh port forwarding attempting to connect to port 8800 during the installation, and can be ignored.

When Launching JupyterHub, I get a Server 500 error

If you shut down and restart a Wallaroo instance in a new environment or change the IP address, some settings may not be updated. Run the following command to restart the deployment process and update the settings to match the current environment:

kubectl rollout restart deployment hub

2 - Wallaroo Community AWS EKS Setup Instructions

How to set up your Wallaroo Community AWS Environment with EKS

The following instructions are made to assist users set up their Amazon Web Services (AWS) environment for running Wallaroo using AWS Elastic Kubernetes Service (EKS).

These represent a recommended setup, but can be modified to fit your specific needs.

If the prerequisites are already met, skip ahead to Install Wallaroo.

The following video demonstrates this process:

AWS Prerequisites

To install Wallaroo in your AWS environment based on these instructions, the following prerequisites must be met:

  • Register an AWS account: https://aws.amazon.com/ and assign the proper permissions according to your organization’s needs.
  • The Kubernetes cluster must include the following minimum settings:
    • Nodes must be OS type Linux with using the containerd driver.
    • Role-based access control (RBAC) must be enabled.
    • Minimum of 4 nodes, each node with a minimum of 8 CPU cores and 16 GB RAM. 50 GB will be allocated per node for a total of 625 GB for the entire cluster.
    • RBAC is enabled.
    • Recommended Aws Machine type: c5.4xlarge. For more information, see the AWS Instance Types.
  • Installed eksctl version 0.101.0 and above.

AWS Cluster Recommendations

The following recommendations will assist in reducing the cost of a cloud based Kubernetes Wallaroo cluster.

  • Turn off the cluster when not in use. An AWS EKS (Elastic Kubernetes Services) cluster can be turn off when not in use, then turned back on again when needed. If organizations adopt this process, be aware of the following issues:

    • IP Address Reassignment: The load balancer public IP address may be reassigned when the cluster is restarted by the cloud service unless a static IP address is assigned. For more information in Amazon Web Services see the Associate Elastic IP addresses with resources in your VPC user guide.
  • Assign to a Single Availability Zone: Clusters that span multiple availability zones may have issues accessing persistent volumes that were provisioned in another availability zone from the node when the node is restarted. The simple solution is to assign the entire cluster into a single availability zone. For more information in Amazon Web Services see the Regions and Zones guide.

    The scripts and configuration files are set up to create the AWS environment for a Wallaroo instance are based on a single availability zone. Modify the script as required for your organization.

Community Cluster Setup Instructions

The following is based on the requirements for Wallaroo Community. Note that Wallaroo Community does not use adaptive nodepools. Adapt the settings as required for your organization’s needs, as long as they meet the prerequisites listed above.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: wallarooAWS
  region: us-east-2
  version: "1.22"

nodeGroups:
  - name: mainpool
    instanceType: m5.2xlarge
    desiredCapacity: 4
    containerRuntime: containerd
    amiFamily: AmazonLinux2
    availabilityZones:
      - us-east-2a

Install AWS Command Line Tools

The following steps require the installation of the following Amazon Web Services command line tools:

Create the Cluster

Create the cluster with the following command, which creates the environment and sets the correct Kubernetes version.

eksctl create cluster -f aws.yaml

During the process the Kuberntes credentials will be copied into the local environment. To verify the setup is complete, use the kubectl get nodes command to display the available nodes as in the following example:

kubectl get nodes
NAME                                           STATUS   ROLES    AGE     VERSION
ip-192-168-21-253.us-east-2.compute.internal   Ready    <none>   13m     v1.21.5-eks-9017834
ip-192-168-30-36.us-east-2.compute.internal    Ready    <none>   13m     v1.21.5-eks-9017834
ip-192-168-55-123.us-east-2.compute.internal   Ready    <none>   12m     v1.21.5-eks-9017834
ip-192-168-79-70.us-east-2.compute.internal    Ready    <none>   13m     v1.21.5-eks-9017834

Wallaroo can now be installed.

Install Wallaroo

With your environment ready, it’s time to install Wallaroo.

Step Status
Setup Environment
NEXT STEP!
COMPLETE
Install Wallaroo Install Wallaroo into a prepared environment

3 - Wallaroo Community Azure Setup Instructions

How to set up your Wallaroo Community Azure Environment

The following instructions are made to assist users set up their Microsoft Azure Kubernetes environment for running Wallaroo Community. These represent a recommended setup, but can be modified to fit your specific needs.

If your prepared to install the environment now, skip to Setup Environment Steps.

There are two methods we’ve detailed here on how to setup your Kubernetes cloud environment in Azure:

  • Quick Setup Script: Download a bash script to automatically set up the Azure environment through the Microsoft Azure command line interface az.
  • Manual Setup Guide: A list of the az commands used to create the environment through manual commands.

The following video demonstrates the manual guide:

Azure Prerequisites

To install Wallaroo in your Microsoft Azure environment, the following prerequisites must be met:

  • Register a Microsoft Azure account: https://azure.microsoft.com/.
  • Install the Microsoft Azure CLI and complete the Azure CLI Get Started Guide to connect your az application to your Microsoft Azure account.
  • The Kubernetes cluster must include the following minimum settings:
    • Nodes must be OS type Linux with using the containerd driver.
    • Role-based access control (RBAC) must be enabled.
    • Minimum of 4 nodes, each node with a minimum of 8 CPU cores and 16 GB RAM. 50 GB will be allocated per node for a total of 625 GB for the entire cluster.
    • RBAC is enabled.
    • Minimum machine type is set to to Standard_D8s_v4.
    • containerd is the default container driver.
  • Install kubectl and kots on the local system where you will be performing the environment setup.

Azure Cluster Recommendations

The following recommendations will assist in reducing the cost of a cloud based Kubernetes Wallaroo cluster.

  • Turn off the cluster when not in use. An Azure Kubernetes Service (AKS) cluster can be turn off when not in use, then turned back on again when needed to save on costs. For more information on starting and stopping an AKS cluster, see the Stop and Start an Azure Kubernetes Service (AKS) cluster guide.

    If organizations adopt this process, be aware of the following issues:

  • Assign to a Single Availability Zone: Clusters that span multiple availability zones may have issues accessing persistent volumes that were provisioned in another availability zone from the node when the cluster is restarted. The simple solution is to assign the entire cluster into a single availability zone. For more information in Microsoft Azure see the Create an Azure Kubernetes Service (AKS) cluster that uses availability zones guide.

    The scripts and configuration files are set up to create the Azure environment for a Wallaroo instance are based on a single availability zone. Modify the script as required for your organization.

Setup Environment Steps

Standard Setup Variables

The following variables are used in the Quick Setup Script and the Manual Setup Guide. Modify them as best fits your organization.

Variable Name Default Value Description
WALLAROO_RESOURCE_GROUP wallaroocegroup The Azure Resource Group used for the Kubernetes environment.
WALLAROO_GROUP_LOCATION eastus The region that the Kubernetes environment will be installed to.
WALLAROO_CONTAINER_REGISTRY wallarooceacr The Azure Container Registry used for the Kubernetes environment.
WALLAROO_CLUSTER wallarooceaks The name of the Kubernetes cluster that Wallaroo is installed to.
WALLAROO_SKU_TYPE Basic The Azure Kubernetes Service SKU type.
WALLAROO_NODEPOOL wallaroocepool The main nodepool for the Kubernetes cluster.
WALLAROO_VM_SIZE Standard_D8s_v4 The VM type used for the standard Wallaroo cluster nodes.
WALLAROO_CLUSTER_SIZE 4 The number of nodes in the cluster.

Quick Setup Script

The following sample script creates an Azure Kubernetes environment ready for use with Wallaroo Community. This script requires the following prerequisites listed above.

Modify the installation file to fit for your organization. The only parts that require modification are the variables listed in the beginning as follows:

The following steps are geared towards a standard Linux or macOS system that supports the prerequisites listed above. Modify these steps based on your local environment.

  1. Download the script above.

  2. In a terminal window set the script status as execute with the command chmod +x wallaroo_community_azure_install.bash.

  3. Modify the script variables listed above based on your requirements.

  4. Run the script with either bash wallaroo_community_azure_install.bash or ./wallaroo_community_azure_install.bash from the same directory as the script.

    Azure Quick Setup Script

Manual Setup Guide

The following steps are guidelines to assist new users in setting up their Azure environment for Wallaroo. Feel free to replace these with commands with ones that match your needs.

See the Azure Command-Line Interface for full details on commands and settings.

The following are used for the example commands below. Replace them with your specific environment settings:

  • Azure Resource Group: wallarooCEGroup
  • Azure Resource Group Location: eastus
  • Azure Container Registry: wallarooCEAcr
  • Azure Kubernetes Cluster: wallarooCEAKS
  • Azure Container SKU type: Basic
  • Azure Nodepool Name: wallarooCEPool

Setting up an Azure AKS environment is based on the Azure Kubernetes Service tutorial, streamlined to show the minimum steps in setting up your own Wallaroo environment in Azure.

Manual Guide

This follows these major steps:

  • Create an Azure Resource Group
  • Create an Azure Container Registry
  • Create the Azure Kubernetes Environment

Set Variables

The following are the variables used in the environment setup process. Modify them as best fits your organization’s needs.

WALLAROO_RESOURCE_GROUP=wallaroocegroupdocs
WALLAROO_GROUP_LOCATION=eastus
WALLAROO_CONTAINER_REGISTRY=wallarooceacrdocs
WALLAROO_CLUSTER=wallarooceaksdocs
WALLAROO_SKU_TYPE=Basic
WALLAROO_NODEPOOL=wallaroocepool
WALLAROO_VM_SIZE=Standard_D8s_v4
WALLAROO_CLUSTER_SIZE=4

Create an Azure Resource Group

To create an Azure Resource Group for Wallaroo in Microsoft Azure, use the following template:

az group create --name $WALLAROO_RESOURCE_GROUP --location $WALLAROO_GROUP_LOCATION

(Optional): Set the default Resource Group to the one recently created. This allows other Azure commands to automatically select this group for commands such as az aks list, etc.

az configure --defaults group=$WALLAROO_RESOURCE_GROUP

Create an Azure Container Registry

An Azure Container Registry(ACR) manages the container images for services includes Kubernetes. The template for setting up an Azure ACR that supports Wallaroo is the following:

az acr create -n $WALLAROO_CONTAINER_REGISTRY -g $WALLAROO_RESOURCE_GROUP --sku $WALLAROO_SKU_TYPE

Create an Azure Kubernetes Services

And now we can create our Kubernetes service in Azure that will host our Wallaroo that meet the prerequisites. Modify the the settings to meet your organization’s needs. This creates a 4 node cluster with a total of 32 cores.

az aks create \
--resource-group $WALLAROO_RESOURCE_GROUP \
--name $WALLAROO_CLUSTER \
--node-count $WALLAROO_CLUSTER_SIZE \
--generate-ssh-keys \
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--node-vm-size $WALLAROO_VM_SIZE \
--nodepool-name $WALLAROO_NODEPOOL \
--nodepool-name mainpool \
--attach-acr $WALLAROO_CONTAINER_REGISTRY \
--kubernetes-version=1.22.15 \
--zones 1

Download Wallaroo Kubernetes Configuration

Once the Kubernetes environment is complete, associate it with the local Kubernetes configuration by importing the credentials through the following template command:

az aks get-credentials --resource-group $WALLAROO_RESOURCE_GROUP --name $WALLAROO_CLUSTER

Verify the cluster is available through the kubectl get nodes command.

kubectl get nodes

NAME                               STATUS   ROLES   AGE   VERSION
aks-mainpool-37402055-vmss000000   Ready    agent   81m   v1.22.6
aks-mainpool-37402055-vmss000001   Ready    agent   81m   v1.22.6
aks-mainpool-37402055-vmss000002   Ready    agent   81m   v1.22.6
aks-mainpool-37402055-vmss000003   Ready    agent   81m   v1.22.6

Install Wallaroo

With your environment ready, it’s time to install Wallaroo.

Step Status
Setup Environment
NEXT STEP!
COMPLETE
Install Wallaroo Install Wallaroo into a prepared environment

4 - Wallaroo Community GCP Setup Instructions

How to set up your Wallaroo Community GCP Environment

The following instructions are made to assist users set up their Google Cloud Platform (GCP) Kubernetes environment for running Wallaroo. These represent a recommended setup, but can be modified to fit your specific needs. In particular, these instructions will provision a GKE cluster with 32 CPUs in total. Please ensure that your project’s resource limits support that.

  • Quick Setup Guide: Download a bash script to automatically set up the GCP environment through the Google Cloud Platform command line interface gcloud.
  • Manual Setup Guide: A list of the gcloud commands used to create the environment through manual commands.

The following video demonstrates the manual guide:

GCP Prerequisites

Organizations that wish to run Wallaroo in their Google Cloud Platform environment must complete the following prerequisites:

GCP Cluster Recommendations

The following recommendations will assist in reducing the cost of a cloud based Kubernetes Wallaroo cluster.

  • Turn off the cluster when not in use. A GCP Google Kubernetes Engine (GKE) cluster can be turn off when not in use, then turned back on again when needed. If organizations adopt this process, be aware of the following issues:

    • IP Address Reassignment: The load balancer public IP address may be reassigned when the cluster is restarted by the cloud service unless a static IP address is assigned. For more information in Google Cloud Platform see the Configuring domain names with static IP addresses user guide.
  • Assign to a Single Availability Zone: Clusters that span multiple availability zones may have issues accessing persistent volumes that were provisioned in another availability zone from the node when the cluster is restarted. The simple solution is to assign the entire cluster into a single availability zone. For more information in Google Cloud Platform see the Regions and zones guide.

    The scripts and configuration files are set up to create the GCP environment for a Wallaroo instance are based on a single availability zone. Modify the script as required for your organization.

Standard Setup Variables

The following variables are used in the Quick Setup Script and the Manual Setup Guide. Modify them as best fits your organization.

Variable Name Default Value Description
WALLAROO_GCP_PROJECT wallaroo-ce The name of the Google Project used for the Wallaroo instance.
WALLAROO_CLUSTER wallaroo-ce The name of the Kubernetes cluster for the Wallaroo instance.
WALLAROO_GCP_REGION us-central1 The region the Kubernetes environment is installed to. Update this to your GCP Computer Engine region.
WALLAROO_NODE_LOCATION us-central1-f The location the Kubernetes nodes are installed to. Update this to your GCP Computer Engine zone.
WALLAROO_GCP_NETWORK_NAME wallaroo-network The Google network used with the Kubernetes environment.
WALLAROO_GCP_SUBNETWORK_NAME wallaroo-subnet-1 The Google network subnet used with the Kubernetes environment.
WALLAROO_GCP_MACHINE_TYPE e2-standard-8 Recommended VM size per GCP node.
WALLAROO_CLUSTER_SIZE 4 Number of nodes installed into the cluster. 4 nodes will create a 32 core cluster.

Quick Setup Script

A sample script is available here, and creates a Google Kubernetes Engine cluster ready for use with Wallaroo Community. This script requires the prerequisites listed above, and uses the variables as listed in Standard Setup Variables.

The following steps are geared towards a standard Linux or macOS system that supports the prerequisites listed above. Modify these steps based on your local environment.

  1. Download the script above.

  2. In a terminal window set the script status as execute with the command chmod +x bash wallaroo_community_gcp_install.bash.

  3. Modify the script variables listed above based on your requirements.

  4. Run the script with either bash wallaroo_community_gcp_install.bash or ./wallaroo_community_gcp_install.bash from the same directory as the script.

    GCP Quick Setup Script

Manual Setup Guide

The following steps are guidelines to assist new users in setting up their GCP environment for Wallaroo. Feel free to replace these with commands with ones that match your needs.

See the Google Cloud SDK for full details on commands and settings.

The commands below are set to meet the prerequisites listed above, and uses the variables as listed in Standard Setup Variables. Modify them as best fits your organization’s needs.

Set Variables

The following are the variables used in the environment setup process. Modify them as best fits your organization’s needs.

WALLAROO_GCP_PROJECT=wallaroo-ce
WALLAROO_CLUSTER=wallaroo-ce
WALLAROO_GCP_REGION=us-central1
WALLAROO_NODE_LOCATION=us-central1-f
WALLAROO_GCP_NETWORK_NAME=wallaroo-network
WALLAROO_GCP_SUBNETWORK_NAME=wallaroo-subnet-1
WALLAROO_GCP_MACHINE_TYPE=e2-standard-8
WALLAROO_CLUSTER_SIZE=4

Create a GCP Network

First create a GCP network that is used to connect to the cluster with the gcloud compute networks create command. For more information, see the gcloud compute networks create page.

gcloud compute networks \
create $WALLAROO_GCP_NETWORK_NAME \
--bgp-routing-mode regional \
--subnet-mode custom

Verify it’s creation by listing the GCP networks:

gcloud compute networks list

Create the GCP Wallaroo Cluster

Once the network is created, the gcloud container clusters create command is used to create a cluster. For more information see the gcloud container clusters create page.

Note that three nodes are created by default, so one more is added with the --num-nodes setting to meet the Wallaroo prerequisites. For Google GKE, containerd is enabled by default and so does not need to be specified during the setup procedure: (https://cloud.google.com/kubernetes-engine/docs/concepts/using-containerd)[https://cloud.google.com/kubernetes-engine/docs/concepts/using-containerd].

gcloud container clusters \
create $WALLAROO_CLUSTER \
--region $WALLAROO_GCP_REGION \
--node-locations $WALLAROO_NODE_LOCATION \
--machine-type $WALLAROO_GCP_MACHINE_TYPE \
--num-nodes $WALLAROO_CLUSTER_SIZE \
--network $WALLAROO_GCP_NETWORK_NAME \
--create-subnetwork name=$WALLAROO_GCP_SUBNETWORK_NAME \
--enable-ip-alias \
--cluster-version=1.22

The command can take several minutes to complete based on the size and complexity of the clusters. Verify the process is complete with the clusters list command:

gcloud container clusters list

Retrieving Kubernetes Credentials

Once the GCP cluster is complete, the Kubernetes credentials can be installed into the local administrative system with the the gcloud container clusters get-credentials (https://cloud.google.com/sdk/gcloud/reference/container/clusters/get-credentials) command:

gcloud container clusters \
get-credentials $WALLAROO_CLUSTER \
--region $WALLAROO_GCP_REGION

To verify the Kubernetes credentials for your cluster have been installed locally, use the kubectl get nodes command. This will display the nodes in the cluster as demonstrated below:

kubectl get nodes
NAME                                         STATUS   ROLES    AGE   VERSION
gke-wallaroo-ce-default-pool-863f02db-7xd4   Ready    <none>   39m   v1.21.6-gke.1503
gke-wallaroo-ce-default-pool-863f02db-8j2d   Ready    <none>   39m   v1.21.6-gke.1503
gke-wallaroo-ce-default-pool-863f02db-hn06   Ready    <none>   39m   v1.21.6-gke.1503
gke-wallaroo-ce-default-pool-3946eaca-4l3s   Ready    <none>   39m   v1.21.6-gke.1503

Install Wallaroo

With your environment ready, it’s time to install Wallaroo.

Step Status
Setup Environment
NEXT STEP!
COMPLETE
Install Wallaroo Install Wallaroo into a prepared environment

Troubleshooting

What does the error ‘Insufficient project quota to satisfy request: resource “CPUS_ALL_REGIONS”’ mean?

Make sure that the Computer Engine Zone and Region are properly set based on your organization’s requirements. The instructions above default to us-central1, so change that zone to install your Wallaroo instance in the correct location.

In the case of the script, this would mean changing the region and location from:

WALLAROO_GCP_REGION=us-central1
WALLAROO_NODE_LOCATION=us-central1-f
WALLAROO_GCP_REGION={Your Region}
WALLAROO_NODE_LOCATION={Your Location}