How to set up your Wallaroo Community Edition AWS Environment with EC2
Setup AWS EC2 Environment for Wallaroo
The following instructions are made to assist users set up their Amazon Web Services (AWS) environment for running Wallaroo using AWS virtual servers with EC2. This allows organizations to stand a single virtual machine and used a pre-made Amazon Machine Images (AMIs) to quickly stand up an environment that can be used to install Wallaroo.
AWS Prerequisites
To install Wallaroo in your AWS environment based on these instructions, the following prerequisites must be met:
Register an AWS account: https://aws.amazon.com/ and assign the proper permissions according to your organization’s needs. This must be a paid AWS account - Wallaroo will not operate on the free tier level of virtual machines.
Steps
Create the EC2 VM
To create your Wallaroo instance using a pre-made AMI:
Log into AWS cloud console.
Set the region to N. Virginia. Other regions will be added over time.
Select Services -> EC2.
Select Instances, then from the upper right hand section Launch Instances->Launch Instances.
Set the Name and any additional tags.
In Application and OS Images, enter Wallaroo Install and press Enter.
From the search results, select Community AMIs and select Wallaroo Installer 3a.
Set the Instance Type as c6i.8xlarge or c6a.8xlarge as the minimum machine type. This provides 32 cores with 60 GB memory.
For Key pair (login) select one of the following:
Select an existing Key pair name
Select Create new key pair and set the following:
Name: The name of the new key pair.
Key pair type: Select either RSA or ED25519.
Private key file format: Select either .pem or .ppk. These instructions are based on the .pem file.
Select Create key pair when complete.
Set the following for Network settings:
Firewall: Select Create security group or select from an existing one that best fits your organization.
Allow SSH traffic from: Set to Enabled and Anywhere 0.0.0.0/0.
Allow HTTPs traffic from the internet: Set to Enabled.
Set the following for Configure Storage:
Set Root volume to at least 400 GiB, type standard.
Review the Summary and verify the following:
Number of instances: 1
Virtual server type: Matches the minimum requirement listed above.
Verify the other settings are accurate.
Select Launch Instance.
It is recommended to give the instance time to complete its setup process. This typically takes 20 minutes.
Verify the Setup
To verify the environment is setup for Wallaroo:
From the EC2 Dashboard, select the virtual machine created for your Wallaroo instance.
Note the Public IPv4 DNS address.
From a terminal, run ssh to connect to your virtual machine. The installation requires access to port 8800 and the private key selected or created in the instructions above.
The ssh command format for connecting to your virtual machine uses the following format, replacing the $keyfile, $VM_DNS with your private key file and the DNS address to your Amazon VM:
If the Kubernetes setup is still installing, wait until complete and when prompted select EXIT to complete the process. This process may take up to 20 to 30 minutes.
Install Wallaroo
With your environment ready, it’s time to install Wallaroo.
The following tips can be used to save costs on your AWS EC2 instance.
Stop Instances When Not In Use
One cost saving measure is to stop instances when not in use. If you intend to stop an instance, register it with static IP address so when it is turned back on your services will continue to function without interruption.
I keep seeing the errors such as connect failed. Is this a problem?
Sometimes you may see an error such as channel 3: open failed: connect failed: Connection refused. This is the ssh port forwarding attempting to connect to port 8800 during the installation, and can be ignored.
When Launching JupyterHub, I get a Server 500 error
If you shut down and restart a Wallaroo instance in a new environment or change the IP address, some settings may not be updated. Run the following command to restart the deployment process and update the settings to match the current environment:
kubectl rollout restart deployment hub
2 - Wallaroo Community AWS EKS Setup Instructions
How to set up your Wallaroo Community AWS Environment with EKS
The following instructions are made to assist users set up their Amazon Web Services (AWS) environment for running Wallaroo using AWS Elastic Kubernetes Service (EKS).
These represent a recommended setup, but can be modified to fit your specific needs.
If the prerequisites are already met, skip ahead to Install Wallaroo.
The following video demonstrates this process:
AWS Prerequisites
To install Wallaroo in your AWS environment based on these instructions, the following prerequisites must be met:
Register an AWS account: https://aws.amazon.com/ and assign the proper permissions according to your organization’s needs.
The Kubernetes cluster must include the following minimum settings:
Nodes must be OS type Linux with using the containerd driver.
Role-based access control (RBAC) must be enabled.
Minimum of 4 nodes, each node with a minimum of 8 CPU cores and 16 GB RAM. 50 GB will be allocated per node for a total of 625 GB for the entire cluster.
RBAC is enabled.
Recommended Aws Machine type: c5.4xlarge. For more information, see the AWS Instance Types.
Running Wallaroo has two sets of software requirements:
Environment Requirements: The following software must be installed on the environment that will be running Wallaroo version. Most are automatically available through the supported cloud providers.
Kubernetes Admin Requirements: The following software must be installed in the system where the Kubernetes environment is being managed - aka where kubectl and kots will be installed. This requires the following software be installed to manage the Kubernetes environment:
The following recommendations will assist in reducing the cost of a cloud based Kubernetes Wallaroo cluster.
Turn off the cluster when not in use. An AWS EKS (Elastic Kubernetes Services) cluster can be turn off when not in use, then turned back on again when needed. If organizations adopt this process, be aware of the following issues:
IP Address Reassignment: The load balancer public IP address may be reassigned when the cluster is restarted by the cloud service unless a static IP address is assigned. For more information in Amazon Web Services see the Associate Elastic IP addresses with resources in your VPC user guide.
Assign to a Single Availability Zone: Clusters that span multiple availability zones may have issues accessing persistent volumes that were provisioned in another availability zone from the node when the node is restarted. The simple solution is to assign the entire cluster into a single availability zone. For more information in Amazon Web Services see the Regions and Zones guide.
The scripts and configuration files are set up to create the AWS environment for a Wallaroo instance are based on a single availability zone. Modify the script as required for your organization.
Community Cluster Setup Instructions
The following is based on the requirements for Wallaroo Community. Note that Wallaroo Community does not use adaptive nodepools. Adapt the settings as required for your organization’s needs, as long as they meet the prerequisites listed above.
IMPORTANT NOTE
This sample YAML file can be downloaded from here:
eksctl: A command line tool for managing Amazon EKS clusters from a configured YAML file. See the EKSCTL Install Guide for more details.
kubectl: This interfaces with the Kubernetes server created in the Wallaroo environment.
kots Version: Used to manage software installed in a Kubernetes environment.
Create the Cluster
Create the cluster with the following command, which creates the environment and sets the correct Kubernetes version.
eksctl create cluster -f aws.yaml
During the process the Kuberntes credentials will be copied into the local environment. To verify the setup is complete, use the kubectl get nodes command to display the available nodes as in the following example:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-192-168-21-253.us-east-2.compute.internal Ready <none> 13m v1.23.8-eks-9017834
ip-192-168-30-36.us-east-2.compute.internal Ready <none> 13m v1.23.8-eks-9017834
ip-192-168-55-123.us-east-2.compute.internal Ready <none> 12m v1.23.8-eks-9017834
ip-192-168-79-70.us-east-2.compute.internal Ready <none> 13m v1.23.8-eks-9017834
Wallaroo can now be installed.
Install Wallaroo
With your environment ready, it’s time to install Wallaroo.
How to set up your Wallaroo Community Azure Environment
The following instructions are made to assist users set up their Microsoft Azure Kubernetes environment for running Wallaroo Community. These represent a recommended setup, but can be modified to fit your specific needs.
The Kubernetes cluster must include the following minimum settings:
Nodes must be OS type Linux with using the containerd driver.
Role-based access control (RBAC) must be enabled.
Minimum of 4 nodes, each node with a minimum of 8 CPU cores and 16 GB RAM. 50 GB will be allocated per node for a total of 625 GB for the entire cluster.
RBAC is enabled.
Minimum machine type is set to to Standard_D8s_v4.
The following recommendations will assist in reducing the cost of a cloud based Kubernetes Wallaroo cluster.
Turn off the cluster when not in use. An Azure Kubernetes Service (AKS) cluster can be turn off when not in use, then turned back on again when needed to save on costs. For more information on starting and stopping an AKS cluster, see the Stop and Start an Azure Kubernetes Service (AKS) cluster guide.
If organizations adopt this process, be aware of the following issues:
Assign to a Single Availability Zone: Clusters that span multiple availability zones may have issues accessing persistent volumes that were provisioned in another availability zone from the node when the cluster is restarted. The simple solution is to assign the entire cluster into a single availability zone. For more information in Microsoft Azure see the Create an Azure Kubernetes Service (AKS) cluster that uses availability zones guide.
The scripts and configuration files are set up to create the Azure environment for a Wallaroo instance are based on a single availability zone. Modify the script as required for your organization.
The Azure Resource Group used for the Kubernetes environment.
WALLAROO_GROUP_LOCATION
eastus
The region that the Kubernetes environment will be installed to.
WALLAROO_CONTAINER_REGISTRY
wallarooceacr
The Azure Container Registry used for the Kubernetes environment.
WALLAROO_CLUSTER
wallarooceaks
The name of the Kubernetes cluster that Wallaroo is installed to.
WALLAROO_SKU_TYPE
Basic
The Azure Kubernetes Service SKU type.
WALLAROO_NODEPOOL
wallaroocepool
The main nodepool for the Kubernetes cluster.
WALLAROO_VM_SIZE
Standard_D8s_v4
The VM type used for the standard Wallaroo cluster nodes.
WALLAROO_CLUSTER_SIZE
4
The number of nodes in the cluster.
Quick Setup Script
The following sample script creates an Azure Kubernetes environment ready for use with Wallaroo Community. This script requires the following prerequisites listed above.
Modify the installation file to fit for your organization. The only parts that require modification are the variables listed in the beginning as follows:
The following steps are geared towards a standard Linux or macOS system that supports the prerequisites listed above. Modify these steps based on your local environment.
Download the script above.
In a terminal window set the script status as execute with the command chmod +x wallaroo_community_azure_install.bash.
Modify the script variables listed above based on your requirements.
Run the script with either bash wallaroo_community_azure_install.bash or ./wallaroo_community_azure_install.bash from the same directory as the script.
Manual Setup Guide
The following steps are guidelines to assist new users in setting up their Azure environment for Wallaroo. Feel free to replace these with commands with ones that match your needs.
The following are used for the example commands below. Replace them with your specific environment settings:
Azure Resource Group: wallarooCEGroup
Azure Resource Group Location: eastus
Azure Container Registry: wallarooCEAcr
Azure Kubernetes Cluster: wallarooCEAKS
Azure Container SKU type: Basic
Azure Nodepool Name: wallarooCEPool
Setting up an Azure AKS environment is based on the Azure Kubernetes Service tutorial, streamlined to show the minimum steps in setting up your own Wallaroo environment in Azure.
Manual Guide
This follows these major steps:
Create an Azure Resource Group
Create an Azure Container Registry
Create the Azure Kubernetes Environment
Set Variables
The following are the variables used in the environment setup process. Modify them as best fits your organization’s needs.
To create an Azure Resource Group for Wallaroo in Microsoft Azure, use the following template:
az group create --name $WALLAROO_RESOURCE_GROUP --location $WALLAROO_GROUP_LOCATION
(Optional): Set the default Resource Group to the one recently created. This allows other Azure commands to automatically select this group for commands such as az aks list, etc.
az configure --defaults group=$WALLAROO_RESOURCE_GROUP
Create an Azure Container Registry
An Azure Container Registry(ACR) manages the container images for services includes Kubernetes. The template for setting up an Azure ACR that supports Wallaroo is the following:
And now we can create our Kubernetes service in Azure that will host our Wallaroo that meet the prerequisites. Modify the settings to meet your organization’s needs. This creates a 4 node cluster with a total of 32 cores.
Once the Kubernetes environment is complete, associate it with the local Kubernetes configuration by importing the credentials through the following template command:
az aks get-credentials --resource-group $WALLAROO_RESOURCE_GROUP --name $WALLAROO_CLUSTER
Verify the cluster is available through the kubectl get nodes command.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
aks-mainpool-37402055-vmss000000 Ready agent 81m v1.23.8
aks-mainpool-37402055-vmss000001 Ready agent 81m v1.23.8
aks-mainpool-37402055-vmss000002 Ready agent 81m v1.23.8
aks-mainpool-37402055-vmss000003 Ready agent 81m v1.23.8
Install Wallaroo
With your environment ready, it’s time to install Wallaroo.
How to set up your Wallaroo Community GCP Environment
The following instructions are made to assist users set up their Google Cloud Platform (GCP) Kubernetes environment for running Wallaroo. These represent a recommended setup, but can be modified to fit your specific needs. In particular, these instructions will provision a GKE cluster with 32 CPUs in total. Please ensure that your project’s resource limits support that.
Quick Setup Guide: Download a bash script to automatically set up the GCP environment through the Google Cloud Platform command line interface gcloud.
Manual Setup Guide: A list of the gcloud commands used to create the environment through manual commands.
The following video demonstrates the manual guide:
GCP Prerequisites
Organizations that wish to run Wallaroo in their Google Cloud Platform environment must complete the following prerequisites:
The following recommendations will assist in reducing the cost of a cloud based Kubernetes Wallaroo cluster.
Turn off the cluster when not in use. A GCP Google Kubernetes Engine (GKE) cluster can be turn off when not in use, then turned back on again when needed. If organizations adopt this process, be aware of the following issues:
IP Address Reassignment: The load balancer public IP address may be reassigned when the cluster is restarted by the cloud service unless a static IP address is assigned. For more information in Google Cloud Platform see the Configuring domain names with static IP addresses user guide.
Assign to a Single Availability Zone: Clusters that span multiple availability zones may have issues accessing persistent volumes that were provisioned in another availability zone from the node when the cluster is restarted. The simple solution is to assign the entire cluster into a single availability zone. For more information in Google Cloud Platform see the Regions and zones guide.
The scripts and configuration files are set up to create the GCP environment for a Wallaroo instance are based on a single availability zone. Modify the script as required for your organization.
The name of the Google Project used for the Wallaroo instance.
WALLAROO_CLUSTER
wallaroo-ce
The name of the Kubernetes cluster for the Wallaroo instance.
WALLAROO_GCP_REGION
us-central1
The region the Kubernetes environment is installed to. Update this to your GCP Computer Engine region.
WALLAROO_NODE_LOCATION
us-central1-f
The location the Kubernetes nodes are installed to. Update this to your GCP Compute Engine Zone.
WALLAROO_GCP_NETWORK_NAME
wallaroo-network
The Google network used with the Kubernetes environment.
WALLAROO_GCP_SUBNETWORK_NAME
wallaroo-subnet-1
The Google network subnet used with the Kubernetes environment.
WALLAROO_GCP_MACHINE_TYPE
e2-standard-8
Recommended VM size per GCP node.
WALLAROO_CLUSTER_SIZE
4
Number of nodes installed into the cluster. 4 nodes will create a 32 core cluster.
Quick Setup Script
A sample script is available here, and creates a Google Kubernetes Engine cluster ready for use with Wallaroo Community. This script requires the prerequisites listed above, and uses the variables as listed in Standard Setup Variables.
The following steps are geared towards a standard Linux or macOS system that supports the prerequisites listed above. Modify these steps based on your local environment.
Download the script above.
In a terminal window set the script status as execute with the command chmod +x bash wallaroo_community_gcp_install.bash.
Modify the script variables listed above based on your requirements.
Run the script with either bash wallaroo_community_gcp_install.bash or ./wallaroo_community_gcp_install.bash from the same directory as the script.
Manual Setup Guide
The following steps are guidelines to assist new users in setting up their GCP environment for Wallaroo. Feel free to replace these with commands with ones that match your needs.
See the Google Cloud SDK for full details on commands and settings.
The commands below are set to meet the prerequisites listed above, and uses the variables as listed in Standard Setup Variables. Modify them as best fits your organization’s needs.
Set Variables
The following are the variables used in the environment setup process. Modify them as best fits your organization’s needs.
First create a GCP network that is used to connect to the cluster with the gcloud compute networks create command. For more information, see the gcloud compute networks create page.
Once the network is created, the gcloud container clusters create command is used to create a cluster. For more information see the gcloud container clusters create page.
The command can take several minutes to complete based on the size and complexity of the clusters. Verify the process is complete with the clusters list command:
To verify the Kubernetes credentials for your cluster have been installed locally, use the kubectl get nodes command. This will display the nodes in the cluster as demonstrated below:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-wallaroo-ce-default-pool-863f02db-7xd4 Ready <none> 39m v1.21.6-gke.1503
gke-wallaroo-ce-default-pool-863f02db-8j2d Ready <none> 39m v1.21.6-gke.1503
gke-wallaroo-ce-default-pool-863f02db-hn06 Ready <none> 39m v1.21.6-gke.1503
gke-wallaroo-ce-default-pool-3946eaca-4l3s Ready <none> 39m v1.21.6-gke.1503
Install Wallaroo
With your environment ready, it’s time to install Wallaroo.
What does the error ‘Insufficient project quota to satisfy request: resource “CPUS_ALL_REGIONS”’ mean?
Make sure that the Compute Engine Zone and Region are properly set based on your organization’s requirements. The instructions above default to us-central1, so change that zone to install your Wallaroo instance in the correct location.
In the case of the script, this would mean changing the region and location from: