Create CUDA GPU Nodepools for Openshift Air-Gap Clusters

How to create CUDA GPU nodepools for Openshift clusters.

Wallaroo provides support for ML models that use GPUs. The following templates demonstrate how to create a nodepool in different cloud providers, then assign that nodepool to an existing cluster. These steps can be used in conjunction with Wallaroo Enterprise Install Guides.

Nodepool	Taints	Labels	Description
default	N/A	`wallaroo.ai/node-purpose: general`	For general Wallaroo services. No taints are applied to this nodepool to allow any process not assigned with a deployment label to run in this space.
persistent	`wallaroo.ai/persistent=true:NoSchedule`	`wallaroo.ai/node-purpose: persistent`	For Wallaroo services with a persistentVolume settings, including JupyterHub, Minio, etc.
pipelines-x86	`wallaroo.ai/pipelines=true:NoSchedule`	`wallaroo.ai/node-purpose: pipelines`	For deploying pipelines for default x86 architectures. The taints and label must be applied to any nodepool used for model deployments.
{custom}	`wallaroo.ai/pipelines=true:NoSchedule`	`wallaroo.ai/node-purpose: pipelines` `{custom label}`, for example: `wallaroo/gpu:true`	Custom named nodepools used to access non-default architectures (gpu, ARM, etc).

The specific nodepool names may differ based on your cloud services naming requirements; check with the cloud services provider for the nodepool name requirements and adjust as needed.

Generic OpenShift Add Nodes Procedure

The following steps add a new node to an existing OpenShift cluster. For full details, see the OpenShift guide Adding worker nodes to an on-premise cluster.

Add CUDA Nodepool to Cloud Environment Examples

The following examples details adding adding new nodepools with CUDA compatible GPUs.

Add Nodepool to IBM Cloud Openshift Cluster Procedure

The following procedure adds a new nodepool named to pipelines-l4 an existing cluster using the ibmcloud command line interface (CLI) using the gx3.16x80.l4 VPC flavor, which has the following specifications:

16 cores
80GB memory
32Gbps network speed
1 L4 GPU

For additional flavors, see VPC flavors. Modify as needed for the organizations requirements.

Install Software Requirements

The following software or runtimes are required for Wallaroo 2025.1. Most are automatically available through the supported cloud providers.

Software or Runtime	Description	Minimum Supported Version	Preferred Version(s)
OpenShift	Container Platform	4.17	4.18
Kubernetes	Cluster deployment management	1.29 with Container Management set to `containerd`.	1.31
kubectl	Kubernetes administrative console application	1.31	1.31

Create the Openshift IBM Cloud CUDA Nodepool

The following steps creates the nodepool with CUDA hardware using the ibmcloud CLI.

Retrieve the following information:
- VPC_ID: The ID of the IBM Cloud® Virtual Private Cloud. Retrieved with ibmcloud oc vpcs. For example:
  - VPC_ID="r006-cac4bfbe-d04d-481a-a099-ba243ea64afd"
- SUBNET_ID_1: The subnet ID used. This is retrieved with the following command:
  - ibmcloud oc subnets --provider vpc-gen2 --vpc-id <your-vpc-id> --zone <your-zone-name>

Set the environmental variables. Modify as needed to match the target cluster. The variable WORKER_POOL_NAME is used to determine the new nodepool. This example uses pipelines-l4.

# Set the environmental variables
CLUSTER_NAME="samplecluster"
ZONE_1="us-south-1"
SUBNET_ID_1="0717-6f46918e-2107-48ae-b023-eb053601697b"
L4_GPU_FLAVOR="gx3.16x80.l4"
# the name of the new node.
WORKER_POOL_NAME="pipelines-l4"
# the name of the acceleration label used
ACCELERATION_LABEL="l4"

The following command is used to retrieve zone details, including the subnet id.

ibmcloud oc subnets --provider vpc-gen2 --vpc-id <your-vpc-id> --zone <your-zone-name>

Create the nodepool and add it to the target cluster with the standard and the custom label.

# Create the nodepool and add it to the cluster.
ibmcloud oc worker-pool create vpc-gen2 --cluster "$CLUSTER_NAME" --name "$WORKER_POOL_NAME" --flavor "$L4_GPU_FLAVOR" --size-per-zone 1 \
    --label wallaroo.ai/node-purpose=pipelines \
    --label wallaroo.ai/accelerator=$ACCELERATION_LABEL

Add the nodepool to the zone and subnet.

# add to the zones and subnet
ibmcloud oc zone add vpc-gen2 --zone "$ZONE_1" --subnet-id "$SUBNET_ID_1" --cluster "$CLUSTER_NAME" --worker-pool "$WORKER_POOL_NAME"

Wait for the workers for the nodepool to be ready.

# wait for the nodepool to be ready
echo "Waiting for nodes in '$WORKER_POOL_NAME' pool to be ready before tainting..."
kubectl wait --for=condition=Ready node -l ibm-cloud.kubernetes.io/worker-pool-name=$WORKER_POOL_NAME --timeout=15m

Add the standard taints. If non-standard taints are used, modify as needed.

# apply the taints
echo "Tainting '$WORKER_POOL_NAME' pool nodes..."
kubectl taint nodes -l ibm-cloud.kubernetes.io/worker-pool-name=$WORKER_POOL_NAME wallaroo.ai/pipelines=true:NoSchedule --overwrite
kubectl taint nodes -l ibm-cloud.kubernetes.io/worker-pool-name=$WORKER_POOL_NAME nvidia.com/gpu=$ACCELERATION_LABEL:NoSchedule --overwrite

IMPORTANT NOTE: Verify that the label is communicated to developers for model deployment. Labels are required for deploying models in Wallaroo with GPUs enabled. For more details, see Deployment Configuration with the Wallaroo SDK: GPU Support

Deployment Tutorials

The following tutorials demonstrate deploying a pipeline with the specified architecture.

Large Language Model with GPU Pipeline Deployment in Wallaroo Demonstration