Inference on IBM Power10 Architecture

How to deploy AI/ML models with IBM Power10 Processors

Table of Contents

AI/ML models can be deployed in centralized Wallaroo OPs instances and Edge devices on a variety of infrastructures and processors. The CPU infrastructure is set during the model upload and packaging stage.

Models specified with the Power10 architecture during the upload and automated model packaging can be deployed on Wallaroo Ops instances or multicloud deployments.

Power10 Support

For details on using Power10 with Wallaroo and setting up a demonstration:

Model Packaging and Deployments Prerequisites for Power10

To upload and package a model for Wallaroo Ops or multicloud edge deployments, the following prerequisites must be met.

  • Wallaroo Ops
    • At least one Power10 node deployed in the cluster.
  • Edge Devices
    • Enable Edge Registry Services in the Wallaroo instance to publish the pipeline to an OCI (Open Container Initiative) registry for edge deployments.
    • Power10 processor support for the edge device.

AI Workloads for Power10 via the Wallaroo SDK

The Wallaroo SDK provides Power10 support for models uploaded for Wallaroo Ops or multicloud edge deployments.

Upload Models for Power10 via the Wallaroo SDK

Models are uploaded to Wallaroo via the wallaroo.client.upload_model method. The infrastructure is set with the optional arch parameter, which accepts the wallaroo.engine_config.Architecture object.

wallaroo.client.upload_model has the following parameters. For more details on model uploads, see Automated Model Packaging.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Required)The framework of the model from wallaroo.framework
input_schemapyarrow.lib.Schema
  • Native Wallaroo Runtimes: (Optional)
  • Non-Native Wallaroo Runtimes: (Required)
The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema
  • Native Wallaroo Runtimes: (Optional)
  • Non-Native Wallaroo Runtimes: (Required)
The output schema in Apache Arrow schema format.
convert_waitbool (Optional)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.
archwallaroo.engine_config.Architecture (Optional)The architecture the model is deployed to. If a model is intended for deployment to an architecture other than X86, it must be specified during this step. Set to Power10 for Power10 based architectures.

Upload Model for Power10 Architecture Example

The following demonstrates uploading a model for deployment on the Power10 Architecture.

import wallaroo

# set the Wallaroo client
wl = wallaroo.Client()

# upload the model and save the reference to a variable
power10_model = wl.upload_model(
    name="sample_model",
    path=model_file_path,
    framework=framework, # the wallaroo.framework.Framework
    input_schema = input_schema, # input schema in PyArrow Schema format
    output_schema = output_schema, # input schema in PyArrow Schema format
    arch = wallaroo.engine_config.Architecture.Power10
)

Deploy Models for Power10 via the Wallaroo SDK

Models are added to pipeline as pipeline steps. Models are then deployed through the wallaroo.pipeline.Pipeline.deploy(deployment_config: Optional[wallaroo.deployment_config.DeploymentConfig] = None) method.

For full details, see Pipeline Deployment Configuration.

When deploying a model in a Wallaroo Ops instance, the deployment configurations inherits the model architecture setting. No additional changes are needed to set the architecture when deploying the model. Other settings, such as the number of CPUs, etc can be changed without modifying the architecture setting.

To change the architecture settings for model deployment, models should be re-uploaded as either a new model or a new model version for maximum compatibility with the hardware infrastructure. For more information on uploading models or new model versions, see Upload Models for Power10 via the Wallaroo SDK.

The following demonstrates deploying a generic AI/ML model with the architecture set to Power10. For this example, the model is deployed with a pre-determined deployment configuration saved to deployment_config.

# create the pipeline
pipeline = wl.build_pipeline("sample_pipeline")

# set the pipeline model step as the model set to the Power10 architecture

pipeline.add_model_step(power10_model)

# deploy the pipeline with the deployment configuration

pipeline.deploy(deployment_configuration)
  
namesample_pipeline
created2024-03-05 16:18:38.768602+00:00
last_updated2024-04-03 21:46:21.865211+00:00
deployedTrue
archPower10
accelnone
tags
versionsd033152c-494c-44a6-8981-627c6b6ad72e
stepssample_model
publishedFalse

Publish Pipeline for Power10 via the Wallaroo SDK

Publishing the pipeline to uses the method wallaroo.pipeline.publish(deployment_config: Optional[wallaroo.deployment_config.DeploymentConfig]).

This requires that the Wallaroo Ops instance have Edge Registry Services enabled.

A deployment configuration must be included with the pipeline publish, even if no changes to the cpus, memory, etc are made. For more detail on deployment configurations, see Pipeline Deployment Configuration.

The deployment configuration for the pipeline publish inherits the model’s architecture. Options such as the number of cpus, amount of memory, etc can be adjusted without impacting the model’s architecture settings.

Pipelines do not need to be deployed in the Wallaroo Ops instance before publishing the pipeline.

For more information, see Wallaroo SDK Essentials Guide: Pipeline Edge Publication.

The following demonstrates deploying the generic model uploaded earlier.

# default deployment configuration
publish = pipeline.publish(deployment_config=wallaroo.DeploymentConfigBuilder().build())
display(publish)

Waiting for pipeline publish... It may take up to 600 sec.
Pipeline is publishing................ Published.
ID15
Pipeline Namesample_pipeline
Pipeline Versiond033152c-494c-44a6-8981-627c6b6ad72e
StatusPublished
Engine URLregistry.example.com/uat/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini-ppc64le:v2024.4.0-main
Pipeline URLregistry.example.com/uat/pipelines/sample_pipeline:d033152c-494c-44a6-8981-627c6b6ad72e
Helm Chart URLoci://registry.example.com/uat/charts/sample_pipeline
Helm Chart Referenceregistry.example.com/uat/charts@sha256:7e2a314d9024cc2529be3e902eb24ac241f1e0819fc07e47bf26dd2e6e64f183
Helm Chart Version0.0.1-d033152c-494c-44a6-8981-627c6b6ad72e
Engine Config{'engine': {'resources': {'limits': {'cpu': 1.0, 'memory': '512Mi'}, 'requests': {'cpu': 1.0, 'memory': '512Mi'}, 'accel': 'none', 'arch': 'power10', 'gpu': False}}, 'engineAux': {'autoscale': {'type': 'none'}, 'images': {}}}
User Images[]
Created Byjohn.hummel@wallaroo.ai
Created At2024-04-03 21:50:14.306316+00:00
Updated At2024-04-03 21:50:14.306316+00:00
Replaces
Docker Run Command
docker run \
    -p $EDGE_PORT:8080 \
    -e OCI_USERNAME=$OCI_USERNAME \
    -e OCI_PASSWORD=$OCI_PASSWORD \
    -e PIPELINE_URL=[registry.example.com/uat/pipelines/architecture-demonstration-Power10:fd5e3d64-9eea-492d-92b2-8bdb5b20ec83](https://registry.example.com/uat/pipelines/sample_pipeline:d033152c-494c-44a6-8981-627c6b6ad72e) \
    -e CONFIG_CPUS=1 registry.example.com/uat/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini-ppc64le:v2024.4.0-main

Note: Please set the EDGE_PORT, OCI_USERNAME, and OCI_PASSWORD environment variables.
Helm Install Command
helm install --atomic $HELM_INSTALL_NAME \
    oci://registry.example.com/uat/charts/architecture-demonstration-Power10 \
    --namespace $HELM_INSTALL_NAMESPACE \
    --version 0.0.1-fd5e3d64-9eea-492d-92b2-8bdb5b20ec83 \
    --set ociRegistry.username=$OCI_USERNAME \
    --set ociRegistry.password=$OCI_PASSWORD

Note: Please set the HELM_INSTALL_NAME, HELM_INSTALL_NAMESPACE, OCI_USERNAME, and OCI_PASSWORD environment variables.

Tutorials

The following tutorials demonstrate deploying models on the Power10 architecture.