Inference on IBM Power10 Architecture
Table of Contents
AI/ML models can be deployed in centralized Wallaroo OPs instances and Edge devices on a variety of infrastructures and processors. The CPU infrastructure is set during the model upload and packaging stage.
Models specified with the Power10
architecture during the upload and automated model packaging can be deployed on Wallaroo Ops instances or multicloud deployments.
Power10 Support
For details on using Power10 with Wallaroo and setting up a demonstration:
- Contact your Wallaroo Support Representative OR
- Schedule Your Wallaroo.AI Demo Today
Model Packaging and Deployments Prerequisites for Power10
To upload and package a model for Wallaroo Ops or multicloud edge deployments, the following prerequisites must be met.
- Wallaroo Ops
- At least one Power10 node deployed in the cluster.
- Edge Devices
- Enable Edge Registry Services in the Wallaroo instance to publish the pipeline to an OCI (Open Container Initiative) registry for edge deployments.
- Power10 processor support for the edge device.
AI Workloads for Power10 via the Wallaroo SDK
The Wallaroo SDK provides Power10 support for models uploaded for Wallaroo Ops or multicloud edge deployments.
Upload Models for Power10 via the Wallaroo SDK
Models are uploaded to Wallaroo via the wallaroo.client.upload_model
method. The infrastructure is set with the optional arch
parameter, which accepts the wallaroo.engine_config.Architecture
object.
wallaroo.client.upload_model
has the following parameters. For more details on model uploads, see Automated Model Packaging.
Parameter | Type | Description |
---|---|---|
name | string (Required) | The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model. |
path | string (Required) | The path to the model file being uploaded. |
framework | string (Required) | The framework of the model from wallaroo.framework |
input_schema | pyarrow.lib.Schema
| The input schema in Apache Arrow schema format. |
output_schema | pyarrow.lib.Schema
| The output schema in Apache Arrow schema format. |
convert_wait | bool (Optional) |
|
arch | wallaroo.engine_config.Architecture (Optional) | The architecture the model is deployed to. If a model is intended for deployment to an architecture other than X86, it must be specified during this step. Set to Power10 for Power10 based architectures. |
Upload Model for Power10 Architecture Example
The following demonstrates uploading a model for deployment on the Power10 Architecture.
import wallaroo
# set the Wallaroo client
wl = wallaroo.Client()
# upload the model and save the reference to a variable
power10_model = wl.upload_model(
name="sample_model",
path=model_file_path,
framework=framework, # the wallaroo.framework.Framework
input_schema = input_schema, # input schema in PyArrow Schema format
output_schema = output_schema, # input schema in PyArrow Schema format
arch = wallaroo.engine_config.Architecture.Power10
)
Deploy Models for Power10 via the Wallaroo SDK
Models are added to pipeline as pipeline steps. Models are then deployed through the wallaroo.pipeline.Pipeline.deploy(deployment_config: Optional[wallaroo.deployment_config.DeploymentConfig] = None)
method.
For full details, see Pipeline Deployment Configuration.
When deploying a model in a Wallaroo Ops instance, the deployment configurations inherits the model architecture setting. No additional changes are needed to set the architecture when deploying the model. Other settings, such as the number of CPUs, etc can be changed without modifying the architecture setting.
To change the architecture settings for model deployment, models should be re-uploaded as either a new model or a new model version for maximum compatibility with the hardware infrastructure. For more information on uploading models or new model versions, see Upload Models for Power10 via the Wallaroo SDK.
The following demonstrates deploying a generic AI/ML model with the architecture set to Power10. For this example, the model is deployed with a pre-determined deployment configuration saved to deployment_config
.
# create the pipeline
pipeline = wl.build_pipeline("sample_pipeline")
# set the pipeline model step as the model set to the Power10 architecture
pipeline.add_model_step(power10_model)
# deploy the pipeline with the deployment configuration
pipeline.deploy(deployment_configuration)
name | sample_pipeline |
created | 2024-03-05 16:18:38.768602+00:00 |
last_updated | 2024-04-03 21:46:21.865211+00:00 |
deployed | True |
arch | Power10 |
accel | none |
tags | |
versions | d033152c-494c-44a6-8981-627c6b6ad72e |
steps | sample_model |
published | False |
Publish Pipeline for Power10 via the Wallaroo SDK
Publishing the pipeline to uses the method wallaroo.pipeline.publish(deployment_config: Optional[wallaroo.deployment_config.DeploymentConfig])
.
This requires that the Wallaroo Ops instance have Edge Registry Services enabled.
A deployment configuration must be included with the pipeline publish, even if no changes to the cpus, memory, etc are made. For more detail on deployment configurations, see Pipeline Deployment Configuration.
The deployment configuration for the pipeline publish inherits the model’s architecture. Options such as the number of cpus, amount of memory, etc can be adjusted without impacting the model’s architecture settings.
Pipelines do not need to be deployed in the Wallaroo Ops instance before publishing the pipeline.
For more information, see Wallaroo SDK Essentials Guide: Pipeline Edge Publication.
The following demonstrates deploying the generic model uploaded earlier.
# default deployment configuration
publish = pipeline.publish(deployment_config=wallaroo.DeploymentConfigBuilder().build())
display(publish)
Waiting for pipeline publish... It may take up to 600 sec.
Pipeline is publishing................ Published.
ID | 15 | |
Pipeline Name | sample_pipeline | |
Pipeline Version | d033152c-494c-44a6-8981-627c6b6ad72e | |
Status | Published | |
Engine URL | registry.example.com/uat/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini-ppc64le:v2024.4.0-main | |
Pipeline URL | registry.example.com/uat/pipelines/sample_pipeline:d033152c-494c-44a6-8981-627c6b6ad72e | |
Helm Chart URL | oci://registry.example.com/uat/charts/sample_pipeline | |
Helm Chart Reference | registry.example.com/uat/charts@sha256:7e2a314d9024cc2529be3e902eb24ac241f1e0819fc07e47bf26dd2e6e64f183 | |
Helm Chart Version | 0.0.1-d033152c-494c-44a6-8981-627c6b6ad72e | |
Engine Config | {'engine': {'resources': {'limits': {'cpu': 1.0, 'memory': '512Mi'}, 'requests': {'cpu': 1.0, 'memory': '512Mi'}, 'accel': 'none', 'arch': 'power10', 'gpu': False}}, 'engineAux': {'autoscale': {'type': 'none'}, 'images': {}}} | |
User Images | [] | |
Created By | john.hummel@wallaroo.ai | |
Created At | 2024-04-03 21:50:14.306316+00:00 | |
Updated At | 2024-04-03 21:50:14.306316+00:00 | |
Replaces | ||
Docker Run Command |
Note: Please set the EDGE_PORT , OCI_USERNAME , and OCI_PASSWORD environment variables. | |
Helm Install Command |
Note: Please set the HELM_INSTALL_NAME , HELM_INSTALL_NAMESPACE ,
OCI_USERNAME , and OCI_PASSWORD environment variables. |
Tutorials
The following tutorials demonstrate deploying models on the Power10 architecture.