AI Workloads on ARM: House Price Prediction Model on ARM Tutorial

How to publish ONNX House Price Prediction models for deployment on multicloud and edge deployments for ARM processors.

This tutorial and the assets can be downloaded as part of the Wallaroo Tutorials repository.

Run Anywhere for ARM Architecture Tutorial: House Price Predictor Model

Wallaroo Run Anywhere provides model deployment in any device, any cloud, and any architecture. Models uploaded to Wallaroo are set to their targeted architecture.

Organizations can deploy uploaded models to clusters that have nodes with the provisioned architecture. The following architectures are supported:

X86: The standard X86 architecture.
ARM: For more details on cloud providers and their ARM offerings, see Create ARM Nodepools for Kubernetes Clusters.

Model Architecture Inheritance

The model’s deployment configuration inherits its architecture. Models automatically deploy in the target architecture provided nodepools with the architecture are available. For information on setting up nodepools with specific architectures, see Infrastructure Configuration Guides.

That deployment configuration is carried over to the models’ publication in an Open Container Initiative (OCI) Registries, which allows edge model deployments on X64 and ARM architectures. More details on deploying models on edge devices is available with the Wallaroo Run Anywhere Guides.

The deployment configuration can be overridden for either model deployment in the Wallaroo Ops instance, or in the Edge devices.

This tutorial demonstrates deploying a ML model trained to predict house prices to ARM edge locations through the following steps.

Upload a model with the architecture set to ARM.
Create a pipeline with the uploaded model as a model step.
Publish the pipeline model to an Open Container Initiative (OCI) Registry for both X64 and ARM deployments.

In this notebook, we use a ONNX model pre-trained to predict house prices for our examples.

Goal

Demonstrate publishing a pipeline with model steps to various architectures.

Resources

This tutorial provides the following:

Models:
- models/rf_model.onnx: The champion model trained to predict house prices.
- Various inputs:
  - smallinputs.df.json: A set of house inputs that tends to generate low house price values.
  - biginputs.df.json: A set of house inputs that tends to generate high house price values.
  - normal-inputs.df.json: A set of house inputs with a range of house values.

Prerequisites

A deployed Wallaroo instance with Edge Registry Services and Edge Observability enabled.
The following Python libraries installed:
- wallaroo: The Wallaroo SDK. Included with the Wallaroo JupyterHub service by default.
- pandas: Pandas, mainly used for Pandas DataFrame
- json: Used for format input data for inference requests.
A X64 Docker deployment to deploy the model on an edge location.

Steps

Upload the model with the targeted architecture set to ARM.
Create the pipeline add the model as a model step.
Deploy the model in the targeted architecture and perform sample inferences.
Publish the pipeline an OCI registry.
Deploy the model from the pipeline publish to the edge deployment with ARM architecture.
Perform sample inferences on the ARM edge model deployment.

Import Libraries

The first step will be to import our libraries, and set variables used through this tutorial.

import wallaroo
from wallaroo.object import EntityNotFoundError
from wallaroo.framework import Framework
from wallaroo.engine_config import Architecture

from IPython.display import display

# used to display DataFrame information without truncating
from IPython.display import display
import pandas as pd
pd.set_option('display.max_colwidth', None)

import datetime
import time

workspace_name = f'run-anywhere-house-price-architecture-demonstration-tutorial'
arm_pipeline_name = f'architecture-demonstration-arm'
model_name_arm = f'house-price-estimator-arm'
model_file_name = './models/rf_model.onnx'

# ignoring warnings for demonstration
import warnings
warnings.filterwarnings('ignore')

# used to display DataFrame information without truncating
from IPython.display import display
import pandas as pd
pd.set_option('display.max_colwidth', None)

'0.0.0'

Connect to the Wallaroo Instance

The first step is to connect to Wallaroo through the Wallaroo client. The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.

This is accomplished using the wallaroo.Client() command, which provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Store the connection into a variable that can be referenced later.

If logging into the Wallaroo instance through the internal JupyterHub service, use wl = wallaroo.Client(). For more information on Wallaroo Client settings, see the Client Connection guide.

# Login through local Wallaroo instance

wl = wallaroo.Client()

Create Workspace

We will create a workspace to manage our pipeline and models. The following variables will set the name of our sample workspace then set it as the current workspace.

Workspace, pipeline, and model names should be unique to each user, so we’ll add in a randomly generated suffix so multiple people can run this tutorial in a Wallaroo instance without effecting each other.

workspace = wl.get_workspace(name=workspace_name, create_if_not_exist=True)

wl.set_current_workspace(workspace)

{'name': 'run-anywhere-house-price-architecture-demonstration-tutorial', 'id': 7, 'archived': False, 'created_by': 'e790f98c-b2a7-403c-9616-c94f31a9f234', 'created_at': '2024-04-01T15:16:26.679968+00:00', 'models': [{'name': 'house-price-estimator-arm', 'versions': 1, 'owner_id': '""', 'last_update_time': datetime.datetime(2024, 4, 1, 15, 16, 27, 203702, tzinfo=tzutc()), 'created_at': datetime.datetime(2024, 4, 1, 15, 16, 27, 203702, tzinfo=tzutc())}], 'pipelines': [{'name': 'architecture-demonstration-arm', 'create_time': datetime.datetime(2024, 4, 1, 15, 16, 27, 308341, tzinfo=tzutc()), 'definition': '[]'}]}

Upload Models and Set ARM Target Architecture

For our example, we will upload the champion model that has been trained to derive house prices from a variety of inputs. The model file is rf_model.onnx, and is uploaded with the name house-price-estimator.

Models are uploaded to Wallaroo via the wallaroo.client.upload_model method which takes the following arguments:

Parameter	Type	Description
path	String (Required)	The file path to the model.
framework	wallaroo.framework.Framework (Required)	The model’s framework. See Wallaroo SDK Essentials Guide: Model Uploads and Registrations for supported model frameworks.
input_schema	pyarrow.lib.Schema (Optional)	The model’s input schema. **Only required for non-Native Wallaroo frameworks. See Wallaroo SDK Essentials Guide: Model Uploads and Registrations for more details.
output_schema	pyarrow.lib.Schema (Optional)	The model’s output schema. **Only required for non-Native Wallaroo frameworks. See Wallaroo SDK Essentials Guide: Model Uploads and Registrations for more details.
convert_wait	bool (Optional)	Whether to wait in the SDK session to complete the auto-packaging process for non-native Wallaroo frameworks.
arch	wallaroo.engine_config.Architecture (Optional)	The targeted architecture for the model. Options are `X86` (Default) `ARM`

We upload the model and set the architecture to ARM.

housing_model_control_arm = (wl.upload_model(model_name_arm, 
                                        model_file_name, 
                                        framework=Framework.ONNX,
                                        arch=wallaroo.engine_config.Architecture.ARM)
                                        .configure(tensor_fields=["tensor"])
                        )

display(housing_model_control_arm)

Name	house-price-estimator-arm
Version	23d65519-03dc-4c79-8163-03f41b05343b
File Name	rf_model.onnx
SHA	e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6
Status	ready
Image Path	None
Architecture	arm
Acceleration	none
Updated At	2024-01-Apr 15:51:03

Build Pipeline

We build the pipeline with the wallaroo.client.build_pipeline(pipeline_name command, and set the model as a model step in the pipeline.

pipeline_arm = wl.build_pipeline('architecture-demonstration-arm')
pipeline_arm.add_model_step(housing_model_control_arm)

name	architecture-demonstration-arm
created	2024-04-01 15:16:27.308341+00:00
last_updated	2024-04-01 15:52:57.907820+00:00
deployed	False
arch	arm
accel	none
tags
versions	a54a8a2d-cde2-4aef-a071-03f58d9b1b65, 15834c62-ffb2-4278-b915-3bb2fbf9aff6, f03a0237-eec9-4fab-9a79-1526cc8088c2, ba4504e4-ff8f-46c6-9dc7-bf5b58da573e
steps	house-price-estimator-arm
published	False

Deploy Pipeline

We can now deploy the pipeline. The pipeline deployment inherits the model’s architecture, so for our deployment configuration we can specify just the cpus and memory we want to use.

#minimum deployment config
deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(0.5).memory("1Gi").build()

pipeline_arm.deploy(deployment_config = deploy_config)

Sample Inference

Once the pipeline is deployed, we will perform a set of sample inferences, then undeploy the pipeline to return the resources back to the cluster.

result = arm_pipeline.infer_from_file('./data/normal-inputs.df.json')
display(result.head(20))

arm_pipeline.undeploy()

Pipeline Publish for ARM Architecture via the Wallaroo SDK

We now publish our pipeline as two different versions.

ARM: The model’s architecture was set to ARM, when when we publish the pipeline to the OCI registry, it will automatically inherit that architecture.
X64: We override the model’s architecture to push a published version that can be deployed on X64 based devices.

Publish Pipeline for ARM

Publishing the pipeline uses the pipeline wallaroo.pipeline.Pipeline.publish() command. This requires that the Wallaroo Ops instance have Edge Registry Services enabled.

When publishing, we specify the pipeline deployment configuration through the wallaroo.DeploymentConfigBuilder and specify the accelerator as wallaroo.engine_config.Architecture.ARM.

The following publishes the pipeline to the OCI registry and displays the container details. For more information, see Wallaroo SDK Essentials Guide: Pipeline Edge Publication.

deploy_config_arm = (wallaroo.DeploymentConfigBuilder()
                     .replica_count(1)
                     .cpus(1)
                     .memory("1Gi")
                     .arch(wallaroo.engine_config.Architecture.ARM)
                     .build()
                    )
assay_pub_arm = pipeline_arm.publish(deployment_config=deploy_config_arm)
assay_pub_arm

Waiting for pipeline publish... It may take up to 600 sec.
Pipeline is publishing....... Published.

ID 3

Pipeline Name architecture-demonstration-arm

Pipeline Version 6fa63751-c97b-4f6f-90f5-78566cd22170

Status Published

Engine URL ghcr.io/wallaroolabs/doc-samples/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini-aarch64:v2024.1.0-main-4833

Pipeline URL ghcr.io/wallaroolabs/doc-samples/pipelines/architecture-demonstration-arm:6fa63751-c97b-4f6f-90f5-78566cd22170

Helm Chart URL oci://ghcr.io/wallaroolabs/doc-samples/charts/architecture-demonstration-arm

Helm Chart Reference ghcr.io/wallaroolabs/doc-samples/charts@sha256:ca931da2bfaef460c8b48ab0ffb2d3a9647a19a1f31014cac185899128589183

Helm Chart Version 0.0.1-6fa63751-c97b-4f6f-90f5-78566cd22170

Engine Config {'engine': {'resources': {'limits': {'cpu': 1.0, 'memory': '512Mi'}, 'requests': {'cpu': 1.0, 'memory': '512Mi'}, 'accel': 'none', 'arch': 'arm', 'gpu': False}}, 'engineAux': {'autoscale': {'type': 'none'}, 'images': {}}, 'enginelb': {'resources': {'limits': {'cpu': 1.0, 'memory': '512Mi'}, 'requests': {'cpu': 1.0, 'memory': '512Mi'}, 'accel': 'none', 'arch': 'x86', 'gpu': False}}}

User Images []

Created By john.hummel@wallaroo.ai

Created At 2024-04-01 15:55:02.156926+00:00

Updated At 2024-04-01 15:55:02.156926+00:00

Replaces

Docker Run Command

docker run \
    -e OCI_USERNAME=$OCI_USERNAME \
    -e OCI_PASSWORD=$OCI_PASSWORD \
    -e PIPELINE_URL=ghcr.io/wallaroolabs/doc-samples/pipelines/architecture-demonstration-arm:6fa63751-c97b-4f6f-90f5-78566cd22170 \
    -e CONFIG_CPUS=1 ghcr.io/wallaroolabs/doc-samples/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini-aarch64:v2024.1.0-main-4833

Note: Please set the OCI_USERNAME, and OCI_PASSWORD environment variables.

Helm Install Command

helm install --atomic $HELM_INSTALL_NAME \
    oci://ghcr.io/wallaroolabs/doc-samples/charts/architecture-demonstration-arm \
    --namespace $HELM_INSTALL_NAMESPACE \
    --version 0.0.1-6fa63751-c97b-4f6f-90f5-78566cd22170 \
    --set ociRegistry.username=$OCI_USERNAME \
    --set ociRegistry.password=$OCI_PASSWORD

Note: Please set the HELM_INSTALL_NAME, HELM_INSTALL_NAMESPACE, OCI_USERNAME, and OCI_PASSWORD environment variables.

Publish Pipeline for X86

We will publish the pipeline again, this time by overriding the deployment architecture and setting it to ARM. Note the changes to the Engine URL, and the Docker Run Command and Helm Install Commands that reflect the changes.

We change the deployment architecture with the wallaroo.DeploymentConfigBuilder object, changing the arch setting to X86.

deploy_config_x86 = (wallaroo.DeploymentConfigBuilder()
                     .replica_count(1)
                     .cpus(1)
                     .memory("1Gi")
                     .arch(wallaroo.engine_config.Architecture.X86)
                     .build()
                    )
assay_pub_x86 = pipeline_arm.publish(deployment_config = deploy_config_x86)
assay_pub_x86

Waiting for pipeline publish... It may take up to 600 sec.
Pipeline is publishing....... Published.

ID 4

Pipeline Name architecture-demonstration-arm

Pipeline Version 46e69342-ce3b-4341-b4fd-699e648f45cd

Status Published

Engine URL ghcr.io/wallaroolabs/doc-samples/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini:v2024.1.0-main-4833

Pipeline URL ghcr.io/wallaroolabs/doc-samples/pipelines/architecture-demonstration-arm:46e69342-ce3b-4341-b4fd-699e648f45cd

Helm Chart URL oci://ghcr.io/wallaroolabs/doc-samples/charts/architecture-demonstration-arm

Helm Chart Reference ghcr.io/wallaroolabs/doc-samples/charts@sha256:35aa8afe037111f9795ec4ad779ccefa1c86d74f0935a1ab3747b7983d44e730

Helm Chart Version 0.0.1-46e69342-ce3b-4341-b4fd-699e648f45cd

Engine Config {'engine': {'resources': {'limits': {'cpu': 1.0, 'memory': '512Mi'}, 'requests': {'cpu': 1.0, 'memory': '512Mi'}, 'accel': 'none', 'arch': 'x86', 'gpu': False}}, 'engineAux': {'autoscale': {'type': 'none'}, 'images': {}}, 'enginelb': {'resources': {'limits': {'cpu': 1.0, 'memory': '512Mi'}, 'requests': {'cpu': 1.0, 'memory': '512Mi'}, 'accel': 'none', 'arch': 'x86', 'gpu': False}}}