This tutorial and the assets can be downloaded as part of the Wallaroo Tutorials repository.
Wallaroo Run Anywhere provides model deployment in any device, any cloud, and any architecture. Models uploaded to Wallaroo are set to their targeted architecture.
Organizations can deploy uploaded models to clusters that have nodes with the provisioned architecture. The following architectures are supported:
X86
: The standard X86 architecture.ARM
: For more details on cloud providers and their ARM offerings, see Create ARM Nodepools for Kubernetes Clusters.The model’s deployment configuration inherits its architecture. Models automatically deploy in the target architecture provided nodepools with the architecture are available. For information on setting up nodepools with specific architectures, see Infrastructure Configuration Guides.
That deployment configuration is carried over to the models’ publication in an Open Container Initiative (OCI) Registries, which allows edge model deployments on X64
and ARM
architectures. More details on deploying models on edge devices is available with the Wallaroo Run Anywhere Guides.
The deployment configuration can be overridden for either model deployment in the Wallaroo Ops instance, or in the Edge devices.
This tutorial demonstrates deploying a ML model trained to predict house prices to ARM edge locations through the following steps.
ARM
.In this notebook, we use a ONNX model pre-trained to predict house prices for our examples.
Demonstrate publishing a pipeline with model steps to various architectures.
This tutorial provides the following:
models/rf_model.onnx
: The champion model trained to predict house prices.smallinputs.df.json
: A set of house inputs that tends to generate low house price values.biginputs.df.json
: A set of house inputs that tends to generate high house price values.normal-inputs.df.json
: A set of house inputs with a range of house values.ARM
.The first step will be to import our libraries, and set variables used through this tutorial.
import wallaroo
from wallaroo.object import EntityNotFoundError
from wallaroo.framework import Framework
from wallaroo.engine_config import Architecture
from IPython.display import display
# used to display DataFrame information without truncating
from IPython.display import display
import pandas as pd
pd.set_option('display.max_colwidth', None)
import datetime
import time
workspace_name = f'run-anywhere-house-price-architecture-demonstration-tutorial'
arm_pipeline_name = f'architecture-demonstration-arm'
model_name_arm = f'house-price-estimator-arm'
model_file_name = './models/rf_model.onnx'
# ignoring warnings for demonstration
import warnings
warnings.filterwarnings('ignore')
# used to display DataFrame information without truncating
from IPython.display import display
import pandas as pd
pd.set_option('display.max_colwidth', None)
The first step is to connect to Wallaroo through the Wallaroo client. The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.
This is accomplished using the wallaroo.Client()
command, which provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Store the connection into a variable that can be referenced later.
If logging into the Wallaroo instance through the internal JupyterHub service, use wl = wallaroo.Client()
. For more information on Wallaroo Client settings, see the Client Connection guide.
# Login through local Wallaroo instance
wl = wallaroo.Client()
We will create a workspace to manage our pipeline and models. The following variables will set the name of our sample workspace then set it as the current workspace.
Workspace, pipeline, and model names should be unique to each user, so we’ll add in a randomly generated suffix so multiple people can run this tutorial in a Wallaroo instance without effecting each other.
workspace = wl.get_workspace(name=workspace_name, create_if_not_exist=True)
wl.set_current_workspace(workspace)
{'name': 'run-anywhere-house-price-architecture-demonstration-tutorial', 'id': 57, 'archived': False, 'created_by': 'eed2002f-769f-4cbd-a189-8ca1e9bf496c', 'created_at': '2024-04-22T15:12:51.014759+00:00', 'models': [], 'pipelines': []}
For our example, we will upload the champion model that has been trained to derive house prices from a variety of inputs. The model file is rf_model.onnx
, and is uploaded with the name house-price-estimator
.
Models are uploaded to Wallaroo via the wallaroo.client.upload_model
method which takes the following arguments:
Parameter | Type | Description |
---|---|---|
path | String (Required) | The file path to the model. |
framework | wallaroo.framework.Framework (Required) | The model’s framework. See Wallaroo SDK Essentials Guide: Model Uploads and Registrations for supported model frameworks. |
input_schema | pyarrow.lib.Schema (Optional) | The model’s input schema. **Only required for non-Native Wallaroo frameworks. See Wallaroo SDK Essentials Guide: Model Uploads and Registrations for more details. |
output_schema | pyarrow.lib.Schema (Optional) | The model’s output schema. **Only required for non-Native Wallaroo frameworks. See Wallaroo SDK Essentials Guide: Model Uploads and Registrations for more details. |
convert_wait | bool (Optional) | Whether to wait in the SDK session to complete the auto-packaging process for non-native Wallaroo frameworks. |
arch | wallaroo.engine_config.Architecture (Optional) | The targeted architecture for the model. Options are
|
We upload the model and set the architecture to ARM
.
housing_model_control_arm = (wl.upload_model(model_name_arm,
model_file_name,
framework=Framework.ONNX,
arch=wallaroo.engine_config.Architecture.ARM)
.configure(tensor_fields=["tensor"])
)
display(housing_model_control_arm)
Name | house-price-estimator-arm |
Version | 68d4f7de-2df5-4d2d-bacd-4e9ad6153c4a |
File Name | rf_model.onnx |
SHA | e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6 |
Status | ready |
Image Path | None |
Architecture | arm |
Acceleration | none |
Updated At | 2024-22-Apr 15:12:51 |
We build the pipeline with the wallaroo.client.build_pipeline(pipeline_name
command, and set the model as a model step in the pipeline.
pipeline_arm = wl.build_pipeline('architecture-demonstration-arm')
pipeline_arm.add_model_step(housing_model_control_arm)
name | architecture-demonstration-arm |
---|---|
created | 2024-04-22 15:12:51.793714+00:00 |
last_updated | 2024-04-22 15:12:51.793714+00:00 |
deployed | (none) |
arch | None |
accel | None |
tags | |
versions | 067b3dd6-4e2b-4709-9243-657c8a4ecbda |
steps | |
published | False |
We can now deploy the pipeline. Out deployment configuration specifies just the cpus and memory we want to use - the architecture is not specified, since that’s inherited from the model’s arch
settings.
#minimum deployment config
deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(0.5).memory("1Gi").build()
pipeline_arm.deploy(deployment_config = deploy_config)
Waiting for deployment - this will take up to 45s ....................... ok
name | architecture-demonstration-arm |
---|---|
created | 2024-04-22 15:12:51.793714+00:00 |
last_updated | 2024-04-22 15:12:52.055247+00:00 |
deployed | True |
arch | arm |
accel | none |
tags | |
versions | e1dced87-0d17-4e03-b692-560b1fd50b16, 067b3dd6-4e2b-4709-9243-657c8a4ecbda |
steps | house-price-estimator-arm |
published | False |
display(pipeline_arm)
name | architecture-demonstration-arm |
---|---|
created | 2024-04-22 15:12:51.793714+00:00 |
last_updated | 2024-04-22 15:12:52.055247+00:00 |
deployed | True |
arch | arm |
accel | none |
tags | |
versions | e1dced87-0d17-4e03-b692-560b1fd50b16, 067b3dd6-4e2b-4709-9243-657c8a4ecbda |
steps | house-price-estimator-arm |
published | False |
pipeline_arm.undeploy()
Waiting for undeployment - this will take up to 45s .................................... ok
name | architecture-demonstration-arm |
---|---|
created | 2024-04-22 15:12:51.793714+00:00 |
last_updated | 2024-04-22 15:12:52.055247+00:00 |
deployed | False |
arch | arm |
accel | none |
tags | |
versions | e1dced87-0d17-4e03-b692-560b1fd50b16, 067b3dd6-4e2b-4709-9243-657c8a4ecbda |
steps | house-price-estimator-arm |
published | False |
We now publish our pipeline as two different versions.
ARM
: The model’s architecture was set to ARM
, when when we publish the pipeline to the OCI registry, it will automatically inherit that architecture.X64
: We override the model’s architecture to push a published version that can be deployed on X64
based devices.Publishing the pipeline uses the pipeline wallaroo.pipeline.Pipeline.publish()
command. This requires that the Wallaroo Ops instance have Edge Registry Services enabled.
When publishing, we specify the pipeline deployment configuration through the wallaroo.DeploymentConfigBuilder
. We do not specify the architecture, since this is inherited from the model’s arch
settings.
The following publishes the pipeline to the OCI registry and displays the container details. For more information, see Wallaroo SDK Essentials Guide: Pipeline Edge Publication.
deploy_config_arm = (wallaroo.DeploymentConfigBuilder()
.replica_count(1)
.cpus(1)
.memory("1Gi")
.arch(wallaroo.engine_config.Architecture.ARM)
.build()
)
assay_pub_arm = pipeline_arm.publish(deployment_config=deploy_config_arm)
assay_pub_arm
Waiting for pipeline publish... It may take up to 600 sec.
Pipeline is publishing........ Published.
ID | 8 | |
Pipeline Name | architecture-demonstration-arm | |
Pipeline Version | e86cf51b-0875-4b58-922b-cba94285602c | |
Status | Published | |
Engine URL | sample.registry.example.com/uat/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini-aarch64:v2024.1.0-main-4963 | |
Pipeline URL | sample.registry.example.com/uat/pipelines/architecture-demonstration-arm:e86cf51b-0875-4b58-922b-cba94285602c | |
Helm Chart URL | oci://sample.registry.example.com/uat/charts/architecture-demonstration-arm | |
Helm Chart Reference | sample.registry.example.com/uat/charts@sha256:539556815d737888ce00cb0232b6c9a947688f5c29bc5603e2730e1115c5bd41 | |
Helm Chart Version | 0.0.1-e86cf51b-0875-4b58-922b-cba94285602c | |
Engine Config | {'engine': {'resources': {'limits': {'cpu': 1.0, 'memory': '512Mi'}, 'requests': {'cpu': 1.0, 'memory': '512Mi'}, 'accel': 'none', 'arch': 'arm', 'gpu': False}}, 'engineAux': {'autoscale': {'type': 'none'}, 'images': {}}} | |
User Images | [] | |
Created By | john.hummel@wallaroo.ai | |
Created At | 2024-04-22 15:17:52.184999+00:00 | |
Updated At | 2024-04-22 15:17:52.184999+00:00 | |
Replaces | ||
Docker Run Command |
Note: Please set the EDGE_PORT , OCI_USERNAME , and OCI_PASSWORD environment variables. | |
Helm Install Command |
Note: Please set the HELM_INSTALL_NAME , HELM_INSTALL_NAMESPACE ,
OCI_USERNAME , and OCI_PASSWORD environment variables. |
For details on performing inference requests through an edge deployed model, see Edge Deployment Endpoints.