This tutorial and the assets can be downloaded as part of the Wallaroo Tutorials repository.
The following example demonstrates some of the data and input requirements when working with ONNX models in Wallaroo. This example will:
For more information on using ONNX models with Wallaroo, see Wallaroo SDK Essentials Guide: Model Uploads and Registrations: ONNX.
The first step is to import the libraries used for our demonstration - primarily the Wallaroo SDK, which is used to connect to the Wallaroo Ops instance, upload models, etc.
import wallaroo
from wallaroo.deployment_config import DeploymentConfigBuilder
from wallaroo.framework import Framework
import pyarrow as pa
import numpy as np
import pandas as pd
The next step is to connect to Wallaroo through the Wallaroo client. The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.
This is accomplished using the wallaroo.Client()
command, which provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Store the connection into a variable that can be referenced later.
If logging into the Wallaroo instance through the internal JupyterHub service, use wl = wallaroo.Client()
. For more information on Wallaroo Client settings, see the Client Connection guide.
# Login through local Wallaroo instance
wl = wallaroo.Client()
We will create a workspace to manage our pipeline and models. The following variables will set the name of our sample workspace then set it as the current workspace. If this tutorial has been run before, the helper function get_workspace
will either create or connect to an existing workspace.
Workspace names must be unique; verify that no other workspaces have the same name when running this tutorial. We then set the current workspace to our new workspace; all model uploads and other requests will use this
def get_workspace(workspace_name, wallaroo_client):
workspace = None
for ws in wallaroo_client.list_workspaces():
if ws.name() == workspace_name:
workspace= ws
if(workspace == None):
workspace = wallaroo_client.create_workspace(workspace_name)
return workspace
workspace_name = 'onnx-tutorial'
workspace = get_workspace(workspace_name, wl)
wl.set_current_workspace(workspace)
{'name': 'onnx-tutorial', 'id': 9, 'archived': False, 'created_by': '12ea09d1-0f49-405e-bed1-27eb6d10fde4', 'created_at': '2023-11-22T16:24:47.786643+00:00', 'models': [], 'pipelines': []}
The ONNX model ./models/multi_io.onnx
will be uploaded with the wallaroo.client.upload_model
method. This requires:
wallaroo.framework.Framework
options.If we wanted to overwrite the name of the input fields, we could use the wallaroo.client.upload_model.configure(tensor_fields[field_names])
option. This ONNX model takes the inputs input_1
and input_2
.
model = wl.upload_model('onnx-multi-io-model',
"./models/multi_io.onnx",
framework=Framework.ONNX)
model
Name | onnx-multi-io-model |
Version | 7adb9245-53c2-43b4-95df-2c907bb88161 |
File Name | multi_io.onnx |
SHA | bb3e51dfdaa6440359c2396033a84a4248656d0f81ba1f662751520b3f93de27 |
Status | ready |
Image Path | None |
Architecture | None |
Updated At | 2023-22-Nov 16:24:51 |
A new pipeline ‘multi-io-example’ is created with the wallaroo.client.build_pipeline
method that creates a new Wallaroo pipeline within our current workspace. We then add our onnx-multi-io-model
as a pipeline step.
pipeline_name = 'multi-io-example'
pipeline = wl.build_pipeline(pipeline_name)
# in case this pipeline was run before
pipeline.clear()
pipeline.add_model_step(model)
name | multi-io-example |
---|---|
created | 2023-11-22 16:24:53.843958+00:00 |
last_updated | 2023-11-22 16:24:54.523098+00:00 |
deployed | True |
arch | None |
tags | |
versions | 73c1b57d-3227-471a-8e9b-4a8af62188dd, c8fb97d9-50cd-475d-8f36-1d2290e4c585 |
steps | onnx-multi-io-model |
published | False |
With the model set, deploy the pipeline with a deployment configuration. This sets the number of resources that the pipeline will be allocated from the Wallaroo Ops cluster and makes it available for inference requests.
deployment_config = DeploymentConfigBuilder() \
.cpus(0.25).memory('1Gi') \
.build()
pipeline.deploy(deployment_config=deployment_config)
pipeline.status()
{'status': 'Running',
'details': [],
'engines': [{'ip': '10.244.3.143',
'name': 'engine-857444867-nldj5',
'status': 'Running',
'reason': None,
'details': [],
'pipeline_statuses': {'pipelines': [{'id': 'multi-io-example',
'status': 'Running'}]},
'model_statuses': {'models': [{'name': 'onnx-multi-io-model',
'version': '7adb9245-53c2-43b4-95df-2c907bb88161',
'sha': 'bb3e51dfdaa6440359c2396033a84a4248656d0f81ba1f662751520b3f93de27',
'status': 'Running'}]}}],
'engine_lbs': [{'ip': '10.244.4.155',
'name': 'engine-lb-584f54c899-h647p',
'status': 'Running',
'reason': None,
'details': []}],
'sidekicks': []}
For our inference request, we will create a dummy DataFrame with the following fields:
10 rows will be created.
Inference requests for Wallaroo for ONNX models must meet the following criteria:
Note that each input meets these requirements:
For more details, see Wallaroo SDK Essentials Guide: Model Uploads and Registrations: ONNX.
np.random.seed(1)
mock_inference_data = [np.random.rand(10, 10), np.random.rand(10, 5)]
mock_dataframe = pd.DataFrame(
{
"input_1": mock_inference_data[0].tolist(),
"input_2": mock_inference_data[1].tolist(),
}
)
display(mock_dataframe)
input_1 | input_2 | |
---|---|---|
0 | [0.417022004702574, 0.7203244934421581, 0.0001... | [0.32664490177209615, 0.5270581022576093, 0.88... |
1 | [0.4191945144032948, 0.6852195003967595, 0.204... | [0.6233601157918027, 0.015821242846556283, 0.9... |
2 | [0.8007445686755367, 0.9682615757193975, 0.313... | [0.17234050834532855, 0.13713574962887776, 0.9... |
3 | [0.0983468338330501, 0.42110762500505217, 0.95... | [0.7554630526024664, 0.7538761884612464, 0.923... |
4 | [0.9888610889064947, 0.7481656543798394, 0.280... | [0.01988013383979559, 0.026210986877719278, 0.... |
5 | [0.019366957870297075, 0.678835532939891, 0.21... | [0.5388310643416528, 0.5528219786857659, 0.842... |
6 | [0.10233442882782584, 0.4140559878195683, 0.69... | [0.5857592714582879, 0.9695957483196745, 0.561... |
7 | [0.9034019152878835, 0.13747470414623753, 0.13... | [0.23297427384102043, 0.8071051956187791, 0.38... |
8 | [0.8833060912058098, 0.6236722070556089, 0.750... | [0.5562402339904189, 0.13645522566068502, 0.05... |
9 | [0.11474597295337519, 0.9494892587070712, 0.44... | [0.1074941291060929, 0.2257093386078547, 0.712... |
We now perform an inference with our sample inference request with the wallaroo.pipeline.infer
method. The returning DataFrame displays the input variables as in.{variable_name}
, and the output variables as out.{variable_name}
. Each inference output row corresponds with an input row.
results = pipeline.infer(mock_dataframe)
results
time | in.input_1 | in.input_2 | out.output_1 | out.output_2 | check_failures | |
---|---|---|---|---|---|---|
0 | 2023-11-22 16:27:10.632 | [0.4170220047, 0.7203244934, 0.0001143748, 0.3... | [0.3266449018, 0.5270581023, 0.8859420993, 0.3... | [-0.16188532, -0.2735075, -0.10427341] | [-0.18745898, -0.035904408] | 0 |
1 | 2023-11-22 16:27:10.632 | [0.4191945144, 0.6852195004, 0.2044522497, 0.8... | [0.6233601158, 0.0158212428, 0.9294372337, 0.6... | [-0.16437894, -0.24449202, -0.10489924] | [-0.17241219, -0.09285815] | 0 |
2 | 2023-11-22 16:27:10.632 | [0.8007445687, 0.9682615757, 0.3134241782, 0.6... | [0.1723405083, 0.1371357496, 0.932595463, 0.69... | [-0.1431846, -0.33338487, -0.1858185] | [-0.25035447, -0.095617786] | 0 |
3 | 2023-11-22 16:27:10.632 | [0.0983468338, 0.421107625, 0.9578895302, 0.53... | [0.7554630526, 0.7538761885, 0.9230245355, 0.7... | [-0.21010575, -0.38097042, -0.26413786] | [-0.081432916, -0.12933002] | 0 |
4 | 2023-11-22 16:27:10.632 | [0.9888610889, 0.7481656544, 0.2804439921, 0.7... | [0.0198801338, 0.0262109869, 0.028306488, 0.24... | [-0.29807547, -0.362104, -0.04459526] | [-0.23403212, 0.019275911] | 0 |
5 | 2023-11-22 16:27:10.632 | [0.0193669579, 0.6788355329, 0.211628116, 0.26... | [0.5388310643, 0.5528219787, 0.8420308924, 0.1... | [-0.14283556, -0.29290834, -0.1613777] | [-0.20929304, -0.10064016] | 0 |
6 | 2023-11-22 16:27:10.632 | [0.1023344288, 0.4140559878, 0.6944001577, 0.4... | [0.5857592715, 0.9695957483, 0.5610302193, 0.0... | [-0.2372348, -0.29803842, -0.17791237] | [-0.20062584, -0.026013546] | 0 |
7 | 2023-11-22 16:27:10.632 | [0.9034019153, 0.1374747041, 0.1392763473, 0.8... | [0.2329742738, 0.8071051956, 0.3878606441, 0.8... | [-0.27525327, -0.46431914, -0.2719731] | [-0.17208403, -0.1618222] | 0 |
8 | 2023-11-22 16:27:10.632 | [0.8833060912, 0.6236722071, 0.750942434, 0.34... | [0.556240234, 0.1364552257, 0.0599176895, 0.12... | [-0.3599869, -0.37006766, 0.05214046] | [-0.26465484, 0.08243461] | 0 |
9 | 2023-11-22 16:27:10.632 | [0.114745973, 0.9494892587, 0.4499121335, 0.57... | [0.1074941291, 0.2257093386, 0.7129889804, 0.5... | [-0.20812269, -0.3822521, -0.14788152] | [-0.19157144, -0.12436578] | 0 |
With the tutorial complete, we will undeploy the pipeline and return the resources back to the cluster.
pipeline.undeploy()