Wallaroo SDK Essentials Guide: Model Uploads and Registrations: ONNX

How to upload and use ONNX ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

Parameter Description
Web Site https://onnx.ai/
Supported Libraries See table below.
Framework Framework.ONNX aka onnx
Runtime Native aka onnx

The following ONNX versions models are supported:

Wallaroo Version ONNX Version ONNX IR Version ONNX OPset Version ONNX ML Opset Version
2023.2.1 (July 2023) 1.12.1 8 17 3
2023.2 (May 2023) 1.12.1 8 17 3
2023.1 (March 2023) 1.12.1 8 17 3
2022.4 (December 2022) 1.12.1 8 17 3
After April 2022 until release 2022.4 (December 2022) 1.10.* 7 15 2
Before April 2022 1.6.* 7 13 2

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Upload ONNX Model to Wallaroo

Open Neural Network eXchange(ONNX) is the default model runtime supported by Wallaroo. ONNX models are uploaded to the current workspace through the Wallaroo Client upload_model(name, path, framework, input_schema, output_schema).configure(options). When uploading a default ML Model that matches the default Wallaroo runtime, the configure(options) can be left empty or the framework onnx specified.

Uploading ONNX Models

ONNX models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload ONNX Model Parameters

The following parameters are required for ONNX models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a ONNX model to Wallaroo.

For ONNX models, the input_schema and output_schema are not required so are not listed here.

Parameter Type Description
name string (Required) The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
path string (Required) The path to the model file being uploaded.
framework string (Required) Set as the Framework.ONNX.
input_schema pyarrow.lib.Schema (Optional) The input schema in Apache Arrow schema format.
output_schema pyarrow.lib.Schema (Optional) The output schema in Apache Arrow schema format.
convert_wait bool (Optional) (Default: True) Not required for native runtimes.
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload ONNX Model Return

The following is returned with a successful model upload and conversion.

Field Type Description
name string The name of the model.
version string The model version as a unique UUID.
file_name string The file name of the model as stored in Wallaroo.
SHA string The hash value of the model file.
Status string The status of the model. Values include:
image_path string The image used to deploy the model in the Wallaroo engine.
last_update_time DateTime When the model was last updated.

For example:

model_name = "embedder-o"
model_path = "./embedder.onnx"

embedder = wl.upload_model(model_name, model_path, Framework=Framework.ONNX).configure("onnx")

ONNX Conversion Tips

When converting from one ML model type to an ONNX ML model, the input and output fields should be specified so users anticipate the exact field names used in their code. This prevents conversion naming formats from creating unintended names, and sets consistent field names that can be relied upon in future code updates.

The following example shows naming the input and output names when converting from a PyTorch model to an ONNX model. Note that the input fields are set to data, and the output fields are set to output_names = ["bounding-box", "classification","confidence"].

input_names = ["data"]
output_names = ["bounding-box", "classification","confidence"]
torch.onnx.export(model,
                    tensor,
                    pytorchModelPath+'.onnx',
                    input_names=input_names,
                    output_names=output_names,
                    opset_version=17,
                    )

See the documentation for the specific ML model being converting from to ONNX for complete details.

Pipeline Deployment Configurations

Pipeline configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space.

This model will always run in the native runtime space.

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()