Wallaroo SDK Essentials Guide: Model Uploads and Registrations: ONNX

How to upload and use ONNX ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must lower case ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo ONNX Requirements

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.4.0 (October 2023)1.12.18173
2023.2.1 (July 2023)1.12.18173

For the most recent release of Wallaroo 2023.4.0, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the Wallaroo Native Runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]

Upload ONNX Model to Wallaroo

Open Neural Network eXchange(ONNX) is the default model runtime supported by Wallaroo. ONNX models are uploaded to the current workspace through the Wallaroo Client upload_model(name, path, framework, input_schema, output_schema).configure(options). When uploading a default ML Model that matches the default Wallaroo runtime, the configure(options) can be left empty or the framework onnx specified.

Uploading ONNX Models

ONNX models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload ONNX Model Parameters

The following parameters are required for ONNX models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a ONNX model to Wallaroo.

For ONNX models, the input_schema and output_schema are not required so are not listed here.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Required)Set as the Framework.ONNX.
input_schemapyarrow.lib.Schema (Optional)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Optional)The output schema in Apache Arrow schema format.
convert_waitbool (Optional) (Default: True)Not required for native runtimes.
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.
archwallaroo.engine_config.ArchitectureThe architecture the model is deployed to. If a model is intended for deployment to an ARM architecture, it must be specified during this step. Values include: X86 (Default): x86 based architectures. ARM: ARM based architectures.

Model Config Options

Model version configurations are updated with the wallaroo.model_version.config and include the following parameters. Most are optional unless specified.

ParameterTypeDescription
runtimeString (Optional)The model runtime from wallaroo.framework. }
tensor_fields(List[string]) (Optional)A list of alternate input fields. For example, if the model accepts the input fields ['variable1', 'variable2'], tensor_fields allows those inputs to be overridden to ['square_feet', 'house_age'], or other values as required.
input_schemapyarrow.lib.SchemaThe input schema for the model in pyarrow.lib.Schema format.
output_schemapyarrow.lib.SchemaThe output schema for the model in pyarrow.lib.Schema format.
batch_config(List[string]) (Optional)Batch config is either None for multiple-input inferences, or single to accept an inference request with only one row of data.

ONNX Model Inputs

By default, inferencing in Wallaroo uses the same input fields as the ONNX model. This is overwritten with the wallaroo.model.configure(tensor_fields=List[String]) method to change the model input fields to match the tensor_field List.

  • IMPORTANT NOTE: The tensor_field length must match the ONNX model’s input field’s list.

The following displays the input fields for ONNX models. Replace onnx_file_model_name with the path to the ONNX model file.

import onnx

onnx_file_model_name = './path/to/onnx/file/file.onnx'

model = onnx.load(onnx_file_model_name)
output =[node.name for node in model.graph.output]

input_all = [node.name for node in model.graph.input]
input_initializer =  [node.name for node in model.graph.initializer]
net_feed_input = list(set(input_all)  - set(input_initializer))

print('Inputs: ', net_feed_input)
print('Outputs: ', output)

Inputs:  ['dense_input']
Outputs:  ['dense_1']

The following Wallaroo upload will use the ONNX model’s default input.

from wallaroo.framework import Framework
model = (wl.upload_model(model_name, 
                         model_file_name
                        )
        )

pipeline.add_model_step(model)

deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(0.5).memory("1Gi").build()
pipeline.deploy(deployment_config=deploy_config)

smoke_test = pd.DataFrame.from_records([
    {
        "dense_input":[
            1.0678324729,
            0.2177810266,
            -1.7115145262,
            0.682285721,
            1.0138553067,
            -0.4335000013,
            0.7395859437,
            -0.2882839595,
            -0.447262688,
            0.5146124988,
            0.3791316964,
            0.5190619748,
            -0.4904593222,
            1.1656456469,
            -0.9776307444,
            -0.6322198963,
            -0.6891477694,
            0.1783317857,
            0.1397992467,
            -0.3554220649,
            0.4394217877,
            1.4588397512,
            -0.3886829615,
            0.4353492889,
            1.7420053483,
            -0.4434654615,
            -0.1515747891,
            -0.2668451725,
            -1.4549617756
        ]
    }
])
result = pipeline.infer(smoke_test)
display(result)
 timein.dense_inputout.dense_1check_failures
02023-10-17 16:13:56.169[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756][0.0014974177]0

The following uses the tensor_field parameter on the model upload to change the input to tensor.

from wallaroo.framework import Framework
model = (wl.upload_model(model_name, 
                         model_file_name, 
                         framework=Framework.ONNX)
                         .configure(tensor_fields=["tensor"]
                        )
        )

pipeline.add_model_step(model)

deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(0.5).memory("1Gi").build()
pipeline.deploy(deployment_config=deploy_config)

smoke_test = pd.DataFrame.from_records([
    {
        "tensor":[
            1.0678324729,
            0.2177810266,
            -1.7115145262,
            0.682285721,
            1.0138553067,
            -0.4335000013,
            0.7395859437,
            -0.2882839595,
            -0.447262688,
            0.5146124988,
            0.3791316964,
            0.5190619748,
            -0.4904593222,
            1.1656456469,
            -0.9776307444,
            -0.6322198963,
            -0.6891477694,
            0.1783317857,
            0.1397992467,
            -0.3554220649,
            0.4394217877,
            1.4588397512,
            -0.3886829615,
            0.4353492889,
            1.7420053483,
            -0.4434654615,
            -0.1515747891,
            -0.2668451725,
            -1.4549617756
        ]
    }
])
result = pipeline.infer(smoke_test)
display(result)
 timein.tensorout.dense_1check_failures
02023-10-17 16:13:56.169[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756][0.0014974177]0

Upload ONNX Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
SHAstringThe hash value of the model file.
StatusstringThe status of the model. Values include:
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

For example:

model_name = "embedder-o"
model_path = "./embedder.onnx"

embedder = wl.upload_model(model_name, model_path, Framework=Framework.ONNX).configure("onnx")

ONNX Conversion Tips

When converting from one ML model type to an ONNX ML model, the input and output fields should be specified so users anticipate the exact field names used in their code. This prevents conversion naming formats from creating unintended names, and sets consistent field names that can be relied upon in future code updates.

The following example shows naming the input and output names when converting from a PyTorch model to an ONNX model. Note that the input fields are set to data, and the output fields are set to output_names = ["bounding-box", "classification","confidence"].

input_names = ["data"]
output_names = ["bounding-box", "classification","confidence"]
torch.onnx.export(model,
                    tensor,
                    pytorchModelPath+'.onnx',
                    input_names=input_names,
                    output_names=output_names,
                    opset_version=17,
                    )

See the documentation for the specific ML model being converting from to ONNX for complete details.

Pipeline Deployment Configurations

Pipeline configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space.

This model will always run in the native runtime space.

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()