Automated Model Packaging

How to upload models to a Wallaroo Ops instance.

ML models are either uploaded as files or registered from container registry services to a Wallaroo Ops workspace through:

Once a ML model is added to a workspace, it is prepared for deployment based on the model’s runtime.

ML models uploaded to Wallaroo come in two runtimes:

Native Runtimes: Models that are deployed “as is” with the Wallaroo engine. These are:
- ONNX
- Python step
- Tensorflow 2.9.1 in SavedModel format
Non-Native Runtimes: Models that when uploaded are either converted to a Native Wallaroo Runtime, or are containerized into the Wallaroo Containerized Runtime so they can be run in the Wallaroo engine.
- When uploaded, Wallaroo will attempt to convert Non-Native Runtimes to a Wallaroo Native Runtime. If it can not be converted, then it will be packed into a Wallaroo Containerized Runtime.

The following is a short guide on using the Wallaroo SDK to upload, register and alter model configuration. For complete details, see either the The Wallaroo SDK
or The Wallaroo MLOps API guides.

Wallaroo Model Runtimes

The following ML Model frameworks are supported by Wallaroo. For full details on each framework, see the Wallaroo Model Frameworks guide.

Supported Models

The following frameworks are supported. Frameworks fall under either Wallaroo Native Runtimes or Wallaroo Containerized Runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

IMPORTANT NOTE

Verify that the input types match the specified inputs, especially for Containerized Wallaroo Runtimes. For example, if the input is listed as a pyarrow.float32(), submitting a pyarrow.float64() may cause an error.

The Wallaroo Model Runtime is displayed after a model is uploaded with the wallaroo.model.config().runtime() method. The following table displays the type of Runtime associated with each possible display.

Runtime Display	Model Runtime Space
`tensorflow`	Native
`onnx`	Native
`python`	Native
`mlflow`	Containerized
`flight`	Containerized

Native Model Runtimes

Model	Wallaroo Framework	Runtime	Supported Versions
ONNX	`Framework.ONNX` aka `onnx`	Native aka `onnx`	`1.12.1`
Tensorflow	`Framework.TENSORFLOW` aka `tensorflow`	Native aka `tensorflow`	SavedModel format as .zip file
Python	`Framework.PYTHON` aka `python`	Native aka `python`	`python==3.8`

Containerized Model Runtimes

Model	Wallaroo Frameworks	Runtime(s)
Arbitrary Python (BYOP)	`Framework.CUSTOM` aka `custom`	Containerized aka `flight`
Hugging Face	`Framework.HUGGING_FACE_FEATURE_EXTRACTION` aka `hugging-face-feature-extraction` `Framework.HUGGING_FACE_IMAGE_CLASSIFICATION` aka `hugging-face-image-classification` `Framework.HUGGING_FACE_IMAGE_SEGMENTATION` aka `hugging-face-image-segmentation` `Framework.HUGGING_FACE_IMAGE_TO_TEXT` aka `hugging-face-image-to-text` `Framework.HUGGING_FACE_OBJECT_DETECTION` aka `hugging-face-object-detection` `Framework.HUGGING_FACE_QUESTION_ANSWERING` aka `hugging-face-question-answering` `Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG` aka `hugging-face-stable-diffusion-text-2-img` `Framework.HUGGING_FACE_SUMMARIZATION` aka `hugging-face-summarization` `Framework.HUGGING_FACE_TEXT_CLASSIFICATION` aka `hugging-face-text-classification` `Framework.HUGGING_FACE_TRANSLATION` aka `hugging-face-translation` `Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION` aka `hugging-face-zero-shot-classification` `Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION` aka `hugging-face-zero-shot-image-classification` `Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION` aka `hugging-face-zero-shot-object-detection` `Framework.HUGGING_FACE_SENTIMENT_ANALYSIS` aka `hugging-face-sentiment-analysis` `Framework.HUGGING_FACE_TEXT_GENERATION` aka `hugging-face-text-generation` `Framework.HUGGING_FACE_AUTOMATIC_SPEECH_RECOGNITION` aka `hugging-face-automatic-speech-recognition`	Containerized `flight`
PyTorch	`Framework.PYTORCH` aka `pytorch`	Native `onnx` and Containerized `flight`
SKLearn	`Framework.SKLEARN` aka `sklearn`	Native `onnx` and Containerized `flight`
Tensorflow keras	`Framework.KERAS` aka `keras`	Native `onnx` and Containerized `flight`
XGBoost	`Framework.XGBOOST` aka `xgboost`	Native `onnx` and Containerized `flight`
MLFlow	N/A^*	Containerized aka `mlflow`

MLFlow models are not uploaded, but registered in Wallaroo. See Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Containerized MLFlow for full details.

List Wallaroo Frameworks

Wallaroo frameworks are listed from the wallaroo.framework.Framework class. The following demonstrates listing all available supported frameworks.

from wallaroo.framework import Framework

[e.value for e in Framework]

    ['onnx',
    'tensorflow',
    'python',
    'keras',
    'sklearn',
    'pytorch',
    'xgboost',
    'hugging-face-feature-extraction',
    'hugging-face-image-classification',
    'hugging-face-image-segmentation',
    'hugging-face-image-to-text',
    'hugging-face-object-detection',
    'hugging-face-question-answering',
    'hugging-face-stable-diffusion-text-2-img',
    'hugging-face-summarization',
    'hugging-face-text-classification',
    'hugging-face-translation',
    'hugging-face-zero-shot-classification',
    'hugging-face-zero-shot-image-classification',
    'hugging-face-zero-shot-object-detection',
    'hugging-face-sentiment-analysis',
    'hugging-face-text-generation']

Manage Models via the Wallaroo SDK

Upload Model

ML Models are uploaded to Wallaroo Ops through the wallaroo.client.upload_model method.

IMPORTANT NOTICE

Models uploaded through the Wallaroo SDK upload the workspace assigned as the current workspace in the SDK session, assigned as the user’s Default Workspace by default. See Wallaroo SDK Essentials Guide: Workspace Management for full details on creating and working with workspaces.

Upload Model Parameters

wallaroo.client.upload_model has the following parameters.

Parameter	Type	Description
`name`	`string` (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	`string` (Required)	The path to the model file being uploaded.
`framework`	`string` (Required)	The framework of the model from `wallaroo.framework`
`input_schema`	`pyarrow.lib.Schema` Native Wallaroo Runtimes: (Optional) Non-Native Wallaroo Runtimes: (Required)	The input schema in Apache Arrow schema format.
`output_schema`	`pyarrow.lib.Schema` Native Wallaroo Runtimes: (Optional) Non-Native Wallaroo Runtimes: (Required)	The output schema in Apache Arrow schema format.
`convert_wait`	`bool` Native Wallaroo Runtimes: (N/A) Non-Native Wallaroo Runtimes: (Required)	Not required for native runtimes. True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.
`arch`	wallaroo.engine_config.Architecture (Optional)	The architecture the model is deployed to. If a model is intended for deployment to an `ARM` architecture, it must be specified during this step. Values include: `X86` (Default): x86 based architectures. `ARM`: ARM based architectures.

Upload Model Returns

wallaroo.client.upload_model returns the model version. The model version refers to the version of the model object in Wallaroo. In Wallaroo, a model version update happens when we upload a new model file (artifact) against the same model object name.

Note that models are uploaded to the current workspace assigned in the SDK session. By default, this is the user’s Default Workspace.

Field	Type	Description
`id`	Integer	The numerical identifier of the model version.
`name`	string	The name of the model.
`version`	string	The model version as a unique UUID.
`file_name`	string	The file name of the model as stored in Wallaroo.
`image_path`	string	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

Upload Model Example

The following examples demonstrate uploading different model types.

Native Runtime Upload

The following demonstrates uploading a ONNX model to a Wallaroo Ops instance. For Wallaroo SDK Essentials Guide: Model Uploads and Registrations: ONNX for full details on uploading ONNX models and model configurations.

The first demonstration shows uploading an ONNX model with the default input fields. The seconds demonstration shows uploading an ONNX model and overriding the default input field names.

# determine the model's input and output fields

import onnx

onnx_file_model_name = './path/to/onnx/file/file.onnx'

model = onnx.load(onnx_file_model_name)
output =[node.name for node in model.graph.output]

input_all = [node.name for node in model.graph.input]
input_initializer =  [node.name for node in model.graph.initializer]
net_feed_input = list(set(input_all)  - set(input_initializer))

print('Inputs: ', net_feed_input)
print('Outputs: ', output)

Inputs:  ['dense_input']
Outputs:  ['dense_1']

The following Wallaroo upload will use the ONNX model’s default input.

from wallaroo.framework import Framework
model = (wl.upload_model(model_name, 
                         model_file_name
                        )
        )

pipeline.add_model_step(model)

deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(0.5).memory("1Gi").build()
pipeline.deploy(deployment_config=deploy_config)

smoke_test = pd.DataFrame.from_records([
    {
        "dense_input":[
            1.0678324729,
            0.2177810266,
            -1.7115145262,
            0.682285721,
            1.0138553067,
            -0.4335000013,
            0.7395859437,
            -0.2882839595,
            -0.447262688,
            0.5146124988,
            0.3791316964,
            0.5190619748,
            -0.4904593222,
            1.1656456469,
            -0.9776307444,
            -0.6322198963,
            -0.6891477694,
            0.1783317857,
            0.1397992467,
            -0.3554220649,
            0.4394217877,
            1.4588397512,
            -0.3886829615,
            0.4353492889,
            1.7420053483,
            -0.4434654615,
            -0.1515747891,
            -0.2668451725,
            -1.4549617756
        ]
    }
])
result = pipeline.infer(smoke_test)
display(result)

	time	in.dense_input	out.dense_1	anomaly.count
0	2023-10-17 16:13:56.169	[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756]	[0.0014974177]	0

The following uses the tensor_field parameter on the model upload to change the input to tensor.

from wallaroo.framework import Framework
model = (wl.upload_model(model_name, 
                         model_file_name, 
                         framework=Framework.ONNX)
                         .configure(tensor_fields=["tensor"]
                        )
        )

pipeline.add_model_step(model)

deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(0.5).memory("1Gi").build()
pipeline.deploy(deployment_config=deploy_config)

smoke_test = pd.DataFrame.from_records([
    {
        "tensor":[
            1.0678324729,
            0.2177810266,
            -1.7115145262,
            0.682285721,
            1.0138553067,
            -0.4335000013,
            0.7395859437,
            -0.2882839595,
            -0.447262688,
            0.5146124988,
            0.3791316964,
            0.5190619748,
            -0.4904593222,
            1.1656456469,
            -0.9776307444,
            -0.6322198963,
            -0.6891477694,
            0.1783317857,
            0.1397992467,
            -0.3554220649,
            0.4394217877,
            1.4588397512,
            -0.3886829615,
            0.4353492889,
            1.7420053483,
            -0.4434654615,
            -0.1515747891,
            -0.2668451725,
            -1.4549617756
        ]
    }
])
result = pipeline.infer(smoke_test)
display(result)

	time	in.tensor	out.dense_1	anomaly.count
0	2023-10-17 16:13:56.169	[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756]	[0.0014974177]	0

Non-Native Runtime Upload

The following demonstrates uploading an PyTorch model to a Wallaroo Ops instance. In this example, the ML model is converted to the Wallaroo Native Runtime.

input_schema = pa.schema(
    [
        pa.field('input', pa.list_(pa.float32(), list_size=10))
    ]
)

output_schema = pa.schema(
[
    pa.field('output', pa.list_(pa.float32(), list_size=1))
]
)

model = wl.upload_model('pt-single-io-model', 
                        "./models/model-auto-conversion_pytorch_single_io_model.pt", 
                        framework=Framework.PYTORCH, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
display(model)

Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a native runtime..
Ready

model.config().runtime()

'onnx'

The following example demonstrates uploading a BYOP model. After it is uploaded, it is converted to a Wallaroo Containerized Runtime.

input_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=32
        ),
        list_size=32
    )),
])

output_schema = pa.schema([
    pa.field('predictions', pa.int64()),
])

model = wl.upload_model('vgg16-clustering', 
                       './models/model-auto-conversion-BYOP-vgg16-clustering.zip', 
                        framework=Framework.CUSTOM, 
                        input_schema=input_schema, 
                        output_schema=output_schema, 
                        convert_wait=True)

Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a container runtime..
Model is attempting loading to a container runtime..........................successful

Ready

model.config().runtime()

'flight'

Register a Containerized MLFlow Model

Parameter	Description
Web Site	https://mlflow.org
Supported Libraries	mlflow==1.3.0
Runtime	Containerized aka `mlflow`

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Containerized MLFlow models are not uploaded, but registered from a container registry service. This is performed through the wallaroo.client.register_model_image(options), and wallaroo.model_version.configure(options) method.

IMPORTANT NOTICE

Models registered through the Wallaroo SDK are associated with the current workspace in the SDK session, assigned as the user’s Default Workspace by default. See Wallaroo SDK Essentials Guide: Workspace Management for full details on creating and working with workspaces.

Register a Containerized MLFlow Model Parameters

The following parameters must be set for wallaroo.client.register_model_image(options) and wallaroo.model_version.configure(options) for a Containerized MLFlow model to be registered in Wallaroo.

Register Model Image Parameters

Parameter	Type	Description
`model_name`	`string` (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`image`	`string` (Required)	The URL to the containerized MLFlow model in the MLFlow Registry..

Model Version Configuration Parameters

Model version configurations are updated with the wallaroo.model_version.config and include the following parameters. Most are optional unless specified.

Parameter	Type	Description
runtime	String (Optional)	The model runtime from wallaroo.framework, plus `mlflow` for MLFlow containerized model registrations. }
tensor_fields	(List[string]) (Optional)	A list of alternate input fields. For example, if the model accepts the input fields `['variable1', 'variable2']`, `tensor_fields` allows those inputs to be overridden to `['square_feet', 'house_age']`, or other values as required.
input_schema	pyarrow.lib.Schema	The input schema for the model in `pyarrow.lib.Schema` format.
output_schema	pyarrow.lib.Schema	The output schema for the model in `pyarrow.lib.Schema` format.
batch_config	(List[string]) (Optional)	Batch config is either `None` for multiple-input inferences, or `single` to accept an inference request with only one row of data.

For model version configuration for MLFlow models, the following must be defined:

runtime: Set as mlflow.
input_schema: The input schema from the Apache Arrow pyarrow.lib.Schema format.
output_schema: The output schema from the Apache Arrow pyarrow.lib.Schema format.

Register a Containerized MLFlow Model Returns

wallaroo.client.register_model_image(options) returns the model version. The model version refers to the version of the model object in Wallaroo. In Wallaroo, a model version update happens when we upload a new model file (artifact) against the same model object name.

Note that models are uploaded to the current workspace assigned in the SDK session. By default, this is the user’s Default Workspace.

Field	Type	Description
`id`	Integer	The numerical identifier of the model version.
`name`	string	The name of the model.
`version`	string	The model version as a unique UUID.
`file_name`	string	The file name of the model as stored in Wallaroo.
`image_path`	string	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

Register a Containerized MLFlow Model Example

The following example demonstrates registering a Statsmodel model stored in a MLFLow container with a Wallaroo instance.

sm_input_schema = pa.schema([
  pa.field('temp', pa.float32()),
  pa.field('holiday', pa.uint8()),
  pa.field('workingday', pa.uint8()),
  pa.field('windspeed', pa.float32())
])

sm_output_schema = pa.schema([
    pa.field('predicted_mean', pa.float32())
])

sm_model = wl.register_model_image(
    name="mlflow-statmodels",
    image="ghcr.io/wallaroolabs/wallaroo_tutorials/mlflow-statsmodels-example:2023.1"
    ).configure("mlflow", 
            input_schema=sm_input_schema, 
            output_schema=sm_output_schema
    )

sm_model

Name	mlflowstatmodels
Version	eb1bcec8-63fe-4a82-98ea-fc4945786973
File Name	none
SHA	3afd13d9c5070679e284050cd099e84aa2e5cb7c08a788b21d6cb2397615d018
Status	ready
Image Path	ghcr.io/wallaroolabs/wallaroo_tutorials/mlflow-statsmodels-example:2023.1
Architecture	None
Updated At	2024-30-Jan 16:11:55

MLFlow Data Formats

When using containerized MLFlow models with Wallaroo, the inputs and outputs must be named. For example, the following output:

[-12.045839810372835]

Would need to be wrapped with the data values named:

[{"prediction": -12.045839810372835}]

A short sample code for wrapping data may be:

output_df = pd.DataFrame(prediction, columns=["prediction"])
return output_df

Get Model Config

The model versions configuration defines how the model is used in the Wallaroo Inference Engine. Settings include:

The runtime
Input and output schemas

The model version configuration is retrieved with the method wallaroo.model_version.ModelVersion.config().

Get Model Config Parameters

N/A

Get Model Config Returns

The method wallaroo.model_version.ModelVersion.config() returns wallaroo.model_config.ModelConfig. The following fields are part of the model config object.

Method	Return Type	Description
id()	Integer	The id of model version the configuration is assigned to.
to_yaml()	String	A YAML output of the model configuration options that are not None.
tensor_fields()	List[String]	A list of tensor field names that override the default model fields. Only applies to `onnx` models.
model_version()	wallaroo.model_version.ModelVersion	The model version the model configuration is assigned to.
runtime()	String*	The model runtime as defined by `wallaroo.framework.Framework`

Get Model Config Example

The following examples retrieves a model version, and displays the fields from the methods listed above.

import wallaroo

print(sample_model.versions()[-1])
model_config = sample_model.versions()[-1].config()

print(model_config.id())

2

print(model_config.tensor_fields())

None

print(model_config.model_version())

{'name': 'pt-unet', 'version': '9bbb0039-074b-4a49-8c3a-e86a543dcbb7', 'file_name': 'unet.pt', 'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.4.1-4351', 'arch': None, 'last_update_time': datetime.datetime(2024, 1, 18, 20, 2, 44, 866594, tzinfo=tzutc())}

print(model_config.runtime())

flight

print(model_config.to_yaml())

model_id: pt-unet
model_version: 9bbb0039-074b-4a49-8c3a-e86a543dcbb7
runtime: flight

Set Model Version Config

The model versions configuration defines how the model is used in the Wallaroo Inference Engine. Settings include:

The runtime
Input and output schemas
Tensor field name overrides

Model version configurations are set with the wallaroo.model_version.ModelVersion.configure() method.

Set Model Version Config Parameters

Model version configurations are updated with the wallaroo.model_version.config and include the following parameters. Most are optional unless specified.

Parameter	Type	Description
runtime	String (Optional)	The model runtime from wallaroo.framework, plus `mlflow` for MLFlow containerized model registrations. }
tensor_fields	(List[string]) (Optional)	A list of alternate input fields. For example, if the model accepts the input fields `['variable1', 'variable2']`, `tensor_fields` allows those inputs to be overridden to `['square_feet', 'house_age']`, or other values as required.
input_schema	pyarrow.lib.Schema	The input schema for the model in `pyarrow.lib.Schema` format.
output_schema	pyarrow.lib.Schema	The output schema for the model in `pyarrow.lib.Schema` format.
batch_config	(List[string]) (Optional)	Batch config is either `None` for multiple-input inferences, or `single` to accept an inference request with only one row of data.

Set Model Version Config Returns

wallaroo.model_version.ModelVersion.configure() returns the wallaroo.model_version.ModelVersion the model configuration is assigned to.

Set Model Version Config Examples

The following example updates the input and output schemas for a model version.

import wallaroo

print(sample_model)

{'name': 'pt-unet', 'versions': 2, 'owner_id': '""', 'last_update_time': datetime.datetime(2024, 1, 18, 19, 58, 12, 603652, tzinfo=tzutc()), 'created_at': datetime.datetime(2024, 1, 18, 19, 57, 0, 984506, tzinfo=tzutc())}

sample_model_version = sample_model.versions()[-1]

sm_input_schema = pa.schema([
  pa.field('temp', pa.float32()),
  pa.field('holiday', pa.uint8()),
  pa.field('workingday', pa.uint8()),
  pa.field('windspeed', pa.float32())
])

sm_output_schema = pa.schema([
    pa.field('predicted_mean', pa.float32())
])

sample_model_version.configure(input_schema=sm_input_schema, output_schema=sm_output_schema)

Name	pt-unet
Version	9bbb0039-074b-4a49-8c3a-e86a543dcbb7
File Name	unet.pt
SHA	dfcd4b092e05564c36d28f1dfa7293f4233a384d81fe345c568b6bb68cafb0c8
Status	ready
Image Path	proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.4.1-4351
Architecture	None
Updated At	2024-18-Jan 20:02:44