Wallaroo SDK Essentials Guide: Model Uploads and Registrations: ONNX
How to upload and use ONNX ML Models with Wallaroo
ML models are either uploaded as files or registered from container registry services to a Wallaroo Ops workspace through:
Once a ML model is added to a workspace, it is prepared for deployment based on the model’s runtime.
ML models uploaded to Wallaroo come in two runtimes:
The following is a short guide on using the Wallaroo SDK to upload, register and alter model configuration. For complete details, see either the The Wallaroo SDK
or The Wallaroo MLOps API guides.
The following ML Model frameworks are supported by Wallaroo. For full details on each framework, see the Wallaroo Model Frameworks guide.
The following frameworks are supported. Frameworks fall under either Wallaroo Native Runtimes or Wallaroo Containerized Runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.
pyarrow.float32()
, submitting a pyarrow.float64()
may cause an error.The Wallaroo Model Runtime is displayed after a model is uploaded with the wallaroo.model.config().runtime()
method. The following table displays the type of Runtime associated with each possible display.
Runtime Display | Model Runtime Space | |
---|---|---|
tensorflow | Native | |
onnx | Native | |
python | Native | |
mlflow | Containerized | |
flight | Containerized |
Model | Wallaroo Framework | Runtime | Supported Versions |
---|---|---|---|
ONNX | Framework.ONNX aka onnx | Native aka onnx | 1.12.1 |
Tensorflow | Framework.TENSORFLOW aka tensorflow | Native aka tensorflow | SavedModel format as .zip file |
Python | Framework.PYTHON aka python | Native aka python | python==3.8 |
Model | Wallaroo Frameworks | Runtime(s) |
---|---|---|
Arbitrary Python (BYOP) | Framework.CUSTOM aka custom | Containerized aka flight |
Hugging Face |
| Containerized flight |
PyTorch | Framework.PYTORCH aka pytorch | Native onnx and Containerized flight |
SKLearn | Framework.SKLEARN aka sklearn | Native onnx and Containerized flight |
Tensorflow keras | Framework.KERAS aka keras | Native onnx and Containerized flight |
XGBoost | Framework.XGBOOST aka xgboost | Native onnx and Containerized flight |
MLFlow | N/A^* | Containerized aka mlflow |
Wallaroo frameworks are listed from the wallaroo.framework.Framework
class. The following demonstrates listing all available supported frameworks.
from wallaroo.framework import Framework
[e.value for e in Framework]
['onnx',
'tensorflow',
'python',
'keras',
'sklearn',
'pytorch',
'xgboost',
'hugging-face-feature-extraction',
'hugging-face-image-classification',
'hugging-face-image-segmentation',
'hugging-face-image-to-text',
'hugging-face-object-detection',
'hugging-face-question-answering',
'hugging-face-stable-diffusion-text-2-img',
'hugging-face-summarization',
'hugging-face-text-classification',
'hugging-face-translation',
'hugging-face-zero-shot-classification',
'hugging-face-zero-shot-image-classification',
'hugging-face-zero-shot-object-detection',
'hugging-face-sentiment-analysis',
'hugging-face-text-generation']
ML Models are uploaded to Wallaroo Ops through the wallaroo.client.upload_model
method.
wallaroo.client.upload_model
has the following parameters.
Parameter | Type | Description |
---|---|---|
name | string (Required) | The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model. |
path | string (Required) | The path to the model file being uploaded. |
framework | string (Required) | The framework of the model from wallaroo.framework |
input_schema | pyarrow.lib.Schema
| The input schema in Apache Arrow schema format. |
output_schema | pyarrow.lib.Schema
| The output schema in Apache Arrow schema format. |
convert_wait | bool
| Not required for native runtimes.
|
arch | wallaroo.engine_config.Architecture (Optional) | The architecture the model is deployed to. If a model is intended for deployment to an ARM architecture, it must be specified during this step. Values include:
|
wallaroo.client.upload_model
returns the model version. The model version refers to the version of the model object in Wallaroo. In Wallaroo, a model version update happens when we upload a new model file (artifact) against the same model object name.
Field | Type | Description |
---|---|---|
id | Integer | The numerical identifier of the model version. |
name | string | The name of the model. |
version | string | The model version as a unique UUID. |
file_name | string | The file name of the model as stored in Wallaroo. |
image_path | string | The image used to deploy the model in the Wallaroo engine. |
last_update_time | DateTime | When the model was last updated. |
The following examples demonstrate uploading different model types.
The following demonstrates uploading a ONNX model to a Wallaroo Ops instance. For Wallaroo SDK Essentials Guide: Model Uploads and Registrations: ONNX for full details on uploading ONNX models and model configurations.
The first demonstration shows uploading an ONNX model with the default input fields. The seconds demonstration shows uploading an ONNX model and overriding the default input field names.
# determine the model's input and output fields
import onnx
onnx_file_model_name = './path/to/onnx/file/file.onnx'
model = onnx.load(onnx_file_model_name)
output =[node.name for node in model.graph.output]
input_all = [node.name for node in model.graph.input]
input_initializer = [node.name for node in model.graph.initializer]
net_feed_input = list(set(input_all) - set(input_initializer))
print('Inputs: ', net_feed_input)
print('Outputs: ', output)
Inputs: ['dense_input']
Outputs: ['dense_1']
The following Wallaroo upload will use the ONNX model’s default input.
from wallaroo.framework import Framework
model = (wl.upload_model(model_name,
model_file_name
)
)
pipeline.add_model_step(model)
deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(0.5).memory("1Gi").build()
pipeline.deploy(deployment_config=deploy_config)
smoke_test = pd.DataFrame.from_records([
{
"dense_input":[
1.0678324729,
0.2177810266,
-1.7115145262,
0.682285721,
1.0138553067,
-0.4335000013,
0.7395859437,
-0.2882839595,
-0.447262688,
0.5146124988,
0.3791316964,
0.5190619748,
-0.4904593222,
1.1656456469,
-0.9776307444,
-0.6322198963,
-0.6891477694,
0.1783317857,
0.1397992467,
-0.3554220649,
0.4394217877,
1.4588397512,
-0.3886829615,
0.4353492889,
1.7420053483,
-0.4434654615,
-0.1515747891,
-0.2668451725,
-1.4549617756
]
}
])
result = pipeline.infer(smoke_test)
display(result)
time | in.dense_input | out.dense_1 | anomaly.count | |
---|---|---|---|---|
0 | 2023-10-17 16:13:56.169 | [1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756] | [0.0014974177] | 0 |
The following uses the tensor_field
parameter on the model upload to change the input to tensor
.
from wallaroo.framework import Framework
model = (wl.upload_model(model_name,
model_file_name,
framework=Framework.ONNX)
.configure(tensor_fields=["tensor"]
)
)
pipeline.add_model_step(model)
deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(0.5).memory("1Gi").build()
pipeline.deploy(deployment_config=deploy_config)
smoke_test = pd.DataFrame.from_records([
{
"tensor":[
1.0678324729,
0.2177810266,
-1.7115145262,
0.682285721,
1.0138553067,
-0.4335000013,
0.7395859437,
-0.2882839595,
-0.447262688,
0.5146124988,
0.3791316964,
0.5190619748,
-0.4904593222,
1.1656456469,
-0.9776307444,
-0.6322198963,
-0.6891477694,
0.1783317857,
0.1397992467,
-0.3554220649,
0.4394217877,
1.4588397512,
-0.3886829615,
0.4353492889,
1.7420053483,
-0.4434654615,
-0.1515747891,
-0.2668451725,
-1.4549617756
]
}
])
result = pipeline.infer(smoke_test)
display(result)
time | in.tensor | out.dense_1 | anomaly.count | |
---|---|---|---|---|
0 | 2023-10-17 16:13:56.169 | [1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756] | [0.0014974177] | 0 |
The following demonstrates uploading an PyTorch model to a Wallaroo Ops instance. In this example, the ML model is converted to the Wallaroo Native Runtime.
input_schema = pa.schema(
[
pa.field('input', pa.list_(pa.float32(), list_size=10))
]
)
output_schema = pa.schema(
[
pa.field('output', pa.list_(pa.float32(), list_size=1))
]
)
model = wl.upload_model('pt-single-io-model',
"./models/model-auto-conversion_pytorch_single_io_model.pt",
framework=Framework.PYTORCH,
input_schema=input_schema,
output_schema=output_schema
)
display(model)
Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a native runtime..
Ready
model.config().runtime()
'onnx'
The following example demonstrates uploading a BYOP model. After it is uploaded, it is converted to a Wallaroo Containerized Runtime.
input_schema = pa.schema([
pa.field('images', pa.list_(
pa.list_(
pa.list_(
pa.int64(),
list_size=3
),
list_size=32
),
list_size=32
)),
])
output_schema = pa.schema([
pa.field('predictions', pa.int64()),
])
model = wl.upload_model('vgg16-clustering',
'./models/model-auto-conversion-BYOP-vgg16-clustering.zip',
framework=Framework.CUSTOM,
input_schema=input_schema,
output_schema=output_schema,
convert_wait=True)
Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a container runtime..
Model is attempting loading to a container runtime..........................successful
Ready
model.config().runtime()
'flight'
Parameter | Description |
---|---|
Web Site | https://mlflow.org |
Supported Libraries | mlflow==1.3.0 |
Runtime | Containerized aka mlflow |
For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.
This guide details how to add ML Models from a model registry service into Wallaroo.
Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.
Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.
As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.
Containerized MLFlow models are not uploaded, but registered from a container registry service. This is performed through the wallaroo.client.register_model_image(options)
, and wallaroo.model_version.configure(options)
method.
The following parameters must be set for wallaroo.client.register_model_image(options)
and wallaroo.model_version.configure(options)
for a Containerized MLFlow model to be registered in Wallaroo.
Parameter | Type | Description |
---|---|---|
model_name | string (Required) | The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model. |
image | string (Required) | The URL to the containerized MLFlow model in the MLFlow Registry.. |
Model version configurations are updated with the wallaroo.model_version.config
and include the following parameters. Most are optional unless specified.
Parameter | Type | Description |
---|---|---|
runtime | String (Optional) | The model runtime from wallaroo.framework, plus mlflow for MLFlow containerized model registrations. } |
tensor_fields | (List[string]) (Optional) | A list of alternate input fields. For example, if the model accepts the input fields ['variable1', 'variable2'] , tensor_fields allows those inputs to be overridden to ['square_feet', 'house_age'] , or other values as required. |
input_schema | pyarrow.lib.Schema | The input schema for the model in pyarrow.lib.Schema format. |
output_schema | pyarrow.lib.Schema | The output schema for the model in pyarrow.lib.Schema format. |
batch_config | (List[string]) (Optional) | Batch config is either None for multiple-input inferences, or single to accept an inference request with only one row of data. |
For model version configuration for MLFlow models, the following must be defined:
runtime
: Set as mlflow
.input_schema
: The input schema from the Apache Arrow pyarrow.lib.Schema
format.output_schema
: The output schema from the Apache Arrow pyarrow.lib.Schema
format.wallaroo.client.register_model_image(options)
returns the model version. The model version refers to the version of the model object in Wallaroo. In Wallaroo, a model version update happens when we upload a new model file (artifact) against the same model object name.
Field | Type | Description |
---|---|---|
id | Integer | The numerical identifier of the model version. |
name | string | The name of the model. |
version | string | The model version as a unique UUID. |
file_name | string | The file name of the model as stored in Wallaroo. |
image_path | string | The image used to deploy the model in the Wallaroo engine. |
last_update_time | DateTime | When the model was last updated. |
The following example demonstrates registering a Statsmodel model stored in a MLFLow container with a Wallaroo instance.
sm_input_schema = pa.schema([
pa.field('temp', pa.float32()),
pa.field('holiday', pa.uint8()),
pa.field('workingday', pa.uint8()),
pa.field('windspeed', pa.float32())
])
sm_output_schema = pa.schema([
pa.field('predicted_mean', pa.float32())
])
sm_model = wl.register_model_image(
name="mlflow-statmodels",
image="ghcr.io/wallaroolabs/wallaroo_tutorials/mlflow-statsmodels-example:2023.1"
).configure("mlflow",
input_schema=sm_input_schema,
output_schema=sm_output_schema
)
sm_model
Name | mlflowstatmodels |
---|---|
Version | eb1bcec8-63fe-4a82-98ea-fc4945786973 |
File Name | none |
SHA | 3afd13d9c5070679e284050cd099e84aa2e5cb7c08a788b21d6cb2397615d018 |
Status | ready |
Image Path | ghcr.io/wallaroolabs/wallaroo_tutorials/mlflow-statsmodels-example:2023.1 |
Architecture | None |
Updated At | 2024-30-Jan 16:11:55 |
When using containerized MLFlow models with Wallaroo, the inputs and outputs must be named. For example, the following output:
[-12.045839810372835]
Would need to be wrapped with the data values named:
[{"prediction": -12.045839810372835}]
A short sample code for wrapping data may be:
output_df = pd.DataFrame(prediction, columns=["prediction"])
return output_df
The model versions configuration defines how the model is used in the Wallaroo Inference Engine. Settings include:
The model version configuration is retrieved with the method wallaroo.model_version.ModelVersion.config()
.
N/A
The method wallaroo.model_version.ModelVersion.config()
returns wallaroo.model_config.ModelConfig
. The following fields are part of the model config object.
Method | Return Type | Description |
---|---|---|
id() | Integer | The id of model version the configuration is assigned to. |
to_yaml() | String | A YAML output of the model configuration options that are not None. |
tensor_fields() | List[String] | A list of tensor field names that override the default model fields. Only applies to onnx models. |
model_version() | wallaroo.model_version.ModelVersion | The model version the model configuration is assigned to. |
runtime() | String* | The model runtime as defined by wallaroo.framework.Framework |
The following examples retrieves a model version, and displays the fields from the methods listed above.
import wallaroo
print(sample_model.versions()[-1])
model_config = sample_model.versions()[-1].config()
print(model_config.id())
2
print(model_config.tensor_fields())
None
print(model_config.model_version())
{'name': 'pt-unet', 'version': '9bbb0039-074b-4a49-8c3a-e86a543dcbb7', 'file_name': 'unet.pt', 'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.4.1-4351', 'arch': None, 'last_update_time': datetime.datetime(2024, 1, 18, 20, 2, 44, 866594, tzinfo=tzutc())}
print(model_config.runtime())
flight
print(model_config.to_yaml())
model_id: pt-unet
model_version: 9bbb0039-074b-4a49-8c3a-e86a543dcbb7
runtime: flight
The model versions configuration defines how the model is used in the Wallaroo Inference Engine. Settings include:
Model version configurations are set with the wallaroo.model_version.ModelVersion.configure()
method.
Model version configurations are updated with the wallaroo.model_version.config
and include the following parameters. Most are optional unless specified.
Parameter | Type | Description |
---|---|---|
runtime | String (Optional) | The model runtime from wallaroo.framework, plus mlflow for MLFlow containerized model registrations. } |
tensor_fields | (List[string]) (Optional) | A list of alternate input fields. For example, if the model accepts the input fields ['variable1', 'variable2'] , tensor_fields allows those inputs to be overridden to ['square_feet', 'house_age'] , or other values as required. |
input_schema | pyarrow.lib.Schema | The input schema for the model in pyarrow.lib.Schema format. |
output_schema | pyarrow.lib.Schema | The output schema for the model in pyarrow.lib.Schema format. |
batch_config | (List[string]) (Optional) | Batch config is either None for multiple-input inferences, or single to accept an inference request with only one row of data. |
wallaroo.model_version.ModelVersion.configure()
returns the wallaroo.model_version.ModelVersion
the model configuration is assigned to.
The following example updates the input and output schemas for a model version.
import wallaroo
print(sample_model)
{'name': 'pt-unet', 'versions': 2, 'owner_id': '""', 'last_update_time': datetime.datetime(2024, 1, 18, 19, 58, 12, 603652, tzinfo=tzutc()), 'created_at': datetime.datetime(2024, 1, 18, 19, 57, 0, 984506, tzinfo=tzutc())}
sample_model_version = sample_model.versions()[-1]
sm_input_schema = pa.schema([
pa.field('temp', pa.float32()),
pa.field('holiday', pa.uint8()),
pa.field('workingday', pa.uint8()),
pa.field('windspeed', pa.float32())
])
sm_output_schema = pa.schema([
pa.field('predicted_mean', pa.float32())
])
sample_model_version.configure(input_schema=sm_input_schema, output_schema=sm_output_schema)
Name | pt-unet |
---|---|
Version | 9bbb0039-074b-4a49-8c3a-e86a543dcbb7 |
File Name | unet.pt |
SHA | dfcd4b092e05564c36d28f1dfa7293f4233a384d81fe345c568b6bb68cafb0c8 |
Status | ready |
Image Path | proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.4.1-4351 |
Architecture | None |
Updated At | 2024-18-Jan 20:02:44 |
How to upload and use ONNX ML Models with Wallaroo
How to upload and use Containerized MLFlow with Wallaroo
How to upload and use Containerized MLFlow with Wallaroo
How to upload and use Registry ML Models with Wallaroo
How to upload and use Python Models as Wallaroo Pipeline Steps
How to upload and use PyTorch ML Models with Wallaroo
How to upload and use SKLearn ML Models with Wallaroo
How to upload and use Hugging Face ML Models with Wallaroo
How to upload and use TensorFlow ML Models with Wallaroo
How to upload and use TensorFlow Keras ML Models with Wallaroo
How to upload and use XGBoost ML Models with Wallaroo