Model Upload

How to upload models to a Wallaroo Ops instance.

ML models are either uploaded as files or registered from container registry services to a Wallaroo Ops workspace through:

Once a ML model is added to a workspace, it is prepared for deployment based on the model’s runtime.

ML models uploaded to Wallaroo come in two runtimes:

Wallaroo Native Runtimes: The following model frameworks are always deployed in the Wallaroo Native Runtime. When these model frameworks are uploaded to Wallaroo, the model name, file path, and model framework are required.
- ONNX
- Tensorflow
Wallaroo Containerized Runtimes: The following model frameworks may be deployed in either the Wallaroo Native Runtime, or the Wallaroo Containerized Runtime. When these models are uploaded to Wallaroo, the model name, file path, model framework, input and output schemas are required.
When uploaded, Wallaroo will attempt to convert Non-Native Runtimes to a Wallaroo Native Runtime. If it can not be converted, then it will be packed into a Wallaroo Containerized Runtime.
- Custom Models (BYOP)
- Python Models
- PyTorch
- SKLearn
- Hugging Face
- Tensorflow Keras
- XGBoost

The following is a short guide on using the Wallaroo SDK to upload, register and alter model configuration. For complete details, see either the The Wallaroo SDK or The Wallaroo MLOps API guides.

Upload Model

Models are uploaded via the wallaroo.client.Client.upload_model method.

Upload Model Parameters

wallaroo.client.upload_model has the following parameters.

Parameter	Type	Description
`name`	`string` (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	`string` (Required)	The path to the model file being uploaded.
`framework`	`string` (Required)	The framework of the model from `wallaroo.framework`
`input_schema`	`pyarrow.lib.Schema` Native Wallaroo Runtimes: (Optional) Non-Native Wallaroo Runtimes: (Required)	The input schema in Apache Arrow schema format.
`output_schema`	`pyarrow.lib.Schema` Native Wallaroo Runtimes: (Optional) Non-Native Wallaroo Runtimes: (Required)	The output schema in Apache Arrow schema format.
`convert_wait`	`bool` (Optional)	True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

Upload Model Returns

Models succesfully uploaded to Wallaroo via wallaroo.client.upload_model return wallaroo.model_version.ModelVersion with the following fields.

Field	Description
Name	The name assigned to the model at upload.
Version	The version of the model in UUID format.
File Name	The file name at model upload.
Status	The status of the model. Models are ready for deployment with status `ready`.
Image Path	The containerized path of the model image; optional.

Model Input and Output Schemas

For Containerized Wallaroo Runtimes, the input and output schemas are required. These are set via the wallaroo.client.Client.upload_model(input_schema: pyarrow.lib.Schema) and wallaroo.client.Client.upload_model(output_schema: pyarrow.lib.Schema).

IMPORTANT NOTE

Some model frameworks have different input and output requirements. For example, PyTorch requires specific column names, while SKLearn outputs are either predictions or probabilities. Check the model framework for each model in Wallaroo Supported Models.

Data types for inputs and outputs to inference requests through Wallaroo is based on the Apache Arrow Data Types, with the following exceptions:

null: Not allowed. All fields must have submitted values that match their data type. For example, if the schema expects a float value, then some value of type float such as 0.0 must be submitted and cannot be None or Null. If a schema expects a string value, then some value of type string must be submitted, etc. The exception are BYOP models, which can accept optional inputs.
time32 and time64: Datetime data types must be converted to string.

The following example demonstrates uploading Hugging Face model as a Wallaroo Containerized Runtime to Wallaroo with the following data schemas.

Inputs:

Field	Type	PyArrow Schema
`inputs`	String	`pa.string()`
`candidate_labels`	List(String) of length 2.	`pa.list_(pa.string(), list_size=2)`
`hypothesis_template`	String	`pa.string()`
`multi_label`	Bool	`pa.bool_()`

Outputs:

Field	Type	PyArrow Schema
`sequence`	String	`pa.string()`
`scores`	List(String) of length 2.	`pa.list_(pa.float64(), list_size=2)`
`labels`	List(String) of length 2.	`pa.list_(pa.string(), list_size=2)`

The input and output parameters are set in pyarrow.schema format, then applied during the model upload.

import pyarrow as pa

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)),
    pa.field('hypothesis_template', pa.string()),
    pa.field('multi_label', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

model = wl.upload_model("hf-zero-shot-classification",
                       "./models/zero-shot-classification-pipeline.zip",
                        framework=Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION,
                        input_schema=input_schema,
                        output_schema=output_schema
                        )

Data Constraints

Data submitted to Wallaroo for inference requests have the following data constraints.

Equal rows constraint: The number of input rows and output rows must match.
Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows

Each input row for an inference is related directly to the inference row output.

For example, the INPUT and OUTPUT rows match, with each input row directly corresponding to an output row.

Equal Rows Input Example

	tensor
0	[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1	[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2	[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3	[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4	[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

Equal Rows Output Example

	time	in.tensor	out.dense_1
0	2023-11-17 20:34:17.005	[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]	[0.99300325]
1	2023-11-17 20:34:17.005	[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]	[0.99300325]
2	2023-11-17 20:34:17.005	[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]	[0.99300325]
3	2023-11-17 20:34:17.005	[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]	[0.99300325]
4	2023-11-17 20:34:17.005	[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311]	[0.0010916889]

Data Type Consistency

Each element must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df

	t	s
0	[2.35, 5.75, 19.2]	[Bob, Nancy, Wani]
1	[5.55, 7.2, 15.7]	[Jason, Rita, Phoebe]

Input Data Optimization

Some models, like computer vision models, can expect very large arrays as their inputs. If a model uses Wallaroo Containerized Runtimes and expects array inputs, the following is recommended for best performance. Note that these are not data submission requirements, but optimizations to improve inference performance in order of impact:

Submit inference data in Apache Arrow format instead of pandas DataFrames. This is especially performant for very large size (for example, converting 4K images and video to array).
Flatten multi-dimensional arrays.
If the ML model inputs allow, use the Apache PyArrow fixed shape tensor pyarrow.fixed_shape_tensor with the multi-dimensional arrays.

The following shows different input schema options and the performance gains for each option. These samples were taken from image processing models dealing with the large array in the shape (2052, 2456). The following is a sample based on converting images and performing inferences with models in the Wallaroo BYOP Framework with example conversions.

	Use Pandas Input	Convert to Apache Arrow	Apache Arrow with Flattened Arrays	Apache Arrow with Fixed Shape Tensor Arrays
Code Snippet	`input_data = pd.DataFrame({"image": [image.tolist()]})`	`input_data = pa.Table.from_pandas(df, schema=input_schema)`	`input_data = pa.Table.from_pydict({` `'image': pa.FixedSizeListArray.from_arrays(image.ravel(), len(image.ravel())),` `'dim0': [image.shape[0]],` `'dim1': [image.shape[1]],` `})`	`input_data = pa.Table.from_pydict({` `'image': pa.FixedShapeTensorArray.from_numpy_ndarray(image),` `})`
Performance Benchmark	100 seconds	10 Seconds	65 ms	18 ms

For further optimizations, see Model Performance.

Wallaroo Supported Models

The following frameworks are supported. Frameworks fall under either Wallaroo Native Runtimes or Wallaroo Containerized Runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

IMPORTANT NOTE

Verify that the input types match the specified inputs, especially for Containerized Wallaroo Runtimes. For example, if the input is listed as a pyarrow.float32(), submitting a pyarrow.float64() may cause an error.

Open Neural Network eXchange(ONNX) is the default model runtime supported by Wallaroo.

Wallaroo ONNX Requirements

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

Parameter	Description
Web Site	https://onnx.ai/
Supported Libraries	See table below.
Framework	`Framework.ONNX` aka `onnx`
Runtime	Native aka `onnx`
Supported Versions	onnx==1.14.1

The following ONNX versions models are supported:

ONNX Version	ONNX IR Version	ONNX OPset Version	ONNX ML Opset Version
onnx==1.14.1	8	17	3

If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the Wallaroo Native Runtime space.

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be lower case ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Uploading ONNX Models

ONNX models are uploaded to the current workspace through the Wallaroo Client upload_model(name, path, framework, input_schema, output_schema).configure(options).

Upload ONNX Model Parameters

The following parameters are required for ONNX models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a ONNX model to Wallaroo.

For ONNX models, the input_schema and output_schema are not required.

Parameter	Type	Description
`name`	String (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	String (Required)	The path to the model file being uploaded.
`framework`	String (Required)	Set as the `Framework.ONNX`.
`input_schema`	pyarrow.lib.Schema (Optional)	The input schema in Apache Arrow schema format.
`output_schema`	pyarrow.lib.Schema (Optional)	The output schema in Apache Arrow schema format.
`convert_wait`	Boolean (Optional) (Default: True)	Not required for native runtimes. True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

ONNX Model Config Parameters

Model version configurations are updated with the wallaroo.model_version.config. The following are additional optional configurations for ONNX models.

Parameter	Type	Description
`tensor_fields`	(List[string]) (Optional)	A list of alternate input fields. For example, if the model accepts the input fields `['variable1', 'variable2']`, `tensor_fields` allows those inputs to be overridden to `['square_feet', 'house_age']`, or other values as required.
`batch_config`	(List[string]) (Optional)	Batch config is either `None` for multiple-input inferences, or `single` to accept an inference request with only one row of data.

ONNX Model Inputs

By default, inferencing in Wallaroo uses the same input fields as the ONNX model. This is overwritten with the wallaroo.model.configure(tensor_fields=List[String]) method to change the model input fields to match the tensor_field List.

IMPORTANT NOTE: The tensor_field length must match the ONNX model’s input field’s list.

The following displays the input fields for ONNX models. Replace onnx_file_model_name with the path to the ONNX model file.

import onnx

onnx_file_model_name = './path/to/onnx/file/file.onnx'

model = onnx.load(onnx_file_model_name)
output =[node.name for node in model.graph.output]

input_all = [node.name for node in model.graph.input]
input_initializer =  [node.name for node in model.graph.initializer]
net_feed_input = list(set(input_all)  - set(input_initializer))

print('Inputs: ', net_feed_input)
print('Outputs: ', output)

Inputs:  ['dense_input']
Outputs:  ['dense_1']

The following Wallaroo upload will use the ONNX model’s default input.

from wallaroo.framework import Framework
model = (wl.upload_model(model_name, 
                         model_file_name
                        )
        )

pipeline.add_model_step(model)

deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(0.5).memory("1Gi").build()
pipeline.deploy(deployment_config=deploy_config)

smoke_test = pd.DataFrame.from_records([
    {
        "dense_input":[
            1.0678324729,
            0.2177810266,
            -1.7115145262,
            0.682285721,
            1.0138553067,
            -0.4335000013,
            0.7395859437,
            -0.2882839595,
            -0.447262688,
            0.5146124988,
            0.3791316964,
            0.5190619748,
            -0.4904593222,
            1.1656456469,
            -0.9776307444,
            -0.6322198963,
            -0.6891477694,
            0.1783317857,
            0.1397992467,
            -0.3554220649,
            0.4394217877,
            1.4588397512,
            -0.3886829615,
            0.4353492889,
            1.7420053483,
            -0.4434654615,
            -0.1515747891,
            -0.2668451725,
            -1.4549617756
        ]
    }
])
result = pipeline.infer(smoke_test)
display(result)

	time	in.dense_input	out.dense_1	anomaly.count
0	2023-10-17 16:13:56.169	[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756]	[0.0014974177]	0

The following uses the tensor_field parameter on the model upload to change the input to tensor.

from wallaroo.framework import Framework
model = (wl.upload_model(model_name, 
                         model_file_name, 
                         framework=Framework.ONNX)
                         .configure(tensor_fields=["tensor"]
                        )
        )

pipeline.add_model_step(model)

deploy_config = wallaroo.DeploymentConfigBuilder().replica_count(1).cpus(0.5).memory("1Gi").build()
pipeline.deploy(deployment_config=deploy_config)

smoke_test = pd.DataFrame.from_records([
    {
        "tensor":[
            1.0678324729,
            0.2177810266,
            -1.7115145262,
            0.682285721,
            1.0138553067,
            -0.4335000013,
            0.7395859437,
            -0.2882839595,
            -0.447262688,
            0.5146124988,
            0.3791316964,
            0.5190619748,
            -0.4904593222,
            1.1656456469,
            -0.9776307444,
            -0.6322198963,
            -0.6891477694,
            0.1783317857,
            0.1397992467,
            -0.3554220649,
            0.4394217877,
            1.4588397512,
            -0.3886829615,
            0.4353492889,
            1.7420053483,
            -0.4434654615,
            -0.1515747891,
            -0.2668451725,
            -1.4549617756
        ]
    }
])
result = pipeline.infer(smoke_test)
display(result)

	time	in.tensor	out.dense_1	anomaly.count
0	2023-10-17 16:13:56.169	[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756]	[0.0014974177]	0

Upload ONNX Model Return

upload_model returns a wallaroo.model_version.ModelVersion object with the following fields.

Field	Type	Description
`name`	String	The name of the model.
`version`	String	The model version as a unique UUID.
`file_name`	String	The file name of the model as stored in Wallaroo.
`SHA`	String	The hash value of the model file.
`Status`	String	The status of the model.
`image_path`	String	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

For example:

model_name = "embedder-o"
model_path = "./embedder.onnx"

embedder = wl.upload_model(model_name, model_path, Framework=Framework.ONNX).configure("onnx")

ONNX Conversion Tips

When converting from one ML model type to an ONNX ML model, the input and output fields should be specified so users anticipate the exact field names used in their code. This prevents conversion naming formats from creating unintended names, and sets consistent field names that can be relied upon in future code updates.

The following example shows naming the input and output names when converting from a PyTorch model to an ONNX model. Note that the input fields are set to data, and the output fields are set to output_names = ["bounding-box", "classification","confidence"].

input_names = ["data"]
output_names = ["bounding-box", "classification","confidence"]
torch.onnx.export(model,
                    tensor,
                    pytorchModelPath+'.onnx',
                    input_names=input_names,
                    output_names=output_names,
                    opset_version=17,
                    )

See the documentation for the specific ML model being converting from to ONNX for complete details.

ONNX Tutorials

The following tutorials demonstrate deploying and performing sample inferences on ONNX models in Wallaroo.

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be lower case ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports TensorFlow models by containerizing the model and running as an image.

Parameter	Description
Web Site	https://www.tensorflow.org/
Supported Libraries	`tensorflow==2.13.1`
Framework	`Framework.TENSORFLOW` aka `tensorflow`
Supported File Types	SavedModel format as .zip file

IMPORTANT NOTE

These requirements are not for Tensorflow Keras models, only for non-Keras Tensorflow models in the SavedModel format. For Tensorflow Keras deployment in Wallaroo, see the Tensorflow Keras requirements.

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

Tensorflow Library in Wallaroo JupyterHub Service

For Jupyter Notebooks running the Wallaroo JupyterHub Service that import the tensorflow library, for example:

import tensorflow

Import the tensorflow-cpu library by executing the following command in the terminal shell:

pip install tensorflow-cpu==2.13.1 --user

Then proceed with the running the Notebook. This only applies to running notebooks in Wallaroo’s JupyterHub service, and does not affect model upload, packaging or other Wallaroo functionality. This does not affect using the Wallaroo SDK outside of Wallaroo’s JupyterHub service.

Uploading TensorFlow Models

TensorFlow models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload TensorFlow Model Parameters

The following parameters are required for TensorFlow models. Tensorflow models are native runtimes in Wallaroo, so the input_schema and output_schema parameters are optional.

Parameter	Type	Description
`name`	String (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	String (Required)	The path to the model file being uploaded.
`framework`	String (Required)	Set as the `Framework.TENSORFLOW`.
`input_schema`	pyarrow.lib.Schema (Optional)	The input schema in Apache Arrow schema format.
`output_schema`	pyarrow.lib.Schema (Optional)	The output schema in Apache Arrow schema format.
`convert_wait`	Boolean (Optional) (Default: True)	Not required for native runtimes. True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload TensorFlow Model Return

upload_model returns a wallaroo.model_version.ModelVersion object with the following fields.

Field	Type	Description
`name`	String	The name of the model.
`version`	String	The model version as a unique UUID.
`file_name`	String	The file name of the model as stored in Wallaroo.
`SHA`	String	The hash value of the model file.
`Status`	String	The status of the model.
`image_path`	String	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

For example, the following example is of uploading a TensorFlow ML Model to a Wallaroo instance.

from wallaroo.framework import Framework
model = wl.upload_model(model_name, 
                        model_file_name,
                        framework=Framework.TENSORFLOW
                        )

Tensorflow Tutorials

The following tutorials demonstrate deploying and performing sample inferences on Tensorflow models in Wallaroo.

Tensorflow Aloha

Custom Models or BYOP (Bring Your Own Predict) allows organizations to use Python scripts and supporting libraries as it’s own model. Similar to using a Python step, Custom Models are an even more robust and flexible tool for working with ML Models in Wallaroo pipelines.

Parameter	Description
Web Site	https://www.python.org/
Supported Libraries	`python==3.10`
Framework	`Framework.CUSTOM` aka `custom`

Custom Model, also known as Bring Your Own Predict (BYOP) allow for custom model inference methods with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with their supporting artifacts such as other Python modules, scripts, model files, etc.

Contrast this with Wallaroo Python models - aka “Python steps” - are standalone python scripts that use the python libraries. These are commonly used for data formatting such as the pre and post-processing steps, and are also appropriate for simple models (such as ARIMA Statsmodels). A Wallaroo Python model can be composed of one or more Python script that matches the Wallaroo requirements.

Custom Model File Requirements

Custom Model (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

Artifact	Type	Description
Python scripts aka `.py` files with classes that extend `mac.inference.Inference` and `mac.inference.creation.InferenceBuilder`	Python Script	Extend the classes `mac.inference.Inference` and `mac.inference.creation.InferenceBuilder`. These are included with the Wallaroo SDK](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-install-guides/). Further details are in [Custom Model Script Requirements. Note that there is no specified naming requirements for the classes that extend `mac.inference.Inference` and `mac.inference.creation.InferenceBuilder` - any qualified class name is sufficient as long as these two classes are extended as defined below.
`requirements.txt`	Python requirements file	This sets the Python libraries used for the Custom Model. These libraries should be targeted for Python 3.10 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifacts	Files	Other models, files, and other artifacts used in support of this model.

For example, if the Custom Model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample Custom Model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Custom Model uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Custom Model Script Requirements

The entry point of the Custom Model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

mac.inference.Inference

mac.inference.Inference Objects

Object	Type	Description
`model` (Required)	`**Any]`	One or more objects that match the `expected_model_types`. This can be a ML Model (for inference use), a string (for data conversion), etc. See [Custom Model Examples** for examples.

mac.inference.Inference Methods

Method	Returns	Description
`expected_model_types` (Required)	`Set`	Returns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the `InferenceBuilder` method matches what this `Inference` class expects.
`_predict (input_data: mac.types.InferenceData)` (Required)	`mac.types.InferenceData`	The entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated. `mac.types.InferenceData`: The input `InferenceData` is a Dictionary of numpy arrays derived from the `input_schema` detailed when the model is uploaded, defined in PyArrow.Schema format. `mac.types.InferenceData`: The output is a Dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format. The `InferenceDataValidationError` exception is raised when the input data does not match `mac.types.InferenceData`.
`raise_error_if_model_is_not_assigned`	N/A	Error when a model is not set to `Inference`.
`raise_error_if_model_is_wrong_type`	N/A	Error when the model does not match the `expected_model_types`.

IMPORTANT NOTE

Verify that the inputs and outputs match the InferenceData input and output types: a Dictionary of numpy arrays defined by the input_schema and output_schema parameters when uploading the model to the Wallaroo instance. The following code is an example of a Dictionary of numpy arrays.

preds = self.model.predict(data)
preds = preds.numpy()
rows, _ = preds.shape
preds = preds.reshape((rows,))

return {"prediction": preds} # a Dictionary of numpy arrays.

The example, the expected_model_types can be defined for the KMeans model.

from sklearn.cluster import KMeans

class SampleClass(mac.inference.Inference):
    @property
    def expected_model_types(self) -> Set[Any]:
        return {KMeans}

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods

Method	Returns	Description
`create(config mac.config.inference.CustomInferenceConfig)` (Required)	The custom `Inference` instance.	Creates an Inference subclass, then assigns a model and attributes. The `CustomInferenceConfig` is used to retrieve the `config.model_path`, which is a `pathlib.Path object` pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to `config.model_path`. This is set when the Custom Model .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact `vgg_clustering\feature_extractor.h5` would be set with `config.model_path \ feature_extractor.h5`. The model loaded must match an existing module. For our example, this is `from sklearn.cluster import KMeans`, and this must match the `Inference` `expected_model_types`.
`inference`	custom `Inference` instance.	Returns the instantiated custom Inference object created from the `create` method.

Python Libraries

Python libraries required by the included Python script are specified in the requirements.txt file included in the .zip file. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo.

The requirements.txt file specifies:

Python Libraries available through PyPi.org and the specific version. For example:
```
requests == 2.32.2
```
Python Wheels as BYOP artifacts: Python Wheels as BYOP artifacts are included in the .zip file and are referred to in the BYOP’s requirements.txt file based on the relative path within the .zip file.
For example, if the BYOP .zip file includes the Python Wheel libraries/custom_wheel.whl
```
├── libraries
│   └── custom_wheel.whl
├── main.py
└── requirements.txt
```
Then the requirements.txt file included with the BYOP’s .zip file refers to this Python Wheel as:
```
libraries/custom_wheel.whl
```
External Python Wheels: Python Wheels that are available from external sources (aka - not included as BYOP artifacts):
- Must be referred to by the full URL.
- Must be available from the Wallaroo instance.
For example, to include the Python Wheel hosted at https://example.wallaroo.ai/libraries/custom_wheel.whl, the requirements.txt file included with the BYOP’s .zip file refers to this Python Wheel as:
```
https://example.wallaroo.ai/libraries/custom_wheel.whl
```
Extra Index URL: For Python libraries that require the --extra-index-url flag:
- Set the --extra-index-url flag with the full URL to the extra index. This must be available from the Wallaroo instance.
- In the next line, specify the Python library and version.
- Repeat the steps above for each Python library with an extra index URL.
For example, to include the extra index URL https://download.pytorch.org/whl/cu117 for the torchvision Python library, the requirements.txt file included with the BYOP’s .zip file refers to this Python Wheel as:
```
--extra-index-url https://download.pytorch.org/whl/cu117
torchvision==0.15.0
```

Custom Model Runtime

Custom Model always run in the containerized model runtime.

Custom Model Inputs

Custom Model inputs are defined during model upload in Apache Arrow Schema format with the following conditions:

By default, data inputs are optional unless they are specified with nullable=False.
The Custom Model code must be aware of the optional and required fields and how to manage those inputs.
Specific Data Types conditions:
- Scaler: Scaler values can be Null.
- Lists: Lists must either be empty [] or an an array of Null values, for example [None], but cannot be passed as Null outside of an array.
By default, columns with only the None or Null value are assigned by Python as NullArray, which is an array with all values of Null. In these situations, the schema must be specified.

Custom Model Inputs Example

The following code sample demonstrates managing optional inputs.

The Custom Model code has three inputs:

input_1: A required List of floats.
input_2: An optional List of floats.
multiply_factor: An optional scaler float.

The following demonstrates setting the input and output schemas when uploading the sample code to Wallaroo.

import wallaroo
import pyarrow as pa
input_schema = pa.schema([
    pa.field('input_1', pa.list_(pa.float32()), nullable=False), # fields are optional by default unless `nullable` is set to `False`
    pa.field('input_2', pa.list_(pa.float32())),
    pa.field('multiply_factor', pa.int32()),
])

output_schema = pa.schema([
    pa.field('output', pa.list_(pa.float32())),
])

The following demonstrates different valid inputs based on the input schemas. These fields are submitted either as a pandas DataFrame or an Apache Arrow table when submitted for inference requests.

Note that each time the data is translated to an Apache Arrow table, the input schema is specified so the accurate data types are assigned to the column, even with the column values are Null or None.

The following input has all fields and values translated into an Apache Arrow table, then submitted as an inference request to a pipeline with our sample BYOP model.

input_1 = [[1., 2.], [3., 4.]]
input_2 = [[5., 6.], [7., 8.]]
multiply_factor = [2, 3]
arrow_table = pa.table({"input_1": input_1, "input_2": input_2, "multiply_factor": multiply_factor}, schema=input_schema)
display(arrow_)table

input_1 = [[1., 2.], [3., 4.]]
input_2 = [[], []]
multiply_factor = [None, None]
arrow_table = pa.table({"input_1": input_1, "input_2": input_2, "multiply_factor": multiply_factor}, schema=input_schema)
arrow_table

pipeline.infer(arrow_table)

pyarrow.Table
time: timestamp[ms]
in.input_1: list<item: float> not null
  child 0, item: float
in.input_2: list<item: float> not null
  child 0, item: float
in.multiply_factor: int32 not null
out.output: list<item: double> not null
  child 0, item: double
anomaly.count: uint32 not null
----
time: [[2024-04-30 09:12:01.445,2024-04-30 09:12:01.445]]
in.input_1: [[[1,2],[3,4]]]
in.input_2: [[[5,6],[7,8]]]
in.multiply_factor: [[2,3]]
out.output: [[[12,16],[30,36]]]
anomaly.count: [[0,0]]

In the following example input_2 has two empty lists, stored into a pandas DataFrame and submitted for the inference request.

dataframe = pd.DataFrame({'input_1': [[1., 2.], [3., 4.]], 'input_2': [[], []], 'multiply_factor': [2, 3]})
display(dataframe)

	input_1	input_2	multiply_factor
0	[1.0, 2.0]	[]	2
1	[3.0, 4.0]	[]	3

For the following example, input_2 is an empty list, with multiply_factor set to None. This is stored in an Apache Arrow table for the inference request.

input_1 = [[1., 2.], [3., 4.]]
input_2 = [[], []]
multiply_factor = [None, None]
arrow_table = pa.table({"input_1": input_1, "input_2": input_2, "multiply_factor": multiply_factor}, schema=input_schema)
display(arrow_table)

pyarrow.Table
input_1: list<item: float> not null
  child 0, item: float
input_2: list<item: float>
  child 0, item: float
multiply_factor: int32
----
input_1: [[[1,2],[3,4]]]
input_2: [[[],[]]]
multiply_factor: [[null,null]]

pipeline.infer(arrow_table)

pyarrow.Table
time: timestamp[ms]
in.input_1: list<item: float> not null
  child 0, item: float
in.input_2: list<item: float> not null
  child 0, item: float
in.multiply_factor: int32 not null
out.output: list<item: double> not null
  child 0, item: double
anomaly.count: uint32 not null
----
time: [[2024-04-30 09:07:42.467,2024-04-30 09:07:42.467]]
in.input_1: [[[1,2],[3,4]]]
in.input_2: [[[],[]]]
in.multiply_factor: [[null,null]]
out.output: [[[1,2],[3,4]]]
anomaly.count: [[0,0]]

Upload Custom Model

Custom Models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload Custom Model Parameters

The following parameters are required for Custom Models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a Custom Model to Wallaroo.

Parameter	Type	Description
`name`	String (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	String (Required)	The path to the model file being uploaded.
`framework`	String (Required)	Set as `Framework.CUSTOM`.
`input_schema`	pyarrow.lib.Schema (Required)	The input schema in Apache Arrow schema format.
`output_schema`	pyarrow.lib.Schema (Required)	The output schema in Apache Arrow schema format.
`convert_wait`	Boolean (Optional) (Default: True)	True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload Custom Model Return

upload_model returns a wallaroo.model_version.ModelVersion object with the following fields.

Field	Type	Description
`name`	String	The name of the model.
`version`	String	The model version as a unique UUID.
`file_name`	String	The file name of the model as stored in Wallaroo.
`SHA`	String	The hash value of the model file.
`Status`	String	The status of the model.
`image_path`	String	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

Custom Model Examples

The following are examples of use cases for BYOP models.

Upload Custom Model Example

The following example is of uploading a Custom Model VGG16 Clustering ML Model to a Wallaroo instance.

Custom Model Script Example

The following is an example script that fulfills the requirements for a Wallaroo Custom Model, and would be saved as custom_inference.py.

"""This module features an example implementation of a custom Inference and its
corresponding InferenceBuilder."""

import pathlib
import pickle
from typing import Any, Set

import tensorflow as tf
from mac.config.inference import CustomInferenceConfig
from mac.inference import Inference
from mac.inference.creation import InferenceBuilder
from mac.types import InferenceData
from sklearn.cluster import KMeans


class ImageClustering(Inference):
    """Inference class for image clustering, that uses
    a pre-trained VGG16 model on cifar10 as a feature extractor
    and performs clustering on a trained KMeans model.

    Attributes:
        - feature_extractor: The embedding model we will use
        as a feature extractor (i.e. a trained VGG16).
        - expected_model_types: A set of model instance types that are expected by this inference.
        - model: The model on which the inference is calculated.
    """

    def __init__(self, feature_extractor: tf.keras.Model):
        self.feature_extractor = feature_extractor
        super().__init__()

    @property
    def expected_model_types(self) -> Set[Any]:
        return {KMeans}

    @Inference.model.setter  # type: ignore
    def model(self, model) -> None:
        """Sets the model on which the inference is calculated.

        :param model: A model instance on which the inference is calculated.

        :raises TypeError: If the model is not an instance of expected_model_types
            (i.e. KMeans).
        """
        self._raise_error_if_model_is_wrong_type(model) # this will make sure an error will be raised if the model is of wrong type
        self._model = model

    def _predict(self, input_data: InferenceData) -> InferenceData:
        """Calculates the inference on the given input data.
        This is the core function that each subclass needs to implement
        in order to calculate the inference.

        :param input_data: The input data on which the inference is calculated.
        It is of type InferenceData, meaning it comes as a dictionary of numpy
        arrays.

        :raises InferenceDataValidationError: If the input data is not valid.
        Ideally, every subclass should raise this error if the input data is not valid.

        :return: The output of the model, that is a dictionary of numpy arrays.
        """

        # input_data maps to the input_schema we have defined
        # with PyArrow, coming as a dictionary of numpy arrays
        inputs = input_data["images"]

        # Forward inputs to the models
        embeddings = self.feature_extractor(inputs)
        predictions = self.model.predict(embeddings.numpy())

        # Return predictions as dictionary of numpy arrays
        return {"predictions": predictions}


class ImageClusteringBuilder(InferenceBuilder):
    """InferenceBuilder subclass for ImageClustering, that loads
    a pre-trained VGG16 model on cifar10 as a feature extractor
    and a trained KMeans model, and creates an ImageClustering object."""

    @property
    def inference(self) -> ImageClustering:
        return ImageClustering

    def create(self, config: CustomInferenceConfig) -> ImageClustering:
        """Creates an Inference subclass and assigns a model and additionally
        needed attributes to it.

        :param config: Custom inference configuration. In particular, we're
        interested in `config.model_path` that is a pathlib.Path object
        pointing to the folder where the model artifacts are saved.
        Every artifact we need to load from this folder has to be
        relative to `config.model_path`.

        :return: A custom Inference instance.
        """
        feature_extractor = self._load_feature_extractor(
            config.model_path / "feature_extractor.h5"
        )
        inference = self.inference(feature_extractor)
        model = self._load_model(config.model_path / "kmeans.pkl")
        inference.model = model

        return inference

    def _load_feature_extractor(
        self, file_path: pathlib.Path
    ) -> tf.keras.Model:
        return tf.keras.models.load_model(file_path)

    def _load_model(self, file_path: pathlib.Path) -> KMeans:
        with open(file_path.as_posix(), "rb") as fp:
            model = pickle.load(fp)
        return model

The following is the requirements.txt file that would be included in the Custom Model ZIP file. It is highly recommended to use the same requirements.txt file for setting the libraries and versions used to create the model in the Custom Model ZIP file.

tensorflow==2.8.0
scikit-learn==1.2.2

Upload Custom Model Example

The following example demonstrates uploading the Custom Model as vgg_clustering.zip with the following input and output schemas defined.

input_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=32
        ),
        list_size=32
    )),
])

output_schema = pa.schema([
    pa.field('predictions', pa.int64()),
])

model = wl.upload_model(
                        'vgg16-clustering', 
                        'vgg16_clustering.zip', 
                        framework=Framework.CUSTOM, 
                        input_schema=input_schema, 
                        output_schema=output_schema, 
                        convert_wait=True
                    )

Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a container runtime..
Model is attempting loading to a container runtime.......................successful

Ready

BYOP Tutorials

The following tutorials demonstrate deploying and performing sample inferences on BYOP models in Wallaroo.

Custom Inference Computer Vision Upload, Autopackaging, and Deploy Tutorial

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be lower case ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports Hugging Face models by containerizing the model and running as an image.

Parameter	Description
Web Site	https://huggingface.co/models
Supported Libraries	`transformers==4.34.1` `diffusers==0.14.0` `accelerate==0.23.0` `torchvision==0.15.2` `torch==2.0.1`
Frameworks	The following Hugging Face pipelines are supported by Wallaroo. `Framework.HUGGING_FACE_FEATURE_EXTRACTION` aka `hugging-face-feature-extraction` `Framework.HUGGING_FACE_IMAGE_CLASSIFICATION` aka `hugging-face-image-classification` `Framework.HUGGING_FACE_IMAGE_SEGMENTATION` aka `hugging-face-image-segmentation` `Framework.HUGGING_FACE_IMAGE_TO_TEXT` aka `hugging-face-image-to-text` `Framework.HUGGING_FACE_OBJECT_DETECTION` aka `hugging-face-object-detection` `Framework.HUGGING_FACE_QUESTION_ANSWERING` aka `hugging-face-question-answering` `Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG` aka `hugging-face-stable-diffusion-text-2-img` `Framework.HUGGING_FACE_SUMMARIZATION` aka `hugging-face-summarization` `Framework.HUGGING_FACE_TEXT_CLASSIFICATION` aka `hugging-face-text-classification` `Framework.HUGGING_FACE_TRANSLATION` aka `hugging-face-translation` `Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION` aka `hugging-face-zero-shot-classification` `Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION` aka `hugging-face-zero-shot-image-classification` `Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION` aka `hugging-face-zero-shot-object-detection` `Framework.HUGGING_FACE_SENTIMENT_ANALYSIS` aka `hugging-face-sentiment-analysis` `Framework.HUGGING_FACE_TEXT_GENERATION` aka `hugging-face-text-generation` `Framework.HUGGING_FACE_AUTOMATIC_SPEECH_RECOGNITION` aka `hugging-face-automatic-speech-recognition`

During the model upload process, the Wallaroo instance will attempt to convert the model to a Native Wallaroo Runtime. If unsuccessful, a Wallaroo Containerized Runtime for the model is generated instead. See the model deployment section for details on how to configure pipeline resources based on the model’s runtime.

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

Framework.HUGGING_FACE_IMAGE_TO_TEXT
Framework.HUGGING_FACE_TEXT_CLASSIFICATION
Framework.HUGGING_FACE_SUMMARIZATION
Framework.HUGGING_FACE_TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Feature Extraction

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_FEATURE_EXTRACTION`	Feature Extraction Pipeline Feature Extraction Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])

Image Classification

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_IMAGE_CLASSIFICATION`	Image Classification Documentation Image Classification Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])

Image Segmentation

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_IMAGE_SEGMENTATION`	Image Segmentation Documentation Image Segmentation Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])

Image to Text

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_IMAGE_TO_TEXT`	Image to Text Documentation Image to Text Source Code

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])

Object Detection

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_OBJECT_DETECTION`	Object Detection Documentation Object Detection Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])

Question Answering

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_QUESTION_ANSWERING`	Question Answering Documentation Question Answering Source Code

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])

Diffusion Text 2 Image

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG`	Stable Diffusion Text to Image Documentation Stable Diffusion Text to Image Source Code

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])

Summarization

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_SUMMARIZATION`	Summarization Documentation Text2Text Generation Source Code.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])

Text Classification

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_TEXT_CLASSIFICATION`	Text Classification Documentation Text Classification Source Code

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('function_to_apply', pa.string()),
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])

Translation

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_TRANSLATION`	Translation Documentation Translation Generation Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])

Zero Shot Classification

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION`	Zero Shot Classification Documentation Zero Shot Classification Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

Zero Shot Image Classification

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION`	Zero Shot Image Classification Zero Shot Image Classification Source Code

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])

Zero Shot Object Detection

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION`	Zero Shot Object Detection Documentation Zero Shot Object Detection Source Code

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])

Sentiment Analysis

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_SENTIMENT_ANALYSIS`	Hugging Face Sentiment Analysis

Text Generation

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_TEXT_GENERATION`	Text Generation Documentation Text Generation Source Code

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])

Automatic Speech Recognition

Wallaroo Framework	Reference
`Framework.HUGGING_FACE_AUTOMATIC_SPEECH_RECOGNITION`	Automatic Speech Recognition Documentation Automatic Speech Recognition Source Code

Sample input and output schema.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float32())), # required: the audio stored in numpy arrays of shape (num_samples,) and data type `float32`
    pa.field('return_timestamps', pa.string()) # optional: return start & end times for each predicted chunk
]) 

output_schema = pa.schema([
    pa.field('text', pa.string()), # required: the output text corresponding to the audio input
    pa.field('chunks', pa.list_(pa.struct([('text', pa.string()), ('timestamp', pa.list_(pa.float32()))]))), # required (if `return_timestamps` is set), start & end times for each predicted chunk
])

Uploading Hugging Face Models

Hugging Face models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload Hugging Face Model Parameters

The following parameters are required for Hugging Face models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a Hugging Face model to Wallaroo.

Parameter	Type	Description
`name`	String (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	String (Required)	The path to the model file being uploaded.
`framework`	String (Required)	Set as the framework - see the list above for all supported Hugging Face frameworks.
`input_schema`	pyarrow.lib.Schema (Required)	The input schema in Apache Arrow schema format.
`output_schema`	pyarrow.lib.Schema (Required)	The output schema in Apache Arrow schema format.
`convert_wait`	Boolean (Optional) (Default: True)	True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload Hugging Face Model Return

upload_model returns a wallaroo.model_version.ModelVersion object with the following fields.

Field	Type	Description
`name`	String	The name of the model.
`version`	String	The model version as a unique UUID.
`file_name`	String	The file name of the model as stored in Wallaroo.
`SHA`	String	The hash value of the model file.
`Status`	String	The status of the model.
`image_path`	String	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

Upload Hugging Face Model Example

The following example is of uploading a Hugging Face Zero Shot Classification ML Model to Wallaroo.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)),
    pa.field('hypothesis_template', pa.string()),
    pa.field('multi_label', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

model = wl.upload_model("hf-zero-shot-classification",
                       "./models/model-auto-conversion_hugging-face_dummy-pipelines_zero-shot-classification-pipeline.zip",
                        framework=Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION,
                        input_schema=input_schema,
                        output_schema=output_schema,
                        convert_wait=True)

Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a container runtime..
Model is attempting loading to a container runtime................................................successful

Ready

Hugging Face Tutorials

The following tutorials demonstrate deploying and performing sample inferences on Hugging Face models in Wallaroo.

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be lower case ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Python scripts are uploaded to Wallaroo and and treated like an ML Models in Pipeline steps. These are referred to as Python steps.

Python steps can include:

Preprocessing steps to prepare the data received to be handed to ML Model deployed as another Pipeline step.
Postprocessing steps to take data output by a ML Model as part of a Pipeline step, and prepare the data to be received by some other data store or entity.
A model contained within a Python script.

In all of these, the requirements for uploading a Python step as a ML Model in Wallaroo are the same.

Parameter	Description
Web Site	https://www.python.org/
Supported Libraries	`python==3.10`
Framework	`Framework.PYTHON` aka `python`

Python models uploaded to Wallaroo are executed Wallaroo Containerized Runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries. These are commonly used for data formatting such as the pre and post-processing steps, and are also appropriate for simple models (such as ARIMA Statsmodels). A Wallaroo Python model can be composed of one or more Python script that matches the Wallaroo requirements.

This is contrasted with Custom Model, also known as Bring Your Own Predict (BYOP) that allow for custom model inference methods with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with their supporting artifacts such as other Python modules, scripts, model files, etc.

Python Models Requirements

Python scripts packaged as Python models in Wallaroo have the following requirements.

At least one .py Python script file with the following:
- Must be compatible with Python version 3.10.
- Imports the mac.types.InferenceData included with the Wallaroo SDK. For example:
```
from mac.types import InferenceData
```
- Includes the following method as the entry point for Wallaroo model inferencing:
```
def process_data(input_data: InferenceData) -> InferenceData:
    # additional code block here
```
  - Only one implementation of process_data(input_data: InferenceData) -> InferenceData is allowed. There can be as many Python scripts included in the .zip file as needed, but only one can have this method as the entry point.
    - The process_data function must return a dictionary where the keys are strings and the values are NumPy arrays. In the case of single values (scalars) these must be single-element arrays. For example:
      def process_data(input_data: InferenceData) -> InferenceData: # return a dictionary with the field output that transforms the input field `variable` to its value to the 10th power. return { 'output' : np.rint(np.power(10, input_data["variable"])) }
  - InferenceData represents a dictionary of numpy arrays where the first dimension is always the batch size. The type annotations set in the input_schema and output_schema for the model when uploaded must be present and correct.
  - process_data accepts and returns InferenceData. Any other implementations will return an error.
- (Optional): A requirements.txt file that includes any additional Python libraries required by the Python script with the following requirements:
  - The Python libraries must match the targeted infrastructure. For details on uploading models to a specific infrastructures such as ARM, see Automated Model Packaging.
  - The Python libraries must be compatible with Python version python==3.10.11.

The Python script, optional requirements.txt file, and artifacts are packaged in a .zip file with the Python script and optional requirements.txt file in the root folder. For example, the sample files stored in the folder preprocess-step:

/preprocss-step
    sample-script.py
    requirements.txt
    /artifacts
        datalist.csv

The files are packaged into a .zip file. For example, the following packages the contents of the folder preprocess_step into preprocess_step.zip.

zip -r preprocess_step.zip preprocess_step/*

In the example below, the Python model is used as a pre processing step for another ML model. It accepts as an input the InferenceData submitted as part of an inference request. It then formats the data and outputs a dictionary of numpy arrays with the field tensor. This data is then able to be passed to the next model in a pipeline step.

import datetime
import logging

import numpy as np
import pandas as pd

import wallaroo

from mac.types import InferenceData

logger = logging.getLogger(__name__)

_vars = [
    "bedrooms",
    "bathrooms",
    "sqft_living",
    "sqft_lot",
    "floors",
    "waterfront",
    "view",
    "condition",
    "grade",
    "sqft_above",
    "sqft_basement",
    "lat",
    "long",
    "sqft_living15",
    "sqft_lot15",
    "house_age",
    "renovated",
    "yrs_since_reno",
]


def process_data(input_data: InferenceData) -> InferenceData:
    input_df = pd.DataFrame(input_data)
    thisyear = datetime.datetime.now().year
    input_df["house_age"] = thisyear - input_df["yr_built"]
    input_df["renovated"] = np.where((input_df["yr_renovated"] > 0), 1, 0)
    input_df["yrs_since_reno"] = np.where(
        input_df["renovated"],
        input_df["yr_renovated"] - input_df["yr_built"],
        0,
    )
    input_df = input_df.loc[:, _vars]

    return {"tensor": input_df.to_numpy(dtype=np.float32)}

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. For example, a postprocessing Python step that is the final model step in a pipeline with the output field output is included in the out dataset as the field out.output in the Wallaroo inference result.

	time	in.tensor	out.output	anomaly.count
0	2023-06-20 20:23:28.395	[0.6878518042, 0.1760734021, -0.869514083, 0.3..	[12.886651039123535]	0

Python Libraries

The requirements.txt file specifies:

Python Libraries available through PyPi.org and the specific version. For example:
```
requests == 2.32.2
```
Python Wheels as Python model artifacts: Python Wheels as Python model artifacts are included in the .zip file and are referred to in the Python model’s requirements.txt file based on the relative path within the .zip file.
For example, if the Python model’s .zip file includes the Python Wheel libraries/custom_wheel.whl
```
├── libraries
│   └── custom_wheel.whl
├── main.py
└── requirements.txt
```
Then the requirements.txt file included with the Python model’s .zip file refers to this Python Wheel as:
```
libraries/custom_wheel.whl
```
External Python Wheels: Python Wheels that are available from external sources (aka - not included as Python model artifacts):
- Must be referred to by the full URL.
- Must be available from the Wallaroo instance.
For example, to include the Python Wheel hosted at https://example.wallaroo.ai/libraries/custom_wheel.whl, the requirements.txt file included with the Python model’s .zip file refers to this Python Wheel as:
```
https://example.wallaroo.ai/libraries/custom_wheel.whl
```
Extra Index URL: For Python libraries that require the --extra-index-url flag:
- Set the --extra-index-url flag with the full URL to the extra index. This must be available from the Wallaroo instance.
- In the next line, specify the Python library and version.
- Repeat the steps above for each Python library with an extra index URL.
For example, to include the extra index URL https://download.pytorch.org/whl/cu117 for the torchvision Python library, the requirements.txt file included with the Python model’s .zip file refers to this Python Wheel as:
```
--extra-index-url https://download.pytorch.org/whl/cu117
torchvision==0.15.0
```

Upload Python Models via the Wallaroo SDK

Python step models are uploaded to Wallaroo through the wallaroo.client.upload_model() method.

Upload Python Model Parameters

Parameter	Type	Description
`name`	`string` (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	`string` (Required)	The path to the model file being uploaded. This must be a `.zip` file as defined in Python Models Requirements.
`framework`	`string` (Required)	Set as the `Framework.Python`.
`input_schema`	`pyarrow.lib.Schema` (Required)	The input schema in Apache Arrow schema format.
`output_schema`	`pyarrow.lib.Schema` (Required)	The output schema in Apache Arrow schema format.
`convert_wait`	`bool` (Optional) (Default: True)	True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

Upload Python Model Returns

upload_model returns a wallaroo.model_version.ModelVersion object with the following fields.

Field	Type	Description
`name`	String	The name of the model.
`version`	String	The model version as a unique UUID.
`file_name`	String	The file name of the model as stored in Wallaroo.
`SHA`	String	The hash value of the model file.
`Status`	String	The status of the model.
`image_path`	String	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

Upload Python Models Example

The following example is of uploading a Python step ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('id', pa.int64()),
    pa.field('date', pa.string()),
    pa.field('list_price', pa.float64()),
    pa.field('bedrooms', pa.int64()),
    pa.field('bathrooms', pa.float64()),
    pa.field('sqft_living', pa.int64()),
    pa.field('sqft_lot', pa.int64()),
    pa.field('floors', pa.float64()),
    pa.field('waterfront', pa.int64()),
    pa.field('view', pa.int64()),
    pa.field('condition', pa.int64()),
    pa.field('grade', pa.int64()),
    pa.field('sqft_above', pa.int64()),
    pa.field('sqft_basement', pa.int64()),
    pa.field('yr_built', pa.int64()),
    pa.field('yr_renovated', pa.int64()),
    pa.field('zipcode', pa.int64()),
    pa.field('lat', pa.float64()),
    pa.field('long', pa.float64()),
    pa.field('sqft_living15', pa.int64()),
    pa.field('sqft_lot15', pa.int64()),
    pa.field('sale_price', pa.float64())
])

output_schema = pa.schema([
    pa.field('tensor', pa.list_(pa.float32(), list_size=18))
])

preprocess_model = wl.upload_model("preprocess-step", "./models/preprocess_step.zip", \
                                   framework=wallaroo.framework.Framework.PYTHON, \
                                   input_schema=input_schema, output_schema=output_schema)
display(preprocess_model)


Name	preprocess-step
Version	d0cb7d27-5c83-45c6-a231-e16c2c5818b9
File Name	preprocess_step.zip
SHA	c09bbca6748ff23d83f48f57446c3ad6b5758c403936157ab731b3c269c0afb9
Status	ready
Image Path	None
Architecture	x86
Acceleration	none
Updated At	2024-03-Apr 18:11:34

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be lower case ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Parameter	Description
Web Site	https://pytorch.org/
Supported Libraries	`torch==2.0.1` `torchvision==0.15.2`
Framework	`Framework.PYTORCH` aka `pytorch`
Supported File Types	`pt` ot `pth` in TorchScript format

IMPORTANT NOTE

The PyTorch model must be in TorchScript format. scripting (i.e. torch.jit.script() is always recommended over tracing (i.e. torch.jit.trace()).

From the PyTorch documentation: “Scripting preserves dynamic control flow and is valid for inputs of different sizes.”

For more details, see TorchScript-based ONNX Exporter: Tracing vs Scripting.

During the model upload process, Wallaroo optimizes models by converting them to the Wallaroo Native Runtime, if possible, or running the model directly in the Wallaroo Containerized Runtime. See the Model Deploy for details on how to configure pipeline resources based on the model’s runtime.

IMPORTANT CONFIGURATION NOTE: For PyTorch input schemas, the floats must be pyarrow.float32() for the PyTorch model to be converted to the Native Wallaroo Runtime during the upload process.

PyTorch Input and Output Schemas

PyTorch input and output schemas have additional requirements depending on whether the PyTorch model is single input/output or multiple input/output. This refers to the number of columns:

Single Input/Output: Has one input and one output column.
Multiple Input/Output: Has more than one input or more than one output column.

The column names for the model can be anything. For example:

Model Input Fields:
- length
- width
- intensity
- etc

When creating the input and output schemas for uploading a PyTorch model in Wallaroo, the field names must match the following requirements. For example, for multi-column PyTorch models, the input would be:

Data Schema Input Fields:
- input_1
- input_2
- input_3
- input_...

For single input/output PyTorch model, the field names must be input and output. For example, if the input field is a List of Floats of size 10, and the output field is a list of floats of list size one, the input and output schemas are:

input_schema = pa.schema([
    pa.field('input', pa.list_(pa.float32(), list_size=10))
])

output_schema = pa.schema([
    pa.field('output', pa.list_(pa.float32(), list_size=1))
])

For multi input/output PyTorch models, the data schemas for each input and output field must be named input_1, input_2... and output_1, output_2, etc. These must be in the same order that the PyTorch model is trained to accept them.

For example, a multi input/output PyTorch model that takes the following inputs and outputs:

Inputs
- input_1: List of Floats of length 10.
- input_2: List of Floats of length 5.
Outputs
- output_1: List of Floats of length 3.
- output_2: List of Floats of length 2.

The following input and output schemas would be used.

input_schema = pa.schema([
    pa.field('input_1', pa.list_(pa.float32(), list_size=10)),
    pa.field('input_2', pa.list_(pa.float32(), list_size=5))
])
output_schema = pa.schema([
    pa.field('output_1', pa.list_(pa.float32(), list_size=3)),
    pa.field('output_2', pa.list_(pa.float32(), list_size=2))
])

Uploading PyTorch Models

PyTorch models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload PyTorch Model Parameters

The following parameters are required for PyTorch models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a PyTorch model to Wallaroo.

Parameter	Type	Description
`name`	String (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	String (Required)	The path to the model file being uploaded.
`framework`	String (Required)	Set as the `Framework.PyTorch`.
`input_schema`	pyarrow.lib.Schema (Required)	The input schema in Apache Arrow schema format. Note that float values must be `pyarrow.float32()` for the Pytorch model to be converted to a Wallaroo Native Runtime during model upload.
`output_schema`	pyarrow.lib.Schema (Required)	The output schema in Apache Arrow schema format. Note that float values must be `pyarrow.float32()` for the Pytorch model to be converted to a Wallaroo Native Runtime during model upload.
`convert_wait`	Boolean (Optional) (Default: True)	True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is optimized. This process may take up to 10 minutes depending on the size and complexity of the model.

Upload PyTorch Model Return

upload_model returns a wallaroo.model_version.ModelVersion object with the following fields.

Field	Type	Description
`name`	String	The name of the model.
`version`	String	The model version as a unique UUID.
`file_name`	String	The file name of the model as stored in Wallaroo.
`she`	String	The hash value of the model file.
`status`	String	The status of the model.
`image_path`	String	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

Upload PyTorch Model Example

The following example is of uploading a PyTorch ML Model to a Wallaroo instance.

input_schema = pa.schema(
    [
        pa.field('input', pa.list_(pa.float32(), list_size=10))
    ]
)

output_schema = pa.schema(
[
    pa.field('output', pa.list_(pa.float32(), list_size=1))
]
)

model = wl.upload_model('pt-single-io-model', 
                        "./models/model-auto-conversion_pytorch_single_io_model.pt", 
                        framework=Framework.PYTORCH, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
display(model)

Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a native runtime..
Ready

PyTorch Model Tutorials

The following tutorials demonstrate deploying and performing sample inferences on PyTorch models in Wallaroo.

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be lower case ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports SKLearn models by containerizing the model and running as an image.

Sci-kit Learn aka SKLearn.

Parameter	Description
Web Site	https://scikit-learn.org/stable/index.html
Supported Libraries	`scikit-learn==1.3.0`
Framework	`Framework.SKLEARN` aka `sklearn`

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2

Converted DataFrame:

	inputs
0	[5.1, 3.5, 1.4, 0.2]
1	[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)

	time	in.inputs	out.predictions	anomaly.count
0	2023-07-05 15:11:29.776	[5.1, 3.5, 1.4, 0.2]	0	0
1	2023-07-05 15:11:29.776	[4.9, 3.0, 1.4, 0.2]	0	0

Uploading SKLearn Models

SKLearn models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload SKLearn Model Parameters

The following parameters are required for SKLearn models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a SKLearn model to Wallaroo.

Parameter	Type	Description
`name`	String (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	String (Required)	The path to the model file being uploaded.
`framework`	String (Required)	Set as the `Framework.SKLEARN`.
`input_schema`	pyarrow.lib.Schema (Required)	The input schema in Apache Arrow schema format.
`output_schema`	pyarrow.lib.Schema (Required)	The output schema in Apache Arrow schema format.
`convert_wait`	Boolean (Optional) (Default: True)	True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload SKLearn Model Return

upload_model returns a wallaroo.model_version.ModelVersion object with the following fields.

Field	Type	Description
`name`	String	The name of the model.
`version`	String	The model version as a unique UUID.
`file_name`	String	The file name of the model as stored in Wallaroo.
`SHA`	String	The hash value of the model file.
`Status`	String	The status of the model.
`image_path`	String	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

Upload SKLearn Model Example

The following example is of uploading a pickled SKLearn ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

model = wl.upload_model('sklearn-clustering-kmeans', 
                        "models/model-auto-conversion_sklearn_kmeans.pkl", 
                        framework=Framework.SKLEARN, 
                        input_schema=input_schema, 
                        output_schema=output_schema,
                       )

Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a native runtime..
Model is attempting loading to a native runtime..incompatible

Model is pending loading to a container runtime..
Model is attempting loading to a container runtime..............successful

Sklearn Model Tutorials

The following tutorials demonstrate deploying and performing sample inferences on Sklearn models in Wallaroo.

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be lower case ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports TensorFlow/Keras models by containerizing the model and running as an image.

Parameter	Description
Web Site	https://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries	`tensorflow==2.13.1` `keras==2.13.1`
Framework	`Framework.KERAS` aka `keras`
Supported File Types	SavedModel format as .zip file and HDF5 format

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

Tensorflow Library in Wallaroo JupyterHub Service

For Jupyter Notebooks running the Wallaroo JupyterHub Service that import the tensorflow library, for example:

import tensorflow

Import the tensorflow-cpu library by executing the following command in the terminal shell:

pip install tensorflow-cpu==2.13.1 --user

Uploading TensorFlow Models

TensorFlow Keras models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload TensorFlow Model Parameters

The following parameters are required for TensorFlow keras models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a TensorFlow Keras model to Wallaroo.

Parameter	Type	Description
`name`	String (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	String (Required)	The path to the model file being uploaded.
`framework`	String (Required)	Set as the `Framework.KERAS`.
`input_schema`	pyarrow.lib.Schema (Required)	The input schema in Apache Arrow schema format.
`output_schema`	pyarrow.lib.Schema (Required)	The output schema in Apache Arrow schema format.
`convert_wait`	Boolean (Optional) (Default: True)	True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload TensorFlow Model Return

upload_model returns a wallaroo.model_version.ModelVersion object with the following fields.

Field	Type	Description
`name`	String	The name of the model.
`version`	String	The model version as a unique UUID.
`file_name`	String	The file name of the model as stored in Wallaroo.
`SHA`	String	The hash value of the model file.
`Status`	String	The status of the model.
`image_path`	String	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

For example, the following example is of uploading a tensorflow Keras Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('input', pa.list_(pa.float64(), list_size=10))
])
output_schema = pa.schema([
    pa.field('output', pa.list_(pa.float64(), list_size=32))
])

model = wl.upload_model('keras-sequential-single-io', 
                        'models/model-auto-conversion_keras_single_io_keras_sequential_model.h5', 
                        framework=Framework.KERAS, 
                        input_schema=input_schema, 
                        output_schema=output_schema)

Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a native runtime.
Model is pending loading to a container runtime..
Model is attempting loading to a container runtime.....................successful

Ready

Tensorflow keras Model Tutorials

The following tutorials demonstrate deploying and performing sample inferences on Tensorflow keras models in Wallaroo.

TensorFlow keras Sequential Single IO

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be lower case ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports XGBoost models by containerizing the model and running as an image.

Parameter	Description
Web Site	https://xgboost.ai/
Supported Libraries	`scikit-learn==1.3.0` `xgboost==1.7.4`
Framework	`Framework.XGBOOST` aka `xgboost`
Supported File Types	`pickle` (XGB files are not supported.)

Since the Wallaroo 2024.1 release, XGBoost support is enhanced to performantly support a wider set of XGBoost models. XGBoost models are not required to be trained with ONNX nomenclature in order to successfully convert to a performant runtime.

XGBoost Types Support

The following XGBoost model types are supported by Wallaroo. XGBoost models not supported by Wallaroo are supported via the Custom Model, also known as Bring Your Own Predict (BYOP).

XGBoost Model Type	Wallaroo Packaging Supported
XGBClassifier	√
XGBRegressor	√
Booster Classifier	√
Booster Classifier	√
Booster Regressor	√
Booster Random Forest Regressor	√
Booster Random Forest Classifier	√
XGBRFClassifier	√
XGBRFRegressor	√
XGBRanker*	X

XGBRanker XGBoost models are currently supported via converting them to BYOP models.

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float32(), list_size=4))
])

Original DataFrame:

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2

Converted DataFrame:

	inputs
0	[5.1, 3.5, 1.4, 0.2]
1	[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs.

Outputs for XBoost that are meant to be predictions or probabilities must be labeled as part of the output schema. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.float32()),
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)

	time	in.inputs	out.predictions	anomaly.count
0	2023-07-05 15:11:29.776	[5.1, 3.5, 1.4, 0.2]	0	0
1	2023-07-05 15:11:29.776	[4.9, 3.0, 1.4, 0.2]	0	0

Uploading XGBoost Models

XGBoost models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload XGBoost Model Parameters

The following parameters are available for XGBoost models.

Parameter	Type	Description
`name`	String (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`path`	String (Required)	The path to the model file being uploaded.
`framework`	String (Required)	Set as the `Framework.XGBOOST`.
`input_schema`	pyarrow.lib.Schema (Required)	The input schema in Apache Arrow schema format.
`output_schema`	pyarrow.lib.Schema (Required)	The output schema in Apache Arrow schema format.
`convert_wait`	Boolean (Optional) (Default: True)	True: Waits in the script for the model conversion completion. False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload XGBoost Model Return

upload_model returns a wallaroo.model_version.ModelVersion object with the following fields.

Field	Type	Description
`name`	String	The name of the model.
`version`	String	The model version as a unique UUID.
`file_name`	String	The file name of the model as stored in Wallaroo.
`SHA`	String	The hash value of the model file.
`Status`	String	The status of the model.
`image_path`	String	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

Upload XGBoost Model Example

The following example is of uploading a XGBoost ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

output_schema = pa.schema([
    pa.field('output', pa.float64())
])

model = wl.upload_model('xgboost-classification', 
                        './models/model-auto-conversion_xgboost_xgb_classification_iris.pkl', 
                        framework=Framework.XGBOOST, 
                        input_schema=input_schema, 
                        output_schema=output_schema)

Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a native runtime..
Model is attempting loading to a native runtime..incompatible

Model is pending loading to a container runtime.
Model is attempting loading to a container runtime............successful

Ready

XGBoost Model Tutorials

The following tutorials demonstrate deploying and performing sample inferences on XGBoost models in Wallaroo.

Parameter	Description
Web Site	https://mlflow.org
Supported Libraries	mlflow==1.30.0

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be lower case ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Containerized MLFlow Model Operations

Register a Containerized MLFlow Model

Parameter	Description
Web Site	https://mlflow.org
Supported Libraries	mlflow==1.30.0

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Containerized MLFlow models are not uploaded, but registered from a container registry service. This is performed through the wallaroo.client.register_model_image(options), and wallaroo.model_version.configure(options) method.

IMPORTANT NOTICE

Wallaroo supports both public and private containerized model registries.  See the <a href="https://docs.wallaroo.ai/wallaroo-operations-guide/wallaroo-configuration/wallaroo-private-model-registry/">Wallaroo Private Containerized Model Container Registry Guide</a> for details on how to configure a Wallaroo instance with a private model registry.

IMPORTANT NOTICE

Models registered through the Wallaroo SDK are associated with the <strong>current workspace</strong> in the SDK session, assigned as the user&rsquo;s <strong>Default Workspace</strong> by default.  See <a href="https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-workspace/">Wallaroo SDK Essentials Guide: Workspace Management</a> for full details on creating and working with workspaces.

Register a Containerized MLFlow Model Parameters

The following parameters must be set for wallaroo.client.register_model_image(options) and wallaroo.model_version.configure(options) for a Containerized MLFlow model to be registered in Wallaroo.

Register Model Image Parameters

Parameter	Type	Description
`model_name`	`string` (Required)	The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
`image`	`string` (Required)	The URL to the containerized MLFlow model in the MLFlow Registry..

Model Version Configuration Parameters

Model version configurations are updated with the wallaroo.model_version.config and include the following parameters.

Parameter	Type	Description
`tensor_fields`	(List[string]) (Optional)	A list of alternate input fields. For example, if the model accepts the input fields `['variable1', 'variable2']`, `tensor_fields` allows those inputs to be overridden to `['square_feet', 'house_age']`, or other values as required. These only apply to `ONNX` models.
`batch_config`	(List[string]) (Optional)	Batch config is either `None` for multiple-input inferences, or `single` to accept an inference request with only one row of data.

For model version configuration for MLFlow models, the following must be defined:

runtime: Set as mlflow.
input_schema: The input schema from the Apache Arrow pyarrow.lib.Schema format.
output_schema: The output schema from the Apache Arrow pyarrow.lib.Schema format.

Register a Containerized MLFlow Model Returns

wallaroo.client.register_model_image(options) returns the model version. The model version refers to the version of the model object in Wallaroo. In Wallaroo, a model version update happens when we upload a new model file (artifact) against the same model object name.

Note that models are uploaded to the current workspace assigned in the SDK session. By default, this is the user’s Default Workspace.

Field	Type	Description
`id`	Integer	The numerical identifier of the model version.
`name`	String	The name of the model.
`version`	String	The model version as a unique UUID.
`file_name`	String	The file name of the model as stored in Wallaroo.
`image_path`	String	The image used to deploy the model in the Wallaroo engine.
`last_update_time`	DateTime	When the model was last updated.

Register a Containerized MLFlow Model Example

The following example demonstrates registering a Statsmodel model stored in a MLFLow container with a Wallaroo instance.

sm_input_schema = pa.schema([
  pa.field('temp', pa.float32()),
  pa.field('holiday', pa.uint8()),
  pa.field('workingday', pa.uint8()),
  pa.field('windspeed', pa.float32())
])

sm_output_schema = pa.schema([
    pa.field('predicted_mean', pa.float32())
])

sm_model = wl.register_model_image(
    name="mlflow-statmodels",
    image="ghcr.io/wallaroolabs/wallaroo_tutorials/mlflow-statsmodels-example:2023.1"
    ).configure("mlflow", 
            input_schema=sm_input_schema, 
            output_schema=sm_output_schema
    )

sm_model

Name	mlflowstatmodels
Version	eb1bcec8-63fe-4a82-98ea-fc4945786973
File Name	none
SHA	3afd13d9c5070679e284050cd099e84aa2e5cb7c08a788b21d6cb2397615d018
Status	ready
Image Path	ghcr.io/wallaroolabs/wallaroo_tutorials/mlflow-statsmodels-example:2023.1
Architecture	None
Updated At	2024-30-Jan 16:11:55

MLFlow Data Formats

When using containerized MLFlow models with Wallaroo, the inputs and outputs must be named. For example, the following output:

[-12.045839810372835]

Would need to be wrapped with the data values named:

[{"prediction": -12.045839810372835}]

A short sample code for wrapping data may be:

output_df = pd.DataFrame(prediction, columns=["prediction"])
return output_df

Wallaroo users can register their trained machine learning models from a model registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

For instructions creating and adding a registry to Wallaroo, see Connect a Model Registry.

This guide details how to add ML Models from a model registry service into a Wallaroo instance.

Artifact Requirements

Models are uploaded to the Wallaroo instance as the specific artifact - the “file” or other data that represents the file itself. This must comply with the Wallaroo model requirements framework and version or it will not be deployed. Note that for models that fall outside of the supported model types, they can be registered to a Wallaroo workspace as MLFlow 1.30.0 containerized models.

Wallaroo Registry Model Operations

List Registries in a Workspace: List the available registries in the current workspace.
List Models: List Models in a Registry
Upload Model: Upload a version of a ML Model from the Registry to a Wallaroo workspace.
List Model Versions: List the versions of a particular model.
Remove Registry from Workspace: Remove a specific Registry configuration from a specific workspace.

List Registries in a Workspace

Registries associated with a workspace are listed with the Wallaroo.Client.list_model_registries() method. This lists all registries associated with the current workspace.

List Registries in a Workspace Parameters

None

List Registries in a Workspace Returns

A List of Registries with the following fields.

Field	Type	Description
`Name`	String	The name of the MLFlow Registry service.
`URL`	string	The URL for connecting to the service.
`Created At`	DateTime	When the registry was added to the Wallaroo instance.
`Updated At`	DateTime	When the registry was last updated.

List Registries in a Workspace Example

wl.list_model_registries()

name	registry url	created at	updated at
gib	https://sampleregistry.wallaroo.ai	2023-27-Jun 03:22:46	2023-27-Jun 03:22:46
ExampleNotebook	https://sampleregistry.wallaroo.ai	2023-27-Jun 13:57:26	2023-27-Jun 13:57:26

List Models in a Registry

A List of models available to the Wallaroo instance through the MLFlow Registry is performed with the Wallaroo.Registry.list_models() method.

List Models in a Registry Parameters

None

List Models in a Registry Returns

A List of models with the following fields.

Field	Type	Description
`Name`	String	The name of the model.
`Registry User`	string	The user account that is tied to the registry service for this model.
`Versions`	int	The number of versions for the model, starting at 0.
`Created At`	DateTime	When the registry was added to the Wallaroo instance.
`Updated At`	DateTime	When the registry was last updated.

List Models in a Registry Example

registry.list_models()

Name	Registry User	Versions	Created At	Updated At
testmodel	sample.user@wallaroo.ai	0	2023-16-Jun 14:38:42	2023-16-Jun 14:38:42
testmodel2	sample.user@wallaroo.ai	0	2023-16-Jun 14:41:04	2023-16-Jun 14:41:04
wine_quality	sample.user@wallaroo.ai	2	2023-16-Jun 15:05:53	2023-16-Jun 15:09:57

Retrieve Specific Model Details from the Registry

Model details are retrieved by assigning a MLFlow Registry Model to an object with the Wallaroo.Registry.list_models(), then specifying the element in the list to save it to a Registered Model object.

The following will return the most recent model added to the MLFlow Registry service.

mlflow_model = registry.list_models()[-1]
mlflow_model

Field	Type	Description
`Name`	String	The name of the model.
`Registry User`	string	The user account that is tied to the registry service for this model.
`Versions`	int	The number of versions for the model, starting at 0.
`Created At`	DateTime	When the registry was added to the Wallaroo instance.
`Updated At`	DateTime	When the registry was last updated.

List Model Versions of Registered Model

MLFlow registries can contain multiple versions of a ML Model. These are listed and are listed with the Registered Model versions attribute. The versions are listed in reverse order of insertion, with the most recent model version in position 0.

List Model Versions of Registered Model Parameters

None

List Model Versions of Registered Model Returns

A List of the Registered Model Versions with the following fields.

Field	Type	Description
`Name`	String	The name of the model.
`Version`	int	The version number. The higher numbers are the most recent.
`Description`	String	The registered model’s description from the MLFlow Registry service.

List Model Versions of Registered Model Example

The following will return the most recent model added to the MLFlow Registry service and list its versions.

mlflow_model = registry.list_models()[-1]
mlflow_model.versions

Name	Version	Description
wine_quality	2	None
wine_quality	1	None

List Model Version Artifacts

Artifacts belonging to a MLFlow registry model are listed with the Model Version list_artifacts() method. This returns all artifacts for the model.

List Model Version Artifacts Parameters

None

List Model Version Artifacts Returns

A List of artifacts with the following fields.

Field	Type	Description
`file_name`	String	The name assigned to the artifact.
`file_size`	String	The size of the artifact in bytes.
`full_path`	String	The path of the artifact. This will be used to upload the artifact to Wallaroo.

List Model Version Artifacts Example

The following will list the artifacts in a single registry model.

single_registry_model.versions[0].list_artifacts()

File Name	File Size	Full Path
MLmodel	546B	https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/MLmodel
conda.yaml	182B	https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/conda.yaml
model.pkl	1429B	https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl
python_env.yaml	122B	https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/python_env.yaml
requirements.txt	73B	https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/requirements.txt

Upload a Model from a Registry

Models uploaded to the Wallaroo workspace are uploaded from a MLFlow Registry with the Wallaroo.Registry.upload method.

Upload a Model from a Registry Parameters

Parameter	Type	Description
`name`	string (Required)	The name to assign the model once uploaded. Model names are unique within a workspace. Models assigned the same name as an existing model will be uploaded as a new model version.
`path`	string (Required)	The full path to the model artifact in the registry.
`framework`	string (Required)	The Wallaroo model `Framework`. See Model Uploads and Registrations Supported Frameworks
`input_schema`	`pyarrow.lib.Schema` (Required for non-native runtimes)	The input schema in Apache Arrow schema format.
`output_schema`	`pyarrow.lib.Schema` (Required for non-native runtimes)	The output schema in Apache Arrow schema format.

Upload a Model from a Registry Returns

The registry model details as follows.

Field	Type	Description
`Name`	String	The name of the model.
`Version`	string	The version registered in the Wallaroo instance in UUID format.
`File Name`	string	The file name associated with the ML Model in the Wallaroo instance.
`SHA`	string	The models hash value.
`Status`	string	The status of the model from the following list. `pending_conversion`: The model is uploaded to Wallaroo and is ready to convert. `converting`: The model is being converted into a Wallaroo supported runtime. `ready` : The model is ready and available for use. `error`: The model conversion has failed. Check error messages and verify the model is the correct version and framework.
`Image Path`	string	The image used for the containerization of the model.
`Updated At`	DateTime	When the model was last updated.

Upload a Model from a Registry Example

The following will retrieve the most recent uploaded model and upload it with the XGBOOST framework into the current Wallaroo workspace.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float32(), list_size=4))
])

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

model = registry.upload_model(
  name="sklearnonnx", 
  path="https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl", 
  framework=Framework.SKLEARN,
  input_schema=input_schema,
  output_schema=output_schema)


Name	sklearnonnx
Version	63bd932d-320d-4084-b972-0cfe1a943f5a
File Name	model.pkl
SHA	970da8c178e85dfcbb69fab7bad0fb58cd0c2378d27b0b12cc03a288655aa28d
Status	pending_conversion
ImagePath	None
Updated At	2023-05-Jul 19:14:49

Retrieve Model Status

The model status is retrieved with the Model status() method.

Retrieve Model Status Parameters

None

Retrieve Model Status Returns

Field	Type	Description
status	String	The current status of the uploaded model. `pending_conversion`: The model is uploaded to Wallaroo and is ready to convert. `converting`: The model is being converted into a Wallaroo supported runtime. `ready` : The model is ready and available for use. `error`: The model conversion has failed. Check error messages and verify the model is the correct version and framework.

Retrieve Model Status Returns Example

The following demonstrates checking the status in the for loop until the model shows either ready or error.

import time
while model.status() != "ready" and model.status() != "error":
    print(model.status())
    time.sleep(3)
print(model.status())

converting
converting
ready

Model Registry Tutorials Tutorials

The following tutorials demonstrate deploying and performing sample inferences on model artifacts stored in model registries in Wallaroo.

Model Registry Service with Wallaroo SDK

Python Libraries

The following Python libraries are used by default with the Wallaroo SDK and the JupyterHub service installed with Wallaroo.

The following ML Model versions and Python libraries are supported by Wallaroo. When using the Wallaroo autoconversion library or working with a local version of the Wallaroo SDK, use the following versions for maximum compatibility.

Library	Supported Version
Python	python==3.10
Wallaroo	wallaroo==2025.1.0
onnx	onnx==1.14.1
tensorflow	tensorflow==2.13.1
keras	keras==2.13.1
pytorch	torch==2.0.1
sk-learn aka scikit-learn	scikit-learn==1.3.0
statsmodels	statsmodels==0.13.2
XGBoost	xgboost==1.7.4
MLFlow	mlflow==1.30.0

Supported Data Types

The following data types are supported for transporting data to and from Wallaroo in the following run times:

ONNX
TensorFlow
MLFlow

Data Type Conditions

The following conditions apply to data types used in inference requests.

None or Null data types are not submitted. All fields must have submitted values that match their data type. For example, if the schema expects a float value, then some value of type float must be submitted and can not be None or Null. If a schema expects a string value, then some value of type string must be submitted, etc. The exception are BYOP models, which can accept optional inputs.
datetime data types must be converted to string.
ONNX models support multiple inputs only of the same data type.

Runtime	Float16	Float32	Float64
ONNX		X	X
TensorFlow	X	X	X
MLFlow	X	X	X

* (Brain Float 16, represented internally as a f32)

Runtime	Int8	Int16	Int32	Int64
ONNX	X	X	X	X
TensorFlow	X	X	X	X
MLFlow	X	X	X	X

Runtime	Uint8	Uint16	Uint32	Uint64
ONNX	X	X	X	X
TensorFlow	X	X	X	X
MLFlow	X	X	X	X

Runtime	Boolean	Utf8 (String)	FixedSizeList*
ONNX			X
Tensor	X	X	X
MLFlow	X	X	X

* Fixed sized lists of any of the previously supported data types.

Model Upload

Table of Contents

Upload Model

Upload Model Parameters

Upload Model Returns

Model Input and Output Schemas

IMPORTANT NOTE

Data Constraints

Equal Rows

Equal Rows Input Example

Equal Rows Output Example

Data Type Consistency

Input Data Optimization

Wallaroo Supported Models

IMPORTANT NOTE

Wallaroo ONNX Requirements

Model Naming Requirements

Uploading ONNX Models

Upload ONNX Model Parameters

ONNX Model Config Parameters

ONNX Model Inputs

Upload ONNX Model Return

ONNX Conversion Tips

ONNX Tutorials

Model Naming Requirements

IMPORTANT NOTE

TensorFlow File Format

Tensorflow Library in Wallaroo JupyterHub Service

Uploading TensorFlow Models

Upload TensorFlow Model Parameters

Upload TensorFlow Model Return

Tensorflow Tutorials

Custom Model File Requirements

Custom Model Script Requirements

mac.inference.Inference

mac.inference.Inference Objects

mac.inference.Inference Methods

IMPORTANT NOTE

mac.inference.creation.InferenceBuilder

mac.inference.creation.InferenceBuilder Methods

Python Libraries

Custom Model Runtime

Custom Model Inputs

Upload Custom Model

Upload Custom Model Parameters

Upload Custom Model Return

Custom Model Examples

Upload Custom Model Example

Custom Model Script Example

Upload Custom Model Example

BYOP Tutorials

Model Naming Requirements

Hugging Face Schemas

Uploading Hugging Face Models

Upload Hugging Face Model Parameters

Upload Hugging Face Model Return

Upload Hugging Face Model Example

Hugging Face Tutorials

Model Naming Requirements

Python Models Requirements

Python Libraries

Upload Python Models via the Wallaroo SDK

Upload Python Model Parameters

Upload Python Model Returns

Upload Python Models Example

Model Naming Requirements

IMPORTANT NOTE

PyTorch Input and Output Schemas

Uploading PyTorch Models

Upload PyTorch Model Parameters

Upload PyTorch Model Return

Upload PyTorch Model Example

PyTorch Model Tutorials

Model Naming Requirements

SKLearn Schema Inputs

SKLearn Schema Outputs

Uploading SKLearn Models

Upload SKLearn Model Parameters

Upload SKLearn Model Return

Upload SKLearn Model Example