Wallaroo SDK Upload Tutorials: Python Step Shape

How to upload different Python Step Shape models to Wallaroo.

The following tutorials cover how to upload sample Python Step Shape models into a Wallaroo instance.

Python scripts are uploaded to Wallaroo and and treated like an ML Models in Pipeline steps. These will be referred to as Python steps.

Python steps can include:

Preprocessing steps to prepare the data received to be handed to ML Model deployed as another Pipeline step.
Postprocessing steps to take data output by a ML Model as part of a Pipeline step, and prepare the data to be received by some other data store or entity.
A model contained within a Python script.

In all of these, the requirements for uploading a Python script as a ML Model in Wallaroo are the same.

Parameter	Description
Web Site	https://www.python.org/
Supported Libraries	`python==3.8`
Framework	`Framework.PYTHON` aka `python`
Runtime	Native aka `python`

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

	time	in.tensor	out.output	check_failures
0	2023-06-20 20:23:28.395	[0.6878518042, 0.1760734021, -0.869514083, 0.3..	[12.886651039123535]	0

Python Model Shape Upload to Wallaroo with the Wallaroo SDK

How to upload a Python Step Shape model to Wallaroo via the SDK