.

.

Wallaroo SDK Guides

Reference Guide for the most essential Wallaroo SDK Commands

The following ML Model versions and Python libraries are supported by Wallaroo. When using the Wallaroo autoconversion library or working with a local version of the Wallaroo SDK, use the following versions for maximum compatibility.

LibrarySupported Version
Python3.8.6 and above
onnx1.12.0
tensorflow2.9.1
keras2.9.0
pytorch1.13.1
sk-learn aka scikit-learn1.1.2
statsmodels0.13.2
XGBoost1.6.2
MLFlow1.30.0

Supported Data Types

The following data types are supported for transporting data to and from Wallaroo in the following run times:

  • ONNX
  • TensorFlow
  • MLFlow

Data Type Conditions

The following conditions apply to data types used in inference requests.

  • None or Null data types are not submitted. All fields must have submitted values that match their data type. For example, if the schema expects a float value, then some value of type float must be submitted and can not be None or Null. If a schema expects a string value, then some value of type string must be submitted, etc.
  • datetime data types must be converted to string.
  • ONNX models support multiple inputs only of the same data type.
RuntimeBFloat16*Float16Float32Float64
ONNXXX
TensorFlowXXX
MLFlowXXX
  • * (Brain Float 16, represented internally as a f32)

RuntimeInt8Int16Int32Int64
ONNXXXXX
TensorFlowXXXX
MLFlowXXXX
RuntimeUint8Uint16Uint32Uint64
ONNXXXXX
TensorFlowXXXX
MLFlowXXXX
RuntimeBooleanUtf8 (String)Complex 64Complex 128FixedSizeList*
ONNXX
TensorXXX
MLFlowXXX
  • * Fixed sized lists of any of the previously supported data types.

Wallaroo JupyterHub Python Libraries

When using the Wallaroo SDK, it is recommended that the Python modules used are the same as those used in the Wallaroo JupyterHub environments to ensure maximum compatibility. When installing modules in the Wallaroo JupyterHub environments, do not override the following modules or versions, as that may impact how the JupyterHub environments performance.

appdirs == 1.4.4
gql == 3.4.0
ipython == 7.24.1
matplotlib == 3.5.0
numpy == 1.22.3
orjson == 3.8.0
pandas == 1.3.4
pyarrow == 9.0.0
PyJWT == 2.4.0
python_dateutil == 2.8.2
PyYAML == 6.0
requests == 2.25.1
scipy == 1.8.0
seaborn == 0.11.2
tenacity == 8.0.1
# Required by gql?
requests_toolbelt>=0.9.1<1
# Required by the autogenerated ML Ops client
httpx >= 0.15.4<0.24.0
attrs >= 21.3.0
# These are documented as part of the autogenerated ML Ops requirements
# python = ^3.7
# python-dateutil = ^2.8.0

1 - Wallaroo SDK Install Guides

How to install the Wallaroo SDK

The following guides demonstrate how to install the Wallaroo SDK in different environments. The Wallaroo SDK is installed by default into a Wallaroo instance for use with the JupyterHub service.

The Wallaroo SDK requires Python 3.8.6 and above and is available through the Wallaroo SDK Page.

Wallaroo JupyterHub Python Libraries

When using the Wallaroo SDK, it is recommended that the Python modules used are the same as those used in the Wallaroo JupyterHub environments to ensure maximum compatibility. When installing modules in the Wallaroo JupyterHub environments, do not override the following modules or versions, as that may impact how the JupyterHub environments performance.

appdirs == 1.4.4
gql == 3.4.0
ipython == 7.24.1
matplotlib == 3.5.0
numpy == 1.22.3
orjson == 3.8.0
pandas == 1.3.4
pyarrow == 9.0.0
PyJWT == 2.4.0
python_dateutil == 2.8.2
PyYAML == 6.0
requests == 2.25.1
scipy == 1.8.0
seaborn == 0.11.2
tenacity == 8.0.1
# Required by gql?
requests_toolbelt>=0.9.1<1
# Required by the autogenerated ML Ops client
httpx >= 0.15.4<0.24.0
attrs >= 21.3.0
# These are documented as part of the autogenerated ML Ops requirements
# python = ^3.7
# python-dateutil = ^2.8.0

Supported Model Versions and Libraries

The following ML Model versions and Python libraries are supported by Wallaroo. When using the Wallaroo autoconversion library or working with a local version of the Wallaroo SDK, use the following versions for maximum compatibility.

LibrarySupported Version
Python3.8.6 and above
onnx1.12.0
tensorflow2.9.1
keras2.9.0
pytorch1.13.1
sk-learn aka scikit-learn1.1.2
statsmodels0.13.2
XGBoost1.6.2
MLFlow1.30.0

Supported Data Types

The following data types are supported for transporting data to and from Wallaroo in the following run times:

  • ONNX
  • TensorFlow
  • MLFlow

Data Type Conditions

The following conditions apply to data types used in inference requests.

  • None or Null data types are not submitted. All fields must have submitted values that match their data type. For example, if the schema expects a float value, then some value of type float must be submitted and can not be None or Null. If a schema expects a string value, then some value of type string must be submitted, etc.
  • datetime data types must be converted to string.
  • ONNX models support multiple inputs only of the same data type.
RuntimeBFloat16*Float16Float32Float64
ONNXXX
TensorFlowXXX
MLFlowXXX
  • * (Brain Float 16, represented internally as a f32)

RuntimeInt8Int16Int32Int64
ONNXXXXX
TensorFlowXXXX
MLFlowXXXX
RuntimeUint8Uint16Uint32Uint64
ONNXXXXX
TensorFlowXXXX
MLFlowXXXX
RuntimeBooleanUtf8 (String)Complex 64Complex 128FixedSizeList*
ONNXX
TensorXXX
MLFlowXXX
  • * Fixed sized lists of any of the previously supported data types.

1.1 - Wallaroo SDK AWS Sagemaker Install Guide

How to install the Wallaroo SDK in AWS Sagemaker

This tutorial and the assets can be downloaded as part of the Wallaroo Tutorials repository.

Installing the Wallaroo SDK in AWS Sagemaker

Organizations that develop machine learning models can deploy models to Wallaroo from AWS Sagemaker to a Wallaroo instance through the Wallaroo SDK. The following guide is created to assist users with installing the Wallaroo SDK and making a standard connection to a Wallaroo instance.

Organizations can use Wallaroo SSO for Amazon Web Services to provide AWS users access to the Wallaroo instance.

These instructions are based on the on the Connect to Wallaroo guides.

This tutorial provides the following:

  • aloha-cnn-lstm.zip: A pre-trained open source model that uses an Aloha CNN LSTM model for classifying Domain names as being either legitimate or being used for nefarious purposes such as malware distribution.
  • Test Data Files:
    • data_1k.arrow: 1,000 records
    • data_25k.arrow: 25,000 records

For this example, a virtual python environment will be used. This will set the necessary libraries and specific Python version required.

Prerequisites

The following is required for this tutorial:

  • A Wallaroo instance version 2023.1 or later.
  • A AWS Sagemaker domain with a Notebook Instance.
  • Python 3.8.6 or later.
  • The following Python libraries installed:
    • os
    • wallaroo: The Wallaroo SDK. Included with the Wallaroo JupyterHub service by default.
    • pandas: Pandas, mainly used for Pandas DataFrame
    • pyarrow: PyArrow for Apache Arrow support
    • polars: Polars for DataFrame with native Apache Arrow support

General Steps

For our example, we will perform the following:

  • Install Wallaroo SDK
    • Set up a Python virtual environment through conda with the libraries that enable the virtual environment for use in a Jupyter Hub environment.
    • Install the Wallaroo SDK.
  • Wallaroo SDK from remote JupyterHub Demonstration (Optional): The following steps are an optional exercise to demonstrate using the Wallaroo SDK from a remote connection. The entire tutorial can be found on the Wallaroo Tutorials repository.
    • Connect to a remote Wallaroo instance.
    • Create a workspace for our work.
    • Upload the Aloha model.
    • Create a pipeline that can ingest our submitted data, submit it to the model, and export the results
    • Run a sample inference through our pipeline by loading a file
    • Retrieve the external deployment URL. This sample Wallaroo instance has been configured to create external inference URLs for pipelines. For more information, see the External Inference URL Guide.
    • Run a sample inference through our pipeline’s external URL and store the results in a file. This assumes that the External Inference URLs have been enabled for the target Wallaroo instance.
    • Undeploy the pipeline and return resources back to the Wallaroo instance’s Kubernetes environment.

Install Wallaroo SDK

Set Up Virtual Python Environment

To set up the Python virtual environment for use of the Wallaroo SDK:

  1. From AWS Sagemaker, select the Notebook instances.

  2. For the list of notebook instances, select Open JupyterLab for the notebook instance to be used.

  3. From the Launcher, select Terminal.

  4. From a terminal shell, create the Python virtual environment with conda. Replace wallaroosdk with the name of the virtual environment as required by your organization. Note that Python 3.8.6 and above is specified as a requirement for Python libraries used with the Wallaroo SDK. The following will install the latest version of Python 3.8.

    conda create -n wallaroosdk python=3.8
    
  5. (Optional) If the shells have not been initialized with conda, use the following to initialize it. The following examples will use the bash shell.

    1. Initialize the bash shell with conda with the command:

      conda init bash
      
    2. Launch the bash shell that has been initialized for conda:

      bash
      
  6. Activate the new environment.

    conda activate wallaroosdk
    
  7. Install the ipykernel library. This allows the JupyterHub notebooks to access the Python virtual environment as a kernel, and it required for the second part of this tutorial.

    conda install ipykernel
    
    1. Install the new virtual environment as a python kernel.

      ipython kernel install --user --name=wallaroosdk
      
  8. Install the Wallaroo SDK. This process may take several minutes while the other required Python libraries are added to the virtual environment.

    • IMPORTANT NOTE: The version of the Wallaroo SDK should match the Wallaroo instance. For example, this example connects to a Wallaroo Enterprise version 2023.1 instance, so the SDK version should be wallaroo==2023.1.0.
    pip install wallaroo==2023.1.0
    

For organizations who will be using the Wallaroo SDK with Jupyter or similar services, the conda virtual environment has been installed, it can either be selected as a new Jupyter Notebook kernel, or the Notebook’s kernel can be set to an existing Jupyter notebook.

To use a new Notebook:

  1. From the main menu, select File->New-Notebook.
  2. From the Kernel selection dropbox, select the new virtual environment - in this case, wallaroosdk.

To update an existing Notebook to use the new virtual environment as a kernel:

  1. From the main menu, select Kernel->Change Kernel.
  2. Select the new kernel.

Sample Wallaroo Connection

With the Wallaroo Python SDK installed, remote commands and inferences can be performed through the following steps.

Open a Connection to Wallaroo

The first step is to connect to Wallaroo through the Wallaroo client.

This is accomplished using the wallaroo.Client(api_endpoint, auth_endpoint, auth_type command) command that connects to the Wallaroo instance services.

The Client method takes the following parameters:

  • api_endpoint (String): The URL to the Wallaroo instance API service.
  • auth_endpoint (String): The URL to the Wallaroo instance Keycloak service.
  • auth_type command (String): The authorization type. In this case, SSO.

The URLs are based on the Wallaroo Prefix and Wallaroo Suffix for the Wallaroo instance. For more information, see the DNS Integration Guide. In the example below, replace “YOUR PREFIX” and “YOUR SUFFIX” with the Wallaroo Prefix and Suffix, respectively.

Once run, the wallaroo.Client command provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Depending on the configuration of the Wallaroo instance, the user will either be presented with a login request to the Wallaroo instance or be authenticated through a broker such as Google, Github, etc. To use the broker, select it from the list under the username/password login forms. For more information on Wallaroo authentication configurations, see the Wallaroo Authentication Configuration Guides.

Once authenticated, the user will verify adding the device the user is establishing the connection from. Once both steps are complete, then the connection is granted.

The connection is stored in the variable wl for use in all other Wallaroo calls.

import wallaroo
from wallaroo.object import EntityNotFoundError

# to display dataframe tables
from IPython.display import display
# used to display dataframe information without truncating
import pandas as pd
pd.set_option('display.max_colwidth', None)
import pyarrow as pa
wallaroo.__version__
'2023.2.0rc3'

Connect to Wallaroo

For this example, a connection through the Wallaroo SDK is used. For more information, see the Wallaroo SDK Essentials Guide: Client Connection.

For wallarooPrefix = "YOUR PREFIX." and wallarooSuffix = "YOUR SUFFIX", enter the prefix and suffix for your Wallaroo instance DNS name. If the prefix instance is blank, then it can be wallarooPrefix = "". Note that the prefix includes the . for proper formatting. For example, if the prefix is empty and the suffix is wallaroo.example.com, then the settings would be:

wallarooPrefix = ""
wallarooSuffix = "wallaroo.example.com"

If the prefix is sales. and the suffix example.com, then the settings would be:

wallarooPrefix = "sales."
wallarooSuffix = "wallaroo.example.com"
# SSO login through keycloak

wallarooPrefix = "YOUR PREFIX"
wallarooSuffix = "YOUR SUFFIX"

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
                    auth_type="sso")

Wallaroo Remote SDK Examples

The following examples can be used by an organization to test using the Wallaroo SDK from a remote location from their Wallaroo instance. These examples show how to create workspaces, deploy pipelines, and perform inferences through the SDK and API.

Create the Workspace

We will create a workspace to work in and call it the sdkworkspace, then set it as current workspace environment. We’ll also create our pipeline in advance as sdkpipeline.

  • IMPORTANT NOTE: For this example, the Aloha model is stored in the file alohacnnlstm.zip. When using tensor based models, the zip file must match the name of the tensor directory. For example, if the tensor directory is alohacnnlstm, then the .zip file must be named alohacnnlstm.zip.
workspace_name = 'sdkquickworkspace'
pipeline_name = 'sdkquickpipeline'
model_name = 'sdkquickmodel'
model_file_name = './alohacnnlstm.zip'
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(name)
    return pipeline
workspace = get_workspace(workspace_name)

wl.set_current_workspace(workspace)

pipeline = get_pipeline(pipeline_name)
pipeline
namesdkquickpipeline
created2023-05-17 20:43:38.111213+00:00
last_updated2023-05-17 20:43:46.036128+00:00
deployedFalse
tags
versions2bd5c838-f7cc-4f48-91ea-28a9ce0f7ed8, d72c468a-a0e2-4189-aa7a-4e27127a2f2b
stepssdkquickmodel

We can verify the workspace is created the current default workspace with the get_current_workspace() command.

wl.get_current_workspace()
{'name': 'sdkquickworkspace', 'id': 6, 'archived': False, 'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da', 'created_at': '2023-05-17T20:43:36.727099+00:00', 'models': [{'name': 'sdkquickmodel', 'versions': 1, 'owner_id': '""', 'last_update_time': datetime.datetime(2023, 5, 17, 20, 43, 42, 734108, tzinfo=tzutc()), 'created_at': datetime.datetime(2023, 5, 17, 20, 43, 42, 734108, tzinfo=tzutc())}], 'pipelines': [{'name': 'sdkquickpipeline', 'create_time': datetime.datetime(2023, 5, 17, 20, 43, 38, 111213, tzinfo=tzutc()), 'definition': '[]'}]}

Upload the Models

Now we will upload our model. Note that for this example we are applying the model from a .ZIP file. The Aloha model is a protobuf file that has been defined for evaluating web pages, and we will configure it to use data in the tensorflow format.

from wallaroo.framework import Framework
model = wl.upload_model(model_name, model_file_name, framework=Framework.TENSORFLOW).configure("tensorflow")

Deploy a Model

Now that we have a model that we want to use we will create a deployment for it.

We will tell the deployment we are using a tensorflow model and give the deployment name and the configuration we want for the deployment.

To do this, we’ll create our pipeline that can ingest the data, pass the data to our Aloha model, and give us a final output. We’ll call our pipeline externalsdkpipeline, then deploy it so it’s ready to receive data. The deployment process usually takes about 45 seconds.

pipeline.add_model_step(model)
namesdkquickpipeline
created2023-05-17 20:43:38.111213+00:00
last_updated2023-05-17 20:43:46.036128+00:00
deployedFalse
tags
versions2bd5c838-f7cc-4f48-91ea-28a9ce0f7ed8, d72c468a-a0e2-4189-aa7a-4e27127a2f2b
stepssdkquickmodel
pipeline.deploy()
namesdkquickpipeline
created2023-05-17 20:43:38.111213+00:00
last_updated2023-05-17 20:45:46.768412+00:00
deployedTrue
tags
versionsbf7c2146-ed14-430b-bf96-1e8b1047eb2e, 2bd5c838-f7cc-4f48-91ea-28a9ce0f7ed8, d72c468a-a0e2-4189-aa7a-4e27127a2f2b
stepssdkquickmodel

We can verify that the pipeline is running and list what models are associated with it.

pipeline.status()
{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.3.135',
   'name': 'engine-5948b767f7-2fthq',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'sdkquickpipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'sdkquickmodel',
      'version': '523e22a0-958b-48e5-9ae1-6f7477502e62',
      'sha': 'd71d9ffc61aaac58c2b1ed70a2db13d1416fb9d3f5b891e5e4e2e97180fe22f8',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.4.165',
   'name': 'engine-lb-584f54c899-dxl4j',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

Interferences

Infer 1 row

Now that the pipeline is deployed and our Aloha model is in place, we’ll perform a smoke test to verify the pipeline is up and running properly. We’ll use the infer_from_file command to load a single encoded URL into the inference engine and print the results back out.

The result should tell us that the tokenized URL is legitimate (0) or fraud (1). This sample data should return close to 1.

## Demonstrate via straight infer

smoke_test = pd.DataFrame.from_records(
    [
    {
        "text_input":[
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            28,
            16,
            32,
            23,
            29,
            32,
            30,
            19,
            26,
            17
        ]
    }
]
)

result = pipeline.infer(smoke_test)
display(result.loc[:, ["time","out.main"]])
timeout.main
02023-05-17 20:46:00[0.997564]

Infer 1,000 Rows

We can also infer an entire batch as one request either with the Pipeline infer method with multiple rows, or loaded from a file using the Pipeline infer_from_file method. For this example, we will run a batch on 1,000 records using the file data_1k.arrow. This is an Apache Arrow table, which gives the added benefit of speed and lower file size as a binary file rather than a text JSON file.

We’ll infer the 1,000 records, then convert it to a DataFrame and display the first 5 to save space in our Jupyter Notebook.

result = pipeline.infer_from_file('./data/data_1k.arrow')

outputs = result.to_pandas()
display(outputs.head(5).loc[:, ["time","out.main"]])
timeout.main
02023-05-17 20:46:01.374[0.997564]
12023-05-17 20:46:01.374[0.9885122]
22023-05-17 20:46:01.374[0.9993358]
32023-05-17 20:46:01.374[0.99999857]
42023-05-17 20:46:01.374[0.9984837]

Batch Inference

Now that our smoke test is successful, let’s really give it some data. We have two inference files we can use:

  • data-1k.arrow: Contains 10,000 inferences
  • data-25k.arrow: Contains 25,000 inferences

These inference inputs are Apache Arrow tables, which Wallaroo can ingest natively. These are binary files, and are faster to transmit because of their smaller size compared to JSON.

We’ll pipe the data-25k.arrow file through the pipeline deployment URL, and place the results in a file named response.df. Note that for larger batches of 1,000 inferences or more can be difficult to view in Jupyter Hub because of its size, so we’ll only display the first 5 results of the inference.

When retrieving the pipeline inference URL through an external SDK connection, the External Inference URL will be returned. This URL will function provided that the Enable external URL inference endpoints is enabled. For more information, see the Wallaroo Model Endpoints Guide.

pipeline.deploy()
namesdkquickpipeline
created2023-05-17 20:43:38.111213+00:00
last_updated2023-05-17 20:46:02.557913+00:00
deployedTrue
tags
versions961c909d-f5ae-472a-b8ae-1e6a00fbc36e, bf7c2146-ed14-430b-bf96-1e8b1047eb2e, 2bd5c838-f7cc-4f48-91ea-28a9ce0f7ed8, d72c468a-a0e2-4189-aa7a-4e27127a2f2b
stepssdkquickmodel
inference_url = pipeline._deployment._url()
inference_url

The API connection details can be retrieved through the Wallaroo client mlops() command. This will display the connection URL, bearer token, and other information. The bearer token is available for one hour before it expires.

For this example, the API connection details will be retrieved, then used to submit an inference request through the external inference URL retrieved earlier.

connection =wl.mlops().__dict__
token = connection['token']
token
'eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJDYkFqN19QY0xCWTFkWmJiUDZ6Q3BsbkNBYTd6US0tRHlyNy0yLXlQb25nIn0.eyJleHAiOjE2ODQzNTYzOTYsImlhdCI6MTY4NDM1NjMzNiwiYXV0aF90aW1lIjoxNjg0MzU1OTU5LCJqdGkiOiI4ZDBlMGUyNi05MTM3LTQ4MzAtYmFmMC1hNzRjZDQ3Yzk2ZWUiLCJpc3MiOiJodHRwczovL2RvYy10ZXN0LmtleWNsb2FrLndhbGxhcm9vY29tbXVuaXR5Lm5pbmphL2F1dGgvcmVhbG1zL21hc3RlciIsImF1ZCI6WyJtYXN0ZXItcmVhbG0iLCJhY2NvdW50Il0sInN1YiI6IjAyOGM4YjQ4LWMzOWItNDU3OC05MTEwLTBiNWJkZDM4MjRkYSIsInR5cCI6IkJlYXJlciIsImF6cCI6InNkay1jbGllbnQiLCJzZXNzaW9uX3N0YXRlIjoiMGJlODJjN2ItNzg1My00ZjVkLWJiNWEtOTlkYjUwYjhiNDVmIiwiYWNyIjoiMCIsInJlYWxtX2FjY2VzcyI6eyJyb2xlcyI6WyJkZWZhdWx0LXJvbGVzLW1hc3RlciIsIm9mZmxpbmVfYWNjZXNzIiwidW1hX2F1dGhvcml6YXRpb24iXX0sInJlc291cmNlX2FjY2VzcyI6eyJtYXN0ZXItcmVhbG0iOnsicm9sZXMiOlsibWFuYWdlLXVzZXJzIiwidmlldy11c2VycyIsInF1ZXJ5LWdyb3VwcyIsInF1ZXJ5LXVzZXJzIl19LCJhY2NvdW50Ijp7InJvbGVzIjpbIm1hbmFnZS1hY2NvdW50IiwibWFuYWdlLWFjY291bnQtbGlua3MiLCJ2aWV3LXByb2ZpbGUiXX19LCJzY29wZSI6InByb2ZpbGUgZW1haWwiLCJzaWQiOiIwYmU4MmM3Yi03ODUzLTRmNWQtYmI1YS05OWRiNTBiOGI0NWYiLCJlbWFpbF92ZXJpZmllZCI6ZmFsc2UsImh0dHBzOi8vaGFzdXJhLmlvL2p3dC9jbGFpbXMiOnsieC1oYXN1cmEtdXNlci1pZCI6IjAyOGM4YjQ4LWMzOWItNDU3OC05MTEwLTBiNWJkZDM4MjRkYSIsIngtaGFzdXJhLWRlZmF1bHQtcm9sZSI6InVzZXIiLCJ4LWhhc3VyYS1hbGxvd2VkLXJvbGVzIjpbInVzZXIiXSwieC1oYXN1cmEtdXNlci1ncm91cHMiOiJ7fSJ9LCJuYW1lIjoiSm9obiBIYW5zYXJpY2siLCJwcmVmZXJyZWRfdXNlcm5hbWUiOiJqb2huLmh1bW1lbEB3YWxsYXJvby5haSIsImdpdmVuX25hbWUiOiJKb2huIiwiZmFtaWx5X25hbWUiOiJIYW5zYXJpY2siLCJlbWFpbCI6ImpvaG4uaHVtbWVsQHdhbGxhcm9vLmFpIn0.JvN26UNUomn66qLH7LcwMicUJVbDxl-bfukv7dShUbBg9kFwe4RkKhXRxAnoU6J5GgGp9c4jg8kt29frBrPeBNY9dqOA_2XY9CxfuJAMGv-kDg4VdY6P-UQw32Yj-oEq9Lo5BYeXEOMR2DgkRrXjbXYT5BNBNXChQQRuYyWvn9q2Gyob_ppYS6FOuqCKydZkrJ_VciMhp8V6GMaDVAY1b7i_lVKIWzv6IjEqDOvGFAqi7JQlEhhhqtv1nK-mtMrWNp8kVAo9Tks3Zv-YwOl8sRntEhlx1Pu_O9ml6FelaEc7HxuwTEVQmgJsEjsgNCSVDsBgWSTPSUJYY4-gFmtwsA'
dataFile="./data/data_25k.arrow"
contentType="application/vnd.apache.arrow.file"
!curl -X POST {inference_url} -H "Authorization: Bearer {token}" -H "Content-Type:{contentType}" --data-binary @{dataFile} > curl_response.df
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 25.6M  100 20.8M  100 4874k  1123k   256k  0:00:19  0:00:19 --:--:-- 2278k
cc_data_from_file =  pd.read_json('./curl_response.df', orient="records")
display(cc_data_from_file.head(5).loc[:, ["time","out"]])
timeout
01684356368857{'banjori': [0.0015195821], 'corebot': [0.9829147500000001], 'cryptolocker': [0.012099549000000001], 'dircrypt': [4.7591115e-05], 'gozi': [2.0289428e-05], 'kraken': [0.00031977256999999996], 'locky': [0.011029262000000001], 'main': [0.997564], 'matsnu': [0.010341609], 'pykspa': [0.008038961], 'qakbot': [0.016155055], 'ramdo': [0.00623623], 'ramnit': [0.0009985747000000001], 'simda': [1.7933434e-26], 'suppobox': [1.388995e-27]}
11684356368857{'banjori': [7.447196e-18], 'corebot': [6.7359245e-08], 'cryptolocker': [0.1708199], 'dircrypt': [1.3220122000000002e-09], 'gozi': [1.2758705999999999e-24], 'kraken': [0.22559543], 'locky': [0.34209849999999997], 'main': [0.99999994], 'matsnu': [0.3080186], 'pykspa': [0.1828217], 'qakbot': [3.8022549999999994e-11], 'ramdo': [0.2062254], 'ramnit': [0.15215826], 'simda': [1.1701982e-30], 'suppobox': [3.1514454e-38]}
21684356368857{'banjori': [2.8598648999999997e-21], 'corebot': [9.302004000000001e-08], 'cryptolocker': [0.04445298], 'dircrypt': [6.1637580000000004e-09], 'gozi': [8.3496755e-23], 'kraken': [0.48234479999999996], 'locky': [0.26332903], 'main': [1.0], 'matsnu': [0.29800338], 'pykspa': [0.22361776], 'qakbot': [1.5238920999999999e-06], 'ramdo': [0.32820392], 'ramnit': [0.029332489000000003], 'simda': [1.1995622e-31], 'suppobox': [0.0]}
31684356368857{'banjori': [2.1387213e-15], 'corebot': [3.8817485e-10], 'cryptolocker': [0.045599736], 'dircrypt': [1.9090386e-07], 'gozi': [1.3140123e-25], 'kraken': [0.59542626], 'locky': [0.17374137], 'main': [0.9999996999999999], 'matsnu': [0.23151578], 'pykspa': [0.17591679999999998], 'qakbot': [1.0876152e-09], 'ramdo': [0.21832279999999998], 'ramnit': [0.0128692705], 'simda': [6.1588803e-28], 'suppobox': [1.4386237e-35]}
41684356368857{'banjori': [9.453342500000001e-15], 'corebot': [7.091151e-10], 'cryptolocker': [0.049815163], 'dircrypt': [5.2914135e-09], 'gozi': [7.4132087e-19], 'kraken': [1.5504574999999998e-13], 'locky': [1.079181e-15], 'main': [0.9999988999999999], 'matsnu': [1.5003075e-15], 'pykspa': [0.33075705], 'qakbot': [2.6258850000000004e-07], 'ramdo': [0.5036279], 'ramnit': [0.020393765], 'simda': [0.0], 'suppobox': [2.3292326e-38]}

Undeploy Pipeline

When finished with our tests, we will undeploy the pipeline so we have the Kubernetes resources back for other tasks. Note that if the deployment variable is unchanged pipeline.deploy() will restart the inference engine in the same configuration as before.

pipeline.undeploy()
namesdkquickpipeline
created2023-05-17 20:43:38.111213+00:00
last_updated2023-05-17 20:46:02.557913+00:00
deployedFalse
tags
versions961c909d-f5ae-472a-b8ae-1e6a00fbc36e, bf7c2146-ed14-430b-bf96-1e8b1047eb2e, 2bd5c838-f7cc-4f48-91ea-28a9ce0f7ed8, d72c468a-a0e2-4189-aa7a-4e27127a2f2b
stepssdkquickmodel

1.2 - Wallaroo SDK AzureML Install Guide

How to install the Wallaroo SDK in AzureML

This tutorial and the assets can be downloaded as part of the Wallaroo Tutorials repository.

Installing the Wallaroo SDK into Azure ML Workspace

Organizations that use Azure ML for model training and development can deploy models to Wallaroo through the Wallaroo SDK. The following guide is created to assist users with installing the Wallaroo SDK, setting up authentication through Azure ML, and making a standard connection to a Wallaroo instance through Azure ML Workspace.

These instructions are based on the on the Wallaroo SSO for Microsoft Azure and the Connect to Wallaroo guides.

This tutorial provides the following:

  • aloha-cnn-lstm.zip: A pre-trained open source model that uses an Aloha CNN LSTM model for classifying Domain names as being either legitimate or being used for nefarious purposes such as malware distribution.
  • Test Data Files:
    • data_1k.arrow: 1,000 records
    • data_25k.arrow: 25,000 records

To use the Wallaroo SDK within Azure ML Workspace, a virtual environment will be used. This will set the necessary libraries and specific Python version required.

Prerequisites

The following is required for this tutorial:

  • A Wallaroo instance version 2023.1 or later.
  • Python 3.8.6 or later installed locally
  • Conda: Used for managing python virtual environments. This is automatically included in Azure ML Workspace.
  • An Azure ML workspace is created with a compute configured.
  • The following Python libraries installed:
    • os
    • wallaroo: The Wallaroo SDK. Included with the Wallaroo JupyterHub service by default.
    • pandas: Pandas, mainly used for Pandas DataFrame
    • pyarrow: PyArrow for Apache Arrow support
    • polars: Polars for DataFrame with native Apache Arrow support

General Steps

For our example, we will perform the following:

  • Wallaroo SDK Install
    • Set up a Python virtual environment through conda with the libraries that enable the virtual environment for use in a Jupyter Hub environment.
    • Install the Wallaroo SDK.
  • Wallaroo SDK from remote JupyterHub Demonstration (Optional): The following steps are an optional exercise to demonstrate using the Wallaroo SDK from a remote connection. The entire tutorial can be found on the Wallaroo Tutorials repository).
    • Connect to a remote Wallaroo instance.
    • Create a workspace for our work.
    • Upload the Aloha model.
    • Create a pipeline that can ingest our submitted data, submit it to the model, and export the results
    • Run a sample inference through our pipeline by loading a file
    • Retrieve the external deployment URL. This sample Wallaroo instance has been configured to create external inference URLs for pipelines. For more information, see the External Inference URL Guide.
    • Run a sample inference through our pipeline’s external URL and store the results in a file. This assumes that the External Inference URLs have been enabled for the target Wallaroo instance.
    • Undeploy the pipeline and return resources back to the Wallaroo instance’s Kubernetes environment.

Install Wallaroo SDK

Set Up Virtual Python Environment

To set up the virtual environment in Azure ML for using the Wallaroo SDK with Azure ML Workspace:

  1. Select Notebooks.

  2. Create a new folder where the Jupyter Notebooks for Wallaroo will be installed.

  3. From this repository, upload sdk-install-guides/azure-ml-sdk-install.zip, or upload the entire folder sdk-install-guides/azure-ml-sdk-install. This tutorial will assume the .zip file was uploaded.

  4. Select Open Terminal. Navigate to the target directory.

  5. Run unzip azure-ml-sdk-install.zip to unzip the directory, then cd into it with cd azure-ml-sdk-install.

  6. Create the Python virtual environment with conda. Replace wallaroosdk with the name of the virtual environment as required by your organization. Note that Python 3.8.6 and above is specified as a requirement for Python libraries used with the Wallaroo SDK. The following will install the latest version of Python 3.8, which as of this time is 3.8.15.

    conda create -n wallaroosdk python=3.8
    
  7. Activate the new environment.

    conda activate wallaroosdk
    
  8. Install the ipykernel library. This allows the JupyterHub notebooks to access the Python virtual environment as a kernel.

    conda install ipykernel
    
  9. Install the new virtual environment as a python kernel.

    ipython kernel install --user --name=wallaroosdk
    
  10. Install the Wallaroo SDK. This process may take several minutes while the other required Python libraries are added to the virtual environment.

    • IMPORTANT NOTE: The version of the Wallaroo SDK should match the Wallaroo instance. For example, this example connects to a Wallaroo Enterprise version 2023.1 instance, so the SDK version should be wallaroo==2023.1.0.
    pip install wallaroo==2023.2.1
    

Once the conda virtual environment has been installed, it can either be selected as a new Jupyter Notebook kernel, or the Notebook’s kernel can be set to an existing Jupyter notebook. If a notebook is existing, close it then reopen to select the new Wallaroo SDK environment.

To use a new Notebook:

  1. From the left navigation panel, select +->Notebook.
  2. From the Kernel selection dropbox on the upper right side, select the new virtual environment - in this case, wallaroosdk.

To update an existing Notebook to use the new virtual environment as a kernel:

  1. From the main menu, select Kernel->Change Kernel.
  2. Select the new kernel.

Sample Wallaroo Connection

With the Wallaroo Python SDK installed, remote commands and inferences can be performed through the following steps.

Open a Connection to Wallaroo

The first step is to connect to Wallaroo through the Wallaroo client.

This is accomplished using the wallaroo.Client(api_endpoint, auth_endpoint, auth_type command) command that connects to the Wallaroo instance services.

The Client method takes the following parameters:

  • api_endpoint (String): The URL to the Wallaroo instance API service.
  • auth_endpoint (String): The URL to the Wallaroo instance Keycloak service.
  • auth_type command (String): The authorization type. In this case, SSO.

The URLs are based on the Wallaroo Prefix and Wallaroo Suffix for the Wallaroo instance. For more information, see the DNS Integration Guide. In the example below, replace “YOUR PREFIX” and “YOUR SUFFIX” with the Wallaroo Prefix and Suffix, respectively.

Once run, the wallaroo.Client command provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Depending on the configuration of the Wallaroo instance, the user will either be presented with a login request to the Wallaroo instance or be authenticated through a broker such as Google, Github, etc. To use the broker, select it from the list under the username/password login forms. For more information on Wallaroo authentication configurations, see the Wallaroo Authentication Configuration Guides.

Once authenticated, the user will verify adding the device the user is establishing the connection from. Once both steps are complete, then the connection is granted.

The connection is stored in the variable wl for use in all other Wallaroo calls.

import wallaroo
from wallaroo.object import EntityNotFoundError

# to display dataframe tables
from IPython.display import display
# used to display dataframe information without truncating
import pandas as pd
pd.set_option('display.max_colwidth', None)
import pyarrow as pa

Connect to Wallaroo

For this example, a connection through the Wallaroo SDK is used. For more information, see the Wallaroo SDK Essentials Guide: Client Connection.

For wallarooPrefix = "YOUR PREFIX." and wallarooSuffix = "YOUR SUFFIX", enter the prefix and suffix for your Wallaroo instance DNS name. If the prefix instance is blank, then it can be wallarooPrefix = "". Note that the prefix includes the . for proper formatting. For example, if the prefix is empty and the suffix is wallaroo.example.com, then the settings would be:

wallarooPrefix = ""
wallarooSuffix = "wallaroo.example.com"

If the prefix is sales. and the suffix example.com, then the settings would be:

wallarooPrefix = "sales."
wallarooSuffix = "wallaroo.example.com"
# SSO login through keycloak

wallarooPrefix = "YOUR PREFIX"
wallarooSuffix = "YOUR SUFFIX"

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
                    auth_type="sso")

Create the Workspace

We will create a workspace to work in and call it the azuremlsdkworkspace, then set it as current workspace environment. We’ll also create our pipeline in advance as azuremlsdkpipeline.

  • IMPORTANT NOTE: For this example, the Aloha model is stored in the file alohacnnlstm.zip. When using tensor based models, the zip file must match the name of the tensor directory. For example, if the tensor directory is alohacnnlstm, then the .zip file must be named alohacnnlstm.zip.
workspace_name = 'azuremlsdkworkspace'
pipeline_name = 'azuremlsdkpipeline'
model_name = 'azuremlsdkmodel'
model_file_name = './alohacnnlstm.zip'
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(name)
    return pipeline
workspace = get_workspace(workspace_name)

wl.set_current_workspace(workspace)

pipeline = get_pipeline(pipeline_name)
pipeline
nameazuremlsdkpipeline
created2023-05-17 21:01:46.655024+00:00
last_updated2023-05-17 21:01:46.655024+00:00
deployed(none)
tags
versionse011272d-c22c-4b2d-ab9f-b17c60099434
steps

We can verify the workspace is created the current default workspace with the get_current_workspace() command.

wl.get_current_workspace()
{'name': 'azuremlsdkworkspace', 'id': 8, 'archived': False, 'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da', 'created_at': '2023-05-17T21:01:45.647036+00:00', 'models': [], 'pipelines': [{'name': 'azuremlsdkpipeline', 'create_time': datetime.datetime(2023, 5, 17, 21, 1, 46, 655024, tzinfo=tzutc()), 'definition': '[]'}]}

Upload the Models

Now we will upload our model. Note that for this example we are applying the model from a .ZIP file. The Aloha model is a protobuf file that has been defined for evaluating web pages, and we will configure it to use data in the tensorflow format.

from wallaroo.framework import Framework
model = wl.upload_model(model_name, model_file_name, framework=Framework.TENSORFLOW).configure("tensorflow")

Deploy a Model

Now that we have a model that we want to use we will create a deployment for it.

We will tell the deployment we are using a tensorflow model and give the deployment name and the configuration we want for the deployment.

To do this, we’ll create our pipeline that can ingest the data, pass the data to our Aloha model, and give us a final output. We’ll call our pipeline externalsdkpipeline, then deploy it so it’s ready to receive data. The deployment process usually takes about 45 seconds.

pipeline.add_model_step(model)
nameazuremlsdkpipeline
created2023-05-17 21:01:46.655024+00:00
last_updated2023-05-17 21:01:46.655024+00:00
deployed(none)
tags
versionse011272d-c22c-4b2d-ab9f-b17c60099434
steps
pipeline.deploy()
nameazuremlsdkpipeline
created2023-05-17 21:01:46.655024+00:00
last_updated2023-05-17 21:01:51.594226+00:00
deployedTrue
tags
versions28a7a5aa-5359-4320-842b-bad84258f7e4, e011272d-c22c-4b2d-ab9f-b17c60099434
stepsazuremlsdkmodel

We can verify that the pipeline is running and list what models are associated with it.

pipeline.status()
{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.3.137',
   'name': 'engine-cb79b5d57-b68lv',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'azuremlsdkpipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'azuremlsdkmodel',
      'version': 'fb4331da-9f14-4013-8ffb-4d994ab88eac',
      'sha': 'd71d9ffc61aaac58c2b1ed70a2db13d1416fb9d3f5b891e5e4e2e97180fe22f8',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.4.167',
   'name': 'engine-lb-584f54c899-vzlwz',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

Interferences

Infer 1 row

Now that the pipeline is deployed and our Aloha model is in place, we’ll perform a smoke test to verify the pipeline is up and running properly. We’ll use the infer_from_file command to load a single encoded URL into the inference engine and print the results back out.

The result should tell us that the tokenized URL is legitimate (0) or fraud (1). This sample data should return close to 1.

## Demonstrate via straight infer

smoke_test = pd.DataFrame.from_records(
    [
    {
        "text_input":[
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            28,
            16,
            32,
            23,
            29,
            32,
            30,
            19,
            26,
            17
        ]
    }
]
)

result = pipeline.infer(smoke_test)
display(result.loc[:, ["time","out.main"]])
timeout.main
02023-05-17 21:02:11.230[0.997564]

Infer 1,000 Rows

We can also infer an entire batch as one request either with the Pipeline infer method with multiple rows, or loaded from a file using the Pipeline infer_from_file method. For this example, we will run a batch on 1,000 records using the file data_1k.arrow. This is an Apache Arrow table, which gives the added benefit of speed and lower file size as a binary file rather than a text JSON file.

We’ll infer the 1,000 records, then convert it to a DataFrame and display the first 5 to save space in our Jupyter Notebook.

result = pipeline.infer_from_file('./data/data_1k.arrow')

outputs = result.to_pandas()
display(outputs.head(5).loc[:, ["time","out.main"]])
timeout.main
02023-05-17 21:02:12.236[0.997564]
12023-05-17 21:02:12.236[0.9885122]
22023-05-17 21:02:12.236[0.9993358]
32023-05-17 21:02:12.236[0.99999857]
42023-05-17 21:02:12.236[0.9984837]

Batch Inference

Now that our smoke test is successful, let’s really give it some data. We have two inference files we can use:

  • data-1k.arrow: Contains 10,000 inferences
  • data-25k.arrow: Contains 25,000 inferences

These inference inputs are Apache Arrow tables, which Wallaroo can ingest natively. These are binary files, and are faster to transmit because of their smaller size compared to JSON.

We’ll pipe the data-25k.arrow file through the pipeline deployment URL, and place the results in a file named response.df. Note that for larger batches of 1,000 inferences or more can be difficult to view in Jupyter Hub because of its size, so we’ll only display the first 5 results of the inference.

When retrieving the pipeline inference URL through an external SDK connection, the External Inference URL will be returned. This URL will function provided that the Enable external URL inference endpoints is enabled. For more information, see the Wallaroo Model Endpoints Guide.

inference_url = pipeline._deployment._url()
inference_url
connection =wl.mlops().__dict__
token = connection['token']
token
'eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJDYkFqN19QY0xCWTFkWmJiUDZ6Q3BsbkNBYTd6US0tRHlyNy0yLXlQb25nIn0.eyJleHAiOjE2ODQzNTczNjQsImlhdCI6MTY4NDM1NzMwNCwiYXV0aF90aW1lIjoxNjg0MzU1OTU5LCJqdGkiOiI0N2NjYzIyOS0yMjEzLTRjYWQtYWI4ZS03ZjNjNTE4MWZhMmUiLCJpc3MiOiJodHRwczovL2RvYy10ZXN0LmtleWNsb2FrLndhbGxhcm9vY29tbXVuaXR5Lm5pbmphL2F1dGgvcmVhbG1zL21hc3RlciIsImF1ZCI6WyJtYXN0ZXItcmVhbG0iLCJhY2NvdW50Il0sInN1YiI6IjAyOGM4YjQ4LWMzOWItNDU3OC05MTEwLTBiNWJkZDM4MjRkYSIsInR5cCI6IkJlYXJlciIsImF6cCI6InNkay1jbGllbnQiLCJzZXNzaW9uX3N0YXRlIjoiMGJlODJjN2ItNzg1My00ZjVkLWJiNWEtOTlkYjUwYjhiNDVmIiwiYWNyIjoiMCIsInJlYWxtX2FjY2VzcyI6eyJyb2xlcyI6WyJkZWZhdWx0LXJvbGVzLW1hc3RlciIsIm9mZmxpbmVfYWNjZXNzIiwidW1hX2F1dGhvcml6YXRpb24iXX0sInJlc291cmNlX2FjY2VzcyI6eyJtYXN0ZXItcmVhbG0iOnsicm9sZXMiOlsibWFuYWdlLXVzZXJzIiwidmlldy11c2VycyIsInF1ZXJ5LWdyb3VwcyIsInF1ZXJ5LXVzZXJzIl19LCJhY2NvdW50Ijp7InJvbGVzIjpbIm1hbmFnZS1hY2NvdW50IiwibWFuYWdlLWFjY291bnQtbGlua3MiLCJ2aWV3LXByb2ZpbGUiXX19LCJzY29wZSI6InByb2ZpbGUgZW1haWwiLCJzaWQiOiIwYmU4MmM3Yi03ODUzLTRmNWQtYmI1YS05OWRiNTBiOGI0NWYiLCJlbWFpbF92ZXJpZmllZCI6ZmFsc2UsImh0dHBzOi8vaGFzdXJhLmlvL2p3dC9jbGFpbXMiOnsieC1oYXN1cmEtdXNlci1pZCI6IjAyOGM4YjQ4LWMzOWItNDU3OC05MTEwLTBiNWJkZDM4MjRkYSIsIngtaGFzdXJhLWRlZmF1bHQtcm9sZSI6InVzZXIiLCJ4LWhhc3VyYS1hbGxvd2VkLXJvbGVzIjpbInVzZXIiXSwieC1oYXN1cmEtdXNlci1ncm91cHMiOiJ7fSJ9LCJuYW1lIjoiSm9obiBIYW5zYXJpY2siLCJwcmVmZXJyZWRfdXNlcm5hbWUiOiJqb2huLmh1bW1lbEB3YWxsYXJvby5haSIsImdpdmVuX25hbWUiOiJKb2huIiwiZmFtaWx5X25hbWUiOiJIYW5zYXJpY2siLCJlbWFpbCI6ImpvaG4uaHVtbWVsQHdhbGxhcm9vLmFpIn0.gT-aE74iiDASRmIkFY3kghHUt3P-fsfWTe3hN-xn3AZMvPzDN1M0p6RCAXx9PbPti7MVizC_Bx0pQSHg6ekJo4mgKpGIUkyP53moSVcZDyT01mmMUsPQDkQtjE5hpYZIdW4_OtDM6WsBHwELyYr_Oezk67JM2mzMBOAG48hztyKZT3xZdZrA8ei-BASjDvnz736C354ivqB_UI1kFOpuF4zKQPkJuedJO6LIbdccpxzA2Vscueeu5wlo2R3h2s2JXDXUZSXibkEihaSJSYOEx5BeGt2JsIx5hb5ZvOo34m-1_Ykg7A7w3-ngySb_yVogQah4v133guviF4sxI8mzPg'

The API connection details can be retrieved through the Wallaroo client mlops() command. This will display the connection URL, bearer token, and other information. The bearer token is available for one hour before it expires.

For this example, the API connection details will be retrieved, then used to submit an inference request through the external inference URL retrieved earlier.

dataFile="./data/data_25k.arrow"
contentType="application/vnd.apache.arrow.file"
!curl -X POST {inference_url} -H "Authorization: Bearer {token}" -H "Content-Type:{contentType}" --data-binary @{dataFile} > curl_response.df
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 24.8M  100 20.0M  100 4874k  1926k   457k  0:00:10  0:00:10 --:--:-- 4735k2k 1537k   499k  0:00:13  0:00:09  0:00:04 3379k
cc_data_from_file =  pd.read_json('./curl_response.df', orient="records")
display(cc_data_from_file.head(5).loc[:, ["time","out"]])
timeout
01684357334487{'banjori': [0.0015195821], 'corebot': [0.9829147500000001], 'cryptolocker': [0.012099549000000001], 'dircrypt': [4.7591115e-05], 'gozi': [2.0289428e-05], 'kraken': [0.00031977256999999996], 'locky': [0.011029262000000001], 'main': [0.997564], 'matsnu': [0.010341609], 'pykspa': [0.008038961], 'qakbot': [0.016155055], 'ramdo': [0.00623623], 'ramnit': [0.0009985747000000001], 'simda': [1.7933434e-26], 'suppobox': [1.388995e-27]}
11684357334487{'banjori': [7.447196e-18], 'corebot': [6.7359245e-08], 'cryptolocker': [0.1708199], 'dircrypt': [1.3220122000000002e-09], 'gozi': [1.2758705999999999e-24], 'kraken': [0.22559543], 'locky': [0.34209849999999997], 'main': [0.99999994], 'matsnu': [0.3080186], 'pykspa': [0.1828217], 'qakbot': [3.8022549999999994e-11], 'ramdo': [0.2062254], 'ramnit': [0.15215826], 'simda': [1.1701982e-30], 'suppobox': [3.1514454e-38]}
21684357334487{'banjori': [2.8598648999999997e-21], 'corebot': [9.302004000000001e-08], 'cryptolocker': [0.04445298], 'dircrypt': [6.1637580000000004e-09], 'gozi': [8.3496755e-23], 'kraken': [0.48234479999999996], 'locky': [0.26332903], 'main': [1.0], 'matsnu': [0.29800338], 'pykspa': [0.22361776], 'qakbot': [1.5238920999999999e-06], 'ramdo': [0.32820392], 'ramnit': [0.029332489000000003], 'simda': [1.1995622e-31], 'suppobox': [0.0]}
31684357334487{'banjori': [2.1387213e-15], 'corebot': [3.8817485e-10], 'cryptolocker': [0.045599736], 'dircrypt': [1.9090386e-07], 'gozi': [1.3140123e-25], 'kraken': [0.59542626], 'locky': [0.17374137], 'main': [0.9999996999999999], 'matsnu': [0.23151578], 'pykspa': [0.17591679999999998], 'qakbot': [1.0876152e-09], 'ramdo': [0.21832279999999998], 'ramnit': [0.0128692705], 'simda': [6.1588803e-28], 'suppobox': [1.4386237e-35]}
41684357334487{'banjori': [9.453342500000001e-15], 'corebot': [7.091151e-10], 'cryptolocker': [0.049815163], 'dircrypt': [5.2914135e-09], 'gozi': [7.4132087e-19], 'kraken': [1.5504574999999998e-13], 'locky': [1.079181e-15], 'main': [0.9999988999999999], 'matsnu': [1.5003075e-15], 'pykspa': [0.33075705], 'qakbot': [2.6258850000000004e-07], 'ramdo': [0.5036279], 'ramnit': [0.020393765], 'simda': [0.0], 'suppobox': [2.3292326e-38]}

Undeploy Pipeline

When finished with our tests, we will undeploy the pipeline so we have the Kubernetes resources back for other tasks. Note that if the deployment variable is unchanged pipeline.deploy() will restart the inference engine in the same configuration as before.

pipeline.undeploy()
nameazuremlsdkpipeline
created2023-05-17 21:01:46.655024+00:00
last_updated2023-05-17 21:01:51.594226+00:00
deployedFalse
tags
versions28a7a5aa-5359-4320-842b-bad84258f7e4, e011272d-c22c-4b2d-ab9f-b17c60099434
stepsazuremlsdkmodel

1.3 - Wallaroo SDK Azure Databricks Install Guide

How to install the Wallaroo SDK in Azure Databricks

This tutorial and the assets can be downloaded as part of the Wallaroo Tutorials repository.

Installing the Wallaroo SDK into Workspace

Organizations that use Azure Databricks for model training and development can deploy models to Wallaroo through the Wallaroo SDK. The following guide is created to assist users with installing the Wallaroo SDK, setting up authentication through Azure Databricks, and making a standard connection to a Wallaroo instance through Azure Databricks Workspace.

These instructions are based on the on the Wallaroo SSO for Microsoft Azure and the Connect to Wallaroo guides.

This tutorial provides the following:

  • ccfraud.onnx: A pretrained model from the Machine Learning Group’s demonstration on Credit Card Fraud detection.
  • Sample inference test data:
    • cc_data_1k.arrow: Sample input file with 1,000 records.
    • cc_data_10k.arrow: Sample input file with 10,000 records.

To use the Wallaroo SDK within Azure Databricks Workspace, a virtual environment will be used. This will set the necessary libraries and specific Python version required.

Prerequisites

The following is required for this tutorial:

  • A Wallaroo instance version 2023.1 or later with External Inference URls enabled.
  • An Azure Databricks workspace with a cluster
  • The following Python libraries installed:
    • os
    • wallaroo: The Wallaroo SDK. Included with the Wallaroo JupyterHub service by default.
    • pandas: Pandas, mainly used for Pandas DataFrame
    • pyarrow: PyArrow for Apache Arrow support
    • polars: Polars for DataFrame with native Apache Arrow support

General Steps

For our example, we will perform the following:

  • Wallaroo SDK Install
    • Install the Wallaroo SDK into the Azure Databricks cluster.
    • Install the Wallaroo Python SDK.
    • Connect to a remote Wallaroo instance. This instance is configured to use the standard Keycloak service.
  • Wallaroo SDK from Azure Databricks Workspace (Optional)
    • The following steps are used to demonstrate using the Wallaroo SDK in an Azure Databricks Workspace environment. The entire tutorial can be found on the Wallaroo Tutorials repository.
      • Create a workspace for our work.
      • Upload the CCFraud model.
      • Create a pipeline that can ingest our submitted data, submit it to the model, and export the results
      • Run a sample inference through our pipeline by loading a file
      • Undeploy the pipeline and return resources back to the Wallaroo instance’s Kubernetes environment.

Install Wallaroo SDK

Add Wallaroo SDK to Cluster

To install the Wallaroo SDK in a Azure Databricks environment:

  1. From the Azure Databricks dashboard, select Computer, then the cluster to use.
  2. Select Libraries.
  3. Select Install new.
  4. Select PyPI. In the Package field, enter the current version of the Wallaroo SDK. It is recommended to specify the version, which as of this writing is wallaroo==2023.2.0.
  5. Select Install.

Once the Status shows Installed, it will be available in Azure Databricks notebooks and other tools that use the cluster.

Add Tutorial Files

The following instructions can be used to upload this tutorial and it’s files into Databricks. Depending on how your Azure Databricks is configured and your organizations standards, there are multiple ways of uploading files to your Azure Databricks environment. The following example is used for the tutorial and makes it easy to reference data files from within this Notebook. Adjust based on your requirements.

  • IMPORTANT NOTE: Importing a repo from a Git repository may not convert the included Jupyter Notebooks into the Databricks format. This method
  1. From the Azure Databricks dashboard, select Repos.

  2. Select where to place the repo, then select Add Repo.

  3. Set the following:

    1. Create repo by cloning a Git repository: Uncheck
    2. Repository name: Set any name based on the Databricks standard (no spaces, etc).
    3. Select Create Repo.
  4. Select the new tutorial, then from the repo menu dropdown, select Import.

  5. Select the files to upload. For this example, the following files are uploaded:

    1. ccfraud.onnx: A pretrained model from the Machine Learning Group’s demonstration on Credit Card Fraud detection.
    2. Sample inference test data:
      1. ccfraud_high_fraud.json: Test input file that returns a high likelihood of credit card fraud.
      2. ccfraud_smoke_test.json: Test input file that returns a low likelihood of credit card fraud.
      3. cc_data_1k.json: Sample input file with 1,000 records.
      4. cc_data_10k.json: Sample input file with 10,000 records.
    3. install-wallaroo-sdk-databricks-azure-guide.ipynb: This notebook.
  6. Select Import.

The Jupyter Notebook can be opened from this new Azure Databricks repository, and relative files it references will be accessible with the exceptions listed below.

Zip files added via the method above are automatically decompressed, so can not be used as model files. For example, tensor based models such as the Wallaroo Aloha Demo. Zip files can be uploaded using DBFS and used through the following process:

To upload model files to Azure Databricks using DBFS:

  1. From the Azure Databricks dashboard, select Data.

  2. Select Add->Add data.

  3. Select DBFS.

  4. Select Upload File and enter the following:

    1. DBFS Target Directory (Optional): Optional step: Set the directory where the files will be uploaded.
  5. Select the files to upload. Note that each file will be given a location and they can be access with /dbfs/PATH. For example, the file alohacnnlstm.zip uploaded to the directory aloha would be referenced with `/dbfs/FileStore/tables/aloha/alohacnnlstm.zip

Sample Wallaroo Connection

With the Wallaroo Python SDK installed, remote commands and inferences can be performed through the following steps.

Open a Connection to Wallaroo

The first step is to connect to Wallaroo through the Wallaroo client.

This is accomplished using the wallaroo.Client(api_endpoint, auth_endpoint, auth_type command) command that connects to the Wallaroo instance services.

The Client method takes the following parameters:

  • api_endpoint (String): The URL to the Wallaroo instance API service.
  • auth_endpoint (String): The URL to the Wallaroo instance Keycloak service.
  • auth_type command (String): The authorization type. In this case, SSO.

The URLs are based on the Wallaroo Prefix and Wallaroo Suffix for the Wallaroo instance. For more information, see the DNS Integration Guide. In the example below, replace “YOUR PREFIX” and “YOUR SUFFIX” with the Wallaroo Prefix and Suffix, respectively. In the example below, replace “YOUR PREFIX” and “YOUR SUFFIX” with the Wallaroo Prefix and Suffix, respectively.

Once run, the wallaroo.Client command provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions.

Depending on the configuration of the Wallaroo instance, the user will either be presented with a login request to the Wallaroo instance or be authenticated through a broker such as Google, Github, etc. To use the broker, select it from the list under the username/password login forms. For more information on Wallaroo authentication configurations, see the Wallaroo Authentication Configuration Guides.

Once authenticated, the user will verify adding the device the user is establishing the connection from. Once both steps are complete, then the connection is granted.

The connection is stored in the variable wl for use in all other Wallaroo calls.

Replace YOUR PREFIX and YOUR SUFFIX with the DNS prefix and suffix for the Wallaroo instance. For more information, see the DNS Integration Guide.

import wallaroo
from wallaroo.object import EntityNotFoundError

# used to display dataframe information without truncating
from IPython.display import display
import pandas as pd
pd.set_option('display.max_colwidth', None)

# For Apache Arrow functions
import pyarrow

Connect to Wallaroo

For this example, a connection through the Wallaroo SDK is used. For more information, see the Wallaroo SDK Essentials Guide: Client Connection.

For wallarooPrefix = "YOUR PREFIX." and wallarooSuffix = "YOUR SUFFIX", enter the prefix and suffix for your Wallaroo instance DNS name. If the prefix instance is blank, then it can be wallarooPrefix = "". Note that the prefix includes the . for proper formatting. For example, if the prefix is empty and the suffix is wallaroo.example.com, then the settings would be:

wallarooPrefix = ""
wallarooSuffix = "wallaroo.example.com"

If the prefix is sales. and the suffix example.com, then the settings would be:

wallarooPrefix = "sales."
wallarooSuffix = "wallaroo.example.com"
# SSO login through keycloak

wallarooPrefix = "YOUR PREFIX."
wallarooSuffix = "YOUR SUFFIX"

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
                    auth_type="sso")

Create the Workspace

We will create a workspace to work in and call it the databricksazuresdkworkspace, then set it as current workspace environment. We’ll also create our pipeline in advance as databricksazuresdkpipeline.

  • IMPORTANT NOTE: For this example, the CCFraud model is stored in the file ccfraud.onnx and is referenced from a relative link. For platforms such as Databricks, the files may need to be in a universal file format. For those, the example file location below may be:

model_file_name = '/dbfs/FileStore/tables/aloha/alohacnnlstm.zip

Adjust file names and locations based on your requirements.

workspace_name = 'databricksazuresdkworkspace'
pipeline_name = 'databricksazuresdkpipeline'
model_name = 'ccfraudmodel'
model_file_name = './ccfraud.onnx'
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(name)
    return pipeline
workspace = get_workspace(workspace_name)

wl.set_current_workspace(workspace)

pipeline = get_pipeline(pipeline_name)
pipeline
namedatabricksazuresdkpipeline
created2023-05-17 21:02:55.076340+00:00
last_updated2023-05-17 21:02:55.076340+00:00
deployed(none)
tags
versions8c4a15b4-2ef0-4da1-8e2d-38088fde8c56
steps

We can verify the workspace is created the current default workspace with the get_current_workspace() command.

wl.get_current_workspace()
{'name': 'databricksazuresdkworkspace', 'id': 9, 'archived': False, 'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da', 'created_at': '2023-05-17T21:02:53.926098+00:00', 'models': [], 'pipelines': [{'name': 'databricksazuresdkpipeline', 'create_time': datetime.datetime(2023, 5, 17, 21, 2, 55, 76340, tzinfo=tzutc()), 'definition': '[]'}]}

Upload the Models

Now we will upload our model.

IMPORTANT NOTE: If using DBFS, use the file path format such as /dbfs/FileStore/shared_uploads/YOURWORKSPACE/file format rather than the dbfs: format.

from wallaroo.framework import Framework
model = wl.upload_model(model_name, model_file_name, framework=Framework.ONNX).configure("tensorflow")
model
{'name': 'ccfraudmodel', 'version': '86b3bdf4-2b70-4111-9c41-f2e6d948c065', 'file_name': 'ccfraud.onnx', 'image_path': None, 'last_update_time': datetime.datetime(2023, 5, 17, 21, 2, 57, 826949, tzinfo=tzutc())}

Deploy a Model

Now that we have a model that we want to use we will create a deployment for it.

To do this, we’ll create our pipeline that can ingest the data, pass the data to our CCFraud model, and give us a final output. We’ll call our pipeline databricksazuresdkpipeline, then deploy it so it’s ready to receive data. The deployment process usually takes about 45 seconds.

pipeline.add_model_step(model)
namedatabricksazuresdkpipeline
created2023-05-17 21:02:55.076340+00:00
last_updated2023-05-17 21:02:55.076340+00:00
deployed(none)
tags
versions8c4a15b4-2ef0-4da1-8e2d-38088fde8c56
steps
pipeline.deploy()
namedatabricksazuresdkpipeline
created2023-05-17 21:02:55.076340+00:00
last_updated2023-05-17 21:02:59.288152+00:00
deployedTrue
tags
versionsf125dc67-f690-4011-986a-8f6a9a23c48a, 8c4a15b4-2ef0-4da1-8e2d-38088fde8c56
stepsccfraudmodel

We can verify that the pipeline is running and list what models are associated with it.

pipeline.status()
{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.3.138',
   'name': 'engine-6879975845-pp6r5',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'databricksazuresdkpipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'ccfraudmodel',
      'version': '86b3bdf4-2b70-4111-9c41-f2e6d948c065',
      'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.4.168',
   'name': 'engine-lb-584f54c899-c5b94',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

Interferences

Infer 1 row

Now that the pipeline is deployed and our CCfraud model is in place, we’ll perform a smoke test to verify the pipeline is up and running properly. We’ll use the infer_from_file command to load a single transaction and determine if it is flagged for fraud. If it returns correctly, a small value should be returned indicating a low likelihood that the transaction was fraudulent.

smoke_test = pd.DataFrame.from_records([
    {
        "tensor":[
            1.0678324729,
            0.2177810266,
            -1.7115145262,
            0.682285721,
            1.0138553067,
            -0.4335000013,
            0.7395859437,
            -0.2882839595,
            -0.447262688,
            0.5146124988,
            0.3791316964,
            0.5190619748,
            -0.4904593222,
            1.1656456469,
            -0.9776307444,
            -0.6322198963,
            -0.6891477694,
            0.1783317857,
            0.1397992467,
            -0.3554220649,
            0.4394217877,
            1.4588397512,
            -0.3886829615,
            0.4353492889,
            1.7420053483,
            -0.4434654615,
            -0.1515747891,
            -0.2668451725,
            -1.4549617756
        ]
    }
])
result = pipeline.infer(smoke_test)
display(result)
timein.tensorout.dense_1check_failures
02023-05-17 21:03:12.428[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756][0.0014974177]0
result.loc[0,["out.dense_1"]]
out.dense_1    [0.0014974177]
Name: 0, dtype: object

Batch Inference

Now that our smoke test is successful, let’s really give it some data. We’ll use the cc_data_1k.jarrowson file that contains 1,000 inferences to be performed, then convert that to a DataFrame and display the first 5 rows.

result = pipeline.infer_from_file("./data/cc_data_1k.arrow")
display(result)

outputs = result.to_pandas()
display(outputs.head(5))
pyarrow.Table
time: timestamp[ms]
in.tensor: list<item: float> not null
  child 0, item: float
out.dense_1: list<inner: float not null> not null
  child 0, inner: float not null
check_failures: int8
----
time: [[2023-05-17 21:03:13.090,2023-05-17 21:03:13.090,2023-05-17 21:03:13.090,2023-05-17 21:03:13.090,2023-05-17 21:03:13.090,...,2023-05-17 21:03:13.090,2023-05-17 21:03:13.090,2023-05-17 21:03:13.090,2023-05-17 21:03:13.090,2023-05-17 21:03:13.090]]
in.tensor: [[[-1.0603298,2.3544967,-3.5638788,5.138735,-1.2308457,...,0.038412016,1.0993439,1.2603409,-0.14662448,-1.4463212],[-1.0603298,2.3544967,-3.5638788,5.138735,-1.2308457,...,0.038412016,1.0993439,1.2603409,-0.14662448,-1.4463212],...,[0.49511018,-0.24993694,0.4553345,0.92427504,-0.36435103,...,1.1117147,-0.566654,0.12122019,0.06676402,0.6583282],[0.61188054,0.1726081,0.43105456,0.50321484,-0.27466634,...,0.30260187,0.081211455,-0.15578508,0.017189292,-0.7236631]]]
out.dense_1: [[[0.99300325],[0.99300325],...,[0.0008533001],[0.0012498498]]]
check_failures: [[0,0,0,0,0,...,0,0,0,0,0]]
timein.tensorout.dense_1check_failures
02023-05-17 21:03:13.090[-1.0603298, 2.3544967, -3.5638788, 5.138735, -1.2308457, -0.76878244, -3.5881228, 1.8880838, -3.2789674, -3.9563255, 4.099344, -5.653918, -0.8775733, -9.131571, -0.6093538, -3.7480276, -5.0309124, -0.8748149, 1.9870535, 0.7005486, 0.9204423, -0.10414918, 0.32295644, -0.74181414, 0.038412016, 1.0993439, 1.2603409, -0.14662448, -1.4463212][0.99300325]0
12023-05-17 21:03:13.090[-1.0603298, 2.3544967, -3.5638788, 5.138735, -1.2308457, -0.76878244, -3.5881228, 1.8880838, -3.2789674, -3.9563255, 4.099344, -5.653918, -0.8775733, -9.131571, -0.6093538, -3.7480276, -5.0309124, -0.8748149, 1.9870535, 0.7005486, 0.9204423, -0.10414918, 0.32295644, -0.74181414, 0.038412016, 1.0993439, 1.2603409, -0.14662448, -1.4463212][0.99300325]0
22023-05-17 21:03:13.090[-1.0603298, 2.3544967, -3.5638788, 5.138735, -1.2308457, -0.76878244, -3.5881228, 1.8880838, -3.2789674, -3.9563255, 4.099344, -5.653918, -0.8775733, -9.131571, -0.6093538, -3.7480276, -5.0309124, -0.8748149, 1.9870535, 0.7005486, 0.9204423, -0.10414918, 0.32295644, -0.74181414, 0.038412016, 1.0993439, 1.2603409, -0.14662448, -1.4463212][0.99300325]0
32023-05-17 21:03:13.090[-1.0603298, 2.3544967, -3.5638788, 5.138735, -1.2308457, -0.76878244, -3.5881228, 1.8880838, -3.2789674, -3.9563255, 4.099344, -5.653918, -0.8775733, -9.131571, -0.6093538, -3.7480276, -5.0309124, -0.8748149, 1.9870535, 0.7005486, 0.9204423, -0.10414918, 0.32295644, -0.74181414, 0.038412016, 1.0993439, 1.2603409, -0.14662448, -1.4463212][0.99300325]0
42023-05-17 21:03:13.090[0.5817662, 0.09788155, 0.15468194, 0.4754102, -0.19788623, -0.45043448, 0.016654044, -0.025607055, 0.09205616, -0.27839172, 0.059329946, -0.019658541, -0.42250833, -0.12175389, 1.5473095, 0.23916228, 0.3553975, -0.76851654, -0.7000849, -0.11900433, -0.3450517, -1.1065114, 0.25234112, 0.020944182, 0.21992674, 0.25406894, -0.04502251, 0.10867739, 0.25471792][0.0010916889]0

Undeploy Pipeline

When finished with our tests, we will undeploy the pipeline so we have the Kubernetes resources back for other tasks. Note that if the deployment variable is unchanged pipeline.deploy() will restart the inference engine in the same configuration as before.

pipeline.undeploy()
namedatabricksazuresdkpipeline
created2023-05-17 21:02:55.076340+00:00
last_updated2023-05-17 21:02:59.288152+00:00
deployedFalse
tags
versionsf125dc67-f690-4011-986a-8f6a9a23c48a, 8c4a15b4-2ef0-4da1-8e2d-38088fde8c56
stepsccfraudmodel

1.4 - Wallaroo SDK Google Vertex Install Guide

How to install the Wallaroo SDK in Google Vertex

This tutorial and the assets can be downloaded as part of the Wallaroo Tutorials repository.

Installing the Wallaroo SDK into Google Vertex Workbench

Organizations that use Google Vertex for model training and development can deploy models to Wallaroo through the Wallaroo SDK. The following guide is created to assist users with installing the Wallaroo SDK, setting up authentication through Google Cloud Platform (GCP), and making a standard connection to a Wallaroo instance through Google Workbench.

These instructions are based on the on the Wallaroo SSO for Google Cloud Platform and the Connect to Wallaroo guides.

This tutorial provides the following:

  • aloha-cnn-lstm.zip: A pre-trained open source model that uses an Aloha CNN LSTM model for classifying Domain names as being either legitimate or being used for nefarious purposes such as malware distribution.
  • Test Data Files:
    • data_1k.arrow: 1,000 records
    • data_25k.arrow: 25,000 records

To use the Wallaroo SDK within Google Workbench, a virtual environment will be used. This will set the necessary libraries and specific Python version required.

Prerequisites

The following is required for this tutorial:

  • A Wallaroo instance version 2023.1 or later.
  • Python 3.8.6 or later installed locally
  • Conda: Used for managing python virtual environments.
  • The following Python libraries installed:
    • os
    • wallaroo: The Wallaroo SDK. Included with the Wallaroo JupyterHub service by default.
    • pandas: Pandas, mainly used for Pandas DataFrame
    • pyarrow: PyArrow for Apache Arrow support
    • polars: Polars for DataFrame with native Apache Arrow support

General Steps

For our example, we will perform the following:

  • Wallaroo SDK Install
    • Set up a Python virtual environment through conda with the libraries that enable the virtual environment for use in a Jupyter Hub environment.
    • Install the Wallaroo SDK.
  • Wallaroo SDK from remote JupyterHub Demonstration (Optional): The following steps are an optional exercise to demonstrate using the Wallaroo SDK from a remote connection. The entire tutorial can be found on the Wallaroo Tutorials repository.
    • Connect to a remote Wallaroo instance.
    • Create a workspace for our work.
    • Upload the Aloha model.
    • Create a pipeline that can ingest our submitted data, submit it to the model, and export the results
    • Run a sample inference through our pipeline by loading a file
    • Retrieve the external deployment URL. This sample Wallaroo instance has been configured to create external inference URLs for pipelines. For more information, see the External Inference URL Guide.
    • Run a sample inference through our pipeline’s external URL and store the results in a file. This assumes that the External Inference URLs have been enabled for the target Wallaroo instance.
    • Undeploy the pipeline and return resources back to the Wallaroo instance’s Kubernetes environment.

Install Wallaroo SDK

Set Up Virtual Python Environment

To set up the virtual environment in Google Workbench for using the Wallaroo SDK with Google Workbench:

  1. Start a separate terminal by selecting File->New->Terminal.

  2. Create the Python virtual environment with conda. Replace wallaroosdk with the name of the virtual environment as required by your organization. Note that Python 3.8.6 and above is specified as a requirement for Python libraries used with the Wallaroo SDK. The following will install the latest version of Python 3.8, which as of this time is 3.8.15.

    conda create -n wallaroosdk python=3.8
    
  3. Activate the new environment.

    conda activate wallaroosdk
    
  4. Install the ipykernel library. This allows the JupyterHub notebooks to access the Python virtual environment as a kernel.

    conda install ipykernel
    
  5. Install the new virtual environment as a python kernel.

    ipython kernel install --user --name=wallaroosdk
    
  6. Install the Wallaroo SDK. This process may take several minutes while the other required Python libraries are added to the virtual environment.

    • IMPORTANT NOTE: The version of the Wallaroo SDK should match the Wallaroo instance. For example, this example connects to a Wallaroo Enterprise version 2023.1 instance, so the SDK version should be wallaroo==2023.1.0.
    pip install wallaroo==2023.2.1
    

Once the conda virtual environment has been installed, it can either be selected as a new Jupyter Notebook kernel, or the Notebook’s kernel can be set to an existing Jupyter notebook.

To use a new Notebook:

  1. From the main menu, select File->New-Notebook.
  2. From the Kernel selection dropbox, select the new virtual environment - in this case, wallaroosdk.

To update an existing Notebook to use the new virtual environment as a kernel:

  1. From the main menu, select Kernel->Change Kernel.
  2. Select the new kernel.

Sample Wallaroo Connection

With the Wallaroo Python SDK installed, remote commands and inferences can be performed through the following steps.

Open a Connection to Wallaroo

The first step is to connect to Wallaroo through the Wallaroo client.

This is accomplished using the wallaroo.Client(api_endpoint, auth_endpoint, auth_type command) command that connects to the Wallaroo instance services.

The Client method takes the following parameters:

  • api_endpoint (String): The URL to the Wallaroo instance API service.
  • auth_endpoint (String): The URL to the Wallaroo instance Keycloak service.
  • auth_type command (String): The authorization type. In this case, SSO.

The URLs are based on the Wallaroo Prefix and Wallaroo Suffix for the Wallaroo instance. For more information, see the DNS Integration Guide. In the example below, replace “YOUR PREFIX” and “YOUR SUFFIX” with the Wallaroo Prefix and Suffix, respectively.

Once run, the wallaroo.Client command provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Depending on the configuration of the Wallaroo instance, the user will either be presented with a login request to the Wallaroo instance or be authenticated through a broker such as Google, Github, etc. To use the broker, select it from the list under the username/password login forms. For more information on Wallaroo authentication configurations, see the Wallaroo Authentication Configuration Guides.

Once authenticated, the user will verify adding the device the user is establishing the connection from. Once both steps are complete, then the connection is granted.

The connection is stored in the variable wl for use in all other Wallaroo calls.

import wallaroo
from wallaroo.object import EntityNotFoundError

# to display dataframe tables
from IPython.display import display
# used to display dataframe information without truncating
import pandas as pd
pd.set_option('display.max_colwidth', None)
import pyarrow as pa

Connect to Wallaroo

For this example, a connection through the Wallaroo SDK is used. For more information, see the Wallaroo SDK Essentials Guide: Client Connection.

For wallarooPrefix = "YOUR PREFIX." and wallarooSuffix = "YOUR SUFFIX", enter the prefix and suffix for your Wallaroo instance DNS name. If the prefix instance is blank, then it can be wallarooPrefix = "". Note that the prefix includes the . for proper formatting. For example, if the prefix is empty and the suffix is wallaroo.example.com, then the settings would be:

wallarooPrefix = ""
wallarooSuffix = "wallaroo.example.com"

If the prefix is sales. and the suffix example.com, then the settings would be:

wallarooPrefix = "sales."
wallarooSuffix = "wallaroo.example.com"
# SSO login through keycloak

wallarooPrefix = "YOUR PREFIX."
wallarooSuffix = "YOUR SUFFIX."

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
                    auth_type="sso")

Create the Workspace

We will create a workspace to work in and call it the gcpsdkworkspace, then set it as current workspace environment. We’ll also create our pipeline in advance as gcpsdkpipeline.

  • IMPORTANT NOTE: For this example, the Aloha model is stored in the file alohacnnlstm.zip. When using tensor based models, the zip file must match the name of the tensor directory. For example, if the tensor directory is alohacnnlstm, then the .zip file must be named alohacnnlstm.zip.
workspace_name = 'gcpsdkworkspace'
pipeline_name = 'gcpsdkpipeline'
model_name = 'gcpsdkmodel'
model_file_name = './alohacnnlstm.zip'
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(name)
    return pipeline
workspace = get_workspace(workspace_name)

wl.set_current_workspace(workspace)

pipeline = get_pipeline(pipeline_name)
pipeline
namegcpsdkpipeline
created2023-05-17 21:03:44.485720+00:00
last_updated2023-05-17 21:03:44.485720+00:00
deployed(none)
tags
versions7c043d3c-c894-4ae9-9ec1-c35518130b90
steps

We can verify the workspace is created the current default workspace with the get_current_workspace() command.

wl.get_current_workspace()
{'name': 'gcpsdkworkspace', 'id': 10, 'archived': False, 'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da', 'created_at': '2023-05-17T21:03:43.54944+00:00', 'models': [], 'pipelines': [{'name': 'gcpsdkpipeline', 'create_time': datetime.datetime(2023, 5, 17, 21, 3, 44, 485720, tzinfo=tzutc()), 'definition': '[]'}]}

Upload the Models

Now we will upload our model. Note that for this example we are applying the model from a .ZIP file. The Aloha model is a protobuf file that has been defined for evaluating web pages, and we will configure it to use data in the tensorflow format.

from wallaroo.framework import Framework
model = wl.upload_model(model_name, model_file_name, framework=Framework.TENSORFLOW).configure("tensorflow")

Deploy a Model

Now that we have a model that we want to use we will create a deployment for it.

We will tell the deployment we are using a tensorflow model and give the deployment name and the configuration we want for the deployment.

To do this, we’ll create our pipeline that can ingest the data, pass the data to our Aloha model, and give us a final output. We’ll call our pipeline externalsdkpipeline, then deploy it so it’s ready to receive data. The deployment process usually takes about 45 seconds.

pipeline.add_model_step(model)
namegcpsdkpipeline
created2023-05-17 21:03:44.485720+00:00
last_updated2023-05-17 21:03:44.485720+00:00
deployed(none)
tags
versions7c043d3c-c894-4ae9-9ec1-c35518130b90
steps
pipeline.deploy()
namegcpsdkpipeline
created2023-05-17 21:03:44.485720+00:00
last_updated2023-05-17 21:03:49.137632+00:00
deployedTrue
tags
versions6398cafc-50c4-49e3-9499-6025b7808245, 7c043d3c-c894-4ae9-9ec1-c35518130b90
stepsgcpsdkmodel

We can verify that the pipeline is running and list what models are associated with it.

pipeline.status()
{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.2.157',
   'name': 'engine-7694d96677-f4jfk',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'gcpsdkpipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'gcpsdkmodel',
      'version': 'aff60f1f-b036-47d7-920c-1819be75d734',
      'sha': 'd71d9ffc61aaac58c2b1ed70a2db13d1416fb9d3f5b891e5e4e2e97180fe22f8',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.4.169',
   'name': 'engine-lb-584f54c899-fc8lh',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

Interferences

Infer 1 row

Now that the pipeline is deployed and our Aloha model is in place, we’ll perform a smoke test to verify the pipeline is up and running properly. We’ll use the infer_from_file command to load a single encoded URL into the inference engine and print the results back out.

The result should tell us that the tokenized URL is legitimate (0) or fraud (1). This sample data should return close to 1.

## Demonstrate via straight infer

smoke_test = pd.DataFrame.from_records(
    [
    {
        "text_input":[
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            28,
            16,
            32,
            23,
            29,
            32,
            30,
            19,
            26,
            17
        ]
    }
]
)

result = pipeline.infer(smoke_test)
display(result.loc[:, ["time","out.main"]])
timeout.main
02023-05-17 21:04:10.047[0.997564]

Infer 1,000 Rows

We can also infer an entire batch as one request either with the Pipeline infer method with multiple rows, or loaded from a file using the Pipeline infer_from_file method. For this example, we will run a batch on 1,000 records using the file data_1k.arrow. This is an Apache Arrow table, which gives the added benefit of speed and lower file size as a binary file rather than a text JSON file.

We’ll infer the 1,000 records, then convert it to a DataFrame and display the first 5 to save space in our Jupyter Notebook.

result = pipeline.infer_from_file('./data/data_1k.arrow')

outputs = result.to_pandas()
display(outputs.head(5).loc[:, ["time","out.main"]])
timeout.main
02023-05-17 21:04:10.886[0.997564]
12023-05-17 21:04:10.886[0.9885122]
22023-05-17 21:04:10.886[0.9993358]
32023-05-17 21:04:10.886[0.99999857]
42023-05-17 21:04:10.886[0.9984837]

Batch Inference

Now that our smoke test is successful, let’s really give it some data. We have two inference files we can use:

  • data-1k.arrow: Contains 10,000 inferences
  • data-25k.arrow: Contains 25,000 inferences

These inference inputs are Apache Arrow tables, which Wallaroo can ingest natively. These are binary files, and are faster to transmit because of their smaller size compared to JSON.

We’ll pipe the data-25k.arrow file through the pipeline deployment URL, and place the results in a file named response.df. Note that for larger batches of 1,000 inferences or more can be difficult to view in Jupyter Hub because of its size, so we’ll only display the first 5 results of the inference.

When retrieving the pipeline inference URL through an external SDK connection, the External Inference URL will be returned. This URL will function provided that the Enable external URL inference endpoints is enabled. For more information, see the Wallaroo Model Endpoints Guide.

inference_url = pipeline._deployment._url()
inference_url

The API connection details can be retrieved through the Wallaroo client mlops() command. This will display the connection URL, bearer token, and other information. The bearer token is available for one hour before it expires.

For this example, the API connection details will be retrieved, then used to submit an inference request through the external inference URL retrieved earlier.

connection =wl.mlops().__dict__
token = connection['token']
token
'eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJDYkFqN19QY0xCWTFkWmJiUDZ6Q3BsbkNBYTd6US0tRHlyNy0yLXlQb25nIn0.eyJleHAiOjE2ODQzNTc0NjYsImlhdCI6MTY4NDM1NzQwNiwiYXV0aF90aW1lIjoxNjg0MzU1OTU5LCJqdGkiOiIwYTA1Y2EyYS0xNzVhLTQ0MTctYTJhYi03MmFlYmE5YTA5NDMiLCJpc3MiOiJodHRwczovL2RvYy10ZXN0LmtleWNsb2FrLndhbGxhcm9vY29tbXVuaXR5Lm5pbmphL2F1dGgvcmVhbG1zL21hc3RlciIsImF1ZCI6WyJtYXN0ZXItcmVhbG0iLCJhY2NvdW50Il0sInN1YiI6IjAyOGM4YjQ4LWMzOWItNDU3OC05MTEwLTBiNWJkZDM4MjRkYSIsInR5cCI6IkJlYXJlciIsImF6cCI6InNkay1jbGllbnQiLCJzZXNzaW9uX3N0YXRlIjoiMGJlODJjN2ItNzg1My00ZjVkLWJiNWEtOTlkYjUwYjhiNDVmIiwiYWNyIjoiMCIsInJlYWxtX2FjY2VzcyI6eyJyb2xlcyI6WyJkZWZhdWx0LXJvbGVzLW1hc3RlciIsIm9mZmxpbmVfYWNjZXNzIiwidW1hX2F1dGhvcml6YXRpb24iXX0sInJlc291cmNlX2FjY2VzcyI6eyJtYXN0ZXItcmVhbG0iOnsicm9sZXMiOlsibWFuYWdlLXVzZXJzIiwidmlldy11c2VycyIsInF1ZXJ5LWdyb3VwcyIsInF1ZXJ5LXVzZXJzIl19LCJhY2NvdW50Ijp7InJvbGVzIjpbIm1hbmFnZS1hY2NvdW50IiwibWFuYWdlLWFjY291bnQtbGlua3MiLCJ2aWV3LXByb2ZpbGUiXX19LCJzY29wZSI6InByb2ZpbGUgZW1haWwiLCJzaWQiOiIwYmU4MmM3Yi03ODUzLTRmNWQtYmI1YS05OWRiNTBiOGI0NWYiLCJlbWFpbF92ZXJpZmllZCI6ZmFsc2UsImh0dHBzOi8vaGFzdXJhLmlvL2p3dC9jbGFpbXMiOnsieC1oYXN1cmEtdXNlci1pZCI6IjAyOGM4YjQ4LWMzOWItNDU3OC05MTEwLTBiNWJkZDM4MjRkYSIsIngtaGFzdXJhLWRlZmF1bHQtcm9sZSI6InVzZXIiLCJ4LWhhc3VyYS1hbGxvd2VkLXJvbGVzIjpbInVzZXIiXSwieC1oYXN1cmEtdXNlci1ncm91cHMiOiJ7fSJ9LCJuYW1lIjoiSm9obiBIYW5zYXJpY2siLCJwcmVmZXJyZWRfdXNlcm5hbWUiOiJqb2huLmh1bW1lbEB3YWxsYXJvby5haSIsImdpdmVuX25hbWUiOiJKb2huIiwiZmFtaWx5X25hbWUiOiJIYW5zYXJpY2siLCJlbWFpbCI6ImpvaG4uaHVtbWVsQHdhbGxhcm9vLmFpIn0.oGodsdxjFu-XP8pzmET-MAUfrH-wNS-7y6WbT-OsQMyt_0xK5ilEHWD6YCtfREKbjmXd9U-a9LFSV3hIySO2truJeXQi6uQS3UTbvcoMdCdOMcnx9mYvxpGyiLw444AGHlbfKvqw4KLxaDi9pDWsyZFZkB8Ha1MLyvbaJvzWFWTsR2d12BptL1wdXFBiXtPfbywlKuUlpa4vDleGIAoZ3oywRdJ_wPsg5X2rCgr79BhwNufTXeKIAOjL_cZfuOZkASf8MzueT7aYGO3CMeWcFDkRcek3Svi58-7CTyTfYn3-0aQ0a73NjoNCb_Jta-cfFoTmCgD5G6h6SgftOXNh-Q'
dataFile="./data/data_25k.arrow"
contentType="application/vnd.apache.arrow.file"
!curl -X POST {inference_url} -H "Authorization: Bearer {token}" -H "Content-Type:{contentType}" --data-binary @{dataFile} > curl_response.df
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 25.5M  100 20.8M  100 4874k  2151k   492k  0:00:09  0:00:09 --:--:-- 4494k
cc_data_from_file =  pd.read_json('./curl_response.df', orient="records")
display(cc_data_from_file.head(5).loc[:, ["time","out"]])
timeout
01684357453186{'banjori': [0.0015195871], 'corebot': [0.9829148], 'cryptolocker': [0.012099565000000001], 'dircrypt': [4.7591344e-05], 'gozi': [2.0289392e-05], 'kraken': [0.0003197726], 'locky': [0.011029272000000001], 'main': [0.997564], 'matsnu': [0.010341625], 'pykspa': [0.008038965], 'qakbot': [0.016155062], 'ramdo': [0.006236233000000001], 'ramnit': [0.0009985756], 'simda': [1.793378e-26], 'suppobox': [1.3889898e-27]}
11684357453186{'banjori': [7.447225e-18], 'corebot': [6.7359245e-08], 'cryptolocker': [0.17081991], 'dircrypt': [1.3220147000000001e-09], 'gozi': [1.2758853e-24], 'kraken': [0.22559536], 'locky': [0.34209844], 'main': [0.99999994], 'matsnu': [0.30801848], 'pykspa': [0.18282163], 'qakbot': [3.8022553999999996e-11], 'ramdo': [0.20622534], 'ramnit': [0.15215826], 'simda': [1.17020745e-30], 'suppobox': [3.1514464999999997e-38]}
21684357453186{'banjori': [2.8599304999999997e-21], 'corebot': [9.302004999999999e-08], 'cryptolocker': [0.04445295], 'dircrypt': [6.1637580000000004e-09], 'gozi': [8.34974e-23], 'kraken': [0.48234479999999996], 'locky': [0.2633289], 'main': [1.0], 'matsnu': [0.29800323], 'pykspa': [0.22361766], 'qakbot': [1.5238920999999999e-06], 'ramdo': [0.3282038], 'ramnit': [0.029332466], 'simda': [1.1995533000000001e-31], 'suppobox': [0.0]}
31684357453186{'banjori': [2.1386805e-15], 'corebot': [3.8817485e-10], 'cryptolocker': [0.045599725], 'dircrypt': [1.9090386e-07], 'gozi': [1.3139924000000002e-25], 'kraken': [0.59542614], 'locky': [0.17374131], 'main': [0.9999996999999999], 'matsnu': [0.2315157], 'pykspa': [0.17591687], 'qakbot': [1.087611e-09], 'ramdo': [0.21832284000000002], 'ramnit': [0.012869288000000001], 'simda': [6.158882e-28], 'suppobox': [1.438591e-35]}
41684357453186{'banjori': [9.453381e-15], 'corebot': [7.091152e-10], 'cryptolocker': [0.049815107000000004], 'dircrypt': [5.2914135e-09], 'gozi': [7.4132087e-19], 'kraken': [1.5504637e-13], 'locky': [1.079181e-15], 'main': [0.9999988999999999], 'matsnu': [1.5003076000000002e-15], 'pykspa': [0.33075709999999997], 'qakbot': [2.6258948e-07], 'ramdo': [0.50362796], 'ramnit': [0.020393757000000002], 'simda': [0.0], 'suppobox': [0.0]}

Undeploy Pipeline

When finished with our tests, we will undeploy the pipeline so we have the Kubernetes resources back for other tasks. Note that if the deployment variable is unchanged pipeline.deploy() will restart the inference engine in the same configuration as before.

pipeline.undeploy()
namegcpsdkpipeline
created2023-05-17 21:03:44.485720+00:00
last_updated2023-05-17 21:03:49.137632+00:00
deployedFalse
tags
versions6398cafc-50c4-49e3-9499-6025b7808245, 7c043d3c-c894-4ae9-9ec1-c35518130b90
stepsgcpsdkmodel

1.5 - Wallaroo SDK Standard Install Guide

How to install the Wallaroo SDK in typical environment

This tutorial and the assets can be downloaded as part of the Wallaroo Tutorials repository.

Installing the Wallaroo SDK

Organizations that develop machine learning models can deploy models to Wallaroo from their local systems to a Wallaroo instance through the Wallaroo SDK. The following guide is created to assist users with installing the Wallaroo SDK and making a standard connection to a Wallaroo instance.

These instructions are based on the on the Connect to Wallaroo guides.

This tutorial provides the following:

  • aloha-cnn-lstm.zip: A pre-trained open source model that uses an Aloha CNN LSTM model for classifying Domain names as being either legitimate or being used for nefarious purposes such as malware distribution.
  • Test Data Files:
    • data_1k.arrow: 1,000 records
    • data_25k.arrow: 25,000 records

For this example, a virtual python environment will be used. This will set the necessary libraries and specific Python version required.

Prerequisites

The following is required for this tutorial:

  • A Wallaroo instance version 2023.1 or later.
  • Python 3.8.6 or later installed locally.
  • Conda: Used for managing python virtual environments.
  • The following Python libraries installed:
    • os
    • wallaroo: The Wallaroo SDK. Included with the Wallaroo JupyterHub service by default.
    • pandas: Pandas, mainly used for Pandas DataFrame
    • pyarrow: PyArrow for Apache Arrow support
    • polars: Polars for DataFrame with native Apache Arrow support

General Steps

For our example, we will perform the following:

  • Wallaroo SDK Install
    • Set up a Python virtual environment through conda with the libraries that enable the virtual environment for use in a Jupyter Hub environment.
    • Install the Wallaroo SDK.
  • Wallaroo SDK from remote JupyterHub Demonstration (Optional): The following steps are an optional exercise to demonstrate using the Wallaroo SDK from a remote connection. The entire tutorial can be found on the Wallaroo Tutorials repository.
    • Connect to a remote Wallaroo instance.
    • Create a workspace for our work.
    • Upload the Aloha model.
    • Create a pipeline that can ingest our submitted data, submit it to the model, and export the results
    • Run a sample inference through our pipeline by loading a file
    • Retrieve the external deployment URL. This sample Wallaroo instance has been configured to create external inference URLs for pipelines. For more information, see the External Inference URL Guide.
    • Run a sample inference through our pipeline’s external URL and store the results in a file. This assumes that the External Inference URLs have been enabled for the target Wallaroo instance.
    • Undeploy the pipeline and return resources back to the Wallaroo instance’s Kubernetes environment.

Install Wallaroo SDK

Set Up Virtual Python Environment

To set up the Python virtual environment for use of the Wallaroo SDK:

  1. From a terminal shell, create the Python virtual environment with conda. Replace wallaroosdk with the name of the virtual environment as required by your organization. Note that Python 3.8.6 and above is specified as a requirement for Python libraries used with the Wallaroo SDK. The following will install the latest version of Python 3.8.

    conda create -n wallaroosdk python=3.8
    
  2. Activate the new environment.

    conda activate wallaroosdk
    
  3. (Optional) For organizations who want to use the Wallaroo SDk from within Jupyter and similar environments:

    1. Install the ipykernel library. This allows the JupyterHub notebooks to access the Python virtual environment as a kernel, and it required for the second part of this tutorial.

      conda install ipykernel
      
    2. Install the new virtual environment as a python kernel.

      ipython kernel install --user --name=wallaroosdk
      
  4. Install the Wallaroo SDK. This process may take several minutes while the other required Python libraries are added to the virtual environment.

    • IMPORTANT NOTE: The version of the Wallaroo SDK should match the Wallaroo instance. For example, this example connects to a Wallaroo Enterprise version 2023.1 instance, so the SDK version should be wallaroo==2023.1.0.
    pip install wallaroo==2023.2.1
    

For organizations who will be using the Wallaroo SDK with Jupyter or similar services, the conda virtual environment has been installed, it can either be selected as a new Jupyter Notebook kernel, or the Notebook’s kernel can be set to an existing Jupyter notebook.

To use a new Notebook:

  1. From the main menu, select File->New-Notebook.
  2. From the Kernel selection dropbox, select the new virtual environment - in this case, wallaroosdk.

To update an existing Notebook to use the new virtual environment as a kernel:

  1. From the main menu, select Kernel->Change Kernel.
  2. Select the new kernel.

Sample Wallaroo Connection

With the Wallaroo Python SDK installed, remote commands and inferences can be performed through the following steps.

Open a Connection to Wallaroo

The first step is to connect to Wallaroo through the Wallaroo client.

This is accomplished using the wallaroo.Client(api_endpoint, auth_endpoint, auth_type command) command that connects to the Wallaroo instance services.

The Client method takes the following parameters:

  • api_endpoint (String): The URL to the Wallaroo instance API service.
  • auth_endpoint (String): The URL to the Wallaroo instance Keycloak service.
  • auth_type command (String): The authorization type. In this case, SSO.

The URLs are based on the Wallaroo Prefix and Wallaroo Suffix for the Wallaroo instance. For more information, see the DNS Integration Guide. In the example below, replace “YOUR PREFIX” and “YOUR SUFFIX” with the Wallaroo Prefix and Suffix, respectively.

Once run, the wallaroo.Client command provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Depending on the configuration of the Wallaroo instance, the user will either be presented with a login request to the Wallaroo instance or be authenticated through a broker such as Google, Github, etc. To use the broker, select it from the list under the username/password login forms. For more information on Wallaroo authentication configurations, see the Wallaroo Authentication Configuration Guides.

Once authenticated, the user will verify adding the device the user is establishing the connection from. Once both steps are complete, then the connection is granted.

The connection is stored in the variable wl for use in all other Wallaroo calls.

import wallaroo
from wallaroo.object import EntityNotFoundError

# to display dataframe tables
from IPython.display import display
# used to display dataframe information without truncating
import pandas as pd
pd.set_option('display.max_colwidth', None)
import pyarrow as pa
wallaroo.__version__
'2023.2.0rc3'

Connect to Wallaroo

For this example, a connection through the Wallaroo SDK is used. For more information, see the Wallaroo SDK Essentials Guide: Client Connection.

For wallarooPrefix = "YOUR PREFIX." and wallarooSuffix = "YOUR SUFFIX", enter the prefix and suffix for your Wallaroo instance DNS name. If the prefix instance is blank, then it can be wallarooPrefix = "". Note that the prefix includes the . for proper formatting. For example, if the prefix is empty and the suffix is wallaroo.example.com, then the settings would be:

wallarooPrefix = ""
wallarooSuffix = "wallaroo.example.com"

If the prefix is sales. and the suffix example.com, then the settings would be:

wallarooPrefix = "sales."
wallarooSuffix = "wallaroo.example.com"
# SSO login through keycloak

wallarooPrefix = "YOUR PREFIX."
wallarooSuffix = "YOUR SUFFIX"

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
                    auth_type="sso")

Wallaroo Remote SDK Examples

The following examples can be used by an organization to test using the Wallaroo SDK from a remote location from their Wallaroo instance. These examples show how to create workspaces, deploy pipelines, and perform inferences through the SDK and API.

Create the Workspace

We will create a workspace to work in and call it the sdkworkspace, then set it as current workspace environment. We’ll also create our pipeline in advance as sdkpipeline.

  • IMPORTANT NOTE: For this example, the Aloha model is stored in the file alohacnnlstm.zip. When using tensor based models, the zip file must match the name of the tensor directory. For example, if the tensor directory is alohacnnlstm, then the .zip file must be named alohacnnlstm.zip.
workspace_name = 'sdkquickworkspace'
pipeline_name = 'sdkquickpipeline'
model_name = 'sdkquickmodel'
model_file_name = './alohacnnlstm.zip'
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(name)
    return pipeline
workspace = get_workspace(workspace_name)

wl.set_current_workspace(workspace)

pipeline = get_pipeline(pipeline_name)
pipeline
namesdkquickpipeline
created2023-05-17 20:43:38.111213+00:00
last_updated2023-05-17 20:43:38.111213+00:00
deployed(none)
tags
versionsd72c468a-a0e2-4189-aa7a-4e27127a2f2b
steps

We can verify the workspace is created the current default workspace with the get_current_workspace() command.

wl.get_current_workspace()
{'name': 'sdkquickworkspace', 'id': 6, 'archived': False, 'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da', 'created_at': '2023-05-17T20:43:36.727099+00:00', 'models': [], 'pipelines': [{'name': 'sdkquickpipeline', 'create_time': datetime.datetime(2023, 5, 17, 20, 43, 38, 111213, tzinfo=tzutc()), 'definition': '[]'}]}

Upload the Models

Now we will upload our model. Note that for this example we are applying the model from a .ZIP file. The Aloha model is a protobuf file that has been defined for evaluating web pages, and we will configure it to use data in the tensorflow format.

from wallaroo.framework import Framework
model = wl.upload_model(model_name, model_file_name, framework=Framework.TENSORFLOW).configure("tensorflow")

Deploy a Model

Now that we have a model that we want to use we will create a deployment for it.

We will tell the deployment we are using a tensorflow model and give the deployment name and the configuration we want for the deployment.

To do this, we’ll create our pipeline that can ingest the data, pass the data to our Aloha model, and give us a final output. We’ll call our pipeline externalsdkpipeline, then deploy it so it’s ready to receive data. The deployment process usually takes about 45 seconds.

pipeline.add_model_step(model)
namesdkquickpipeline
created2023-05-17 20:43:38.111213+00:00
last_updated2023-05-17 20:43:38.111213+00:00
deployed(none)
tags
versionsd72c468a-a0e2-4189-aa7a-4e27127a2f2b
steps
pipeline
namesdkquickpipeline
created2023-05-17 20:43:38.111213+00:00
last_updated2023-05-17 20:43:38.111213+00:00
deployed(none)
tags
versionsd72c468a-a0e2-4189-aa7a-4e27127a2f2b
steps
pipeline.deploy()
namesdkquickpipeline
created2023-05-17 20:43:38.111213+00:00
last_updated2023-05-17 20:43:46.036128+00:00
deployedTrue
tags
versions2bd5c838-f7cc-4f48-91ea-28a9ce0f7ed8, d72c468a-a0e2-4189-aa7a-4e27127a2f2b
stepssdkquickmodel

We can verify that the pipeline is running and list what models are associated with it.

pipeline.status()
{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.3.134',
   'name': 'engine-6c9b85fd76-rjdxp',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'sdkquickpipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'sdkquickmodel',
      'version': 'f404ffe0-33ca-44cf-8a19-731e088c3262',
      'sha': 'd71d9ffc61aaac58c2b1ed70a2db13d1416fb9d3f5b891e5e4e2e97180fe22f8',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.4.164',
   'name': 'engine-lb-584f54c899-wh22c',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

Interferences

Infer 1 row

Now that the pipeline is deployed and our Aloha model is in place, we’ll perform a smoke test to verify the pipeline is up and running properly. We’ll use the infer_from_file command to load a single encoded URL into the inference engine and print the results back out.

The result should tell us that the tokenized URL is legitimate (0) or fraud (1). This sample data should return close to 1.

## Demonstrate via straight infer

smoke_test = pd.DataFrame.from_records(
    [
    {
        "text_input":[
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            0,
            28,
            16,
            32,
            23,
            29,
            32,
            30,
            19,
            26,
            17
        ]
    }
]
)

result = pipeline.infer(smoke_test)
display(result.loc[:, ["time","out.main"]])
timeout.main
02023-05-17 20:44:04.622[0.997564]

Infer 1,000 Rows

We can also infer an entire batch as one request either with the Pipeline infer method with multiple rows, or loaded from a file using the Pipeline infer_from_file method. For this example, we will run a batch on 1,000 records using the file data_1k.arrow. This is an Apache Arrow table, which gives the added benefit of speed and lower file size as a binary file rather than a text JSON file.

We’ll infer the 1,000 records, then convert it to a DataFrame and display the first 5 to save space in our Jupyter Notebook.

result = pipeline.infer_from_file('./data/data_1k.arrow')

outputs = result.to_pandas()
display(outputs.head(5).loc[:, ["time","out.main"]])
timeout.main
02023-05-17 20:44:05.905[0.997564]
12023-05-17 20:44:05.905[0.9885122]
22023-05-17 20:44:05.905[0.9993358]
32023-05-17 20:44:05.905[0.99999857]
42023-05-17 20:44:05.905[0.9984837]

Batch Inference

Now that our smoke test is successful, let’s really give it some data. We have two inference files we can use:

  • data_1k.arrow: Contains 10,000 inferences
  • data_25k.arrow: Contains 25,000 inferences

We’ll pipe the data-25k.json file through the pipeline deployment URL, and place the results in a file named response.txt. We’ll also display the time this takes. Note that for larger batches of 50,000 inferences or more can be difficult to view in Jupyter Hub because of its size.

When retrieving the pipeline inference URL through an external SDK connection, the External Inference URL will be returned. This URL will function provided that the Enable external URL inference endpoints is enabled. For more information, see the Wallaroo Model Endpoints Guide.

inference_url = pipeline._deployment._url()
inference_url

The API connection details can be retrieved through the Wallaroo client mlops() command. This will display the connection URL, bearer token, and other information. The bearer token is available for one hour before it expires.

For this example, the API connection details will be retrieved, then used to submit an inference request through the external inference URL retrieved earlier.

connection =wl.mlops().__dict__
token = connection['token']
token
'eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJDYkFqN19QY0xCWTFkWmJiUDZ6Q3BsbkNBYTd6US0tRHlyNy0yLXlQb25nIn0.eyJleHAiOjE2ODQzNTYyNzUsImlhdCI6MTY4NDM1NjIxNSwiYXV0aF90aW1lIjoxNjg0MzU1OTU5LCJqdGkiOiI1ZjRiNDQ1Zi1hNmI1LTQ2NzgtODZhMS05MWRjMjMxMjJiODIiLCJpc3MiOiJodHRwczovL2RvYy10ZXN0LmtleWNsb2FrLndhbGxhcm9vY29tbXVuaXR5Lm5pbmphL2F1dGgvcmVhbG1zL21hc3RlciIsImF1ZCI6WyJtYXN0ZXItcmVhbG0iLCJhY2NvdW50Il0sInN1YiI6IjAyOGM4YjQ4LWMzOWItNDU3OC05MTEwLTBiNWJkZDM4MjRkYSIsInR5cCI6IkJlYXJlciIsImF6cCI6InNkay1jbGllbnQiLCJzZXNzaW9uX3N0YXRlIjoiMGJlODJjN2ItNzg1My00ZjVkLWJiNWEtOTlkYjUwYjhiNDVmIiwiYWNyIjoiMCIsInJlYWxtX2FjY2VzcyI6eyJyb2xlcyI6WyJkZWZhdWx0LXJvbGVzLW1hc3RlciIsIm9mZmxpbmVfYWNjZXNzIiwidW1hX2F1dGhvcml6YXRpb24iXX0sInJlc291cmNlX2FjY2VzcyI6eyJtYXN0ZXItcmVhbG0iOnsicm9sZXMiOlsibWFuYWdlLXVzZXJzIiwidmlldy11c2VycyIsInF1ZXJ5LWdyb3VwcyIsInF1ZXJ5LXVzZXJzIl19LCJhY2NvdW50Ijp7InJvbGVzIjpbIm1hbmFnZS1hY2NvdW50IiwibWFuYWdlLWFjY291bnQtbGlua3MiLCJ2aWV3LXByb2ZpbGUiXX19LCJzY29wZSI6InByb2ZpbGUgZW1haWwiLCJzaWQiOiIwYmU4MmM3Yi03ODUzLTRmNWQtYmI1YS05OWRiNTBiOGI0NWYiLCJlbWFpbF92ZXJpZmllZCI6ZmFsc2UsImh0dHBzOi8vaGFzdXJhLmlvL2p3dC9jbGFpbXMiOnsieC1oYXN1cmEtdXNlci1pZCI6IjAyOGM4YjQ4LWMzOWItNDU3OC05MTEwLTBiNWJkZDM4MjRkYSIsIngtaGFzdXJhLWRlZmF1bHQtcm9sZSI6InVzZXIiLCJ4LWhhc3VyYS1hbGxvd2VkLXJvbGVzIjpbInVzZXIiXSwieC1oYXN1cmEtdXNlci1ncm91cHMiOiJ7fSJ9LCJuYW1lIjoiSm9obiBIYW5zYXJpY2siLCJwcmVmZXJyZWRfdXNlcm5hbWUiOiJqb2huLmh1bW1lbEB3YWxsYXJvby5haSIsImdpdmVuX25hbWUiOiJKb2huIiwiZmFtaWx5X25hbWUiOiJIYW5zYXJpY2siLCJlbWFpbCI6ImpvaG4uaHVtbWVsQHdhbGxhcm9vLmFpIn0.MS3R4uCav5m3UUQwxrKIXWLgw86DOoWnGJ9TSLAzbn-AVUliDNzBCt2LwWRgOmiUAlIrhFd2l8MCfHg4Ye8M7ucM099OH1VTmaXKOxBVc3Vlbr3tZQD9QcdDkonxe8DttrnOzMVxYKwci7YAiwEEz9eLmLvgxP6FSSl1rMMdxnTdAU9VFEDMUq00p3pu7hFGzsR_4XhGXOyJep-vhNvzP0eh1w3OVh11VjxaFs8a-2jXqYwBByg4oh6jArJ6Nhjcobcv0bqW2h5Q9stL_hT2ZNPd0F5bycc7UQRtDQ0Mt1JbNvm2G6pz3n3vo1nQJQ5y08fnQph6bvNT6LEpIOAEZA'
dataFile="./data/data_25k.arrow"
contentType="application/vnd.apache.arrow.file"
!curl -X POST {inference_url} -H "Authorization: Bearer {token}" -H "Content-Type:{contentType}" --data-binary @{dataFile} > curl_response.df
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 25.6M  100 20.8M  100 4874k   771k   176k  0:00:27  0:00:27 --:--:-- 1161k0:00:35  0:00:19  0:00:16 1079k
cc_data_from_file =  pd.read_json('./curl_response.df', orient="records")
display(cc_data_from_file.head(5).loc[:, ["time","out"]])
timeout
01684356249123{'banjori': [0.0015195821], 'corebot': [0.9829147500000001], 'cryptolocker': [0.012099549000000001], 'dircrypt': [4.7591115e-05], 'gozi': [2.0289428e-05], 'kraken': [0.00031977256999999996], 'locky': [0.011029262000000001], 'main': [0.997564], 'matsnu': [0.010341609], 'pykspa': [0.008038961], 'qakbot': [0.016155055], 'ramdo': [0.00623623], 'ramnit': [0.0009985747000000001], 'simda': [1.7933434e-26], 'suppobox': [1.388995e-27]}
11684356249123{'banjori': [7.447196e-18], 'corebot': [6.7359245e-08], 'cryptolocker': [0.1708199], 'dircrypt': [1.3220122000000002e-09], 'gozi': [1.2758705999999999e-24], 'kraken': [0.22559543], 'locky': [0.34209849999999997], 'main': [0.99999994], 'matsnu': [0.3080186], 'pykspa': [0.1828217], 'qakbot': [3.8022549999999994e-11], 'ramdo': [0.2062254], 'ramnit': [0.15215826], 'simda': [1.1701982e-30], 'suppobox': [3.1514454e-38]}
21684356249123{'banjori': [2.8598648999999997e-21], 'corebot': [9.302004000000001e-08], 'cryptolocker': [0.04445298], 'dircrypt': [6.1637580000000004e-09], 'gozi': [8.3496755e-23], 'kraken': [0.48234479999999996], 'locky': [0.26332903], 'main': [1.0], 'matsnu': [0.29800338], 'pykspa': [0.22361776], 'qakbot': [1.5238920999999999e-06], 'ramdo': [0.32820392], 'ramnit': [0.029332489000000003], 'simda': [1.1995622e-31], 'suppobox': [0.0]}
31684356249123{'banjori': [2.1387213e-15], 'corebot': [3.8817485e-10], 'cryptolocker': [0.045599736], 'dircrypt': [1.9090386e-07], 'gozi': [1.3140123e-25], 'kraken': [0.59542626], 'locky': [0.17374137], 'main': [0.9999996999999999], 'matsnu': [0.23151578], 'pykspa': [0.17591679999999998], 'qakbot': [1.0876152e-09], 'ramdo': [0.21832279999999998], 'ramnit': [0.0128692705], 'simda': [6.1588803e-28], 'suppobox': [1.4386237e-35]}
41684356249123{'banjori': [9.453342500000001e-15], 'corebot': [7.091151e-10], 'cryptolocker': [0.049815163], 'dircrypt': [5.2914135e-09], 'gozi': [7.4132087e-19], 'kraken': [1.5504574999999998e-13], 'locky': [1.079181e-15], 'main': [0.9999988999999999], 'matsnu': [1.5003075e-15], 'pykspa': [0.33075705], 'qakbot': [2.6258850000000004e-07], 'ramdo': [0.5036279], 'ramnit': [0.020393765], 'simda': [0.0], 'suppobox': [2.3292326e-38]}

Undeploy Pipeline

When finished with our tests, we will undeploy the pipeline so we have the Kubernetes resources back for other tasks. Note that if the deployment variable is unchanged pipeline.deploy() will restart the inference engine in the same configuration as before.

pipeline.undeploy()
namesdkquickpipeline
created2023-05-17 20:43:38.111213+00:00
last_updated2023-05-17 20:43:46.036128+00:00
deployedFalse
tags
versions2bd5c838-f7cc-4f48-91ea-28a9ce0f7ed8, d72c468a-a0e2-4189-aa7a-4e27127a2f2b
stepssdkquickmodel

2 - Wallaroo SDK Essentials Guide

Reference Guide for the most essential Wallaroo SDK Commands

The following guides detail how to use the Wallaroo SDK. These include detailed instructions on classes, methods, and code examples.

Wallaroo JupyterHub Python Libraries

When using the Wallaroo SDK, it is recommended that the Python modules used are the same as those used in the Wallaroo JupyterHub environments to ensure maximum compatibility. When installing modules in the Wallaroo JupyterHub environments, do not override the following modules or versions, as that may impact how the JupyterHub environments performance.

appdirs == 1.4.4
gql == 3.4.0
ipython == 7.24.1
matplotlib == 3.5.0
numpy == 1.22.3
orjson == 3.8.0
pandas == 1.3.4
pyarrow == 9.0.0
PyJWT == 2.4.0
python_dateutil == 2.8.2
PyYAML == 6.0
requests == 2.25.1
scipy == 1.8.0
seaborn == 0.11.2
tenacity == 8.0.1
# Required by gql?
requests_toolbelt>=0.9.1<1
# Required by the autogenerated ML Ops client
httpx >= 0.15.4<0.24.0
attrs >= 21.3.0
# These are documented as part of the autogenerated ML Ops requirements
# python = ^3.7
# python-dateutil = ^2.8.0

Model and Framework Support

Supported Models

The following frameworks are supported. Frameworks fall under either Native or Containerized runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

Please note the following.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.2.1 (July 2023)1.12.18173
2023.2 (May 2023)1.12.18173
2023.1 (March 2023)1.12.18173
2022.4 (December 2022)1.12.18173
After April 2022 until release 2022.4 (December 2022)1.10.*7152
Before April 20221.6.*7132

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]
ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.1
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0
ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.27.0
  • diffusers==0.14.0
  • accelerate==0.18.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
RuntimeContainerized aka tensorflow / mlflow

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING-FACE-IMAGE-TO-TEXT
  • Framework.HUGGING-FACE-TEXT-CLASSIFICATION
  • Framework.HUGGING-FACE-SUMMARIZATION
  • Framework.HUGGING-FACE-TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING-FACE-FEATURE-EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-TO-TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-QUESTION-ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-STABLE-DIFFUSION-TEXT-2-IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SENTIMENT-ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])
ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
RuntimeContainerized aka mlflow

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.2.2
FrameworkFramework.SKLEARN aka sklearn
RuntimeContainerized aka tensorflow / mlflow

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.8.0
  • keras==1.1.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
RuntimeContainerized aka mlflow

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Librariesxgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
RuntimeContainerized aka tensorflow / mlflow

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka mlflow

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model Optional[Any]An optional list of models that match the supported frameworks from wallaroo.framework.Framework included in the arbitrary python script. Note that this is optional - no models are actually required. A BYOP can refer to a specific model(s) used, be used for data processing and reshaping for later pipeline steps, or other needs.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when expected_model_types is not set.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.30.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

List Wallaroo Frameworks

Wallaroo frameworks are listed from the Wallaroo.Framework class. The following demonstrates listing all available supported frameworks.

from wallaroo.framework import Framework

[e.value for e in Framework]

    ['onnx',
    'tensorflow',
    'python',
    'keras',
    'sklearn',
    'pytorch',
    'xgboost',
    'hugging-face-feature-extraction',
    'hugging-face-image-classification',
    'hugging-face-image-segmentation',
    'hugging-face-image-to-text',
    'hugging-face-object-detection',
    'hugging-face-question-answering',
    'hugging-face-stable-diffusion-text-2-img',
    'hugging-face-summarization',
    'hugging-face-text-classification',
    'hugging-face-translation',
    'hugging-face-zero-shot-classification',
    'hugging-face-zero-shot-image-classification',
    'hugging-face-zero-shot-object-detection',
    'hugging-face-sentiment-analysis',
    'hugging-face-text-generation']

Supported Data Types

The following data types are supported for transporting data to and from Wallaroo in the following run times:

  • ONNX
  • TensorFlow
  • MLFlow

Data Type Conditions

The following conditions apply to data types used in inference requests.

  • None or Null data types are not submitted. All fields must have submitted values that match their data type. For example, if the schema expects a float value, then some value of type float must be submitted and can not be None or Null. If a schema expects a string value, then some value of type string must be submitted, etc.
  • datetime data types must be converted to string.
  • ONNX models support multiple inputs only of the same data type.
RuntimeBFloat16*Float16Float32Float64
ONNXXX
TensorFlowXXX
MLFlowXXX
  • * (Brain Float 16, represented internally as a f32)

RuntimeInt8Int16Int32Int64
ONNXXXXX
TensorFlowXXXX
MLFlowXXXX
RuntimeUint8Uint16Uint32Uint64
ONNXXXXX
TensorFlowXXXX
MLFlowXXXX
RuntimeBooleanUtf8 (String)Complex 64Complex 128FixedSizeList*
ONNXX
TensorXXX
MLFlowXXX
  • * Fixed sized lists of any of the previously supported data types.

2.1 - Wallaroo SDK Essentials Guide: Client Connection

How to connect to a Wallaroo instance through the Wallaroo SDK

Users connect to a Wallaroo instance with the Wallaroo Client class. This connection can be made from within the Wallaroo instance, or external from the Wallaroo instance via the Wallaroo SDK.

The following methods are supported in connecting to the Wallaroo instance:

  • Connect from Within the Wallaroo Instance: Connect within the JupyterHub service or other method within the Kubernetes cluster hosting the Wallaroo instance. This requires confirming the connections with the Wallaroo instance through a browser link.
  • Connect from Outside the Wallaroo Instance: Connect via the Wallaroo SDK via an external connection to the Kubernetes cluster hosting the Wallaroo instance. This requires confirming the connections with the Wallaroo instance through a browser link.
  • Automated Connection: Connect to the Wallaroo instance by providing the username and password directly into the request. This bypasses confirming the connections with the Wallaroo instance through a browser link.

Once run, the wallaroo.Client command provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Depending on the configuration of the Wallaroo instance, the user will either be presented with a login request to the Wallaroo instance or be authenticated through a broker such as Google, Github, etc. To use the broker, select it from the list under the username/password login forms. For more information on Wallaroo authentication configurations, see the Wallaroo Authentication Configuration Guides.

Wallaroo Login

Once authenticated, the user will verify adding the device the user is establishing the connection from. Once both steps are complete, then the connection is granted.

Device Registration

Connect from Within the Wallaroo Instance

Users who connect from within their Wallaroo instance’s Kubernetes environment, such as through the Wallaroo provided JupyterHub service, will be authenticated with the Wallaroo Client() method.

The first step in using Wallaroo is creating a connection. To connect to your Wallaroo environment:

  1. Import the wallaroo library:

    import wallaroo
    
  2. Open a connection to the Wallaroo environment with the wallaroo.Client() command and save it to a variable.

    In this example, the Wallaroo connection is saved to the variable wl.

    wl = wallaroo.Client()
    
  3. A verification URL will be displayed. Enter it into your browser and grant access to the SDK client.

    Wallaroo Confirm Connection
  4. Once this is complete, you will be able to continue with your Wallaroo commands.

    Wallaroo Connection Example

Connect from Outside the Wallaroo Instance

Users who have installed the Wallaroo SDK from an external location, such as their own JupyterHub service, Google Workbench, or other services can connect via Single-Sign On (SSO). This is accomplished using the wallaroo.Client(api_endpoint, auth_endpoint, auth_type command="sso") command that connects to the Wallaroo instance services. For more information on the DNS names of Wallaroo services, see the DNS Integration Guide.

Before performing this step, verify that SSO is enabled for the specific service. For more information, see the Wallaroo Authentication Configuration Guide.

The Client method takes the following parameters:

  • api_endpoint (String): The URL to the Wallaroo instance API service.
  • auth_endpoint (String): The URL to the Wallaroo instance Keycloak service.
  • auth_type command (String): The authorization type. In this case, SSO.

Once run, the wallaroo.Client command provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. This connection is stored into a variable that can be referenced later.

In this example, a connection will be made to the Wallaroo instance shadowy-unicorn-5555.wallaroo.ai through SSO authentication.

import wallaroo
from wallaroo.object import EntityNotFoundError

# SSO login through keycloak

wl = wallaroo.Client(api_endpoint="https://shadowy-unicorn-5555.api.wallaroo.ai", 
                    auth_endpoint="https://shadowy-unicorn-5555.keycloak.wallaroo.ai", 
                    auth_type="sso")
Please log into the following URL in a web browser:

    https://shadowy-unicorn-5555.keycloak.wallaroo.example.com/auth/realms/master/device?user_code=LGZP-FIQX

Login successful!

Automated Connection

Users can connect either internally or externally without confirming the connection via a browser link using the wallaroo.Client(api_endpoint, auth_endpoint, auth_type="user_password") command for external connections, and wallaroo.Client(auth_type="user_password") with internal connections.

The auth_type="user_password" parameter requires either the environment parameter WALLAROO_SDK_CREDENTIALS with the following settings:

{
    "username": "{Connecting User's Username}", 
    "password": "{Connecting User's Password}", 
    "email": "{Connecting User's Email Address}"
}

The other option is to provide the environment variables WALLAROO_USER and WALLAROO_PASSWORD:

WALLAROO_USER={Connecting User's Username}
WALLAROO_PASSWORD={Connecting User's Password}

In typical installations, the username and email settings will both be the user’s email address.

For example, if the username is steve, the password is hello and the email is steve@ex.co then the ``WALLAROO_SDK_CREDENTIALS` can be set in the following ways:

# Import via file
os.environ["WALLAROO_SDK_CREDENTIALS"] = 'creds.json'
wl = wallaroo.Client(auth_type="user_password")

The other method:

# Set directly
os.environ["WALLAROO_USER"] = 'username@company.com'
os.environ["WALLAROO_PASSWORD"] = 'password'
wl = wallaroo.Client(auth_type="user_password")

For automated connections, using the environment options tied into a specific file with minimum access is recommended.

The following example shows connecting to a remote Wallaroo instance via the auth_type="user_password" parameter with the credentials stored in the creds.json file using the format above:

wallarooPrefix = "wallaroo"
wallarooSuffix = "example.com"

os.environ["WALLAROO_SDK_CREDENTIALS"] = 'creds.json'

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}.api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}.keycloak.{wallarooSuffix}", 
                    auth_type="user_password")

2.2 - Wallaroo SDK Essentials Guide: Data Connections Management

How to create and manage Wallaroo Data Connections through the Wallaroo SDK

Wallaroo Data Connections are provided to establish connections to external data stores for requesting or submitting information. They provide a source of truth for data source connection information to enable repeatability and access control within your ecosystem. The actual implementation of data connections are managed through other means, such as Wallaroo Pipeline Orchestrators and Tasks, where the libraries and other tools used for the data connection can be stored.

Wallaroo Data Connections have the following properties:

  • Available across the Wallaroo Instance: Data Connections are created at the Wallaroo instance level.
  • Tied to a Wallaroo Workspace: Data Connections, like pipeline and models, are tied to a workspace. This allows organizations to limit data connection access by restricting users to specific workspaces.
  • Support different types: Data Connections support various types of connections such as ODBC, Kafka, etc.

Create Data Connection

Data Connections are created through the Wallaroo Client create_connection(name, type, details) method.

ParameterTypeDescription
namestring (Required)The name of the connection. Names must be unique. Attempting to create a connection with the same name as an existing connection will cause an error.
typestring (Required)The user defined type of connection.
detailsDict (Required)User defined configuration details for the data connection. These can be {'username':'dataperson', 'password':'datapassword', 'port': 3339}, or {'token':'abcde123==', 'host':'example.com', 'port:1234'}, or other user defined combinations.

The SDK allows the data connections to be fully defined and stored for later use. This allows whatever type of data connection the organization uses to be defined, then applied to a workspace for other Wallaroo users to integrate into their code.

When data connections are displayed via the Wallaroo Client list_connections, the details field is removed to not show sensitive information by default.

wl.create_connection("houseprice_arrow_table", 
                  "HTTPFILE", 
                  {'host':'https://github.com/WallarooLabs/Wallaroo_Tutorials/tree/main/wallaroo-testing-tutorials/houseprice-saga/data/xtest-1k.arrow?raw=true'}
                  )
FieldValue
Namehouseprice_arrow_table
Connection TypeHTTPFILE
Details*****
Created At2023-05-04T17:52:32.249322+00:00

Get Connection By Name

The Wallaroo client get_connection(name) method retrieves the connection with the Connection name matching the name parameter.

ParameterTypeDescription
namestring (Required)The name of the connection.

In the following example, the connection name external_inference_connection will be retrieved and stored into the variable inference_source_connection.

inference_source_connection = wl.get_connection(name="external_inference_connection")
display(inference_source_connection)
FieldValue
Nameexternal_inference_connection
Connection TypeHTTPFILE
Details*****
Created At2023-05-08T20:10:07.914083+00:00

List Data Connections

The Wallaroo Client list_connections() method lists all connections for the Wallaroo instance. When data connections are displayed via the Wallaroo Client list_connections, the details field is removed to not show sensitive information by default.

wl.list_connections()
nameconnection typedetailscreated at
houseprice_arrow_tableHTTPFILE*****2023-05-04T17:52:32.249322+00:00

Add Data Connection to Workspace

The method Workspace add_connection(connection_name) adds a Data Connection to a workspace, and takes the following parameters.

ParameterTypeDescription
namestring (Required)The name of the Data Connection

Connection Details

The Connection method details() retrieves a the connection details() as a dict.

display(connection.details())

{'host': 'https://github.com/WallarooLabs/Wallaroo_Tutorials/tree/main/wallaroo-testing-tutorials/houseprice-saga/data/xtest-1k.arrow?raw=true'}

Remove Connection from Workspace

The Workspace method remove_connection(connection_name) removes the connection from the workspace, but does not delete the connection from the Wallaroo instance. This method takes the following parameters.

ParameterTypeDescription
nameString (Required)The name of the connection to be removed from the workspace.

Delete Connection

The Connection method delete_connection() removes the connection from the Wallaroo instance.

Before deleting a connection, it must be removed from all workspaces that it is attached to.

2.3 - Wallaroo SDK Essentials Guide: Workspace Management

How to create and use Wallaroo Workspaces through the Wallaroo SDK

Workspace Management

Workspaces are used to segment groups of models into separate environments. This allows different users to either manage or have access to each workspace, controlling the models and pipelines assigned to the workspace.

Workspace Naming Requirements

Workspace names map onto Kubernetes objects, and must be DNS compliant. Workspace names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Create a Workspace

Workspaces can be created either through the Wallaroo Dashboard or through the Wallaroo SDK.

  • IMPORTANT NOTICE

    Workspace names are not forced to be unique. You can have 50 workspaces all named my-amazing-workspace, which can cause confusion in determining which workspace to use.

    It is recommended that organizations agree on a naming convention and select the workspace to use rather than creating a new one each time.

To create a workspace, use the create_workspace("{WORKSPACE NAME}") command through an established Wallaroo connection and store the workspace settings into a new variable. Once the new workspace is created, the user who created the workspace is assigned as its owner. The following template is an example:

{New Workspace Variable} = {Wallaroo Connection}.create_workspace("{New Workspace Name}")

For example, if the connection is stored in the variable wl and the new workspace will be named imdb, then the command to store it in the new_workspace variable would be:

new_workspace = wl.create_workspace("imdb-workspace")

List Workspaces

The command list_workspaces() displays the workspaces that are part of the current Wallaroo connection. The following details are returned as an array:

ParameterTypeDescription
NameStringThe name of the workspace. Note that workspace names are not unique.
Created AtDateTimeThe date and time the workspace was created.
UsersArray[Users]A list of all users assigned to this workspace.
ModelsIntegerThe number of models uploaded to the workspace.
PipelinesIntegerThe number of pipelines in the environment.

For example, for the Wallaroo connection wl the following workspaces are returned:

wl.list_workspaces()
Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

Get Current Workspace

The command get_current_workspace displays the current workspace used for the Wallaroo connection. The following information is returned by default:

ParameterTypeDescription
nameStringThe name of the current workspace.
idIntegerThe ID of the current workspace.
archivedBoolWhether the workspace is archived or not.
created_byStringThe identifier code for the user that created the workspace.
created_atDateTimeWhen the timestamp for when workspace was created.
modelsArray[Models]The models that are uploaded to this workspace.
pipelinesArray[Pipelines]The pipelines created for the workspace.

For example, the following will display the current workspace for the wl connection that contains a single pipeline and multiple models:

wl.get_current_workspace()
{'name': 'imdb-workspace', 'id': 6, 'archived': False, 'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8', 'created_at': '2022-03-30T17: 09: 23.960406+00: 00', 'models': [
        {'name': 'embedder-o', 'version': '6dbe5524-7bc3-4ff3-8ca8-d454b2cbd0e4', 'file_name': 'embedder.onnx', 'last_update_time': datetime.datetime(2022,
            3,
            30,
            17,
            34,
            18,
            321105, tzinfo=tzutc())
        },
        {'name': 'smodel-o', 'version': '6eb7f824-3d77-417f-9169-6a301d20d842', 'file_name': 'sentiment_model.onnx', 'last_update_time': datetime.datetime(2022,
            3,
            30,
            17,
            34,
            18,
            783485, tzinfo=tzutc())
        }
    ], 'pipelines': [
        {'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022,
            3,
            30,
            17,
            34,
            19,
            318819, tzinfo=tzutc()), 'definition': '[]'
        }
    ]
}

Set the Current Workspace

The current workspace can be set through set_current_workspace for the Wallaroo connection through the following call, and returns the workspace details as a JSON object:

{Wallaroo Connection}.set_current_workspace({Workspace Object})

Set Current Workspace from a New Workspace

The following example creates the workspace imdb-workspace through the Wallaroo connection stored in the variable wl, then sets it as the current workspace:

new_workspace = wl.create_workspace("imdb-workspace")
wl.set_current_workspace(new_workspace)
{'name': 'imdb-workspace', 'id': 7, 'archived': False, 'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8', 'created_at': '2022-03-30T17:43:09.405038+00:00', 'models': [], 'pipelines': []}

Set the Current Workspace an Existing Workspace

To set the current workspace from an established workspace, the easiest method is to use list_workspaces() then set the current workspace as the array value displayed. For example, from the following list_workspaces() command the 3rd workspace element demandcurve-workspace can be assigned as the current workspace:

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

wl.set_current_workspace(wl.list_workspaces()[2])

{'name': 'demandcurve-workspace', 'id': 3, 'archived': False, 'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8', 'created_at': '2022-03-29T20:21:32.732178+00:00', 'models': [{'name': 'demandcurve', 'version': '4f5193fc-9c18-4851-8489-42e61d095588', 'file_name': 'demand_curve_v1.onnx', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 21, 32, 822812, tzinfo=tzutc())}, {'name': 'preprocess', 'version': '159b9e99-edb6-4c5e-8336-63bc6000623e', 'file_name': 'preprocess.py', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 21, 32, 984117, tzinfo=tzutc())}, {'name': 'postprocess', 'version': '77ee154c-d64c-49dd-985a-96f4c2931b6e', 'file_name': 'postprocess.py', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 21, 33, 119037, tzinfo=tzutc())}], 'pipelines': [{'name': 'demand-curve-pipeline', 'create_time': datetime.datetime(2022, 3, 29, 20, 21, 33, 264321, tzinfo=tzutc()), 'definition': '[]'}]}

Add a User to a Workspace

Users are added to the workspace via their email address through the wallaroo.workspace.Workspace.add_user({email address}) command. The email address must be assigned to a current user in the Wallaroo platform before they can be assigned to the workspace.

For example, the following workspace imdb-workspace has the user steve@ex.co. We will add the user john@ex.co to this workspace:

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

current_workspace = wl.list_workspaces()[3]

current_workspace.add_user("john@ex.co")

{'name': 'imdb-workspace', 'id': 4, 'archived': False, 'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8', 'created_at': '2022-03-29T20:23:08.742676+00:00', 'models': [{'name': 'embedder-o', 'version': '23a33c3d-68e6-4bdb-a8bc-32ea846908ee', 'file_name': 'embedder.onnx', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 23, 8, 833716, tzinfo=tzutc())}, {'name': 'smodel-o', 'version': '2c298aa9-be9d-482d-8188-e3564bdbab43', 'file_name': 'sentiment_model.onnx', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 23, 9, 49881, tzinfo=tzutc())}], 'pipelines': [{'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 29, 20, 23, 28, 518946, tzinfo=tzutc()), 'definition': '[]'}]}

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co', 'john@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

Remove a User to a Workspace

Removing a user from a workspace is performed through the wallaroo.workspace.Workspace.remove_user({email address}) command, where the {email address} matches a user in the workspace.

In the following example, the user john@ex.co is removed from the workspace imdb-workspace.

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co', 'john@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

current_workspace = wl.list_workspaces()[3]

current_workspace.remove_user("john@ex.co")

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

Add a Workspace Owner

To update the owner of workspace, or promote an existing user of a workspace to the owner of workspace, use the wallaroo.workspace.Workspace.add_owner({email address}) command. The email address must be assigned to a current user in the Wallaroo platform before they can be assigned as the owner to the workspace.

The following example shows assigning the user john@ex.co as an owner to the workspace imdb-workspace:

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

current_workspace = wl.list_workspaces()[3]

current_workspace.add_owner("john@ex.co")

{'name': 'imdb-workspace', 'id': 4, 'archived': False, 'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8', 'created_at': '2022-03-29T20:23:08.742676+00:00', 'models': [{'name': 'embedder-o', 'version': '23a33c3d-68e6-4bdb-a8bc-32ea846908ee', 'file_name': 'embedder.onnx', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 23, 8, 833716, tzinfo=tzutc())}, {'name': 'smodel-o', 'version': '2c298aa9-be9d-482d-8188-e3564bdbab43', 'file_name': 'sentiment_model.onnx', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 23, 9, 49881, tzinfo=tzutc())}], 'pipelines': [{'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 29, 20, 23, 28, 518946, tzinfo=tzutc()), 'definition': '[]'}]}

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co', 'john@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

2.4 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations

How to create and manage Wallaroo Models Uploads through the Wallaroo SDK

Models are uploaded or registered to a Wallaroo workspace depending on the model framework and version.

Wallaroo Engine Runtimes

Pipeline deployment configurations provide two runtimes to run models in the Wallaroo engine:

  • Native Runtimes: Models that are deployed “as is” with the Wallaroo engine. These are:

    • ONNX
    • Python step
    • Tensorflow 2.9.1 in SavedModel format
  • Containerized Runtimes: Containerized models such as MLFlow or Arbitrary Python. These are run in the Wallaroo engine in their containerized form.

  • Non-Native Runtimes: Models that when uploaded are either converted to a native Wallaroo runtime, or are containerized so they can be run in the Wallaroo engine. When uploaded, Wallaroo will attempt to convert it to a native runtime. If it can not be converted, then it will be packed in a Wallaroo containerized model based on its framework type.

    Pipeline Deployment Configurations

Pipeline configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space.

This model will always run in the native runtime space.

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Models and Runtimes

Supported Models

The following frameworks are supported. Frameworks fall under either Native or Containerized runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

Please note the following.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.2.1 (July 2023)1.12.18173
2023.2 (May 2023)1.12.18173
2023.1 (March 2023)1.12.18173
2022.4 (December 2022)1.12.18173
After April 2022 until release 2022.4 (December 2022)1.10.*7152
Before April 20221.6.*7132

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]
ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.1
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0
ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.27.0
  • diffusers==0.14.0
  • accelerate==0.18.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
RuntimeContainerized aka tensorflow / mlflow

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING-FACE-IMAGE-TO-TEXT
  • Framework.HUGGING-FACE-TEXT-CLASSIFICATION
  • Framework.HUGGING-FACE-SUMMARIZATION
  • Framework.HUGGING-FACE-TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING-FACE-FEATURE-EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-TO-TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-QUESTION-ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-STABLE-DIFFUSION-TEXT-2-IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SENTIMENT-ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])
ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
RuntimeContainerized aka mlflow

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.2.2
FrameworkFramework.SKLEARN aka sklearn
RuntimeContainerized aka tensorflow / mlflow

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.8.0
  • keras==1.1.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
RuntimeContainerized aka mlflow

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Librariesxgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
RuntimeContainerized aka tensorflow / mlflow

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka mlflow

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model Optional[Any]An optional list of models that match the supported frameworks from wallaroo.framework.Framework included in the arbitrary python script. Note that this is optional - no models are actually required. A BYOP can refer to a specific model(s) used, be used for data processing and reshaping for later pipeline steps, or other needs.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when expected_model_types is not set.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.30.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

List Wallaroo Frameworks

Wallaroo frameworks are listed from the Wallaroo.Framework class. The following demonstrates listing all available supported frameworks.

from wallaroo.framework import Framework

[e.value for e in Framework]

    ['onnx',
    'tensorflow',
    'python',
    'keras',
    'sklearn',
    'pytorch',
    'xgboost',
    'hugging-face-feature-extraction',
    'hugging-face-image-classification',
    'hugging-face-image-segmentation',
    'hugging-face-image-to-text',
    'hugging-face-object-detection',
    'hugging-face-question-answering',
    'hugging-face-stable-diffusion-text-2-img',
    'hugging-face-summarization',
    'hugging-face-text-classification',
    'hugging-face-translation',
    'hugging-face-zero-shot-classification',
    'hugging-face-zero-shot-image-classification',
    'hugging-face-zero-shot-object-detection',
    'hugging-face-sentiment-analysis',
    'hugging-face-text-generation']

2.4.1 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: ONNX

How to upload and use ONNX ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.2.1 (July 2023)1.12.18173
2023.2 (May 2023)1.12.18173
2023.1 (March 2023)1.12.18173
2022.4 (December 2022)1.12.18173
After April 2022 until release 2022.4 (December 2022)1.10.*7152
Before April 20221.6.*7132

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]

Upload ONNX Model to Wallaroo

Open Neural Network eXchange(ONNX) is the default model runtime supported by Wallaroo. ONNX models are uploaded to the current workspace through the Wallaroo Client upload_model(name, path, framework, input_schema, output_schema).configure(options). When uploading a default ML Model that matches the default Wallaroo runtime, the configure(options) can be left empty or the framework onnx specified.

Uploading ONNX Models

ONNX models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload ONNX Model Parameters

The following parameters are required for ONNX models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a ONNX model to Wallaroo.

For ONNX models, the input_schema and output_schema are not required so are not listed here.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Required)Set as the Framework.ONNX.
input_schemapyarrow.lib.Schema (Optional)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Optional)The output schema in Apache Arrow schema format.
convert_waitbool (Optional) (Default: True)Not required for native runtimes.
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload ONNX Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
SHAstringThe hash value of the model file.
StatusstringThe status of the model. Values include:
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

For example:

model_name = "embedder-o"
model_path = "./embedder.onnx"

embedder = wl.upload_model(model_name, model_path, Framework=Framework.ONNX).configure("onnx")

ONNX Conversion Tips

When converting from one ML model type to an ONNX ML model, the input and output fields should be specified so users anticipate the exact field names used in their code. This prevents conversion naming formats from creating unintended names, and sets consistent field names that can be relied upon in future code updates.

The following example shows naming the input and output names when converting from a PyTorch model to an ONNX model. Note that the input fields are set to data, and the output fields are set to output_names = ["bounding-box", "classification","confidence"].

input_names = ["data"]
output_names = ["bounding-box", "classification","confidence"]
torch.onnx.export(model,
                    tensor,
                    pytorchModelPath+'.onnx',
                    input_names=input_names,
                    output_names=output_names,
                    opset_version=17,
                    )

See the documentation for the specific ML model being converting from to ONNX for complete details.

Pipeline Deployment Configurations

Pipeline configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space.

This model will always run in the native runtime space.

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

2.4.2 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Arbitrary Python

How to upload and use Containerized MLFlow with Wallaroo

Arbitrary Python or BYOP (Bring Your Own Predict) allows organizations to use Python scripts and supporting libraries as it’s own model. Similar to using a Python step, arbitrary python is an even more robust and flexible tool for working with ML Models in Wallaroo pipelines.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka mlflow

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model Optional[Any]An optional list of models that match the supported frameworks from wallaroo.framework.Framework included in the arbitrary python script. Note that this is optional - no models are actually required. A BYOP can refer to a specific model(s) used, be used for data processing and reshaping for later pipeline steps, or other needs.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when expected_model_types is not set.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

Upload Arbitrary Python Model

Arbitrary Python models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload Arbitrary Python Model Parameters

The following parameters are required for Arbitrary Python models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a Arbitrary Python model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, Arbitrary Python model Required)Set as Framework.CUSTOM.
input_schemapyarrow.lib.Schema (Upload Method Optional, Arbitrary Python model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, Arbitrary Python model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, Arbitrary Python model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload Arbitrary Python Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

Upload Arbitrary Python Model Example

The following example is of uploading a Arbitrary Python VGG16 Clustering ML Model to a Wallaroo instance.

Arbitrary Python Script Example

The following is an example script that fulfills the requirements for a Wallaroo Arbitrary Python Model, and would be saved as custom_inference.py.

"""This module features an example implementation of a custom Inference and its
corresponding InferenceBuilder."""

import pathlib
import pickle
from typing import Any, Set

import tensorflow as tf
from mac.config.inference import CustomInferenceConfig
from mac.inference import Inference
from mac.inference.creation import InferenceBuilder
from mac.types import InferenceData
from sklearn.cluster import KMeans


class ImageClustering(Inference):
    """Inference class for image clustering, that uses
    a pre-trained VGG16 model on cifar10 as a feature extractor
    and performs clustering on a trained KMeans model.

    Attributes:
        - feature_extractor: The embedding model we will use
        as a feature extractor (i.e. a trained VGG16).
        - expected_model_types: A set of model instance types that are expected by this inference.
        - model: The model on which the inference is calculated.
    """

    def __init__(self, feature_extractor: tf.keras.Model):
        self.feature_extractor = feature_extractor
        super().__init__()

    @property
    def expected_model_types(self) -> Set[Any]:
        return {KMeans}

    @Inference.model.setter  # type: ignore
    def model(self, model) -> None:
        """Sets the model on which the inference is calculated.

        :param model: A model instance on which the inference is calculated.

        :raises TypeError: If the model is not an instance of expected_model_types
            (i.e. KMeans).
        """
        self._raise_error_if_model_is_wrong_type(model) # this will make sure an error will be raised if the model is of wrong type
        self._model = model

    def _predict(self, input_data: InferenceData) -> InferenceData:
        """Calculates the inference on the given input data.
        This is the core function that each subclass needs to implement
        in order to calculate the inference.

        :param input_data: The input data on which the inference is calculated.
        It is of type InferenceData, meaning it comes as a dictionary of numpy
        arrays.

        :raises InferenceDataValidationError: If the input data is not valid.
        Ideally, every subclass should raise this error if the input data is not valid.

        :return: The output of the model, that is a dictionary of numpy arrays.
        """

        # input_data maps to the input_schema we have defined
        # with PyArrow, coming as a dictionary of numpy arrays
        inputs = input_data["images"]

        # Forward inputs to the models
        embeddings = self.feature_extractor(inputs)
        predictions = self.model.predict(embeddings.numpy())

        # Return predictions as dictionary of numpy arrays
        return {"predictions": predictions}


class ImageClusteringBuilder(InferenceBuilder):
    """InferenceBuilder subclass for ImageClustering, that loads
    a pre-trained VGG16 model on cifar10 as a feature extractor
    and a trained KMeans model, and creates an ImageClustering object."""

    @property
    def inference(self) -> ImageClustering:
        return ImageClustering

    def create(self, config: CustomInferenceConfig) -> ImageClustering:
        """Creates an Inference subclass and assigns a model and additionally
        needed attributes to it.

        :param config: Custom inference configuration. In particular, we're
        interested in `config.model_path` that is a pathlib.Path object
        pointing to the folder where the model artifacts are saved.
        Every artifact we need to load from this folder has to be
        relative to `config.model_path`.

        :return: A custom Inference instance.
        """
        feature_extractor = self._load_feature_extractor(
            config.model_path / "feature_extractor.h5"
        )
        inference = self.inference(feature_extractor)
        model = self._load_model(config.model_path / "kmeans.pkl")
        inference.model = model

        return inference

    def _load_feature_extractor(
        self, file_path: pathlib.Path
    ) -> tf.keras.Model:
        return tf.keras.models.load_model(file_path)

    def _load_model(self, file_path: pathlib.Path) -> KMeans:
        with open(file_path.as_posix(), "rb") as fp:
            model = pickle.load(fp)
        return model

The following is the requirements.txt file that would be included in the arbitrary python ZIP file. It is highly recommended to use the same requirements.txt file for setting the libraries and versions used to create the model in the arbitrary python ZIP file.

tensorflow==2.8.0
scikit-learn==1.2.2

Upload Upload Arbitrary Python Example

The following example demonstrates uploading the arbitrary python model as vgg_clustering.zip with the following input and output schemas defined.

input_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=32
        ),
        list_size=32
    )),
])

output_schema = pa.schema([
    pa.field('predictions', pa.int64()),
])

model = wl.upload_model(
                        'vgg16-clustering', 
                        'vgg16_clustering.zip', 
                        framework=Framework.CUSTOM, 
                        input_schema=input_schema, 
                        output_schema=output_schema, 
                        convert_wait=True
                    )
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion..Converting.................Ready.

{
    'name': 'vgg16-clustering', 
    'version': '819114a4-c1a4-43b4-b66b-a486e05a867f', 
    'file_name': 'model-auto-conversion_BYOP_vgg16_clustering.zip', 
    'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3443', 
    'last_update_time': datetime.datetime(2023, 6, 28, 16, 54, 38, 299848, tzinfo=tzutc())
}

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This model will always run in the containerized runtime space.

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

2.4.3 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Containerized MLFlow

How to upload and use Containerized MLFlow with Wallaroo
ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.30.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Register a Containerized MLFlow Model

Containerized MLFlow models are not uploaded, but registered from a container registry service. This is performed through the Wallaroo Client .register_model_image(name, image).configure(options) method. For the options, the following must be defined:

  • runtime: Set as mlflow.
  • input_schema: The input schema from the Apache Arrow pyarrow.lib.Schema format.
  • output_schema: The output schema from the Apache Arrow pyarrow.lib.Schema format.

For example:

sm_input_schema = pa.schema([
  pa.field('temp', pa.float32()),
  pa.field('holiday', pa.uint8()),
  pa.field('workingday', pa.uint8()),
  pa.field('windspeed', pa.float32())
])

sm_output_schema = pa.schema([
    pa.field('predicted_mean', pa.float32())
])

statsmodelUrl = "ghcr.io/wallaroolabs/wallaroo_tutorials/mlflow-statsmodels-example:2023.1"

sm_model = wl.register_model_image(
    name=f"{prefix}statmodels",
    image=f"{statsmodelUrl}"
    ).configure("mlflow", 
            input_schema=sm_input_schema, 
            output_schema=sm_output_schema
    )

MLFlow Data Formats

When using containerized MLFlow models with Wallaroo, the inputs and outputs must be named. For example, the following output:

[-12.045839810372835]

Would need to be wrapped with the data values named:

[{"prediction": -12.045839810372835}]

A short sample code for wrapping data may be:

output_df = pd.DataFrame(prediction, columns=["prediction"])
return output_df

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This model will always run in the containerized runtime space.

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

2.4.4 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Model Registry Services

How to upload and use Registry ML Models with Wallaroo

Wallaroo users can register their trained machine learning models from a model registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

This guide details how to add ML Models from a model registry service into a Wallaroo instance.

Artifact Requirements

Models are uploaded to the Wallaroo instance as the specific artifact - the “file” or other data that represents the file itself. This must comply with the Wallaroo model requirements framework and version or it will not be deployed. Note that for models that fall outside of the supported model types, they can be registered to a Wallaroo workspace as MLFlow 1.30.0 containerized models.

Supported Models

The following frameworks are supported. Frameworks fall under either Native or Containerized runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

Please note the following.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.2.1 (July 2023)1.12.18173
2023.2 (May 2023)1.12.18173
2023.1 (March 2023)1.12.18173
2022.4 (December 2022)1.12.18173
After April 2022 until release 2022.4 (December 2022)1.10.*7152
Before April 20221.6.*7132

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]
ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.1
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0
ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.27.0
  • diffusers==0.14.0
  • accelerate==0.18.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
RuntimeContainerized aka tensorflow / mlflow

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING-FACE-IMAGE-TO-TEXT
  • Framework.HUGGING-FACE-TEXT-CLASSIFICATION
  • Framework.HUGGING-FACE-SUMMARIZATION
  • Framework.HUGGING-FACE-TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING-FACE-FEATURE-EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-TO-TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-QUESTION-ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-STABLE-DIFFUSION-TEXT-2-IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SENTIMENT-ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])
ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
RuntimeContainerized aka mlflow

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.2.2
FrameworkFramework.SKLEARN aka sklearn
RuntimeContainerized aka tensorflow / mlflow

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.8.0
  • keras==1.1.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
RuntimeContainerized aka mlflow

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Librariesxgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
RuntimeContainerized aka tensorflow / mlflow

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka mlflow

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model Optional[Any]An optional list of models that match the supported frameworks from wallaroo.framework.Framework included in the arbitrary python script. Note that this is optional - no models are actually required. A BYOP can refer to a specific model(s) used, be used for data processing and reshaping for later pipeline steps, or other needs.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when expected_model_types is not set.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.30.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

List Wallaroo Frameworks

Wallaroo frameworks are listed from the Wallaroo.Framework class. The following demonstrates listing all available supported frameworks.

from wallaroo.framework import Framework

[e.value for e in Framework]

    ['onnx',
    'tensorflow',
    'python',
    'keras',
    'sklearn',
    'pytorch',
    'xgboost',
    'hugging-face-feature-extraction',
    'hugging-face-image-classification',
    'hugging-face-image-segmentation',
    'hugging-face-image-to-text',
    'hugging-face-object-detection',
    'hugging-face-question-answering',
    'hugging-face-stable-diffusion-text-2-img',
    'hugging-face-summarization',
    'hugging-face-text-classification',
    'hugging-face-translation',
    'hugging-face-zero-shot-classification',
    'hugging-face-zero-shot-image-classification',
    'hugging-face-zero-shot-object-detection',
    'hugging-face-sentiment-analysis',
    'hugging-face-text-generation']

Registry Services Roles

Registry service use in Wallaroo typically falls under the following roles.

RoleRecommended ActionsDescription
DevOps EngineerCreate Model RegistryCreate the model (AKA artifact) registry service
 Retrieve Model Registry TokensGenerate the model registry service credentials.
MLOps EngineerConnect Model Registry to WallarooAdd the Registry Service URL and credentials into a Wallaroo instance for use by other users and scripts.
 Add Wallaroo Registry Service to WorkspaceAdd the registry service configuration to a Wallaroo workspace for use by workspace users.
Data ScientistList Registries in a WorkspaceList registries available from a workspace.
 List Models in RegistryList available models in a model registry.
 List Model Versions of Registered ModelList versions of a registry stored model.
 List Model Version ArtifactsRetrieve the artifacts (usually files) for a model stored in a model registry.
 Upload Model from RegistryUpload a model and artifacts stored in a model registry into a Wallaroo workspace.

Model Registry Operations

The following links to guides and information on setting up a model registry (also known as an artifact registry).

Create Model Registry

See Model serving with Azure Databricks for setting up a model registry service using Azure Databricks.

The following steps create an Access Token used to authenticate to an Azure Databricks Model Registry.

  1. Log into the Azure Databricks workspace.
  2. From the upper right corner access the User Settings.
  3. From the Access tokens, select Generate new token.
  4. Specify any token description and lifetime. Once complete, select Generate.
  5. Copy the token and store in a secure place. Once the Generate New Token module is closed, the token will not be retrievable.
Retrieve Azure Databricks User Token

The MLflow Model Registry provides a method of setting up a model registry service. Full details can be found at the MLflow Registry Quick Start Guide.

A generic MLFlow model registry requires no token.

Wallaroo Registry Operations

  • Connect Model Registry to Wallaroo: This details the link and connection information to a existing MLFlow registry service. Note that this does not create a MLFlow registry service, but adds the connection and credentials to Wallaroo to allow that MLFlow registry service to be used by other entities in the Wallaroo instance.
  • Add a Registry to a Workspace: Add the created Wallaroo Model Registry so make it available to other workspace members.
  • Remove a Registry from a Workspace: Remove the link between a Wallaroo Model Registry and a Wallaroo workspace.

Connect Model Registry to Wallaroo

MLFlow Registry connection information is added to a Wallaroo instance through the Wallaroo.Client.create_model_registry method.

Connect Model Registry to Wallaroo Parameters

ParameterTypeDescription
namestring (Required)The name of the MLFlow Registry service.
tokenstring (Required)The authentication token used to authenticate to the MLFlow Registry.
urlstring (Required)The URL of the MLFlow registry service.

Connect Model Registry to Wallaroo Return

The following is returned when a MLFlow Registry is successfully created.

FieldTypeDescription
NamestringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
WorkspacesList[string]The name of all workspaces this registry was added to.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

Note that the token is not displayed for security reasons.

Connect Model Registry to Wallaroo Example

The following example creates a Wallaroo MLFlow Registry with the name ExampleNotebook stored in a sample Azure DataBricks environment.

wl.create_model_registry(name="ExampleNotebook", 
                        token="abcdefg-3", 
                        url="https://abcd-123489.456.azuredatabricks.net")
FieldValue
NameExampleNotebook
URLhttps://abcd-123489.456.azuredatabricks.net
Workspacessample.user@wallaroo.ai - Default Workspace
Created At2023-27-Jun 13:57:26
Updated At2023-27-Jun 13:57:26

Add Registry to Workspace

Registries are assigned to a Wallaroo workspace with the Wallaroo.registry.add_registry_to_workspace method. This allows members of the workspace to access the registry connection. A registry can be associated with one or more workspaces.

Add Registry to Workspace Parameters

ParameterTypeDescription
namestring (Required)The numerical identifier of the workspace.

Add Registry to Workspace Returns

The following is returned when a MLFlow Registry is successfully added to a workspace.

FieldTypeDescription
NamestringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
WorkspacesList[string]The name of all workspaces this registry was added to.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

Example

registry.add_registry_to_workspace(workspace_id=workspace_id)
FieldValue
NameExampleNotebook
URLhttps://abcd-123489.456.azuredatabricks.net
Workspacessample.user@wallaroo.ai - Default Workspace
Created At2023-27-Jun 13:57:26
Updated At2023-27-Jun 13:57:26

Remove Registry from Workspace

Registries are removed from a Wallaroo workspace with the Registry remove_registry_from_workspace method.

Remove Registry from Workspace Parameters

ParameterTypeDescription
workspace_idInteger (Required)The numerical identifier of the workspace.

Remove Registry from Workspace Return

FieldTypeDescription
NamestringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
WorkspacesList(string)A list of workspaces by name that still contain the registry.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

Remove Registry from Workspace Example

registry.remove_registry_from_workspace(workspace_id=workspace_id)
FieldValue
NameJeffRegistry45
URLhttps://sample.registry.azuredatabricks.net
Workspacesjohn.hummel@wallaroo.ai - Default Workspace
Created At2023-17-Jul 17:56:52
Updated At2023-17-Jul 17:56:52

Wallaroo Registry Model Operations

  • List Registries in a Workspace: List the available registries in the current workspace.
  • List Models: List Models in a Registry
  • Upload Model: Upload a version of a ML Model from the Registry to a Wallaroo workspace.
  • List Model Versions: List the versions of a particular model.
  • Remove Registry from Workspace: Remove a specific Registry configuration from a specific workspace.

List Registries in a Workspace

Registries associated with a workspace are listed with the Wallaroo.Client.list_model_registries() method. This lists all registries associated with the current workspace.

List Registries in a Workspace Parameters

None

List Registries in a Workspace Returns

A List of Registries with the following fields.

FieldTypeDescription
NamestringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

List Registries in a Workspace Example

wl.list_model_registries()
nameregistry urlcreated atupdated at
gibhttps://sampleregistry.wallaroo.ai2023-27-Jun 03:22:462023-27-Jun 03:22:46
ExampleNotebookhttps://sampleregistry.wallaroo.ai2023-27-Jun 13:57:262023-27-Jun 13:57:26

List Models in a Registry

A List of models available to the Wallaroo instance through the MLFlow Registry is performed with the Wallaroo.Registry.list_models() method.

List Models in a Registry Parameters

None

List Models in a Registry Returns

A List of models with the following fields.

FieldTypeDescription
NamestringThe name of the model.
Registry UserstringThe user account that is tied to the registry service for this model.
VersionsintThe number of versions for the model, starting at 0.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

List Models in a Registry Example

registry.list_models()
NameRegistry UserVersionsCreated AtUpdated At
testmodelsample.user@wallaroo.ai02023-16-Jun 14:38:422023-16-Jun 14:38:42
testmodel2sample.user@wallaroo.ai02023-16-Jun 14:41:042023-16-Jun 14:41:04
wine_qualitysample.user@wallaroo.ai22023-16-Jun 15:05:532023-16-Jun 15:09:57

Retrieve Specific Model Details from the Registry

Model details are retrieved by assigning a MLFlow Registry Model to an object with the Wallaroo.Registry.list_models(), then specifying the element in the list to save it to a Registered Model object.

The following will return the most recent model added to the MLFlow Registry service.

mlflow_model = registry.list_models()[-1]
mlflow_model
FieldTypeDescription
NamestringThe name of the model.
Registry UserstringThe user account that is tied to the registry service for this model.
VersionsintThe number of versions for the model, starting at 0.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

List Model Versions of Registered Model

MLFlow registries can contain multiple versions of a ML Model. These are listed and are listed with the Registered Model versions attribute. The versions are listed in reverse order of insertion, with the most recent model version in position 0.

List Model Versions of Registered Model Parameters

None

List Model Versions of Registered Model Returns

A List of the Registered Model Versions with the following fields.

FieldTypeDescription
NamestringThe name of the model.
VersionintThe version number. The higher numbers are the most recent.
DescriptionstringThe registered model’s description from the MLFlow Registry service.

List Model Versions of Registered Model Example

The following will return the most recent model added to the MLFlow Registry service and list its versions.

mlflow_model = registry.list_models()[-1]
mlflow_model.versions
NameVersionDescription
wine_quality2None
wine_quality1None

List Model Version Artifacts

Artifacts belonging to a MLFlow registry model are listed with the Model Version list_artifacts() method. This returns all artifacts for the model.

List Model Version Artifacts Parameters

None

List Model Version Artifacts Returns

A List of artifacts with the following fields.

FieldTypeDescription
file_namestringThe name assigned to the artifact.
file_sizestringThe size of the artifact in bytes.
full_pathstringThe path of the artifact. This will be used to upload the artifact to Wallaroo.

List Model Version Artifacts Example

The following will list the artifacts in a single registry model.

single_registry_model.versions[0].list_artifacts()
File NameFile SizeFull Path
MLmodel546Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/MLmodel
conda.yaml182Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/conda.yaml
model.pkl1429Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl
python_env.yaml122Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/python_env.yaml
requirements.txt73Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/requirements.txt

Upload a Model from a Registry

Models uploaded to the Wallaroo workspace are uploaded from a MLFlow Registry with the Wallaroo.Registry.upload method.

Upload a Model from a Registry Parameters

ParameterTypeDescription
namestring (Required)The name to assign the model once uploaded. Model names are unique within a workspace. Models assigned the same name as an existing model will be uploaded as a new model version.
pathstring (Required)The full path to the model artifact in the registry.
frameworkstring (Required)The Wallaroo model Framework. See Model Uploads and Registrations Supported Frameworks
input_schemapyarrow.lib.Schema (Required for non-native runtimes)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Required for non-native runtimes)The output schema in Apache Arrow schema format.

Upload a Model from a Registry Returns

The registry model details as follows.

FieldTypeDescription
NamestringThe name of the model.
VersionstringThe version registered in the Wallaroo instance in UUID format.
File NamestringThe file name associated with the ML Model in the Wallaroo instance.
SHAstringThe models hash value.
StatusstringThe status of the model from the following list.
  • pending_conversion: The model is uploaded to Wallaroo and is ready to convert.
  • converting: The model is being converted into a Wallaroo supported runtime.
  • ready
  • : The model is ready and available for use.
  • error: The model conversion has failed. Check error messages and verify the model is the correct version and framework.
Image PathstringThe image used for the containerization of the model.
Updated AtDateTimeWhen the model was last updated.

Upload a Model from a Registry Example

The following will retrieve the most recent uploaded model and upload it with the XGBOOST framework into the current Wallaroo workspace.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float32(), list_size=4))
])

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

model = registry.upload_model(
  name="sklearnonnx", 
  path="https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl", 
  framework=Framework.SKLEARN,
  input_schema=input_schema,
  output_schema=output_schema)
  
Namesklearnonnx
Version63bd932d-320d-4084-b972-0cfe1a943f5a
File Namemodel.pkl
SHA970da8c178e85dfcbb69fab7bad0fb58cd0c2378d27b0b12cc03a288655aa28d
Statuspending_conversion
ImagePathNone
Updated At2023-05-Jul 19:14:49

Retrieve Model Status

The model status is retrieved with the Model status() method.

Retrieve Model Status Parameters

None

Retrieve Model Status Returns

FieldTypeDescription
statusstringThe current status of the uploaded model.
  • pending_conversion: The model is uploaded to Wallaroo and is ready to convert.
  • converting: The model is being converted into a Wallaroo supported runtime.
  • ready
  • : The model is ready and available for use.
  • error: The model conversion has failed. Check error messages and verify the model is the correct version and framework.

Retrieve Model Status Returns Example

The following demonstrates checking the status in the for loop until the model shows either ready or error.

import time
while model.status() != "ready" and model.status() != "error":
    print(model.status())
    time.sleep(3)
print(model.status())

converting
converting
ready

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

2.4.5 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Python Models

How to upload and use Python Models as Wallaroo Pipeline Steps

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Python scripts are uploaded to Wallaroo and and treated like an ML Models in Pipeline steps. These will be referred to as Python steps.

Python steps can include:

  • Preprocessing steps to prepare the data received to be handed to ML Model deployed as another Pipeline step.
  • Postprocessing steps to take data output by a ML Model as part of a Pipeline step, and prepare the data to be received by some other data store or entity.
  • A model contained within a Python script.

In all of these, the requirements for uploading a Python step as a ML Model in Wallaroo are the same.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0

Upload Python Models

Python step models are uploaded to Wallaroo through the Wallaroo Client upload_model(name, path, framework).configure(options).

Upload Python Model Parameters

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Required)Set as the Framework.Python.
input_schemapyarrow.lib.Schema (Optional)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Optional)The output schema in Apache Arrow schema format.
convert_waitbool (Optional) (Default: True)Not required for native runtimes.
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Upload Python Models Example

The following example is of uploading a Python step ML Model to a Wallaroo instance.

import pyarrow as pa
input_schema = pa.schema([
    pa.field('dense_2', pa.list_(pa.float64()))
])
output_schema = pa.schema([
    pa.field('output', pa.list_(pa.float64()))
])

from wallaroo.framework import Framework
step = client.upload_model("python-step", 
                           "./step.py", 
                           framework=Framework.PYTHON)
                           .configure('python', 
                                       input_schema=input_schema,
                                       output_schema=output_schema
                        )

2.4.6 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: PyTorch

How to upload and use PyTorch ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports PyTorch models by containerizing the model and running as an image.

ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
RuntimeContainerized aka mlflow

Uploading PyTorch Models

PyTorch models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload PyTorch Model Parameters

The following parameters are required for PyTorch models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a PyTorch model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, PyTorch model Required)Set as the Framework.PyTorch.
input_schemapyarrow.lib.Schema (Upload Method Optional, PyTorch model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, PyTorch model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, PyTorch model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload PyTorch Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

Upload PyTorch Model Example

The following example is of uploading a PyTorch ML Model to a Wallaroo instance.

input_schema = pa.schema(
    [
        pa.field('input', pa.list_(pa.float64(), list_size=10))
    ]
)

output_schema = pa.schema(
[
    pa.field('output', pa.list_(pa.float64(), list_size=1))
]
)

model = wl.upload_model('pt-single-io-model', 
                        "./models/model-auto-conversion_pytorch_single_io_model.pt", 
                        framework=Framework.PYTORCH, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion..Converting...........Ready.
{
    'name': 'pt-single-io-model', 
    'version': '8f91dee1-79e0-449b-9a59-0e93ba4a1ba9', 
    'file_name': 'model-auto-conversion_pytorch_single_io_model.pt', 
    'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3397', 
    'last_update_time': datetime.datetime(2023, 6, 23, 2, 8, 56, 669565, tzinfo=tzutc())
}

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

2.4.7 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: SKLearn

How to upload and use SKLearn ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports SKLearn models by containerizing the model and running as an image.

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.2.2
FrameworkFramework.SKLEARN aka sklearn
RuntimeContainerized aka tensorflow / mlflow

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00

Uploading SKLearn Models

SKLearn models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload SKLearn Model Parameters

The following parameters are required for SKLearn models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a SKLearn model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, SKLearn model Required)Set as the Framework.SKLEARN.
input_schemapyarrow.lib.Schema (Upload Method Optional, SKLearn model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, SKLearn model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, SKLearn model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload SKLearn Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

Upload SKLearn Model Example

The following example is of uploading a pickled SKLearn ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=10))
])

output_schema = pa.schema([
    pa.field('predictions', pa.float64())
])

model = wl.upload_model(
                        'sklearn-linear-regression', 
                        'models/model-auto-conversion_sklearn_linreg_diabetes.pkl', 
                        framework=Framework.SKLEARN, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                        )
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion..Converting.Pending conversion..Converting.......Ready.

{
    'name': 'sklearn-linear-regression', 
    'version': 'e84809eb-0992-457e-95cb-cc3d20c792db', 
    'file_name': 'model-auto-conversion_sklearn_linreg_diabetes.pkl', 
    'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3443', 
    'last_update_time': datetime.datetime(2023, 6, 27, 22, 17, 52, 967054, tzinfo=tzutc())
}
 agesexbmibps1s2s3s4s5s6
00.0380760.0506800.0616960.021872-0.044223-0.034821-0.043401-0.0025920.019907-0.017646
1-0.001882-0.044642-0.051474-0.026328-0.008449-0.0191630.074412-0.039493-0.068332-0.092204
 inputs
0[0.0380759064, 0.0506801187, 0.0616962065, 0.0…
1[-0.0018820165, -0.0446416365, -0.051474061200…
 timein.inputsout.predictionscheck_failures
02023-07-05 15:43:33.065[0.0380759064, 0.0506801187, 0.0616962065, 0.0…206.1166770
12023-07-05 15:43:33.065[-0.0018820165, -0.0446416365, -0.0514740612, …68.0710330

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

2.4.8 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Hugging Face

How to upload and use Hugging Face ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports Hugging Face models by containerizing the model and running as an image.

ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.27.0
  • diffusers==0.14.0
  • accelerate==0.18.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
RuntimeContainerized aka tensorflow / mlflow

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING-FACE-IMAGE-TO-TEXT
  • Framework.HUGGING-FACE-TEXT-CLASSIFICATION
  • Framework.HUGGING-FACE-SUMMARIZATION
  • Framework.HUGGING-FACE-TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING-FACE-FEATURE-EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-TO-TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-QUESTION-ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-STABLE-DIFFUSION-TEXT-2-IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SENTIMENT-ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])

Uploading Hugging Face Models

Hugging Face models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload Hugging Face Model Parameters

The following parameters are required for Hugging Face models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a Hugging Face model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, Hugging Face model Required)Set as the framework - see the list above for all supported Hugging Face frameworks.
input_schemapyarrow.lib.Schema (Upload Method Optional, Hugging Face model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, Hugging Face model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, Hugging Face model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload Hugging Face Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

Upload Hugging Face Model Example

The following example is of uploading a Hugging Face Zero Shot Classification ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

model = wl.upload_model(f"hugging-face-zero-model",
                        './models/model-auto-conversion_hugging-face_dummy-pipelines_zero-shot-classification-pipeline.zip', 
                        framework=Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION, 
                        input_schema=input_schema,
                        output_schema=output_schema)

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

2.4.9 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: TensorFlow

How to upload and use TensorFlow ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports TensorFlow models by containerizing the model and running as an image.

ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.1
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

Uploading TensorFlow Models

TensorFlow models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload TensorFlow Model Parameters

The following parameters are required for TensorFlow models. Tensorflow models are native runtimes in Wallaroo, so the input_schema and output_schema parameters are optional.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Required)Set as the Framework.TENSORFLOW.
input_schemapyarrow.lib.Schema (Optional)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Optional)The output schema in Apache Arrow schema format.
convert_waitbool (Optional) (Default: True)Not required for native runtimes.
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload TensorFlow Model Return

For example, the following example is of uploading a TensorFlow ML Model to a Wallaroo instance.

from wallaroo.framework import Framework
model = wl.upload_model(model_name, 
                        model_file_name,
                        framework=Framework.TENSORFLOW
                        )

Pipeline Deployment Configurations

Pipeline configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space.

This model will always run in the native runtime space.

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

2.4.10 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: TensorFlow Keras

How to upload and use TensorFlow Keras ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports TensorFlow/Keras models by containerizing the model and running as an image.

ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.8.0
  • keras==1.1.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
RuntimeContainerized aka mlflow

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

Uploading TensorFlow Models

TensorFlow Keras models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload TensorFlow Model Parameters

The following parameters are required for TensorFlow keras models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a TensorFlow Keras model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, TensorFlow keras model Required)Set as the Framework.KERAS.
input_schemapyarrow.lib.Schema (Upload Method Optional, TensorFlow Keras model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, TensorFlow Keras model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, TensorFlow model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload TensorFlow Model Return

For example, the following example is of uploading a PyTorch ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('input', 
             pa.list_(pa.float64(), 
             list_size=10)
             )
        ]
    )

output_schema = pa.schema([
    pa.field('output', 
             pa.list_(pa.float64(), 
             list_size=32)
            )
        ]
    )

model = wl.upload_model('mac-keras-single-io-example', 
                        './models/single_io_keras_sequential_model.h5',
                        framework=Framework.KERAS, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                        )

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

2.4.11 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: XGBoost

How to upload and use XGBoost ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports XGBoost models by containerizing the model and running as an image.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Librariesxgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
RuntimeContainerized aka tensorflow / mlflow

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00

Uploading XGBoost Models

XGBoost models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload XGBoost Model Parameters

The following parameters are required for XGBoost models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a XGBoost model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, SKLearn model Required)Set as the Framework.XGBOOST.
input_schemapyarrow.lib.Schema (Upload Method Optional, SKLearn model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, SKLearn model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, SKLearn model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload XGBoost Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

Upload XGBoost Model Example

The following example is of uploading a PyTorch ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

output_schema = pa.schema([
    pa.field('output', pa.float64())
])

output_schema = pa.schema([
    pa.field('output', pa.float64())
])

model = wl.upload_model(f"{prefix}", 
                        'models/model-auto-conversion_xgboost_xgb_ranker_model.pkl', 
                        framework=Framework.XGBOOST, 
                        input_schema=input_schema, output_schema=output_schema
                        )
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion...Converting..Pending conversion.Converting.........Ready.

{
    'name': 'xgb-ranker', 
    'version': 'c53c6a84-9f56-41c6-bb2f-049ef6b067e8', 
    'file_name': 'model-auto-conversion_xgboost_xgb_ranker_model.pkl', 
    'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3367', 
    'last_update_time': datetime.datetime(2023, 6, 16, 18, 51, 15, 27969, tzinfo=tzutc())
}

data = pd.read_json('data/test-xgboost-classification-data.json')
display(data)

dataframe = pd.DataFrame({"inputs": data[:2].values.tolist()})
display(dataframe)

results = pipeline.infer(dataframe)
display(results)
 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2
 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]
 timein.inputsout.outputcheck_failures
02023-07-05 16:15:55.802[5.1, 3.5, 1.4, 0.2]0.00
12023-07-05 16:15:55.802[4.9, 3.0, 1.4, 0.2]0.00

Model Status

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

2.5 - Wallaroo SDK Essentials Guide: User Management

How to create and manage Wallaroo Users through the Wallaroo SDK

Managing Workspace Users

Users are managed via their email address, and can be assigned to a workspace as either the owner or a user.

List All Users

list_users() returns an array of all users registered in the connected Wallaroo platform with the following values:

ParameterTypeDescription
idStringThe unique identifier for the user.
emailStringThe unique email identifier for the user.
usernameStringThe unique username of the user.

For example, listing all users in the Wallaroo connection returns the following:

wl.list_users()
[User({"id": "7b8b4f7d-de27-420f-9cd0-892546cb0f82", "email": "test@test.com", "username": "admin"),
 User({"id": "45e6b641-fe57-4fb2-83d2-2c2bd201efe8", "email": "steve@ex.co", "username": "steve")]

Get User by Email

The get_user_by_email({email}) command finds the user who’s email address matches the submitted {email} field. If no email address matches then the return will be null.

For example, the user steve with the email address steve@ex.co returns the following:

wl.get_user_by_email("steve@ex.co")

User({"id": "45e6b641-fe57-4fb2-83d2-2c2bd201efe8", "email": "steve@ex.co", "username": "steve")

Activate and Deactivate Users

To remove a user’s access to the Wallaroo instance, use the Wallaroo Client deactivate_user("{User Email Address}) method, replacing the {User Email Address} with the email address of the user to deactivate.

To activate a user, use the Wallaroo Client active_user("{User Email Address}) method, replacing the {User Email Address} with the email address of the user to activate.

This feature impacts Wallaroo Community’s license count. Wallaroo Community only allows a total of 5 users per Wallaroo Community instance. Deactivated users does not count to this total - this allows organizations to add users, then activate/deactivate them as needed to stay under the total number of licensed users count.

Wallaroo Enterprise has no limits on the number of users who can be added or active in a Wallaroo instance.

In this example, the user testuser@wallaroo.ai will be deactivated then reactivated.

wl.list_users()

[User({"id": "0528f34c-2725-489f-b97b-da0cde02cbd9", "email": "testuser@wallaroo.ai", "username": "testuser@wallaroo.ai"),
 User({"id": "3927b9d3-c279-442c-a3ac-78ba1d2b14d8", "email": "john.hummel+signuptest@wallaroo.ai", "username": "john.hummel+signuptest@wallaroo.ai")]

wl.deactivate_user("testuser@wallaroo.ai")

wl.activate_user("testuser@wallaroo.ai")

Troubleshooting

When a new user logs in for the first time, they get an error when uploading a model or issues when they attempt to log in. How do I correct that?

When a new registered user attempts to upload a model, they may see the following error:

TransportQueryError: 
{'extensions': 
    {'path': 
        '$.selectionSet.insert_workspace_one.args.object[0]', 'code': 'not-supported'
    }, 
    'message': 
        'cannot proceed to insert array relations since insert to table "workspace" affects zero rows'

Or if they log into the Wallaroo Dashboard, they may see a Page not found error.

This is caused when a user has been registered without an appropriate email address. See the user guides here on inviting a user, or the Wallaroo Enterprise User Management on how to log into the Keycloak service and update users. Verify that the username and email address are both the same, and they are valid confirmed email addresses for the user.

2.6 - Wallaroo SDK Essentials Guide: Pipelines

The classes and methods for managing Wallaroo pipelines and configurations.

2.6.1 - Wallaroo SDK Essentials Guide: Pipeline Management

How to create and manage Wallaroo Pipelines through the Wallaroo SDK

Pipelines are the method of taking submitting data and processing that data through the models. Each pipeline can have one or more steps that submit the data from the previous step to the next one. Information can be submitted to a pipeline as a file, or through the pipeline’s URL.

A pipeline’s metrics can be viewed through the Wallaroo Dashboard Pipeline Details and Metrics page.

Pipeline Naming Requirements

Pipeline names map onto Kubernetes objects, and must be DNS compliant. Pipeline names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Create a Pipeline

New pipelines are created in the current workspace.

To create a new pipeline, use the Wallaroo Client build_pipeline("{Pipeline Name}") command.

The following example creates a new pipeline imdb-pipeline through a Wallaroo Client connection wl:

imdb_pipeline = wl.build_pipeline("imdb-pipeline")

imdb_pipeline.status()
{'status': 'Pipeline imdb-pipeline is not deployed'}

List All Pipelines

The Wallaroo Client method list_pipelines() lists all pipelines in a Wallaroo Instance.

The following example lists all pipelines in the wl Wallaroo Client connection:

wl.list_pipelines()

[{'name': 'ccfraud-pipeline', 'create_time': datetime.datetime(2022, 4, 12, 17, 55, 41, 944976, tzinfo=tzutc()), 'definition': '[]'}]

Select an Existing Pipeline

Rather than creating a new pipeline each time, an existing pipeline can be selected by using the list_pipelines() command and assigning one of the array members to a variable.

The following example sets the pipeline ccfraud-pipeline to the variable current_pipeline:

wl.list_pipelines()

[{'name': 'ccfraud-pipeline', 'create_time': datetime.datetime(2022, 4, 12, 17, 55, 41, 944976, tzinfo=tzutc()), 'definition': '[]'}]

current_pipeline = wl.list_pipelines()[0]

current_pipeline.status()

{'status': 'Running',
 'details': None,
 'engines': [{'ip': '10.244.5.4',
   'name': 'engine-7fcc7df596-hvlxb',
   'status': 'Running',
   'reason': None,
   'pipeline_statuses': {'pipelines': [{'id': 'ccfraud-pipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'ccfraud-model',
      'version': '4624e8a8-1414-4408-8b40-e03da4b5cb68',
      'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.1.24',
   'name': 'engine-lb-85846c64f8-mtq9p',
   'status': 'Running',
   'reason': None}]}

Pipeline Steps

Once a pipeline has been created, or during its creation process, a pipeline step can be added. The pipeline step refers to the model that will perform an inference off of the data submitted to it. Each time a step is added, it is added to the pipeline’s models array.

Pipeline steps are not saved until the pipeline is deployed. Until then, pipeline steps are stored in local memory as a potential pipeline configuration until the pipeline is deployed.

Add a Step to a Pipeline

A pipeline step is added through the pipeline add_model_step({Model}) command.

In the following example, two models uploaded to the workspace are added as pipeline step:

imdb_pipeline.add_model_step(embedder)
imdb_pipeline.add_model_step(smodel)

imdb_pipeline.status()

{'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}, {'ModelInference': {'models': [{'name': 'smodel-o', 'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19', 'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650'}]}}]"}

Replace a Pipeline Step

The model specified in a pipeline step can be replaced with the pipeline method replace_with_model_step(index, model).

The following parameters are used for replacing a pipeline step:

ParameterDefault ValuePurpose
indexnullThe pipeline step to be replaced. Pipeline steps follow array numbering, where the first step is 0, etc.
modelnullThe new model to be used in the pipeline step.

In the following example, a deployed pipeline will have the initial model step replaced with a new one. A status of the pipeline will be displayed after deployment and after the pipeline swap to show the model has been replaced from ccfraudoriginal to ccfraudreplacement, each with their own versions.

pipeline.deploy()

pipeline.status()

{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.2.145',
   'name': 'engine-75bfd7dc9d-7p9qk',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'hotswappipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'ccfraudoriginal',
      'version': '3a03dc94-716e-46bb-84c8-91bc99ceb2c3',
      'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.2.144',
   'name': 'engine-lb-55dcdff64c-vf74s',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

pipeline.replace_with_model_step(0, replacement_model).deploy()

pipeline.status()

{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.2.153',
   'name': 'engine-96486c95d-zfchr',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'hotswappipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'ccfraudreplacement',
      'version': '714efd19-5c83-42a8-aece-24b4ba530925',
      'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.2.154',
   'name': 'engine-lb-55dcdff64c-9np9k',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}
Pre and Post Processing Steps

A Pipeline Step can be more than models - they can also be pre processing and post processing steps. For example, the Demand Curve Tutorial has both a pre and post processing steps that are added to the pipeline. The preprocessing step uses the following code:

import numpy
import pandas

import json

# add interaction terms for the model
def actual_preprocess(pdata):
    pd = pdata.copy()
    # convert boolean cust_known to 0/1
    pd.cust_known = numpy.where(pd.cust_known, 1, 0)
    # interact UnitPrice and cust_known
    pd['UnitPriceXcust_known'] = pd.UnitPrice * pd.cust_known
    return pd.loc[:, ['UnitPrice', 'cust_known', 'UnitPriceXcust_known']]

# If the data is a json string, call this wrapper instead
# Expected input:
# a dictionary with fields 'colnames', 'data'

# test that the code works here
def wallaroo_json(data):
    obj = json.loads(data)
    pdata = pandas.DataFrame(obj['query'],
                             columns=obj['colnames'])
    pprocessed = actual_preprocess(pdata)
    
   # return a dictionary, with the fields the model expect
    return {
       'tensor_fields': ['model_input'],
       'model_input': pprocessed.to_numpy().tolist()
    }

It is added as a Python module by uploading it as a model:

# load the preprocess module
module_pre = wl.upload_model("preprocess", "./preprocess.py").configure('python')

And then added to the pipeline as a step:

# now make a pipeline
demandcurve_pipeline = (wl.build_pipeline("demand-curve-pipeline")
                        .add_model_step(module_pre)
                        .add_model_step(demand_curve_model)
                        .add_model_step(module_post))

Remove a Pipeline Step

To remove a step from the pipeline, use the Pipeline remove_step(index) command, where the index is the array index for the pipeline’s steps.

In the following example the pipeline imdb_pipeline will have the step with the model smodel-o removed.

imdb_pipeline.status

<bound method Pipeline.status of {'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}, {'ModelInference': {'models': [{'name': 'smodel-o', 'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19', 'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650'}]}}]"}>

imdb_pipeline.remove_step(1)
{'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}]"}

Clear All Pipeline Steps

The Pipeline clear() method removes all pipeline steps from a pipeline. Note that pipeline steps are not saved until the pipeline is deployed.

Manage Pipeline Deployment Configuration

For full details on pipeline deployment configurations, see Wallaroo SDK Essentials Guide: Pipeline Deployment Configuration.

Deploy a Pipeline

When a pipeline step is added or removed, the pipeline must be deployed through the pipeline deploy(). This allocates resources to the pipeline from the Kubernetes environment and make it available to submit information to perform inferences. This process typically takes 45 seconds.

Once complete, the pipeline status() command will show 'status':'Running'.

Pipeline deployments can be modified to enable auto-scaling to allow pipelines to allocate more or fewer resources based on need by setting the pipeline’s This will then be applied to the deployment of the pipelineccfraudPipelineby specifying it'sdeployment_config` optional parameter. If this optional parameter is not passed, then the deployment will defer to default values. For more information, see Manage Pipeline Deployment Configuration.

In the following example, the pipeline imdb-pipeline that contains two steps will be deployed with default deployment configuration:

imdb_pipeline.status

<bound method Pipeline.status of {'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}, {'ModelInference': {'models': [{'name': 'smodel-o', 'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19', 'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650'}]}}]"}>

imdb_pipeline.deploy()
Waiting for deployment - this will take up to 45s ...... ok

imdb_pipeline.status()

{'status': 'Running',
 'details': None,
 'engines': [{'ip': '10.12.1.65',
   'name': 'engine-778b65459-f9mt5',
   'status': 'Running',
   'reason': None,
   'pipeline_statuses': {'pipelines': [{'id': 'imdb-pipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'embedder-o',
      'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d',
      'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4',
      'status': 'Running'},
     {'name': 'smodel-o',
      'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19',
      'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.12.1.66',
   'name': 'engine-lb-85846c64f8-ggg2t',
   'status': 'Running',
   'reason': None}]}

Troubleshooting Pipeline Deployment

If you deploy more pipelines than your environment can handle, or if you deploy more pipelines than your license allows, you may see an error like the following:


LimitError: You have reached a license limit in your Wallaroo instance. In order to add additional resources, you can remove some of your existing resources. If you have any questions contact us at community@wallaroo.ai: MAX_PIPELINES_LIMIT_EXCEEDED

Undeploy any unnecessary pipelines either through the SDK or through the Wallaroo Pipeline Dashboard, then attempt to redeploy the pipeline in question again.

Undeploy a Pipeline

When a pipeline is not currently needed, it can be undeployed and its resources turned back to the Kubernetes environment. To undeploy a pipeline, use the pipeline undeploy() command.

In this example, the aloha_pipeline will be undeployed:

aloha_pipeline.undeploy()

{'name': 'aloha-test-demo', 'create_time': datetime.datetime(2022, 3, 29, 20, 34, 3, 960957, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'aloha-2', 'version': 'a8e8abdc-c22f-416c-a13c-5fe162357430', 'sha': 'fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520'}]}}]"}

Get Pipeline Status

The pipeline status() command shows the current status, models, and other information on a pipeline.

The following example shows the pipeline imdb_pipeline status before and after it is deployed:

imdb_pipeline.status

<bound method Pipeline.status of {'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}, {'ModelInference': {'models': [{'name': 'smodel-o', 'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19', 'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650'}]}}]"}>

imdb_pipeline.deploy()
Waiting for deployment - this will take up to 45s ...... ok

imdb_pipeline.status()

{'status': 'Running',
 'details': None,
 'engines': [{'ip': '10.12.1.65',
   'name': 'engine-778b65459-f9mt5',
   'status': 'Running',
   'reason': None,
   'pipeline_statuses': {'pipelines': [{'id': 'imdb-pipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'embedder-o',
      'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d',
      'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4',
      'status': 'Running'},
     {'name': 'smodel-o',
      'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19',
      'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.12.1.66',
   'name': 'engine-lb-85846c64f8-ggg2t',
   'status': 'Running',
   'reason': None}]}

Anomaly Testing

Anomaly detection allows organizations to set validation parameters. A validation is added to a pipeline to test data based on a specific expression. If the expression is returned as False, this is detected as an anomaly and added to the InferenceResult object’s check_failures array and the pipeline logs.

Anomaly detection consists of the following steps:

  • Set a validation: Add a validation to a pipeline that, when returned False, adds an entry to the InferenceResult object’s check_failures attribute with the expression that caused the failure.
  • Display anomalies: Anomalies detected through a Pipeline’s validation attribute are displayed either through the InferenceResult object’s check_failures attribute, or through the pipeline’s logs.

Set A Validation

Validations are added to a pipeline through the wallaroo.pipeline add_validation method. The following parameters are required:

ParameterTypeDescription
nameString (Required)The name of the validation
validationwallaroo.checks.Expression (Required)The validation expression that adds the result InferenceResult object’s check_failures attribute when expression result is False. The validation checks the expression against both the data value and the data type.

Validation expressions take the format value Expression, with the expression being in the form of a :py:Expression:. For example, if the model housing_model is part of the pipeline steps, then a validation expression may be housing_model.outputs[0][0] < 100.0: If the output of the housing_model inference is less than 100, then the validation is True and no action is taken. Any values over 100, the validation is False which triggers adding the anomaly to the InferenceResult object’s check_failures attribute.

Note that multiple validations can be created to allow for multiple anomalies detection.

In the following example, a validation is added to the pipeline to detect housing prices that are below 100 (represented as $100 million), and trigger an anomaly for values above that level. When an inference is performed that triggers a validation failure, the results are displayed in the InferenceResult object’s check_failures attribute.

p = wl.build_pipeline('anomaly-housing-pipeline')
p = p.add_model_step(housing_model)
p = p.add_validation('price too high', housing_model.outputs[0][0] < 100.0)
pipeline = p.deploy()

test_input = {"dense_16_input":[[0.02675675, 0.0, 0.02677953, 0.0, 0.0010046, 0.00951931, 0.14795322, 0.0027145,  2, 0.98536841, 0.02988655, 0.04031725, 0.04298041]]}
response_trigger = pipeline.infer(test_input)
print("\n")
print(response_trigger)

[InferenceResult({'check_failures': [{'False': {'expr': 'anomaly-housing.outputs[0][0] < 100'}}],
 'elapsed': 15110549,
 'model_name': 'anomaly-housing',
 'model_version': 'c3cf1577-6666-48d3-b85c-5d4a6e6567ea',
 'original_data': {'dense_16_input': [[0.02675675,
                                       0.0,
                                       0.02677953,
                                       0.0,
                                       0.0010046,
                                       0.00951931,
                                       0.14795322,
                                       0.0027145,
                                       2,
                                       0.98536841,
                                       0.02988655,
                                       0.04031725,
                                       0.04298041]]},
 'outputs': [{'Float': {'data': [350.46990966796875], 'dim': [1, 1], 'v': 1}}],
 'pipeline_name': 'anomaly-housing-model',
 'time': 1651257043312})]

Display Anomalies

Anomalies detected through a Pipeline’s validation attribute are displayed either through the InferenceResult object’s check_failures attribute, or through the pipeline’s logs.

To display an anomaly through the InferenceResult object, display the check_failures attribute.

In the following example, the an InferenceResult where the validation failed will display the failure in the check_failures attribute:

test_input = {"dense_16_input":[[0.02675675, 0.0, 0.02677953, 0.0, 0.0010046, 0.00951931, 0.14795322, 0.0027145,  2, 0.98536841, 0.02988655, 0.04031725, 0.04298041]]}
response_trigger = pipeline.infer(test_input)
print("\n")
print(response_trigger)

[InferenceResult({'check_failures': [{'False': {'expr': 'anomaly-housing-model.outputs[0][0] < '
                                       '100'}}],
 'elapsed': 12196540,
 'model_name': 'anomaly-housing-model',
 'model_version': 'a3b1c29f-c827-4aad-817d-485de464d59b',
 'original_data': {'dense_16_input': [[0.02675675,
                                       0.0,
                                       0.02677953,
                                       0.0,
                                       0.0010046,
                                       0.00951931,
                                       0.14795322,
                                       0.0027145,
                                       2,
                                       0.98536841,
                                       0.02988655,
                                       0.04031725,
                                       0.04298041]]},
 'outputs': [{'Float': {'data': [350.46990966796875], 'dim': [1, 1], 'v': 1}}],
 'pipeline_name': 'anomaly-housing-pipeline',
 'shadow_data': {},
 'time': 1667416852255})]

The other methods is to use the pipeline.logs() method with the parameter valid=False, isolating the logs where the validation was returned as False.

In this example, a set of logs where the validation returned as False will be displayed:

pipeline.logs(valid=False)
TimestampOutputInputAnomalies
2022-02-Nov 19:20:52[array([[350.46990967]])][[0.02675675, 0.0, 0.02677953, 0.0, 0.0010046, 0.00951931, 0.14795322, 0.0027145, 2, 0.98536841, 0.02988655, 0.04031725, 0.04298041]]1

A/B Testing

A/B testing is a method that provides the ability to test competing ML models for performance, accuracy or other useful benchmarks. Different models are added to the same pipeline steps as follows:

  • Control or Champion model: The model currently used for inferences.
  • Challenger model(s): The model or set of models compared to the challenger model.

A/B testing splits a portion of the inference requests between the champion model and the one or more challengers through the add_random_split method. This method splits the inferences submitted to the model through a randomly weighted step.

Each model receives inputs that are approximately proportional to the weight it is assigned. For example, with two models having weights 1 and 1, each will receive roughly equal amounts of inference inputs. If the weights were changed to 1 and 2, the models would receive roughly 33% and 66% respectively instead.

When choosing the model to use, a random number between 0.0 and 1.0 is generated. The weighted inputs are mapped to that range, and the random input is then used to select the model to use. For example, for the two-models equal-weight case, a random key of 0.4 would route to the first model, 0.6 would route to the second.

Add Random Split

A random split step can be added to a pipeline through the add_random_split method.

The following parameters are used when adding a random split step to a pipeline:

ParameterTypeDescription
champion_weightFloat (Required)The weight for the champion model.
champion_modelWallaroo.Model (Required)The uploaded champion model.
challenger_weightFloat (Required)The weight of the challenger model.
challenger_modelWallaroo.Model (Required)The uploaded challenger model.
hash_keyString(Optional)A key used instead of a random number for model selection. This must be between 0.0 and 1.0.

Note that multiple challenger models with different weights can be added as the random split step.

add_random_split([(champion_weight, champion_model), (challenger_weight, challenger_model),  (challenger_weight2, challenger_model2),...], hash_key)

In this example, a pipeline will be built with a 2:1 weighted ratio between the champion and a single challenger model.

pipeline = (wl.build_pipeline("randomsplitpipeline-demo")
            .add_random_split([(2, control), (1, challenger)]))

The results for a series of single are displayed to show the random weighted split between the two models in action:

results = []
results.append(experiment_pipeline.infer_from_file("data/data-1.json"))
results.append(experiment_pipeline.infer_from_file("data/data-1.json"))
results.append(experiment_pipeline.infer_from_file("data/data-1.json"))
results.append(experiment_pipeline.infer_from_file("data/data-1.json"))
results.append(experiment_pipeline.infer_from_file("data/data-1.json"))

for result in results:
    print(result[0].model())
    print(result[0].data())

('aloha-control', 'ff81f634-8fb4-4a62-b873-93b02eb86ab4')
[array([[0.00151959]]), array([[0.98291481]]), array([[0.01209957]]), array([[4.75912966e-05]]), array([[2.02893716e-05]]), array([[0.00031977]]), array([[0.01102928]]), array([[0.99756402]]), array([[0.01034162]]), array([[0.00803896]]), array([[0.01615506]]), array([[0.00623623]]), array([[0.00099858]]), array([[1.79337805e-26]]), array([[1.38899512e-27]])]

('aloha-control', 'ff81f634-8fb4-4a62-b873-93b02eb86ab4')
[array([[0.00151959]]), array([[0.98291481]]), array([[0.01209957]]), array([[4.75912966e-05]]), array([[2.02893716e-05]]), array([[0.00031977]]), array([[0.01102928]]), array([[0.99756402]]), array([[0.01034162]]), array([[0.00803896]]), array([[0.01615506]]), array([[0.00623623]]), array([[0.00099858]]), array([[1.79337805e-26]]), array([[1.38899512e-27]])]

('aloha-challenger', '87fdfe08-170e-4231-a0b9-543728d6fc57')
[array([[0.00151959]]), array([[0.98291481]]), array([[0.01209957]]), array([[4.75912966e-05]]), array([[2.02893716e-05]]), array([[0.00031977]]), array([[0.01102928]]), array([[0.99756402]]), array([[0.01034162]]), array([[0.00803896]]), array([[0.01615506]]), array([[0.00623623]]), array([[0.00099858]]), array([[1.79337805e-26]]), array([[1.38899512e-27]])]

('aloha-challenger', '87fdfe08-170e-4231-a0b9-543728d6fc57')
[array([[0.00151959]]), array([[0.98291481]]), array([[0.01209957]]), array([[4.75912966e-05]]), array([[2.02893716e-05]]), array([[0.00031977]]), array([[0.01102928]]), array([[0.99756402]]), array([[0.01034162]]), array([[0.00803896]]), array([[0.01615506]]), array([[0.00623623]]), array([[0.00099858]]), array([[1.79337805e-26]]), array([[1.38899512e-27]])]

('aloha-challenger', '87fdfe08-170e-4231-a0b9-543728d6fc57')
[array([[0.00151959]]), array([[0.98291481]]), array([[0.01209957]]), array([[4.75912966e-05]]), array([[2.02893716e-05]]), array([[0.00031977]]), array([[0.01102928]]), array([[0.99756402]]), array([[0.01034162]]), array([[0.00803896]]), array([[0.01615506]]), array([[0.00623623]]), array([[0.00099858]]), array([[1.79337805e-26]]), array([[1.38899512e-27]])]

Replace With Random Split

If a pipeline already had steps as detailed in Add a Step to a Pipeline, this step can be replaced with a random split with the replace_with_random_split method.

The following parameters are used when adding a random split step to a pipeline:

ParameterTypeDescription
indexInteger (Required)The pipeline step being replaced.
champion_weightFloat (Required)The weight for the champion model.
champion_modelWallaroo.Model (Required)The uploaded champion model.
**challenger_weightFloat (Required)The weight of the challenger model.
challenger_modelWallaroo.Model (Required)The uploaded challenger model.
hash_keyString(Optional)A key used instead of a random number for model selection. This must be between 0.0 and 1.0.

Note that one or more challenger models can be added for the random split step:

replace_with_random_split(index, [(champion_weight, champion_model), (challenger_weight, challenger_model)], (challenger_weight2, challenger_model2),...], hash_key)

A/B Testing Logs

A/B Testing logs entries contain the model used for the inferences in the column out._model_split.

logs = experiment_pipeline.logs(limit=5)
display(logs.loc[:,['time', 'out._model_split', 'out.main']])
timeout._model_splitout.main
02023-03-03 19:08:35.653[{“name”:“aloha-control”,“version”:“89389786-0c17-4214-938c-aa22dd28359f”,“sha”:“fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520”}][0.9999754]
12023-03-03 19:08:35.702[{“name”:“aloha-challenger”,“version”:“3acd3835-be72-42c4-bcae-84368f416998”,“sha”:“223d26869d24976942f53ccb40b432e8b7c39f9ffcf1f719f3929d7595bceaf3”}][0.9999727]
22023-03-03 19:08:35.753[{“name”:“aloha-challenger”,“version”:“3acd3835-be72-42c4-bcae-84368f416998”,“sha”:“223d26869d24976942f53ccb40b432e8b7c39f9ffcf1f719f3929d7595bceaf3”}][0.6606688]
32023-03-03 19:08:35.799[{“name”:“aloha-control”,“version”:“89389786-0c17-4214-938c-aa22dd28359f”,“sha”:“fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520”}][0.9998954]
42023-03-03 19:08:35.846[{“name”:“aloha-control”,“version”:“89389786-0c17-4214-938c-aa22dd28359f”,“sha”:“fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520”}][0.99999803]

Pipeline Shadow Deployments

Wallaroo provides a method of testing the same data against two different models or sets of models at the same time through shadow deployments otherwise known as parallel deployments or A/B test. This allows data to be submitted to a pipeline with inferences running on several different sets of models. Typically this is performed on a model that is known to provide accurate results - the champion - and a model or set of models that is being tested to see if it provides more accurate or faster responses depending on the criteria known as the challenger(s). Multiple challengers can be tested against a single champion to determine which is “better” based on the organization’s criteria.

As described in the Wallaroo blog post The What, Why, and How of Model A/B Testing:

In data science, A/B tests can also be used to choose between two models in production, by measuring which model performs better in the real world. In this formulation, the control is often an existing model that is currently in production, sometimes called the champion. The treatment is a new model being considered to replace the old one. This new model is sometimes called the challenger….
Keep in mind that in machine learning, the terms experiments and trials also often refer to the process of finding a training configuration that works best for the problem at hand (this is sometimes called hyperparameter optimization).

When a shadow deployment is created, only the inference from the champion is returned in the InferenceResult Object data, while the result data for the shadow deployments is stored in the InferenceResult Object shadow_data.

Create Shadow Deployment

Create a parallel or shadow deployment for a pipeline with the pipeline.add_shadow_deploy(champion, challengers[]) method, where the champion is a Wallaroo Model object, and challengers[] is one or more Wallaroo Model objects.

Each inference request sent to the pipeline is sent to all the models. The prediction from the champion is returned by the pipeline, while the predictions from the challengers are not part of the standard output, but are kept stored in the shadow_data attribute and in the logs for later comparison.

In this example, a shadow deployment is created with the champion versus two challenger models.

champion = wl.upload_model(champion_model_name, champion_model_file).configure()
model2 = wl.upload_model(shadow_model_01_name, shadow_model_01_file).configure()
model3 = wl.upload_model(shadow_model_02_name, shadow_model_02_file).configure()
   
pipeline.add_shadow_deploy(champion, [model2, model3])
pipeline.deploy()
  
namecc-shadow
created2022-08-04 20:06:55.102203+00:00
last_updated2022-08-04 20:37:28.785947+00:00
deployedTrue
tags
stepsccfraud-lstm

Shadow Deploy Outputs

Model outputs are listed by column based on the model’s outputs. The output data is set by the term out, followed by the name of the model. For the default model, this is out.{variable_name}, while the shadow deployed models are in the format out_{model name}.variable, where {model name} is the name of the shadow deployed model.

sample_data_file = './smoke_test.df.json'
response = pipeline.infer_from_file(sample_data_file)
timein.tensorout.dense_1check_failuresout_ccfraudrf.variableout_ccfraudxgb.variable
02023-03-03 17:35:28.859[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756][0.0014974177]0[1.0][0.0005066991]

Retrieve Shadow Deployment Logs

Shadow deploy results are part of the Pipeline.logs() method. The output data is set by the term out, followed by the name of the model. For the default model, this is out.dense_1, while the shadow deployed models are in the format out_{model name}.variable, where {model name} is the name of the shadow deployed model.

logs = pipeline.logs()
display(logs)
timein.tensorout.dense_1check_failuresout_ccfraudrf.variableout_ccfraudxgb.variable
02023-03-03 17:35:28.859[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756][0.0014974177]0[1.0][0.0005066991]

Get Pipeline URL Endpoint

The Pipeline URL Endpoint or the Pipeline Deploy URL is used to submit data to a pipeline to use for an inference. This is done through the pipeline _deployment._url() method.

In this example, the pipeline URL endpoint for the pipeline ccfraud_pipeline will be displayed:

ccfraud_pipeline._deployment._url()

'http://engine-lb.ccfraud-pipeline-1:29502/pipelines/ccfraud-pipeline'

2.6.2 - Wallaroo SDK Essentials Guide: Pipeline Deployment Configuration

Details on pipeline configurations and settings

Pipeline deployments configurations allow tailoring of a pipeline’s resources to match an organization’s and model’s requirements. Pipelines may require more memory, CPU cores, or GPUs to run to run all its steps efficiently. Pipeline deployment configurations also allow for multiple replicas of a model in a pipeline to provide scalability.

Create Pipeline Configuration

Setting a pipeline deployment configuration follows this process:

  1. Pipeline deployment configurations are created through the wallaroo ‘deployment_config.DeploymentConfigBuilder()](https://docs.wallaroo.ai/20230201/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-reference-guide/deployment_config/#DeploymentConfigBuilder) class.
  2. Once the configuration options are set the pipeline deployment configuration is set with the deployment_config.build() method.
  3. The pipeline deployment configuration is then applied when the pipeline is deployed.

The following example shows a pipeline deployment configuration with 1 replica, 1 cpu, and 2Gi of memory set to be allocated to the pipeline.

deployment_config = wallaroo.DeploymentConfigBuilder()
                    .replica_count(1)
                    .cpus(1)
                    .memory("2Gi")
                    .build()

pipeline.deploy(deployment_config = deployment_config)

Pipeline resources can be configured with autoscaling. Autoscaling allows the user to define how many engines a pipeline starts with, the minimum amount of engines a pipeline uses, and the maximum amount of engines a pipeline can scale to. The pipeline scales up and down based on the average CPU utilization across the engines in a given pipeline as the user’s workload increases and decreases.

Pipeline Resource Configurations

Pipeline deployment configurations deal with two major components:

  • Native Runtimes: Models that are deployed “as is” with the Wallaroo engine (Onnx, etc).
  • Containerized Runtimes: Models that are packaged into a container then deployed as a container with the Wallaroo engine (MLFlow, etc).

These configurations can be mixed - both native runtimes and containerized runtimes deployed to the same pipeline, with resources allocated to each runtimes in different configurations.

The following resources configurations are available through the wallaroo.deployment_config object.

GPU and CPU Allocation

CPUs are allocated in fractions of total CPU power similar to the Kubernetes CPU definitions. cpus(0.25), cpus(1.0), etc are valid values.

GPUs can only be allocated by entire integer units from the GPU enabled nodepools. gpus(1), gpus(2), etc are valid values, while gpus(0.25) are not.

Organizations should be aware of how many GPUs are allocated to the cluster. If all GPUs are already allocated to other pipelines, or if there are not enough GPUs to fulfill the request, the pipeline deployment will fail and return an error message.

GPU Support

Wallaroo 2023.2.1 and above supports Kubernetes nodepools with Nvidia Cuda GPUs.

See the Create GPU Nodepools for Kubernetes Clusters guide for instructions on adding GPU enabled nodepools to a Kubernetes cluster.

Native Runtime Configuration Methods

MethodParametersDescriptionEnterprise Only Feature
replica_count(count: int)The number of replicas of the pipeline to deploy. This allows for multiple deployments of the same models to be deployed to increase inferences through parallelization.
replica_autoscale_min_max(maximum: int, minimum: int = 0)Provides replicas to be scaled from 0 to some maximum number of replicas. This allows pipelines to spin up additional replicas as more resources are required, then spin them back down to save on resources and costs.
autoscale_cpu_utilization(cpu_utilization_percentage: int)Sets the average CPU percentage metric for when to load or unload another replica.
disable_autoscaleDisables autoscaling in the deployment configuration. 
cpus(core_count: float)Sets the number or fraction of CPUs to use for the pipeline, for example: 0.25, 1, 1.5, etc. The units are similar to the Kubernetes CPU definitions. 
gpus(core_count: int)Sets the number of GPUs to allocate for native runtimes. GPUs are only allocated in whole units, not as fractions. Organizations should be aware of the total number of GPUs available to the cluster, and monitor which pipeline deployment configurations have gpus allocated to ensure they do not run out. If there are not enough gpus to allocate to a pipeline deployment configuration, and error message will be deployed when the pipeline is deployed. If gpus is called, then the deployment_label must be called and match the GPU Nodepool for the Wallaroo Cluster hosting the Wallaroo instance.
memorymemory_spec: strSets the amount of RAM to allocate the pipeline. The memory_spec string is in the format “{size as number}{unit value}”. The accepted unit values are:
  • KiB (for KiloBytes)
  • MiB (for MegaBytes)
  • GiB (for GigaBytes)
  • TiB (for TeraBytes)
The values are similar to the Kubernetes memory resource units format.
 
lb_cpus(core_count: float)Sets the number or fraction of CPUs to use for the pipeline’s load balancer, for example: 0.25, 1, 1.5, etc. The units, similar to the Kubernetes CPU definitions. 
lb_memorymemory_spec: strSets the amount of RAM to allocate the pipeline’s load balancer. The memory_spec string is in the format “{size as number}{unit value}”. The accepted unit values are:
  • KiB (for KiloBytes)
  • MiB (for MegaBytes)
  • GiB (for GigaBytes)
  • TiB (for TeraBytes)
The values are similar to the Kubernetes memory resource units format.
 
deployment_label Label used for Kubernetes labels. Required if gpus are set and must match the GPU nodepool label.

Containerized Runtime Configuration Methods

MethodParametersDescriptionEnterprise Only Feature
sidekick_cpus(model: wallaroo.model.Model, core_count: float)Sets the number of CPUs to be used for the model’s sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment. The parameters are as follows:
  • Model model: The sidekick model to configure.
  • float core_count: Number of CPU cores to use in this sidekick.
 
sidekick_memory(model: wallaroo.model.Model, memory_spec: str)Sets the memory available to for the model’s sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment. The parameters are as follows:
  • Model model: The sidekick model to configure.
  • memory_spec: The amount of memory to allocated as memory unit values. The accepted unit values are:
    • KiB (for KiloBytes)
    • MiB (for MegaBytes)
    • GiB (for GigaBytes)
    • TiB (for TeraBytes)
    The values are similar to the Kubernetes memory resource units format.
 
sidekick_env(model: wallaroo.model.Model, environment: Dict[str, str])Environment variables submitted to the model’s sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment. These are used specifically for containerized models that have environment variables that effect their performance. 
sidekick_gpus(model: wallaroo.model.Model, core_count: int)Sets the number of GPUs to allocate for containerized runtimes. GPUs are only allocated in whole units, not as fractions. Organizations should be aware of the total number of GPUs available to the cluster, and monitor which pipeline deployment configurations have gpus allocated to ensure they do not run out. If there are not enough gpus to allocate to a pipeline deployment configuration, and error message will be deployed when the pipeline is deployed. If called, then the deployment_label must be called and match the GPU Nodepool for the Wallaroo Cluster hosting the Wallaroo instance.

Examples

Native Runtime Deployment

The following will set native runtime deployment to one quarter of a CPU with 1 Gi of Ram:

deployment_config = DeploymentConfigBuilder() \
    .cpus(0.25).memory('1Gi') \
    .build()

This example sets the replica count to 1, then sets the auto-scale to vary between 2 to 5 replicas depending on need, with 1 CPU and 1 GI RAM allocated per replica.

deploy_config = (wallaroo.DeploymentConfigBuilder()
                        .replica_count(1)
                        .replica_autoscale_min_max(minimum=2, maximum=5)
                        .cpus(1)
                        .memory("1Gi")
                        .build()
                    )

The following configuration allocates 1 GPU to the pipeline for native runtimes.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .gpus(1)
                    .deployment_label('doc-gpu-label:true')
                    .build()

Containerized Runtime Deployment

The following configuration allocates 0.25 CPU and 1Gi RAM to the containerized runtime sm_model, and passes that runtime environmental variables used for timeout settings.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

This example shows allocating 1 GPU to the containerized runtime model sm_model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_gpus(sm_model, 1)
                    .deployment_label('doc-gpu-label:true')
                    .sidekick_memory(sm_model, '1Gi')
                    .build()

Mixed Environments

The following configuration allocates 1 gpu to the pipeline for native runtimes, then another gpu to the containerized runtime sm_model for a total of 2 gpus allocated to the pipeline: one gpu for native runtimes, another gpu for the containerized runtime model sm_model.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .gpus(1)
                    .sidekick_gpus(sm_model, 1)
                    .deployment_label('doc-gpu-label:true')
                    .build()

2.6.3 - Wallaroo SDK Essentials Guide: Pipeline Log Management

How to create and manage Wallaroo Pipelines through the Wallaroo SDK

Pipeline have their own set of log files that are retrieved and analyzed as needed with the either through:

  • The Pipeline logs method (returns either a DataFrame or Apache Arrow).
  • The Pipeline export_logs method (saves either a DataFrame file in JSON format, or an Apache Arrow file).

Get Pipeline Logs

Pipeline logs are retrieved through the Pipeline logs method. By default, logs are returned as a DataFrame in reverse chronological order of insertion, with the most recent files displayed first.

Pipeline logs are segmented by pipeline versions. For example, if a new model step is added to a pipeline, a model swapped out of a pipeline step, etc - this generated a new pipeline version. log method requests will return logs based on the parameter that match the pipeline version. To request logs of a specific pipeline version, specify the start_datetime and end_datetime parameters based on the pipeline version logs requested.

This command takes the following parameters.

ParameterTypeDescription
limitInt (Optional) (Default: 100)Limits how many log records to display. If there are more pipeline logs than are being displayed, the Warning message Pipeline log record limit exceeded will be displayed. For example, if 100 log files were requested and there are a total of 1,000, the warning message will be displayed.
start_datetime and end_datetimeDateTime (Optional)Limits logs to all logs between the start_datetime and end_datetime DateTime parameters. These comply with the Python datetime library for formats such as:
  • datetime.datetime.now()
  • datetime.datetime(2023, 3, 28, 14, 25, 51, 660058, tzinfo=tzutc()) (March 28, 2023 14:25:51:660058 UTC time zone)

Both parameters must be provided. Submitting a logs() request with only start_datetime or end_datetime will generate an exception.
If start_datetime and end_datetime are provided as parameters even with any other parameter, then the records are returned in chronological order, with the oldest record displayed first.
datasetList[String] (OPTIONAL)The datasets to be returned. The datasets available are:
  • *: Default. This translates to ["time", "in", "out", "check_failures"].
  • time: The DateTime of the inference request.
  • in: All inputs listed as in_{variable_name}.
  • out: All outputs listed as out_variable_name.
  • check_failures: Flags whether an Anomaly or Validation Check was triggered. 0 indicates no checks weretriggers, 1 or greater indicates a check was triggered.
  • meta: Returns metadata. IMPORTANT NOTE: See Metadata RequestsRestrictions for specifications on how this dataset can be used with otherdatasets.
    • Returns in the metadata.elapsed field:
      • A list of time in nanoseconds for:
        • The time to serialize the input.
        • How long each step took.
    • Returns in the metadata.last_model field:
      • A dict with each Python step as:
        • model_name: The name of the model in the pipeline step.
        • model_sha : The sha hash of the model in the pipeline step.
    • Returns in the metadata.pipeline_version field:
      • The pipeline version as a UUID value.
  • metadata.elapsed: IMPORTANT NOTE: See Metadata Requests Restrictionsfor specifications on how this dataset can be used with other datasets.
    • Returns in the metadata.elapsed field:
      • A list of time in nanoseconds for:
        • The time to serialize the input.
        • How long each step took.
dataset_excludeList[String] (OPTIONAL)Exclude specified datasets.
dataset_separatorSequence[[String], string] (OPTIONAL)If set to “.”, return dataset will be flattened.
arrowBoolean (Optional) (Default: False)If arrow is set to True, then the logs are returned as an Apache Arrow table. If arrow=False, then the logs are returned as a pandas DataFrame.

All of the parameters can be used together, but start_datetime and end_datetime must be combined; if one is used, then so must the other. If start_datetime and end_datetime are used with any other parameter, then the log results are in chronological order of record insertion.

Log requests are limited to around 100k in size. For requests greater than 100k in size, use the Pipeline export_logs() method.

Logs include the following standard datasets:

ParameterTypeDescription
timeDateTimeThe DateTime the inference request was made.
in.{variable} The input(s) for the inference request. Each input is listed as in.{variable_name}. For example, in.text_input, in.square_foot, in.number_of_rooms, etc.
out The outputs(s) for the inference request, based on the ML model’s outputs. Each output is listed as out.{variable_name}. For example, out.maximum_offer_price, out.minimum_asking_price, out.trade_in_value, etc.
check_failuresIntHow many validation checks were triggered by the inference. For more information, see Anomaly Testing
out_{model_name}.{variable} Only returned when using Pipeline Shadow Deployments. For each model in the shadow deploy step, their output is listed in the format out_{model_name}.{variable}. For example, out_shadow_model_xgb.maximum_offer_price, out_shadow_model_xgb.minimum_asking_price, out_shadow_model_xgb.trade_in_value, etc.
out._model_split Only returned when using A/B Testing, used to display the model_name, model_version, and model_sha of the model used for the inference.

In this example, the last 50 logs to the pipeline mainpipeline between two sample dates. In this case, all of the time column fields are the same since the inference request was sent as a batch.

logs = mainpipeline.logs(start_datetime=date_start, end_datetime=date_end)

display(len(logs))
display(logs)

538
 timein.tensorout.variablecheck_failures
02023-04-24 18:09:33.970[4.0, 2.5, 2900.0, 5505.0, 2.0, 0.0, 0.0, 3.0, 8.0, 2900.0, 0.0, 47.6063, -122.02, 2970.0, 5251.0, 12.0, 0.0, 0.0][718013.75]0
12023-04-24 18:09:33.970[2.0, 2.5, 2170.0, 6361.0, 1.0, 0.0, 2.0, 3.0, 8.0, 2170.0, 0.0, 47.7109, -122.017, 2310.0, 7419.0, 6.0, 0.0, 0.0][615094.56]0
22023-04-24 18:09:33.970[3.0, 2.5, 1300.0, 812.0, 2.0, 0.0, 0.0, 3.0, 8.0, 880.0, 420.0, 47.5893, -122.317, 1300.0, 824.0, 6.0, 0.0, 0.0][448627.72]0
32023-04-24 18:09:33.970[4.0, 2.5, 2500.0, 8540.0, 2.0, 0.0, 0.0, 3.0, 9.0, 2500.0, 0.0, 47.5759, -121.994, 2560.0, 8475.0, 24.0, 0.0, 0.0][758714.2]0
42023-04-24 18:09:33.970[3.0, 1.75, 2200.0, 11520.0, 1.0, 0.0, 0.0, 4.0, 7.0, 2200.0, 0.0, 47.7659, -122.341, 1690.0, 8038.0, 62.0, 0.0, 0.0][513264.7]0
5332023-04-24 18:09:33.970[3.0, 2.5, 1750.0, 7208.0, 2.0, 0.0, 0.0, 3.0, 8.0, 1750.0, 0.0, 47.4315, -122.192, 2050.0, 7524.0, 20.0, 0.0, 0.0][311909.6]0
5342023-04-24 18:09:33.970[5.0, 1.75, 2330.0, 6450.0, 1.0, 0.0, 1.0, 3.0, 8.0, 1330.0, 1000.0, 47.4959, -122.367, 2330.0, 8258.0, 57.0, 0.0, 0.0][448720.28]0
5352023-04-24 18:09:33.970[4.0, 3.5, 4460.0, 16271.0, 2.0, 0.0, 2.0, 3.0, 11.0, 4460.0, 0.0, 47.5862, -121.97, 4540.0, 17122.0, 13.0, 0.0, 0.0][1208638.0]0
5362023-04-24 18:09:33.970[3.0, 2.75, 3010.0, 1842.0, 2.0, 0.0, 0.0, 3.0, 9.0, 3010.0, 0.0, 47.5836, -121.994, 2950.0, 4200.0, 3.0, 0.0, 0.0][795841.06]0
5372023-04-24 18:09:33.970[2.0, 1.5, 1780.0, 4750.0, 1.0, 0.0, 0.0, 4.0, 7.0, 1080.0, 700.0, 47.6859, -122.395, 1690.0, 5962.0, 67.0, 0.0, 0.0][558463.3]0

538 rows × 4 columns

Metadata Requests Restrictions

The following restrictions are in place when requesting the datasets metadata or metadata.elapsed.

Standard Pipeline Steps

For the following Pipeline steps, metadata or metadata.elapsed must be requested with the * parameter. For example:

result = mainpipeline.infer(normal_input, dataset=["*", "metadata.elapsed"])

Effected pipeline steps:

  • add_model_step
  • replace_with_model_step

Testing Pipeline Steps

For the following Pipeline steps, meta or metadata.elapsed can not be included with the * parameter. For example:

result = mainpipeline.infer(normal_input, dataset=["metadata.elapsed"])

Effected pipeline steps:

  • add_random_split
  • replace_with_random_split
  • add_shadow_deploy
  • replace_with_shadow_deploy

Export Pipeline Logs as File

The Pipeline method export_logs returns the Pipeline records as either by default pandas records in Newline Delimited JSON (NDJSON) format, or an Apache Arrow table files.

The output files are by default stores in the current working directory ./logs with the default prefix as the {pipeline name}-1, {pipeline name}-2, etc.

The suffix by default will be json for pandas records in Newline Delimited JSON (NDJSON) format files. Logs are segmented by pipeline version across the limit, data_size_limit, or start_datetime and end_datetime parameters.

By default, logs are returned as a pandas record in NDJSON in reverse chronological order of insertion, with the most recent log insertions displayed first.

Pipeline logs are segmented by pipeline versions. For example, if a new model step is added to a pipeline, a model swapped out of a pipeline step, etc - this generated a new pipeline version.

This command takes the following parameters.

ParameterTypeDescription
directoryString (Optional) (Default: logs)Logs are exported to a file from current working directory to directory.
file_prefixString (Optional) (Default: The name of the pipeline)The name of the exported files. By default, this will the name of the pipeline and is segmented by pipeline version between the limits or the start and end period. For example: ’logpipeline-1.json`, etc.
data_size_limitString (Optional) (Default: 100MB)The maximum size for the exported data in bytes. Note that file size is approximate to the request; a request of 10MiB may return 10.3MB of data. The fields are in the format “{size as number} {unit value}”, and can include a space so “10 MiB” and “10MiB” are the same. The accepted unit values are:
  • KiB (for KiloBytes)
  • MiB (for MegaBytes)
  • GiB (for GigaBytes)
  • TiB (for TeraBytes)
limitInt (Optional) (Default: 100)Limits how many log records to display. Defaults to 100. If there are more pipeline logs than are being displayed, the Warning message Pipeline log record limit exceeded will be displayed. For example, if 100 log files were requested and there are a total of 1,000, the warning message will be displayed.
start_datetime and end_datetimeDateTime (Optional)Limits logs to all logs between the start_datetime and end_datetime DateTime parameters. These comply with the Python datetime library for formats such as:
  • datetime.datetime.now()
  • datetime.datetime(2023, 3, 28, 14, 25, 51, 660058, tzinfo=tzutc()) (March 28, 2023 14:25:51:660058 UTC time zone)

Both parameters must be provided. Submitting a logs() request with only start_datetime or end_datetime will generate an exception.
If start_datetime and end_datetime are provided as parameters even with any other parameter, then the records are returned in chronological order, with the oldest record displayed first.
filenameString (Required)The file name to save the log file to. The requesting user must have write access to the file location. The requesting user must have write permission to the file location, and the target directory for the file must already exist. For example: If the file is set to /var/wallaroo/logs/pipeline.json, then the directory /var/wallaroo/logs must already exist. Otherwise file names are only limited by standard file naming rules for the target environment.
datasetList (OPTIONAL)The datasets to be returned. The datasets available are:
  • *: Default. This translates to ["time", "in", "out", "check_failures"].
  • time: The DateTime of the inference request.
  • in: All inputs listed as in_{variable_name}.
  • out: All outputs listed as out_variable_name.
  • check_failures: Flags whether an Anomaly or Validation Check was triggered. 0 indicates no checks were triggered, 1 or greater indicates a check was triggered.
  • meta: Returns metadata. IMPORTANT NOTE: See Metadata RequestsRestrictions for specifications on how this dataset can be used with other datasets.
    • Returns in the metadata.elapsed field:
      • A list of time in nanoseconds for:
        • The time to serialize the input.
        • How long each step took.
    • Returns in the metadata.last_model field:
      • A dict with each Python step as:
        • model_name: The name of the model in the pipeline step.
        • model_sha : The sha hash of the model in the pipeline step.
    • Returns in the metadata.pipeline_version field:
      • The pipeline version as a UUID value.
  • metadata.elapsed: IMPORTANT NOTE: See Metadata Requests Restrictionsfor specifications on how this dataset can be used with other datasets.
    • Returns in the metadata.elapsed field:
      • A list of time in nanoseconds for:
        • The time to serialize the input.
        • How long each step took.
dataset_excludeList[String] (OPTIONAL)Exclude specified datasets.
dataset_separatorSequence[[String], string] (OPTIONAL)If set to “.”, return dataset will be flattened.
arrowBoolean (Optional)Defaults to False. If arrow=True, then the logs are returned as an Apache Arrow table. If arrow=False, then the logs are returned as pandas record in NDJSON that can be imported into a pandas DataFrame.

All of the parameters can be used together, but start_datetime and end_datetime must be combined; if one is used, then so must the other. If start_datetime and end_datetime are used with any other parameter, then the log results are in chronological order of record insertion.

File sizes are limited to around 10 MB in size. If the requested log file is greater than 10 MB, a Warning will be displayed indicating the end date of the log file downloaded so the request can be adjusted to capture the requested log files.

In this example, the log files are saved as both Pandas DataFrame and Apache Arrow.

# Save the DataFrame version of the log file

mainpipeline.export_logs()
display(os.listdir('./logs'))

mainpipeline.export_logs(arrow=True)
display(os.listdir('./logs'))

    Warning: There are more logs available. Please set a larger limit to export more data.
    

    ['pipeline-logs-1.json']

    Warning: There are more logs available. Please set a larger limit to export more data.
    

    ['pipeline-logs-1.arrow', 'pipeline-logs-1.json']

Pipeline Log Storage

Pipeline logs have a set allocation of storage space and data requirements.

Pipeline Log Storage Warnings

To prevent storage and performance issues, inference result data may be dropped from pipeline logs by the following standards:

  • Columns are progressively removed from the row starting with the largest input data size and working to the smallest, then the same for outputs.

For example, Computer Vision ML Models typically have large inputs and output values - a single pandas DataFrame inference request may be over 13 MB in size, and the inference results nearly as large. To prevent pipeline log storage issues, the input may be dropped from the pipeline logs, and if additional space is needed, the inference outputs would follow. The time column is preserved.

If a pipeline has dropped columns for space purposes, this will be displayed when a log request is made with the following warning, with {columns} replaced with the dropped columns.

The inference log is above the allowable limit and the following columns may have been suppressed for various rows in the logs: {columns}. To review the dropped columns for an individual inferences suppressed data, include dataset=["metadata"] in the log request.

Review Dropped Columns

To review what columns are dropped from pipeline logs for storage reasons, include the dataset metadata in the request to view the column metadata.dropped. This metadata field displays a List of any columns dropped from the pipeline logs.

For example:

metadatalogs = mainpipeline.logs(dataset=["time", "metadata"])
 timemetadata.dropped
02023-07-0615:47:03.673
12023-07-0615:47:03.673
22023-07-0615:47:03.673
32023-07-0615:47:03.673
42023-07-0615:47:03.673
952023-07-0615:47:03.673
962023-07-0615:47:03.673
972023-07-0615:47:03.673
982023-07-0615:47:03.673
992023-07-0615:47:03.673

Suppressed Data Elements

Data elements that do not fit the supported data types below, such as None or Null values, are not supported in pipeline logs. When present, undefined data will be written in the place of the null value, typically zeroes. Any null list values will present an empty list.

2.7 - Wallaroo SDK Essentials Guide: ML Workload Orchestration

How to create and manage ML Workload Orchestration through the Wallaroo SDK

Wallaroo provides ML Workload Orchestrations and Tasks to automate processes in a Wallaroo instance. For example:

  • Deploy a pipeline, retrieve data through a data connector, submit the data for inferences, undeploy the pipeline
  • Replace a model with a new version
  • Retrieve shadow deployed inference results and submit them to a database

Orchestration Flow

ML Workload Orchestration flow works within 3 tiers:

TierDescription
ML Workload OrchestrationUser created custom instructions that provide automated processes that follow the same steps every time without error. Orchestrations contain the instructions to be performed, uploaded as a .ZIP file with the instructions, requirements, and artifacts.
TaskInstructions on when to run an Orchestration as a scheduled Task. Tasks can be Run Once, where is creates a single Task Run, or Run Scheduled, where a Task Run is created on a regular schedule based on the Kubernetes cronjob specifications. If a Task is Run Scheduled, it will create a new Task Run every time the schedule parameters are met until the Task is killed.
Task RunThe execution of an task. These validate business operations are successful identify any unsuccessful task runs. If the Task is Run Once, then only one Task Run is generated. If the Task is a Run Scheduled task, then a new Task Run will be created each time the schedule parameters are met, with each Task Run having its own results and logs.
Wallaroo Components

One example may be of making donuts.

  • The ML Workload Orchestration is the recipe.
  • The Task is the order to make the donuts. It might be Run Once, so only one set of donuts are made, or Run Scheduled, so donuts are made every 2nd Sunday at 6 AM. If Run Scheduled, the donuts are made every time the schedule hits until the order is cancelled (aka killed).
  • The Task Run are the donuts with their own receipt of creation (logs, etc).

Orchestration Requirements

Orchestrations are uploaded to the Wallaroo instance as a ZIP file with the following requirements:

ParameterTypeDescription
User Code(Required) Python script as .py filesIf main.py exists, then that will be used as the task entrypoint. Otherwise, the first main.py found in any subdirectory will be used as the entrypoint. If no main.py is found, the orchestration will not be accepted.
Python Library Requirements(Optional) requirements.txt file in the requirements file format.A standard Python requirements.txt for any dependencies to be provided in the task environment. The Wallaroo SDK will already be present and should not be included in the requirements.txt. Multiple requirements.txt files are not allowed.
Other artifacts Other artifacts such as files, data, or code to support the orchestration.

Zip Instructions

In a terminal with the zip command, assemble artifacts as above and then create the archive. The zip command is included by default with the Wallaroo JupyterHub service.

zip commands take the following format, with {zipfilename}.zip as the zip file to save the artifacts to, and each file thereafter as the files to add to the archive.

zip {zipfilename}.zip file1, file2, file3....

For example, the following command will add the files main.py and requirements.txt into the file hello.zip.

$ zip hello.zip main.py requirements.txt 
  adding: main.py (deflated 47%)
  adding: requirements.txt (deflated 52%)

Example requirements.txt file

dbt-bigquery==1.4.3
dbt-core==1.4.5
dbt-extractor==0.4.1
dbt-postgres==1.4.5
google-api-core==2.8.2
google-auth==2.11.0
google-auth-oauthlib==0.4.6
google-cloud-bigquery==3.3.2
google-cloud-bigquery-storage==2.15.0
google-cloud-core==2.3.2
google-cloud-storage==2.5.0
google-crc32c==1.5.0
google-pasta==0.2.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.4

Orchestration Recommendations

The following recommendations will make using Wallaroo orchestrations.

  • The version of Python used should match the same version as in the Wallaroo JupyterHub service.
  • The same version of the Wallaroo SDK should match the server. For a 2023.2.1 Wallaroo instance, use the Wallaroo SDK version 2023.2.1.
  • Specify the version of pip dependencies.
  • The wallaroo.Client constructor auth_type argument is ignored. Using wallaroo.Client() is sufficient.
  • The following methods will assist with orchestrations:
    • wallaroo.in_task() : Returns True if the code is running within an orchestration task.
    • wallaroo.task_args(): Returns a Dict of invocation-specific arguments passed to the run_ calls.
  • Orchestrations will be run in the same way as running within the Wallaroo JupyterHub service, from the version of Python libraries (unless specifically overridden by the requirements.txt setting, which is not recommended), and running in the virtualized directory /home/jovyan/.

Orchestration Code Samples

The following demonstres using the wallaroo.in_task() and wallaroo.task_args() methods within an Orchestration. This sample code uses wallaroo.in_task() to verify whether or not the script is running as a Wallaroo Task. If true, it will gather the wallaroo.task_args() and use them to set the workspace and pipeline. If False, then it sets the pipeline and workspace manually.

# get the arguments
wl = wallaroo.Client()

# if true, get the arguments passed to the task
if wl.in_task():
  arguments = wl.task_args()
  
  # arguments is a key/value pair, set the workspace and pipeline name
  workspace_name = arguments['workspace_name']
  pipeline_name = arguments['pipeline_name']
  
# False:  We're not in a Task, so set the pipeline manually
else:
  workspace_name="bigqueryworkspace"
  pipeline_name="bigquerypipeline"

Orchestration Methods

The following methods are provided for creating and listing orchestrations.

Create Orchestration

An orchestration is created through the Wallaroo Client upload_orchestration(path) with the following parameters.

For the uploads, either the path to the .zip file is required, or bytes_buffer with name are required. path can not be used with bytes_buffer and name, and vice versa.

ParameterTypeDescription
pathString (Optional)The path to the .zip file that contains the orchestration package. Can not be use with bytes_buffer and name are used.
file_nameString (Optional)The file name to give to the zip file when uploaded.
bytes_buffer[bytes] (Optional)The .zip file object to be uploaded. Can not be used with path. Note that if the zip file is uploaded as from the bytes_buffer parameter and file_name is not included, then the file name in the Wallaroo orchestrations list will be -.
nameString (Optional)Sets the name of the byte uploaded zip file.

List Orchestrations

All orchestrations for a Wallaroo instances are listed via the Wallaroo Client list_orchestrations() method. It returns an array with the following.

ParameterTypeDescription
idStringThe UUID identifier for the orchestration.
last run statusStringThe last reported status the task. Valid values are:
  • packaging: The orchestration has been upload and is being prepared.
  • ready: The orchestration is available to be used as a task.
shaStringThe sha value of the uploaded orchestration.
nameStringThe name of the orchestration
filenameStringThe name of the uploaded orchestration file.
created atDateTimeThe date and time the orchestration was uploaded to the Wallaroo instance.
updated atDateTimeThe date and time a new version of the orchestration was uploaded.
wl.list_orchestrations()
idnamestatusfilenameshacreated atupdated at
0f90e606-09f8-409b-a306-cb04ec4c011acomprehensive samplereadyremote_inference.zipb88e93...2396fb2023-22-May 19:55:152023-22-May 19:56:09

Task Methods

Tasks are the implementation of an orchestration. Think of the orchestration as the instructions to follow, and the Task is the unit actually doing it.

Tasks are set at the workspace level.

Create Tasks

Tasks are created from an orchestration through the following methods.

Task TypeDescription
run_onceRun the task once.
run_scheduledRun on a schedule, repeat every time the schedule fits the task until it is killed.

Tasks have the following parameters.

ParameterTypeDescription
idStringThe UUID identifier for the task.
last run statusStringThe last reported status the task. Values are:
  • unknown: The task has not been started or is being prepared.
  • ready: The task is scheduled to execute.
  • running: The task has started.
  • failure: The task failed.
  • success: The task completed.
typeStringThe type of the task. Values are:
  • Temporary Run: The task runs once then stop.
  • Scheduled Run: The task repeats on a cron like schedule.
  • Service Run: The task runs as a service and executes when its service port is activated.
activeBooleanTrue: The task is scheduled or running. False: The task has completed or has been issued the kill command.
scheduleStringThe cron style schedule for the task. If the task is not a scheduled one, then the schedule will be -.
created atDateTimeThe date and time the task was started.
updated atDateTimeThe date and time the task was updated.

Run Task Once

Temporary Run tasks are created from the Orchestration run_once(name, json_args, timeout) with the following parameters.

ParameterTypeDescription
nameString (Required)The designated name of the task.
json_argsDict (Required)Arguments for the orchestration, such as { "dogs": 3.9, "cats": 8.1}
timeoutint (Optional)Timeout period in seconds.
task = orchestration.run_once(name="house price run once 2", json_args={"workspace_name": workspace_name, 
                                                                           "pipeline_name":pipeline_name,
                                                                           "connection_name": connection_name
                                                                           }
                            )
task
FieldValue
IDf0e27d6a-6a98-4d26-b240-266f08560c48
Namehouse price run once 2
Last Run Statusunknown
TypeTemporary Run
ActiveTrue
Schedule-
Created At2023-22-May 19:58:32
Updated At2023-22-May 19:58:32

Run Task Scheduled

A task can be scheduled via the Orchestration run_scheduled method.

Scheduled tasks are run every time the schedule period is met. This uses the same settings as the cron utility.

Scheduled tasks include the following parameters.

ParameterTypeDescription
nameString (Required)The name of the task.
scheduleString (Required)Schedule in the cron format of: hour, minute, day_of_week, day_of_month, month.
timeoutint (Optional)Timeout period in seconds.
json_argsDict (Required)Arguments for the task, such as { "dogs": 3.9, "cats": 8.1}

The schedule uses the same method as the cron service. For example, the following schedule:

schedule={'42 * * * *'}

Runs on the 42nd minute of every hour. The following schedule:

schedule={'00 1 * * 0'}

Indicates “At 1:00 AM on Sunday.”

For a shortcut in creating cron formatted schedules, see sites such as the Cron expression generator by Cronhub.

task_scheduled = orchestration.run_scheduled(name="schedule example", 
                                             timeout=600, 
                                             schedule=schedule, 
                                             json_args={"workspace_name": workspace_name, 
                                                        "pipeline_name": pipeline_name,
                                                        "connection_name": connection_name
                                            })
task_scheduled
FieldValue
ID4af57c61-dfa9-43eb-944e-559135495df4
Nameschedule example
Last Run Statusunknown
TypeScheduled Run
ActiveTrue
Schedule*/5 * * * *
Created At2023-22-May 20:08:25
Updated At2023-22-May 20:08:25

List Tasks

The list of tasks in the Wallaroo instance is retrieves through the Wallaroo Client list_tasks() method that accepts the following parameters.

ParameterTypeDescription
killedBoolean (Optional Default: False)Returns tasks depending on whether they have been issued the kill command. False returns all tasks whether killed or not. True only returns killed tasks.

This returns an array list of the following in reverse chronological order from updated at.

ParameterTypeDescription
idStringThe UUID identifier for the task.
last run statusStringThe last reported status the task. Values are:
  • unknown: The task has not been started or is being prepared.
  • ready: The task is scheduled to execute.
  • running: The task has started.
  • failure: The task failed.
  • success: The task completed.
typeStringThe type of the task. Values are:
  • Temporary Run: The task runs once then stop.
  • Scheduled Run: The task repeats on a cron like schedule.
  • Service Run: The task runs as a service and executes when its service port is activated.
activeBooleanTrue: The task is scheduled or running. False: The task has completed or has been issued the kill command.
scheduleStringThe cron style schedule for the task. If the task is not a scheduled one, then the schedule will be -.
created atDateTimeThe date and time the task was started.
updated atDateTimeThe date and time the task was updated.

For example:

wl.list_tasks()
idnamelast run statustypeactiveschedulecreated atupdated at
f0e27d6a-6a98-4d26-b240-266f08560c48house price run once 2runningTemporary RunTrue-2023-22-May 19:58:322023-22-May 19:58:38
36509ef8-98da-42a0-913f-e6e929dedb15house price run oncesuccessTemporary RunTrue-2023-22-May 19:56:372023-22-May 19:56:48

An individual task can be retrieved through the list_tasks() by specifying the task from the array returned. In this example, the first task listed from the list_tasks() method will be assigned to the task variable.

task = wl.list_tasks()[0]

Get Task Status

The status of a task is retrieved through the Task status() method and returns the following.

ParameterTypeDescription
statusStringThe current status of the task. Values are:
  1. pending: The task has not been started or is being prepared.
  2. started: The task has started to execute.
display(task2.status())
'started'

Kill a Task

Killing a task removes the schedule or removes it from a service. Tasks are killed with the Task kill() method, and returns a message with the status of the kill procedure.

Note that a Task set to Run Scheduled will generate a new Task Run each time the schedule parameters are met until the Task is killed. A Task set to Run Once will generate only one Task Run, so does not need to be killed.

task2.kill()

<ArbexStatus.PENDING_KILL: 'pending_kill'>

Task Runs

Task Runs are generated from a Task. If the Task is Run Once, then only one Task Run is generated. If the Task is a Run Scheduled task, then a new Task Run will be created each time the schedule parameters are met, with each Task Run having its own results and logs.

Task Last Runs History

The history of a task, which each deployment of the task is known as a task run is retrieved with the Task last_runs method that takes the following arguments.

ParameterTypeDescription
statusString (Optional *Default: all)Filters the task history by the status. If all, returns all statuses. Status values are:
  • running: The task has started.
  • failure: The task failed.
  • success: The task completed.
limitInteger (Optional)Limits the number of task runs returned.

This returns the following in reverse chronological order by updated at.

ParameterTypeDescription
task idStringTask id in UUID format.
pod idStringPod id in UUID format.
statusStringStatus of the task. Status values are:
  • running: The task has started.
  • failure: The task failed.
  • success: The task completed.
created atDateTimeDate and time the task was created at.
updated atDateTimeDate and time the task was updated.
task.last_runs()
task idpod idstatuscreated atupdated at
f0e27d6a-6a98-4d26-b240-266f08560c487d9d73d5-df11-44ed-90c1-db0e64c7f9b8success2023-22-May 19:58:352023-22-May 19:58:35

Task Run Logs

The output of a task is displayed with the Task Run logs() method that takes the following parameters.

ParameterTypeDescription
limitInteger (Optional)Limits the lines returned from the task run log. The limit parameter is based on the log tail - starting from the last line of the log file, then working up until the limit of lines is reached. This is useful for viewing final outputs, exceptions, etc.

The Task Run logs() returns the log entries as a string list, with each entry as an item in the list.

  • IMPORTANT NOTE: It may take around a minute for task run logs to be integrated into the Wallaroo log database.
# give time for the task to complete and the log files entered
time.sleep(60)
recent_run = task.last_runs()[0]
display(recent_run.logs())
2023-22-May 19:59:29 Getting the pipeline orchestrationpipelinetgiq
2023-22-May 19:59:29 Getting arrow table file
2023-22-May 19:59:29 Inference time.  Displaying results after.
2023-22-May 19:59:29 pyarrow.Table
2023-22-May 19:59:29 time: timestamp[ms]
2023-22-May 19:59:29 in.tensor: list<item: float> not null
2023-22-May 19:59:29   child 0, item: float
2023-22-May 19:59:29 out.variable: list<inner: float not null> not null
2023-22-May 19:59:29 check_failures: int8
2023-22-May 19:59:29   child 0, inner: float not null
2023-22-May 19:59:29 ----
2023-22-May 19:59:29 time: [[2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,...,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767]]
2023-22-May 19:59:29 in.tensor: [[[4,2.5,2900,5505,2,...,2970,5251,12,0,0],[2,2.5,2170,6361,1,...,2310,7419,6,0,0],...,[3,1.75,2910,37461,1,...,2520,18295,47,0,0],[3,2,2005,7000,1,...,1750,4500,34,0,0]]]
2023-22-May 19:59:29 check_failures: [[0,0,0,0,0,...,0,0,0,0,0]]
2023-22-May 19:59:29 out.variable: [[[718013.75],[615094.56],...,[706823.56],[581003]]]</code></pre>

2.8 - Wallaroo SDK Essentials Guide: Tag Management

How to create and manage Wallaroo Tags through the Wallaroo SDK

Wallaroo SDK Tag Management

Tags are applied to either model versions or pipelines. This allows organizations to track different versions of models, and search for what pipelines have been used for specific purposes such as testing versus production use.

Create Tag

Tags are created with the Wallaroo client command create_tag(String tagname). This creates the tag and makes it available for use.

The tag will be saved to the variable currentTag to be used in the rest of these examples.

# Now we create our tag
currentTag = wl.create_tag("My Great Tag")

List Tags

Tags are listed with the Wallaroo client command list_tags(), which shows all tags and what models and pipelines they have been assigned to.

# List all tags

wl.list_tags()
idtagmodelspipelines
1My Great Tag[('tagtestmodel', ['70169e97-fb7e-4922-82ba-4f5d37e75253'])][]

Wallaroo Pipeline Tag Management

Tags are used with pipelines to track different pipelines that are built or deployed with different features or functions.

Add Tag to Pipeline

Tags are added to a pipeline through the Wallaroo Tag add_to_pipeline(pipeline_id) method, where pipeline_id is the pipeline’s integer id.

For this example, we will add currentTag to testtest_pipeline, then verify it has been added through the list_tags command and list_pipelines command.

# add this tag to the pipeline
currentTag.add_to_pipeline(tagtest_pipeline.id())
{'pipeline_pk_id': 1, 'tag_pk_id': 1}

Search Pipelines by Tag

Pipelines can be searched through the Wallaroo Client search_pipelines(search_term) method, where search_term is a string value for tags assigned to the pipelines.

In this example, the text “My Great Tag” that corresponds to currentTag will be searched for and displayed.

wl.search_pipelines('My Great Tag')
nameversioncreation_timelast_updated_timedeployedtagssteps
tagtestpipeline5a4ff3c7-1a2d-4b0a-ad9f-78941e6f56772022-29-Nov 17:15:212022-29-Nov 17:15:21(unknown)My Great Tag

Remove Tag from Pipeline

Tags are removed from a pipeline with the Wallaroo Tag remove_from_pipeline(pipeline_id) command, where pipeline_id is the integer value of the pipeline’s id.

For this example, currentTag will be removed from tagtest_pipeline. This will be verified through the list_tags and search_pipelines command.

## remove from pipeline
currentTag.remove_from_pipeline(tagtest_pipeline.id())
{'pipeline_pk_id': 1, 'tag_pk_id': 1}

Wallaroo Model Tag Management

Tags are used with models to track differences in model versions.

Assign Tag to a Model

Tags are assigned to a model through the Wallaroo Tag add_to_model(model_id) command, where model_id is the model’s numerical ID number. The tag is applied to the most current version of the model.

For this example, the currentTag will be applied to the tagtest_model. All tags will then be listed to show it has been assigned to this model.

# add tag to model

currentTag.add_to_model(tagtest_model.id())
{'model_id': 1, 'tag_id': 1}

Search Models by Tag

Model versions can be searched via tags using the Wallaroo Client method search_models(search_term), where search_term is a string value. All models versions containing the tag will be displayed. In this example, we will be using the text from our tag to list all models that have the text from currentTag in them.

# Search models by tag

wl.search_models('My Great Tag')
nameversionfile_nameimage_pathlast_update_time
tagtestmodel70169e97-fb7e-4922-82ba-4f5d37e75253ccfraud.onnxNone2022-11-29 17:15:21.703465+00:00

Remove Tag from Model

Tags are removed from models using the Wallaroo Tag remove_from_model(model_id) command.

In this example, the currentTag will be removed from tagtest_model. A list of all tags will be shown with the list_tags command, followed by searching the models for the tag to verify it has been removed.

### remove tag from model

currentTag.remove_from_model(tagtest_model.id())
{'model_id': 1, 'tag_id': 1}

2.9 - Wallaroo SDK Essentials Guide: Assays Management

How to create and manage Wallaroo Assays through the Wallaroo SDK

Model Insights and Interactive Analysis Introduction

Wallaroo provides the ability to perform interactive analysis so organizations can explore the data from a pipeline and learn how the data is behaving. With this information and the knowledge of your particular business use case you can then choose appropriate thresholds for persistent automatic assays as desired.

  • IMPORTANT NOTE

    Model insights operates over time and is difficult to demo in a notebook without pre-canned data. We assume you have an active pipeline that has been running and making predictions over time and show you the code you may use to analyze your pipeline.

Monitoring tasks called assays monitors a model’s predictions or the data coming into the model against an established baseline. Changes in the distribution of this data can be an indication of model drift, or of a change in the environment that the model trained for. This can provide tips on whether a model needs to be retrained or the environment data analyzed for accuracy or other needs.

Assay Details

Assays contain the following attributes:

AttributeDefaultDescription
Name The name of the assay. Assay names must be unique.
Baseline Data Data that is known to be “typical” (typically distributed) and can be used to determine whether the distribution of new data has changed.
ScheduleEvery 24 hours at 1 AMNew assays are configured to run a new analysis for every 24 hours starting at the end of the baseline period. This period can be configured through the SDK.
Group ResultsDailyGroups assay results into groups based on either Daily (the default), Weekly, or Monthly.
MetricPSIPopulation Stability Index (PSI) is an entropy-based measure of the difference between distributions. Maximum Difference of Bins measures the maximum difference between the baseline and current distributions (as estimated using the bins). Sum of the difference of bins sums up the difference of occurrences in each bin between the baseline and current distributions.
Threshold0.1The threshold for deciding whether the difference between distributions, as evaluated by the above metric, is large (the distributions are different) or small (the distributions are similar). The default of 0.1 is generally a good threshold when using PSI as the metric.
Number of Bins5Sets the number of bins that will be used to partition the baseline data for comparison against how future data falls into these bins. By default, the binning scheme is percentile (quantile) based. The binning scheme can be configured (see Bin Mode, below). Note that the total number of bins will include the set number plus the left_outlier and the right_outlier, so the total number of bins will be the total set + 2.
Bin ModeQuantileSet the binning scheme. Quantile binning defines the bins using percentile ranges (each bin holds the same percentage of the baseline data). Equal binning defines the bins using equally spaced data value ranges, like a histogram. Custom allows users to set the range of values for each bin, with the Left Outlier always starting at Min (below the minimum values detected from the baseline) and the Right Outlier always ending at Max (above the maximum values detected from the baseline).
Bin WeightEqually WeightedThe bin weights can be either set to Equally Weighted (the default) where each bin is weighted equally, or Custom where the bin weights can be adjusted depending on which are considered more important for detecting model drift.

Manage Assays via the Wallaroo SDK

List Assays

Assays are listed through the Wallaroo Client list_assays method.

wl.list_assays()
nameactivestatuswarning_thresholdalert_thresholdpipeline_name
api_assayTruecreated0.00.1housepricepipe

Interactive Baseline Runs

We can do an interactive run of just the baseline part to see how the baseline data will be put into bins. This assay uses quintiles so all 5 bins (not counting the outlier bins) have 20% of the predictions. We can see the bin boundaries along the x-axis.

baseline_run.chart()
baseline mean = 12.940910643273655
baseline median = 12.884286880493164
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = False

We can also get a dataframe with the bin/edge information.

baseline_run.baseline_bins()
b_edgesb_edge_namesb_aggregated_valuesb_aggregation
012.00left_outlier0.00Density
112.55q_200.20Density
212.81q_400.20Density
312.98q_600.20Density
413.33q_800.20Density
514.97q_1000.20Density
6infright_outlier0.00Density

The previous assay used quintiles so all of the bins had the same percentage/count of samples. To get bins that are divided equally along the range of values we can use BinMode.EQUAL.

equal_bin_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end)
equal_bin_builder.summarizer_builder.add_bin_mode(BinMode.EQUAL)
equal_baseline = equal_bin_builder.build().interactive_baseline_run()
equal_baseline.chart()
baseline mean = 12.940910643273655
baseline median = 12.884286880493164
bin_mode = Equal
aggregation = Density
metric = PSI
weighted = False

We now see very different bin edges and sample percentages per bin.

equal_baseline.baseline_bins()
b_edgesb_edge_namesb_aggregated_valuesb_aggregation
012.00left_outlier0.00Density
112.60p_1.26e10.24Density
213.19p_1.32e10.49Density
313.78p_1.38e10.22Density
414.38p_1.44e10.04Density
514.97p_1.50e10.01Density
6infright_outlier0.00Density

Interactive Assay Runs

By default the assay builder creates an assay with some good starting parameters. In particular the assay is configured to run a new analysis for every 24 hours starting at the end of the baseline period. Additionally, it sets the number of bins to 5 so creates quintiles, and sets the target iopath to "outputs 0 0" which means we want to monitor the first column of the first output/prediction.

We can do an interactive run of just the baseline part to see how the baseline data will be put into bins. This assay uses quintiles so all 5 bins (not counting the outlier bins) have 20% of the predictions. We can see the bin boundaries along the x-axis.

We then run it with interactive_run and convert it to a dataframe for easy analysis with to_dataframe.

Now lets do an interactive run of the first assay as it is configured. Interactive runs don’t save the assay to the database (so they won’t be scheduled in the future) nor do they save the assay results. Instead the results are returned after a short while for further analysis.

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end)
assay_config = assay_builder.add_run_until(last_day).build()
assay_results = assay_config.interactive_run()
assay_df = assay_results.to_dataframe()
assay_df.loc[:, ~assay_df.columns.isin(['assay_id', 'iopath', 'name', 'warning_threshold'])]
scorestartminmaxmeanmedianstdalert_thresholdstatus
00.002023-01-02T00:00:00+00:0012.0514.7112.9712.900.480.25Ok
10.092023-01-03T00:00:00+00:0012.0414.6512.9612.930.410.25Ok
20.042023-01-04T00:00:00+00:0011.8714.0212.9812.950.460.25Ok
30.062023-01-05T00:00:00+00:0011.9214.4612.9312.870.460.25Ok
40.022023-01-06T00:00:00+00:0012.0214.1512.9512.900.430.25Ok
50.032023-01-07T00:00:00+00:0012.1814.5812.9612.930.440.25Ok
60.022023-01-08T00:00:00+00:0012.0114.6012.9212.900.460.25Ok
70.042023-01-09T00:00:00+00:0012.0114.4013.0012.970.450.25Ok
80.062023-01-10T00:00:00+00:0011.9914.7912.9412.910.460.25Ok
90.022023-01-11T00:00:00+00:0011.9014.6612.9112.880.450.25Ok
100.022023-01-12T00:00:00+00:0011.9614.8212.9412.900.460.25Ok
110.032023-01-13T00:00:00+00:0012.0714.6112.9612.930.470.25Ok
120.152023-01-14T00:00:00+00:0012.0014.2013.0613.030.430.25Ok
132.922023-01-15T00:00:00+00:0012.7415.6214.0014.010.570.25Alert
147.892023-01-16T00:00:00+00:0014.6417.1915.9115.870.630.25Alert
158.872023-01-17T00:00:00+00:0016.6019.2317.9417.940.630.25Alert
168.872023-01-18T00:00:00+00:0018.6721.2920.0120.040.640.25Alert
178.872023-01-19T00:00:00+00:0020.7223.5722.1722.180.650.25Alert
188.872023-01-20T00:00:00+00:0023.0425.7224.3224.330.660.25Alert
198.872023-01-21T00:00:00+00:0025.0627.6726.4826.490.630.25Alert
208.872023-01-22T00:00:00+00:0027.2129.8928.6328.580.650.25Alert
218.872023-01-23T00:00:00+00:0029.3632.1830.8230.800.670.25Alert
228.872023-01-24T00:00:00+00:0031.5634.3532.9832.980.650.25Alert
238.872023-01-25T00:00:00+00:0033.6836.4435.1435.140.660.25Alert
248.872023-01-26T00:00:00+00:0035.9338.5137.3137.330.650.25Alert
253.692023-01-27T00:00:00+00:0012.0639.9129.2938.6512.660.25Alert
260.052023-01-28T00:00:00+00:0011.8713.8812.9212.900.380.25Ok
270.102023-01-29T00:00:00+00:0012.0214.3612.9812.960.380.25Ok
280.112023-01-30T00:00:00+00:0011.9914.4412.8912.880.370.25Ok
290.012023-01-31T00:00:00+00:0012.0014.6412.9212.890.400.25Ok

Basic functionality for creating quick charts is included.

assay_results.chart_scores()

We see that the difference scores are low for a while and then jump up to indicate there is an issue. We can examine that particular window to help us decide if that threshold is set correctly or not.

We can generate a quick chart of the results. This chart shows the 5 quantile bins (quintiles) derived from the baseline data plus one for left outliers and one for right outliers. We also see that the data from the window falls within the baseline quintiles but in a different proportion and is skewing higher. Whether this is an issue or not is specific to your use case.

First lets examine a day that is only slightly different than the baseline. We see that we do see some values that fall outside of the range from the baseline values, the left and right outliers, and that the bin values are different but similar.

assay_results[0].chart()
baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = False
score = 0.0029273068646199748
scores = [0.0, 0.000514261205558409, 0.0002139202456922972, 0.0012617897456473992, 0.0002139202456922972, 0.0007234154220295724, 0.0]
index = None

Other days, however are significantly different.

assay_results[12].chart()
baseline mean = 12.940910643273655
window mean = 13.06380216891949
baseline median = 12.884286880493164
window median = 13.027600288391112
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = False
score = 0.15060511096978788
scores = [4.6637149189075455e-05, 0.05969428191167242, 0.00806617426854112, 0.008316273402678306, 0.07090885609902021, 0.003572888138686759, 0.0]
index = None
assay_results[13].chart()
baseline mean = 12.940910643273655
window mean = 14.004728427908038
baseline median = 12.884286880493164
window median = 14.009637832641602
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = False
score = 2.9220486095961196
scores = [0.0, 0.7090936334784107, 0.7130482300184766, 0.33500731896676245, 0.12171058214520876, 0.9038825518183468, 0.1393062931689142]
index = None

If we want to investigate further, we can run interactive assays on each of the inputs to see if any of them show anything abnormal. In this example we’ll provide the feature labels to create more understandable titles.

The current assay expects continuous data. Sometimes categorical data is encoded as 1 or 0 in a feature and sometimes in a limited number of values such as 1, 2, 3. If one value has high a percentage the analysis emits a warning so that we know the scores for that feature may not behave as we expect.

labels = ['bedrooms', 'bathrooms', 'lat', 'long', 'waterfront', 'sqft_living', 'sqft_lot', 'floors', 'view', 'condition', 'grade', 'sqft_above', 'sqft_basement', 'yr_built', 'yr_renovated', 'sqft_living15', 'sqft_lot15']

topic = wl.get_topic_name(pipeline.id())

all_inferences = wl.get_raw_pipeline_inference_logs(topic, baseline_start, last_day, model_name, limit=1_000_000)

assay_builder = wl.build_assay("Input Assay", pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_builder.window_builder().add_width(hours=4)
assay_config = assay_builder.build()
assay_results = assay_config.interactive_input_run(all_inferences, labels)
iadf = assay_results.to_dataframe()
display(iadf.loc[:, ~iadf.columns.isin(['assay_id', 'iopath', 'name', 'warning_threshold'])])
column distinct_vals label           largest_pct
     0            17 bedrooms        0.4244 
     1            44 bathrooms       0.2398 
     2          3281 lat             0.0014 
     3           959 long            0.0066 
     4             4 waterfront      0.9156 *** May not be continuous feature
     5          3901 sqft_living     0.0032 
     6          3487 sqft_lot        0.0173 
     7            11 floors          0.4567 
     8            10 view            0.8337 
     9             9 condition       0.5915 
    10            19 grade           0.3943 
    11           745 sqft_above      0.0096 
    12           309 sqft_basement   0.5582 
    13           224 yr_built        0.0239 
    14            77 yr_renovated    0.8889 
    15           649 sqft_living15   0.0093 
    16          3280 sqft_lot15      0.0199 
scorestartminmaxmeanmedianstdalert_thresholdstatus
00.192023-01-02T00:00:00+00:00-2.541.750.210.680.990.25Ok
10.032023-01-02T04:00:00+00:00-1.472.820.21-0.400.950.25Ok
20.092023-01-02T08:00:00+00:00-2.543.89-0.04-0.401.220.25Ok
30.052023-01-02T12:00:00+00:00-1.472.82-0.12-0.400.940.25Ok
40.082023-01-02T16:00:00+00:00-1.471.75-0.00-0.400.760.25Ok
..............................
30550.082023-01-31T04:00:00+00:00-0.424.870.25-0.171.130.25Ok
30560.582023-01-31T08:00:00+00:00-0.432.01-0.04-0.210.480.25Alert
30570.132023-01-31T12:00:00+00:00-0.327.750.30-0.201.570.25Ok
30580.262023-01-31T16:00:00+00:00-0.435.880.19-0.181.170.25Alert
30590.842023-01-31T20:00:00+00:00-0.400.52-0.17-0.250.180.25Alert

3060 rows × 9 columns

We can chart each of the iopaths and do a visual inspection. From the charts we see that if any of the input features had significant differences in the first two days which we can choose to inspect further. Here we choose to show 3 charts just to save space in this notebook.

assay_results.chart_iopaths(labels=labels, selected_labels=['bedrooms', 'lat', 'sqft_living'])

When we are comfortable with what alert threshold should be for our specific purposes we can create and save an assay that will be automatically run on a daily basis.

In this example we’re create an assay that runs everyday against the baseline and has an alert threshold of 0.5.

Once we upload it it will be saved and scheduled for future data as well as run against past data.

alert_threshold = 0.5
import string
import random

prefix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))

assay_name = f"{prefix}example assay"
assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_alert_threshold(alert_threshold)
assay_id = assay_builder.upload()

After a short while, we can get the assay results for further analysis.

When we get the assay results, we see that the assays analysis is similar to the interactive run we started with though the analysis for the third day does not exceed the new alert threshold we set. And since we called upload instead of interactive_run the assay was saved to the system and will continue to run automatically on schedule from now on.

Scheduling Assays

By default assays are scheduled to run every 24 hours starting immediately after the baseline period ends.

However, you can control the start time by setting start and the frequency by setting interval on the window.

So to recap:

  • The window width is the size of the window. The default is 24 hours.
  • The interval is how often the analysis is run, how far the window is slid into the future based on the last run. The default is the window width.
  • The window start is when the analysis should start. The default is the end of the baseline period.

For example to run an analysis every 12 hours on the previous 24 hours of data you’d set the window width to 24 (the default) and the interval to 12.

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end)
assay_builder = assay_builder.add_run_until(last_day)

assay_builder.window_builder().add_width(hours=24).add_interval(hours=12)

assay_config = assay_builder.build()

assay_results = assay_config.interactive_run()
print(f"Generated {len(assay_results)} analyses")
Generated 59 analyses
assay_results.chart_scores()

To start a weekly analysis of the previous week on a specific day, set the start date (taking care to specify the desired timezone), and the width and interval to 1 week and of course an analysis won’t be generated till a window is complete.

report_start = datetime.datetime.fromisoformat('2022-01-03T00:00:00+00:00')

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end)
assay_builder = assay_builder.add_run_until(last_day)

assay_builder.window_builder().add_width(weeks=1).add_interval(weeks=1).add_start(report_start)

assay_config = assay_builder.build()

assay_results = assay_config.interactive_run()
print(f"Generated {len(assay_results)} analyses")
Generated 5 analyses
assay_results.chart_scores()

Advanced Configuration

The assay can be configured in a variety of ways to help customize it to your particular needs. Specifically you can:

  • change the BinMode to evenly spaced, quantile or user provided
  • change the number of bins to use
  • provide weights to use when scoring the bins
  • calculate the score using the sum of differences, maximum difference or population stability index
  • change the value aggregation for the bins to density, cumulative or edges

Lets take a look at these in turn.

Default configuration

First lets look at the default configuration. This is a lot of information but much of it is useful to know where it is available.

We see that the assay is broken up into 4 sections. A top level meta data section, a section for the baseline specification, a section for the window specification and a section that specifies the summarization configuration.

In the meta section we see the name of the assay, that it runs on the first column of the first output "outputs 0 0" and that there is a default threshold of 0.25.

The summarizer section shows us the defaults of Quantile, Density and PSI on 5 bins.

The baseline section shows us that it is configured as a fixed baseline with the specified start and end date times.

And the window tells us what model in the pipeline we are analyzing and how often.

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
print(assay_builder.build().to_json())
{
    "name": "onmyexample assay",
    "pipeline_id": 1,
    "pipeline_name": "housepricepipe",
    "active": true,
    "status": "created",
    "iopath": "output dense_2 0",
    "baseline": {
        "Fixed": {
            "pipeline": "housepricepipe",
            "model": "housepricemodel",
            "start_at": "2023-01-01T00:00:00+00:00",
            "end_at": "2023-01-02T00:00:00+00:00"
        }
    },
    "window": {
        "pipeline": "housepricepipe",
        "model": "housepricemodel",
        "width": "24 hours",
        "start": null,
        "interval": null
    },
    "summarizer": {
        "type": "UnivariateContinuous",
        "bin_mode": "Quantile",
        "aggregation": "Density",
        "metric": "PSI",
        "num_bins": 5,
        "bin_weights": null,
        "bin_width": null,
        "provided_edges": null,
        "add_outlier_edges": true
    },
    "warning_threshold": null,
    "alert_threshold": 0.25,
    "run_until": "2023-02-01T00:00:00+00:00",
    "workspace_id": 5
}

Defaults

We can run the assay interactively and review the first analysis. The method compare_basic_stats gives us a dataframe with basic stats for the baseline and window data.

assay_results = assay_builder.build().interactive_run()
ar = assay_results[0]

ar.compare_basic_stats()
BaselineWindowdiffpct_diff
count182.00181.00-1.00-0.55
min12.0012.050.040.36
max14.9714.71-0.26-1.71
mean12.9412.970.030.22
median12.8812.900.010.12
std0.450.480.035.68
start2023-01-01T00:00:00+00:002023-01-02T00:00:00+00:00NaNNaN
end2023-01-02T00:00:00+00:002023-01-03T00:00:00+00:00NaNNaN

The method compare_bins gives us a dataframe with the bin information. Such as the number of bins, the right edges, suggested bin/edge names and the values for each bin in the baseline and the window.

assay_bins = ar.compare_bins()
display(assay_bins.loc[:, assay_bins.columns!='w_aggregation'])
b_edgesb_edge_namesb_aggregated_valuesb_aggregationw_edgesw_edge_namesw_aggregated_valuesdiff_in_pcts
012.00left_outlier0.00Density12.00left_outlier0.000.00
112.55q_200.20Density12.55e_1.26e10.19-0.01
212.81q_400.20Density12.81e_1.28e10.210.01
312.98q_600.20Density12.98e_1.30e10.18-0.02
413.33q_800.20Density13.33e_1.33e10.210.01
514.97q_1000.20Density14.97e_1.50e10.210.01
6NaNright_outlier0.00DensityNaNright_outlier0.000.00

We can also plot the chart to visualize the values of the bins.

ar.chart()
baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = False
score = 0.0029273068646199748
scores = [0.0, 0.000514261205558409, 0.0002139202456922972, 0.0012617897456473992, 0.0002139202456922972, 0.0007234154220295724, 0.0]
index = None

Binning Mode

We can change the bin mode algorithm to equal and see that the bins/edges are partitioned at different points and the bins have different values.

prefix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))

assay_name = f"{prefix}example assay"

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_builder.summarizer_builder.add_bin_mode(BinMode.EQUAL)
assay_results = assay_builder.build().interactive_run()
assay_results_df = assay_results[0].compare_bins()
display(assay_results_df.loc[:, ~assay_results_df.columns.isin(['b_aggregation', 'w_aggregation'])])
assay_results[0].chart()
b_edgesb_edge_namesb_aggregated_valuesw_edgesw_edge_namesw_aggregated_valuesdiff_in_pcts
012.00left_outlier0.0012.00left_outlier0.000.00
112.60p_1.26e10.2412.60e_1.26e10.240.00
213.19p_1.32e10.4913.19e_1.32e10.48-0.02
313.78p_1.38e10.2213.78e_1.38e10.22-0.00
414.38p_1.44e10.0414.38e_1.44e10.060.02
514.97p_1.50e10.0114.97e_1.50e10.010.00
6NaNright_outlier0.00NaNright_outlier0.000.00
baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Equal
aggregation = Density
metric = PSI
weighted = False
score = 0.011074287819376092
scores = [0.0, 7.3591419975306595e-06, 0.000773779195360713, 8.538514991838585e-05, 0.010207597078872246, 1.6725322721660374e-07, 0.0]
index = None

User Provided Bin Edges

The values in this dataset run from ~11.6 to ~15.81. And lets say we had a business reason to use specific bin edges. We can specify them with the BinMode.PROVIDED and specifying a list of floats with the right hand / upper edge of each bin and optionally the lower edge of the smallest bin. If the lowest edge is not specified the threshold for left outliers is taken from the smallest value in the baseline dataset.

edges = [11.0, 12.0, 13.0, 14.0, 15.0, 16.0]
assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_builder.summarizer_builder.add_bin_mode(BinMode.PROVIDED, edges)
assay_results = assay_builder.build().interactive_run()
assay_results_df = assay_results[0].compare_bins()
display(assay_results_df.loc[:, ~assay_results_df.columns.isin(['b_aggregation', 'w_aggregation'])])
assay_results[0].chart()
b_edgesb_edge_namesb_aggregated_valuesw_edgesw_edge_namesw_aggregated_valuesdiff_in_pcts
011.00left_outlier0.0011.00left_outlier0.000.00
112.00e_1.20e10.0012.00e_1.20e10.000.00
213.00e_1.30e10.6213.00e_1.30e10.59-0.03
314.00e_1.40e10.3614.00e_1.40e10.35-0.00
415.00e_1.50e10.0215.00e_1.50e10.060.03
516.00e_1.60e10.0016.00e_1.60e10.000.00
6NaNright_outlier0.00NaNright_outlier0.000.00
baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Provided
aggregation = Density
metric = PSI
weighted = False
score = 0.0321620386600679
scores = [0.0, 0.0, 0.0014576920813015586, 3.549754401142936e-05, 0.030668849034754912, 0.0, 0.0]
index = None

Number of Bins

We could also choose to a different number of bins, lets say 10, which can be evenly spaced or based on the quantiles (deciles).

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_builder.summarizer_builder.add_bin_mode(BinMode.QUANTILE).add_num_bins(10)
assay_results = assay_builder.build().interactive_run()
assay_results_df = assay_results[1].compare_bins()
display(assay_results_df.loc[:, ~assay_results_df.columns.isin(['b_aggregation', 'w_aggregation'])])
assay_results[1].chart()
b_edgesb_edge_namesb_aggregated_valuesw_edgesw_edge_namesw_aggregated_valuesdiff_in_pcts
012.00left_outlier0.0012.00left_outlier0.000.00
112.41q_100.1012.41e_1.24e10.09-0.00
212.55q_200.1012.55e_1.26e10.04-0.05
312.72q_300.1012.72e_1.27e10.140.03
412.81q_400.1012.81e_1.28e10.05-0.05
512.88q_500.1012.88e_1.29e10.120.02
612.98q_600.1012.98e_1.30e10.09-0.01
713.15q_700.1013.15e_1.32e10.180.08
813.33q_800.1013.33e_1.33e10.140.03
913.47q_900.1013.47e_1.35e10.07-0.03
1014.97q_1000.1014.97e_1.50e10.08-0.02
11NaNright_outlier0.00NaNright_outlier0.000.00
baseline mean = 12.940910643273655
window mean = 12.956829186961135
baseline median = 12.884286880493164
window median = 12.929338455200195
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = False
score = 0.16591076620684958
scores = [0.0, 0.0002571306027792045, 0.044058279699182114, 0.009441459631493015, 0.03381618572319047, 0.0027335446937028877, 0.0011792419836838435, 0.051023062424253904, 0.009441459631493015, 0.008662563542113508, 0.0052978382749576496, 0.0]
index = None

Bin Weights

Now lets say we only care about differences at the higher end of the range. We can use weights to specify that difference in the lower bins should not be counted in the score.

If we stick with 10 bins we can provide 10 a vector of 12 weights. One weight each for the original bins plus one at the front for the left outlier bin and one at the end for the right outlier bin.

Note we still show the values for the bins but the scores for the lower 5 and left outlier are 0 and only the right half is counted and reflected in the score.

weights = [0] * 6
weights.extend([1] * 6)
print("Using weights: ", weights)
assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_builder.summarizer_builder.add_bin_mode(BinMode.QUANTILE).add_num_bins(10).add_bin_weights(weights)
assay_results = assay_builder.build().interactive_run()
assay_results_df = assay_results[1].compare_bins()
display(assay_results_df.loc[:, ~assay_results_df.columns.isin(['b_aggregation', 'w_aggregation'])])
assay_results[1].chart()
Using weights:  [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
b_edgesb_edge_namesb_aggregated_valuesw_edgesw_edge_namesw_aggregated_valuesdiff_in_pcts
012.00left_outlier0.0012.00left_outlier0.000.00
112.41q_100.1012.41e_1.24e10.09-0.00
212.55q_200.1012.55e_1.26e10.04-0.05
312.72q_300.1012.72e_1.27e10.140.03
412.81q_400.1012.81e_1.28e10.05-0.05
512.88q_500.1012.88e_1.29e10.120.02
612.98q_600.1012.98e_1.30e10.09-0.01
713.15q_700.1013.15e_1.32e10.180.08
813.33q_800.1013.33e_1.33e10.140.03
913.47q_900.1013.47e_1.35e10.07-0.03
1014.97q_1000.1014.97e_1.50e10.08-0.02
11NaNright_outlier0.00NaNright_outlier0.000.00
baseline mean = 12.940910643273655
window mean = 12.956829186961135
baseline median = 12.884286880493164
window median = 12.929338455200195
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = True
score = 0.012600694309416988
scores = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00019654033061397393, 0.00850384373737565, 0.0015735766052488358, 0.0014437605903522511, 0.000882973045826275, 0.0]
index = None

Metrics

The score is a distance or dis-similarity measure. The larger it is the less similar the two distributions are. We currently support
summing the differences of each individual bin, taking the maximum difference and a modified Population Stability Index (PSI).

The following three charts use each of the metrics. Note how the scores change. The best one will depend on your particular use case.

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_results = assay_builder.build().interactive_run()
assay_results[0].chart()
baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = False
score = 0.0029273068646199748
scores = [0.0, 0.000514261205558409, 0.0002139202456922972, 0.0012617897456473992, 0.0002139202456922972, 0.0007234154220295724, 0.0]
index = None
assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_builder.summarizer_builder.add_metric(Metric.SUMDIFF)
assay_results = assay_builder.build().interactive_run()
assay_results[0].chart()
baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Density
metric = SumDiff
weighted = False
score = 0.025438649748041997
scores = [0.0, 0.009956893934794486, 0.006648048084512165, 0.01548175581324751, 0.006648048084512165, 0.012142553579017668, 0.0]
index = None
assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_builder.summarizer_builder.add_metric(Metric.MAXDIFF)
assay_results = assay_builder.build().interactive_run()
assay_results[0].chart()
baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Density
metric = MaxDiff
weighted = False
score = 0.01548175581324751
scores = [0.0, 0.009956893934794486, 0.006648048084512165, 0.01548175581324751, 0.006648048084512165, 0.012142553579017668, 0.0]
index = 3

Aggregation Options

Also, bin aggregation can be done in histogram Aggregation.DENSITY style (the default) where we count the number/percentage of values that fall in each bin or Empirical Cumulative Density Function style Aggregation.CUMULATIVE where we keep a cumulative count of the values/percentages that fall in each bin.

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_builder.summarizer_builder.add_aggregation(Aggregation.DENSITY)
assay_results = assay_builder.build().interactive_run()
assay_results[0].chart()
baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = False
score = 0.0029273068646199748
scores = [0.0, 0.000514261205558409, 0.0002139202456922972, 0.0012617897456473992, 0.0002139202456922972, 0.0007234154220295724, 0.0]
index = None
assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_builder.summarizer_builder.add_aggregation(Aggregation.CUMULATIVE)
assay_results = assay_builder.build().interactive_run()
assay_results[0].chart()
baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Cumulative
metric = PSI
weighted = False
score = 0.04419889502762442
scores = [0.0, 0.009956893934794486, 0.0033088458502823492, 0.01879060166352986, 0.012142553579017725, 0.0, 0.0]
index = None

2.10 - Wallaroo SDK Essentials Guide: Inference Management

How to use Wallaroo SDK for inferences

Inferences are performed on deployed pipelines. This submits data to the pipeline, where it is processed through each of the pipeline’s steps with the output of the previous step providing the input for the next step. The final step will then output the result of all of the pipeline’s steps.

  • Inputs are either sent one of the following:

Apache Arrow is the recommended method of data inputs for inferences. Wallaroo inference data is based on Apache Arrow, which will return the fastest inference results and smaller data transfer amounts on average than JSON or DataFrame tables. Arrow tables also specify the data types used in their schema, insuring that the data sent and receives are exactly what is required. Using pandas DataFrame requires inferring the data type which may lead to data type mismatch issues.

For a complete example of using the Wallaroo SDK for inferencing, see the Wallaroo SDK Inference Tutorial.

Run Inference through Local Variable

The pipeline infer(data, timeout, dataset, dataset_exclude, dataset_separator) method performs an inference as defined by the pipeline steps and takes the following arguments:

  • data (REQUIRED): The data submitted to the pipeline for inference. The following data inputs are supported:
    • pandas.DataFrame: Data submitted as a pandas DataFrame are returned as a pandas DataFrame. For models that output one column based on the models outputs.
    • Apache Arrow (Preferred): Data submitted as an Apache Arrow are returned as an Apache Arrow.
  • timeout (OPTIONAL): A timeout in seconds before the inference throws an exception. The default is 15 second per call to accommodate large, complex models. Note that for a batch inference, this is per call - with 10 inference requests, each would have a default timeout of 15 seconds.
  • dataset (OPTIONAL): The datasets to be returned. The datasets available are:
    • *: Default. This translates to ["time", "in", "out", "check_failures"].
    • time: The DateTime of the inference request.
    • in: All inputs listed as in_{variable_name}.
    • out: All outputs listed as out_variable_name.
    • check_failures: Flags whether an Anomaly or Validation Check was triggered. 0 indicates no checks were triggers, 1 or greater indicates a check was triggered.
    • meta: IMPORTANT NOTE: See Metadata Requests Restrictions for specifications on how to use meta or metadata dataset requests in combination with other fields.
      • Returns in the metadata.elapsed field:
        • A list of time in nanoseconds for:
          • The time to serialize the input.
          • How long each step took.
      • Returns in the metadata.last_model field:
        • A dict with each Python step as:
        • model_name: The name of the model in the pipeline step.
        • model_sha : The sha hash of the model in the pipeline step.
      • Returns in the metadata.pipeline_version field:
        • The pipeline version as a UUID value.
    • metadata.elapsed: See Metadata Requests Restrictions for specifications on how to use meta or metadata dataset requests in combination with other fields.
      • Returns in the metadata.elapsed field:
        • A list of time in nanoseconds for:
          • The time to serialize the input.
          • How long each step took.
  • dataset_exclude (OPTIONAL): Allows users to exclude parts of the dataset.
  • dataset_separator (OPTIONAL): Allows other types of dataset separators to be used. If set to “.”, the returned dataset will be flattened.

Outputs of the inference are based on the model’s outputs as out.{model_output}. This model only has one output - dense_1, which is listed in the out.dense_1 column. If the model has multiple outputs, they would be listed as out.output1, out.output2, etc.

The following example is an inference request using an Apache Arrow table. The inference result is returned as an Apache Arrow table, which is then converted into a Pandas DataFrame and a Polars DataFrame, with the results filtered based on results greater than 0.75.

result = ccfraud_pipeline.infer(ccfraud_input_1k_arrow_table)

display(result)

pyarrow.Table
time: timestamp[ms]
in.tensor: list<item: float> not null
  child 0, item: float
out.dense_1: list<inner: float not null> not null
  child 0, inner: float not null
check_failures: int8
----
time: [[2023-03-20 18:55:09.562,2023-03-20 18:55:09.562,2023-03-20 18:55:09.562,2023-03-20 18:55:09.562,2023-03-20 18:55:09.562,...,2023-03-20 18:55:09.562,2023-03-20 18:55:09.562,2023-03-20 18:55:09.562,2023-03-20 18:55:09.562,2023-03-20 18:55:09.562]]
in.tensor: [[[-1.0603298,2.3544967,-3.5638788,5.138735,-1.2308457,...,0.038412016,1.0993439,1.2603409,-0.14662448,-1.4463212],[-1.0603298,2.3544967,-3.5638788,5.138735,-1.2308457,...,0.038412016,1.0993439,1.2603409,-0.14662448,-1.4463212],...,[0.49511018,-0.24993694,0.4553345,0.92427504,-0.36435103,...,1.1117147,-0.566654,0.12122019,0.06676402,0.6583282],[0.61188054,0.1726081,0.43105456,0.50321484,-0.27466634,...,0.30260187,0.081211455,-0.15578508,0.017189292,-0.7236631]]]
out.dense_1: [[[0.99300325],[0.99300325],...,[0.0008533001],[0.0012498498]]]
check_failures: [[0,0,0,0,0,...,0,0,0,0,0]]
import pyarrow as pa

list = [0.75]

outputs =  result.to_pandas()
# display(outputs)
filter = [elt[0] > 0.75 for elt in outputs['out.dense_1']]
outputs = outputs.loc[filter]
display(outputs)
&nbsptimein.tensorout.dense_1check_failures
02023-03-20 18:55:09.562[-1.0603298, 2.3544967, -3.5638788, 5.138735, -1.2308457, -0.76878244, -3.5881228, 1.8880838, -3.2789674, -3.9563255, 4.099344, -5.653918, -0.8775733, -9.131571, -0.6093538, -3.7480276, -5.0309124, -0.8748149, 1.9870535, 0.7005486, 0.9204423, -0.10414918, 0.32295644, -0.74181414, 0.038412016, 1.0993439, 1.2603409, -0.14662448, -1.4463212][0.99300325]0
12023-03-20 18:55:09.562[-1.0603298, 2.3544967, -3.5638788, 5.138735, -1.2308457, -0.76878244, -3.5881228, 1.8880838, -3.2789674, -3.9563255, 4.099344, -5.653918, -0.8775733, -9.131571, -0.6093538, -3.7480276, -5.0309124, -0.8748149, 1.9870535, 0.7005486, 0.9204423, -0.10414918, 0.32295644, -0.74181414, 0.038412016, 1.0993439, 1.2603409, -0.14662448, -1.4463212][0.99300325]0
22023-03-20 18:55:09.562[-1.0603298, 2.3544967, -3.5638788, 5.138735, -1.2308457, -0.76878244, -3.5881228, 1.8880838, -3.2789674, -3.9563255, 4.099344, -5.653918, -0.8775733, -9.131571, -0.6093538, -3.7480276, -5.0309124, -0.8748149, 1.9870535, 0.7005486, 0.9204423, -0.10414918, 0.32295644, -0.74181414, 0.038412016, 1.0993439, 1.2603409, -0.14662448, -1.4463212][0.99300325]0
32023-03-20 18:55:09.562[-1.0603298, 2.3544967, -3.5638788, 5.138735, -1.2308457, -0.76878244, -3.5881228, 1.8880838, -3.2789674, -3.9563255, 4.099344, -5.653918, -0.8775733, -9.131571, -0.6093538, -3.7480276, -5.0309124, -0.8748149, 1.9870535, 0.7005486, 0.9204423, -0.10414918, 0.32295644, -0.74181414, 0.038412016, 1.0993439, 1.2603409, -0.14662448, -1.4463212][0.99300325]0
1612023-03-20 18:55:09.562[-9.716793, 9.174981, -14.450761, 8.653825, -11.039951, 0.6602411, -22.825525, -9.919395, -8.064324, -16.737926, 4.852197, -12.563343, -1.0762653, -7.524591, -3.2938414, -9.62102, -15.6501045, -7.089741, 1.7687134, 5.044906, -11.365625, 4.5987034, 4.4777045, 0.31702697, -2.2731977, 0.07944675, -10.052058, -2.024108, -1.0611985][1.0]0
9412023-03-20 18:55:09.562[-0.50492376, 1.9348029, -3.4217603, 2.2165704, -0.6545315, -1.9004827, -1.6786858, 0.5380051, -2.7229102, -5.265194, 3.504164, -5.4661765, 0.68954825, -8.725291, 2.0267954, -5.4717045, -4.9123807, -1.6131229, 3.8021576, 1.3881834, 1.0676425, 0.28200775, -0.30759808, -0.48498034, 0.9507336, 1.5118006, 1.6385275, 1.072455, 0.7959132][0.9873102]0
import polars as pl

outputs =  pl.from_arrow(result)

display(outputs.filter(pl.col("out.dense_1").apply(lambda x: x[0]) > 0.75))
timein.tensorout.dense_1check_failures
datetime[ms]list[f32]list[f32]i8
2023-03-20 18:55:09.562[-1.06033, 2.354497, … -1.446321][0.993003]0
2023-03-20 18:55:09.562[-1.06033, 2.354497, … -1.446321][0.993003]0
2023-03-20 18:55:09.562[-1.06033, 2.354497, … -1.446321][0.993003]0
2023-03-20 18:55:09.562[-1.06033, 2.354497, … -1.446321][0.993003]0
2023-03-20 18:55:09.562[-9.716793, 9.174981, … -1.061198][1.0]0
2023-03-20 18:55:09.562[-0.504924, 1.934803, … 0.795913][0.98731]0

Metadata Requests Restrictions

The following restrictions are in place when requesting the datasets metadata or metadata.elapsed.

Standard Pipeline Steps

For the following Pipeline steps, metadata or metadata.elapsed must be requested with the * parameter. For example:

result = mainpipeline.infer(normal_input, dataset=["*", "metadata.elapsed"])

Effected pipeline steps:

  • add_model_step
  • replace_with_model_step

Testing Pipeline Steps

For the following Pipeline steps, meta or metadata.elapsed can not be included with the * parameter. For example:

result = mainpipeline.infer(normal_input, dataset=["metadata.elapsed"])

Effected pipeline steps:

  • add_random_split
  • replace_with_random_split
  • add_shadow_deploy
  • replace_with_shadow_deploy

Numpy Arrays as Inputs

Numpy arrays can be submitted as an input by containing it within a DataFrame. In this example, the input column is tensor, but can whatever the model expects.

dataframedata = pd.DataFrame({"tensor":[npArray]})

This bypasses the need to convert the npArray to a List - the object itself can be embedded into the DataFrame table and submitted. For this example, a DataFrame with the column tensor that contains a numpy array will be submitted as an inference, and from the return only the column out.2519 will be displayed.

infResults = pipeline.infer(dataframedata, dataset=["*", "metadata.elapsed"])
display(infResults.loc[0]["out.2519"])

[44,
 44,
 44,
 44,
 82,
 44,
 44,
 44,
 44,
 44,
 44,
 44,
 44,
 44,
 44,
 44,
 44,
 44,
 44,
 84,
 84,
 44,
 84,
 44,
 44,
 44,
 61,
 44,
 86,
 44,
 44]

Run Inference From A File

To submit a data file directly to a pipeline, use the pipeline infer_from_file(data, timeout, dataset, dataset_exclude, dataset_separator) method. This performs an inference as defined by the pipeline steps and takes the following arguments:

  • data (REQUIRED): The name of the file submitted to the pipeline for inference.
    • pandas.DataFrame: Data submitted as a pandas DataFrame are returned as a pandas DataFrame. For models that output one column based on the models outputs.
    • Apache Arrow (Preferred): Data submitted as an Apache Arrow are returned as an Apache Arrow.
    • [Custom JSON]: Data formatted in a custom JSON format. This requires the use of the data_format="custom-json" parameter. IMPORTANT NOTE: Submitting JSON as input data can have performance repercussions compared to using either pandas DataFrame or Apache Arrow as the data input.
  • timeout (OPTIONAL): A timeout in seconds before the inference throws an exception. The default is 15 second per call to accommodate large, complex models. Note that for a batch inference, this is per call - with 10 inference requests, each would have a default timeout of 15 seconds. Inferences sent in a batch rather than individual inference requests are processed faster.
  • dataset (OPTIONAL): The datasets to be returned. By default this is set to ["*"] which returns, [“time”, “in”, “out”, “check_failures”].
  • dataset (OPTIONAL): The datasets to be returned. The datasets available are:
    • *: Default. This translates to ["time", "in", "out", "check_failures"].
    • time: The DateTime of the inference request.
    • in: All inputs listed as in_{variable_name}.
    • out: All outputs listed as out_variable_name.
    • check_failures: Flags whether an Anomaly or Validation Check was triggered. 0 indicates no checks were triggers, 1 or greater indicates a check was triggered.
    • meta:
      • Returns in the metadata.elapsed field:
        • A list of time in nanoseconds for:
          • The time to serialize the input.
          • How long each step took.
    • metadata.elapsed:
      • Returns in the metadata.elapsed field:
        • A list of time in nanoseconds for:
          • The time to serialize the input.
          • How long each step took.
      • Returns in the metadata.last_model field:
        • A dict with each Python step as:
        • model_name: The name of the model in the pipeline step.
        • model_sha : The sha hash of the model in the pipeline step.
  • data_format: If the input is custom JSON, then this parameter must be included as data_format="custom-json".
  • dataset_exclude (OPTIONAL): Allows users to exclude parts of the dataset.
  • dataset_separator (OPTIONAL): Allows other types of dataset separators to be used. If set to “.”, the returned dataset will be flattened.

In this example, an inference of 50K inferences as an Apache Arrow file will be submitted to a model trained for reviewing IMDB reviews, and the first 5 results displayed.

results = imdb_pipeline.infer_from_file('./data/test_data_50K.arrow')
import polars as pl

outputs =  pl.from_arrow(results)
display(outputs.head(5))

shape: (5, 4)
timein.tensorout.dense_1check_failures
datetime[ms]list[f32]list[f32]i8
2023-03-20 20:53:50.170[11.0, 6.0, … 0.0][0.898019]0
2023-03-20 20:53:50.170[54.0, 548.0, … 20.0][0.056597]0
2023-03-20 20:53:50.170[1.0, 9259.0, … 1.0][0.92608]0
2023-03-20 20:53:50.170[10.0, 25.0, … 0.0][0.926919]0
2023-03-20 20:53:50.170[10.0, 37.0, … 0.0][0.661858]0

In this example, an inference will be submitted to the ccfraud_pipeline with the file smoke_test.df.json, a DataFrame formatted JSON file.

result = ccfraud_pipeline.infer_from_file('./data/smoke_test.df.json')
 timein.tensorout.dense_1check_failures
02023-02-15 23:07:07.497[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756][0.0014974177]0

Parallel Inferences

Wallaroo pipelines allow for multiple replicas of the pipeline and models to be deployed. This allows for parallel inferences to increase the speed of multiple inferences requests. Wallaroo does so by scaling multiple replicas of the deployed pipeline and models based on the pipeline configuration. See Pipeline Deployment Configuration.

Parallel Inference Use Cases

Parallel inferences are most useful when:

  • Inference request inputs are extremely large - for example, greater than 4 GB. Parallen inference requests allow that request to be split into more manageable sizes and submitted in one request, with each segment split as a separate inference request automatically.
  • Inference inputs come from different data sources. This allows organizations to query data from different sources, add each query result to the list, then submit the entire list as one request and receive the results fast.
  • Image processing, where the entire image is of a extreme size and resolution where submitting the entire image requires large memory and bandwidth. The image can be resolved into separate pieces, then all the pieces submitted in one requests to allow parallelization to examine each individual piece and return the results faster than analyzing the entire large image.

It is highly recommended that the data elements included in the parallel inference List are all of the same data type. For example: all of the elements of the list should be a pandas DataFrame OR all an Apache Arrow table. This makes processing the returned information easier rather than trying to parse what type of data is received.

For example, if the parallel inference input list should be in the format:

 Data Type
0DataFrame
1DataFrame
2DataFrame
3DataFrame

And not:

 Data Type
0DataFrame
1Apache Arrow
2DataFrame
3Apache Arrow

Parallel Inferences Method

The pipeline parallel_infer(tensor_list, timeout, num_parallel, retries) asynchronous method performs an inference as defined by the pipeline steps and takes the following arguments:

  • tensor_list (REQUIRED List): The data submitted to the pipeline for inference as a List of the supported data types:
    • pandas.DataFrame: Data submitted as a pandas DataFrame are returned as a pandas DataFrame. For models that output one column based on the models outputs.
    • Apache Arrow (Preferred): Data submitted as an Apache Arrow are returned as an Apache Arrow.
  • timeout (OPTIONAL int): A timeout in seconds before the inference throws an exception. The default is 15 second per call to accommodate large, complex models. Note that for a batch inference, this is per list item - with 10 inference requests, each would have a default timeout of 15 seconds.
  • num_parallel (OPTIONAL int): The number of parallel threads used for the submission. This should be no more than four times the number of pipeline replicas.
  • retries (OPTIONAL int): The number of retries per inference request submitted.

parallel_infer is an asynchronous method that returns the Python callback list of tasks. Calling parallel_infer should be called with the await keyword to retrieve the callback results.

For example, the following will split a single pandas DataFrame table into rows, and submit each row as a separate DataFrame table. Once complete, each separate table is submitted via parallel_infer, and the results collected together as a new List. For this example, there are 4 replicas set in the pipeline deployment configuration.

dataset = []
for index, row in test_data.head(200).iterrows():
    dataset.append(row.to_frame('text_input').reset_index())

# we have a list of 200 dataframes - run as in inference
parallel_results = await pipeline.parallel_infer(dataset, timeout=10, num_parallel=8, retries=1)

Parallel Inference Returns

The await pipeline.parallel_infer method asynchronously returns a List of inference results. This includes how inference requests match the input types: pandas DataFrame inputs return pandas DataFrame, and Apache Arrow inputs return Apache Arrow objects. For example: a parallel inference request with 3 DataFrame tables in the list will return a list with 3 DataFrame tables.

Inference failures are tied to the object in the List that caused the failure. For example, a List with [dataframe1, dataframe2, dataframe3] where dataframe2 is malformed, then the List returned from await pipeline.parallel_infer would be [some inference result, error inference result, some inference result]. Results are returned in the same order of the data submitted.

Output Formats

DataFrame and Arrow

Output formats are based on the input types: pandas DataFrame inputs return pandas DataFrame, and Apache Arrow inputs return Apache Arrow objects.

The default columns returned are:

  • time: The DateTime of the inference request.
  • in: The input data.
  • out: The output data. Outputs of the inference are based on the model’s outputs as out.{model_output}. This model only has one output - dense_1, which is listed in the out.dense_1 column. If the model has multiple outputs, they would be listed as out.{outputname1}, out.{outputname2}, etc.
  • check_failures: Whether any Pipeline validation parameters were triggered.

Columns returned are controlled by the dataset_exclude array parameter, which specifies which output columns to ignore. For example, if a model outputs the columns out.rambo, out.main, out.glibnar, using the parameter dataset_exclude=["out.rambo", "out.glibnar"] will exclude those columns from the output.

Custom JSON

When submitting custom JSON as an input, JSON is returned as an output based on the model’s output parameters.

Using Apache Arrow is highly encouraged over custom JSON or pandas DataFrame for the inference speed, lower data transmission, and use specific data types as defined in the Arrow table schemas.

In this example, a pipeline with a Statsmodel model accepts custom JSON inputs and returns JSON as the output.

results = pipeline.infer_from_file('bike_day_eval.json', data_format="custom-json")
display(results)

[{'forecast': [1882.378455403016,
   2130.6079157429585,
   2340.840053800859,
   2895.754978555364,
   2163.6575155637433,
   1509.1792126514365,
   2431.183892393437]}]

3 - Wallaroo SDK Reference Guide

Wallaroo SDK Reference Guide

3.1 - wallaroo.assay

An Assay represents a record in the database. An assay contains some high level attributes such as name, status, active, etc. as well as the sub objects Baseline, Window and Summarizer which specify how the Baseline is derived, how the Windows should be created and how the analysis should be conducted.

Assay(client: Optional[wallaroo.client.Client], data: Dict[str, Any])

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
def turn_on(self):

Sets the Assay to active causing it to run and backfill any missing analysis.

def turn_off(self):

Disables the Assay. No further analysis will be conducted until the assay is enabled.

def set_alert_threshold(self, threshold: float):

Sets the alert threshold at the specified level. The status in the AssayAnalysis will show if this level is exceeded however currently alerting/notifications are not implemented.

def set_warning_threshold(self, threshold: float):

Sets the warning threshold at the specified level. The status in the AssayAnalysis will show if this level is exceeded however currently alerting/notifications are not implemented.

def meta_df(assay_result: Dict, index_name) -> pandas.core.frame.DataFrame:

Creates a dataframe for the meta data in the baseline or window excluding the edge information.

Parameters
  • assay_result: The dict of the raw asset result
def edge_df(window_or_baseline: Dict) -> pandas.core.frame.DataFrame:

Creates a dataframe specifically for the edge information in the baseline or window.

Parameters
  • window_or_baseline: The dict from the assay result of either the window or baseline
class AssayAnalysis:

The AssayAnalysis class helps handle the assay analysis logs from the Plateau logs. These logs are a json document with meta information on the assay and analysis as well as summary information on the baseline and window and information on the comparison between them.

AssayAnalysis(raw: Dict[str, Any])
def compare_basic_stats(self) -> pandas.core.frame.DataFrame:

Creates a simple dataframe making it easy to compare a baseline and window.

def baseline_stats(self) -> pandas.core.frame.DataFrame:

Creates a simple dataframe with the basic stats data for a baseline.

def compare_bins(self) -> pandas.core.frame.DataFrame:

Creates a simple dataframe to compare the bin/edge information of baseline and window.

def baseline_bins(self) -> pandas.core.frame.DataFrame:

Creates a simple dataframe to with the edge/bin data for a baseline.

def chart(self, show_scores=True):

Quickly create a chart showing the bins, values and scores of an assay analysis. show_scores will also label each bin with its final weighted (if specified) score.

class AssayAnalysisList:

Helper class primarily to easily create a dataframe from a list of AssayAnalysis objects.

AssayAnalysisList(raw: List[wallaroo.assay.AssayAnalysis])
def to_dataframe(self) -> pandas.core.frame.DataFrame:

Creates and returns a summary dataframe from the assay results.

def to_full_dataframe(self) -> pandas.core.frame.DataFrame:

Creates and returns a dataframe with all values including inputs and outputs from the assay results.

def chart_df( self, df: Union[pandas.core.frame.DataFrame, pandas.core.series.Series], title: str, nth_x_tick=None):

Creates a basic chart of the scores from dataframe created from assay analysis list

def chart_scores(self, title: Optional[str] = None, nth_x_tick=4):

Creates a basic chart of the scores from an AssayAnalysisList

def chart_iopaths( self, labels: Optional[List[str]] = None, selected_labels: Optional[List[str]] = None, nth_x_tick=None):

Creates a basic charts of the scores for each unique iopath of an AssayAnalysisList

class Assays(typing.List[wallaroo.assay.Assay]):

Wraps a list of assays for display in an HTML display-aware environment like Jupyter.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort

3.2 - wallaroo.assay_config

def unwrap(v: Optional[~T]) -> ~T:

Simple function to placate pylance

class BaselineConfig:

Abstract base class for Baseline config objects. Currently only FixedBaseline is implemented though SlidingBaseline and others are planned.

BaselineConfig()
def to_json(self) -> str:
class FixedBaseline(BaselineConfig):

The FixedBaseline is calculate from the inferences from a specific time window.

FixedBaseline( pipeline_name: str, model_name: str, start: datetime.datetime, end: datetime.datetime)
Inherited Members
class BaselineBuilder(abc.ABC):

Helper class that provides a standard way to create an ABC using inheritance.

@abstractmethod
def build(self) -> wallaroo.assay_config.BaselineConfig:
def to_json(self) -> str:
def ensure_tz(d: datetime.datetime) -> datetime.datetime:

Ensure the date it tz aware. If naive assume it is in utc.

class FixedBaselineBuilder(BaselineBuilder):

Helps to easily create the config object for a FixedBaseline.

FixedBaselineBuilder(pipeline_name: str)
def add_model_name(self, model_name: str):

Specify the model to use in the baseline

def add_start(self, start: datetime.datetime):

Specify the start of the window for the baseline

def add_end(self, end: datetime.datetime):

Specify the end of the window for the baseline

Create the FixedBaseline object.

Inherited Members
class SummarizerConfig:

The summarizer specifies how the bins of the baseline and window should be compared.

SummarizerConfig()
def to_json(self) -> str:
class BinMode(builtins.str, enum.Enum):

How should we calculate the bins. NONE - no bins. Only useful if we only care about the mean, median, etc. EQUAL - evenly spaced bins: min - max / num_bins QUANTILE - based on percentages. If num_bins is 5 then quintiles so bins are created at the 20%, 40%, 60%, 80% and 100% points. PROVIDED - user provides the edge points for the bins.

NONE = <BinMode.NONE: 'None'>
EQUAL = <BinMode.EQUAL: 'Equal'>
QUANTILE = <BinMode.QUANTILE: 'Quantile'>
PROVIDED = <BinMode.PROVIDED: 'Provided'>
Inherited Members
enum.Enum
name
value
builtins.str
encode
replace
split
rsplit
join
capitalize
casefold
title
center
count
expandtabs
find
partition
index
ljust
lower
lstrip
rfind
rindex
rjust
rstrip
rpartition
splitlines
strip
swapcase
translate
upper
startswith
endswith
removeprefix
removesuffix
isascii
islower
isupper
istitle
isspace
isdecimal
isdigit
isnumeric
isalpha
isalnum
isidentifier
isprintable
zfill
format
format_map
maketrans
class Aggregation(builtins.str, enum.Enum):

What we use to calculate the score. EDGES - distnces between the edges. DENSITY - percentage of values that fall in each bin. CUMULATIVE - cumulative percentage that fall in the bins.

EDGES = <Aggregation.EDGES: 'Edges'>
DENSITY = <Aggregation.DENSITY: 'Density'>
CUMULATIVE = <Aggregation.CUMULATIVE: 'Cumulative'>
Inherited Members
enum.Enum
name
value
builtins.str
encode
replace
split
rsplit
join
capitalize
casefold
title
center
count
expandtabs
find
partition
index
ljust
lower
lstrip
rfind
rindex
rjust
rstrip
rpartition
splitlines
strip
swapcase
translate
upper
startswith
endswith
removeprefix
removesuffix
isascii
islower
isupper
istitle
isspace
isdecimal
isdigit
isnumeric
isalpha
isalnum
isidentifier
isprintable
zfill
format
format_map
maketrans
class Metric(builtins.str, enum.Enum):

How we calculate the score. MAXDIFF - maximum difference between corresponding bins. SUMDIFF - sum of differences between corresponding bins. PSI - Population Stability Index

MAXDIFF = <Metric.MAXDIFF: 'MaxDiff'>
SUMDIFF = <Metric.SUMDIFF: 'SumDiff'>
PSI = <Metric.PSI: 'PSI'>
Inherited Members
enum.Enum
name
value
builtins.str
encode
replace
split
rsplit
join
capitalize
casefold
title
center
count
expandtabs
find
partition
index
ljust
lower
lstrip
rfind
rindex
rjust
rstrip
rpartition
splitlines
strip
swapcase
translate
upper
startswith
endswith
removeprefix
removesuffix
isascii
islower
isupper
istitle
isspace
isdecimal
isdigit
isnumeric
isalpha
isalnum
isidentifier
isprintable
zfill
format
format_map
maketrans
class UnivariateContinousSummarizerConfig(SummarizerConfig):

The UnivariateContinousSummarizer analyizes one input or output feature (Univariate) at a time. Expects the values to be continous or at least numerous enough to fall in various/all the bins.

UnivariateContinousSummarizerConfig( bin_mode: wallaroo.assay_config.BinMode, aggregation: wallaroo.assay_config.Aggregation, metric: wallaroo.assay_config.Metric, num_bins: int, bin_weights: Optional[List[float]] = None, bin_width: Optional[float] = None, provided_edges: Optional[List[float]] = None, add_outlier_edges: bool = True)
Inherited Members
class SummarizerBuilder(abc.ABC):

Helper class that provides a standard way to create an ABC using inheritance.

@abstractmethod
def build(self) -> wallaroo.assay_config.SummarizerConfig:
class UnivariateContinousSummarizerBuilder(SummarizerBuilder):

Builds the UnviariateSummarizer

UnivariateContinousSummarizerBuilder()
def add_bin_mode( self, bin_mode: wallaroo.assay_config.BinMode, edges: Optional[List[float]] = None):

Sets the binning mode. If BinMode.PROVIDED is specified a list of edges is also required.

def add_num_bins(self, num_bins: int):

Sets the number of bins. If weights have been previously set they must be set to none to allow changing the number of bins.

def add_bin_weights(self, weights: Optional[List[float]]):

Specifies the weighting to be given to the bins. The number of weights must be 2 larger than the number of bins to accomodate outliers smaller and outliers larger than values seen in the baseline. The passed in values can be whole or real numbers and do not need to add up to 1 or any other specific value as they will be normalized during the score calculation phase. The weights passed in can be none to remove previously specified weights and to allow changing of the number of bins.

def add_metric(self, metric: wallaroo.assay_config.Metric):

Sets the metric mode.

def add_aggregation(self, aggregation: wallaroo.assay_config.Aggregation):

Sets the aggregation style.

def add_bin_edges(self, edges: Optional[List[float]]):

Specifies the right hand side (max value) of the bins. The number of edges must be equal to or one more than the number of bins. When equal to the number of bins the edge for the left outlier bin is calculated from the baseline. When an additional edge (one more than number of bins) that first (lower) value is used as the max value for the left outlier bin. The max value for the right hand outlier bin is always Float MAX.

class WindowConfig:

Configures a window to be compared against the baseline.

WindowConfig( pipeline_name: str, model_name: str, width: str, start: Optional[datetime.datetime] = None, interval: Optional[str] = None)
def to_json(self) -> str:
class WindowBuilder:

Helps build a WindowConfig. model and width are required but there are no good default values for them because they depend on the baseline. We leave it up to the assay builder to configure the window correctly after it is created.

WindowBuilder(pipeline_name: str)
def add_model_name(self, model_name: str):

The model name (model_id) that the window should analyze.

def add_width(self, **kwargs: int):

The width of the window to use when collecting data for analysis.

def add_interval(self, **kwargs: int):

The width of the window to use when collecting data for analysis.

def add_start(self, start: datetime.datetime):
def ConfigEncoder(o):

Used to format datetimes as we need when encoding to JSON

class AssayConfig:

Configuration for an Assay record.

AssayConfig( client: Optional[wallaroo.client.Client], name: str, pipeline_id: int, pipeline_name: str, active: bool, status: str, iopath: str, baseline: wallaroo.assay_config.BaselineConfig, window: wallaroo.assay_config.WindowConfig, summarizer: wallaroo.assay_config.SummarizerConfig, warning_threshold: Optional[float], alert_threshold: float, run_until: Optional[datetime.datetime], workspace_id: Optional[int])
def to_json(self) -> str:
def interactive_run(self) -> wallaroo.assay.AssayAnalysisList:

Runs this assay interactively. The assay is not saved to the database nor are analyis records saved to a Plateau topic. Useful for exploring pipeline inference data and experimenting with thresholds.

def interactive_baseline_run(self) -> Optional[wallaroo.assay.AssayAnalysis]:
def interactive_input_run_arrow( self, inferences: pandas.core.frame.DataFrame, labels: Optional[List[str]]) -> wallaroo.assay.AssayAnalysisList:
def interactive_input_run_legacy( self, inferences: List[Dict], labels: Optional[List[str]]) -> wallaroo.assay.AssayAnalysisList:
def interactive_input_run( self, inferences: Union[List[Dict], pandas.core.frame.DataFrame], labels: Optional[List[str]]) -> wallaroo.assay.AssayAnalysisList:

Analyzes the inputs given to create an interactive run for each feature column. The assay is not saved to the database nor are analyis records saved to a Plateau topic. Usefull for exploring inputs for possible causes when a difference is detected in the output.

class AssayBuilder:

Helps build an AssayConfig

AssayBuilder( client: Optional[wallaroo.client.Client], name: str, pipeline_id: int, pipeline_name: str, model_name: str, baseline_start: datetime.datetime, baseline_end: datetime.datetime, iopath: str)
def baseline_dataframe(self):
def baseline_histogram( self, bins: Union[int, str, NoneType] = None, log_scale: bool = False):
def baseline_kde(self, log_scale: bool = False):
def baseline_ecdf(self, log_scale: bool = False):
def upload(self) -> int:
def add_name(self, name: str):

Specify the assay name

def add_active(self, active: bool):

Specify if the assay is active or not

def add_iopath(self, iopath: str):

Specify what the assay should analyze. Should start with input or output and have indexes (zero based) into row and column: For example 'input 0 1' specifies the second column of the first input.

def fixed_baseline_builder(self):

Specify creates a fixed baseline builder for this assay builder.

def add_baseline(self, baseline: wallaroo.assay_config.BaselineConfig):

Adds a specific baseline created elsewhere.

def window_builder(self):

Returns this assay builders window builder.

def add_window(self, window: wallaroo.assay_config.WindowConfig):

Adds a window created elsewhere.

def univariate_continuous_summarizer(self) -> wallaroo.assay_config.UnivariateContinousSummarizerBuilder:

Creates and adds an UCS to this assay builder.

def add_summarizer(self, summarizer: wallaroo.assay_config.SummarizerConfig):

Adds the summarizer created elsewhere to this builder.

def add_warning_threshold(self, warning_threshold: float):

Specify the warning threshold for this assay.

def add_alert_threshold(self, alert_threshold: float):

Specify the alert threshold for this assay.

def add_run_until(self, run_until: datetime.datetime):

"How long should this assay run. Primarily useful for interactive runs to limit the number of analysis.

def calc_bins(num_samples: int, bins: Union[int, str, NoneType]) -> Union[str, int]:

If the users specifies a number of bins or a strategy for calculating it use that. Else us the min of the square root or 50.

3.3 - wallaroo.auth

Handles authentication to the Wallaroo platform.

Performs a "device code"-style OAuth login flow.

The code is organized as follows:

  • Auth objects returned by create() should be placed on each request to platform APIs. Currently, we have the following types:

    • NoAuth: Does not modify requests
    • PlatformAuth: Places Authorization: Bearer XXX headers on each outgoing request
  • Objects derived from TokenFetcher know how to obtain an AccessToken from a particular provider:

    • KeycloakTokenFetcher: Fetches a token from Keycloak using a device code login flow
    • CachedTokenFetcher: Wraps another TokenFetcher and caches the value to a JSON file to reduce the number of user logins needed.
class AuthType(enum.Enum):

Defines all the supported auth types.

Handles conversions from string names to enum values.

NONE = <AuthType.NONE: 'none'>
SSO = <AuthType.SSO: 'sso'>
USER_PASSWORD = <AuthType.USER_PASSWORD: 'user_password'>
TEST_AUTH = <AuthType.TEST_AUTH: 'test_auth'>
TOKEN = <AuthType.TOKEN: 'token'>
ORCH = <AuthType.ORCH: 'orch'>
Inherited Members
enum.Enum
name
value
class TokenData(typing.NamedTuple):

TokenData(token, user_email, user_id)

TokenData(token: str, user_email: str, user_id: str)

Create new instance of TokenData(token, user_email, user_id)

token: str

Alias for field number 0

user_email: str

Alias for field number 1

user_id: str

Alias for field number 2

def to_dict(self) -> Dict[str, str]:
Inherited Members
builtins.tuple
index
count
def create( keycloak_addr: str, auth_type: Union[wallaroo.auth.AuthType, str, NoneType]) -> wallaroo.auth._WallarooAuth:

Returns an auth object of the corresponding type.

Parameters
  • str keycloak_addr: Address of the Keycloak instance to auth against
  • AuthType or str auth_type: Type of authentication to use
Returns

Auth object that can be passed to all requests calls

Raises
  • NotImplementedError: if auth_type is not recognized
def logout():

Removes cached values for all third-party auth providers.

This will not invalidate auth objects already created with create().

class AuthError(builtins.Exception):

Base type for all errors in this module.

AuthError(message: str, code: Optional[int] = None)
Inherited Members
builtins.BaseException
with_traceback
args
class TokenFetchError(AuthError):

Errors encountered while performing a login.

Inherited Members
builtins.BaseException
with_traceback
args
class TokenRefreshError(AuthError):

Errors encountered while refreshing an AccessToken.

Inherited Members
builtins.BaseException
with_traceback
args

3.4 - wallaroo.checks

class Expression:

Root base class for all model-checker expressions. Provides pythonic magic-method sugar for expression definitions.

Expression()
def model_names(self):
def as_json(self):
def one_of(self, *values):
@classmethod
def from_py(cls, value):

Creates an :py:Expression: from a given python value.

def top_json(self) -> Dict[str, object]:

Creates a top-level expression that can be passed to the model checker runtime.

class Function(Expression):

Root base class for all model-checker expressions. Provides pythonic magic-method sugar for expression definitions.

Function(op, args)
def model_names(self):
def as_json(self):
class BinOp(Expression):

Root base class for all model-checker expressions. Provides pythonic magic-method sugar for expression definitions.

BinOp(op, left, right)
def model_names(self):
def as_json(self):
class Variable(Expression):

Declares a model variable that can be used as an :py:Expression: in the model checker. Variables are identified by their model_name, a position of either "input" or "output", and the tensor index.

Variable(model_name, position, index)
def model_names(self):
def as_json(self):
def value_to_node(value):
class Value(Expression):

Root base class for all model-checker expressions. Provides pythonic magic-method sugar for expression definitions.

Value(value)
def model_names(self):
def as_json(self):
def is_prom_primitive(v):
class Aggregate:
Aggregate( name: str, promql_agg: str, inner_expression: wallaroo.checks.Expression, duration: datetime.timedelta, bucket_size: Optional[datetime.timedelta])
def expression(self):
def promql(self, gauge_name):
class Alert:
Alert(op, left, right)
def promql(self, gauge_name):
class DefinedFunction:
DefinedFunction(name)
class DefinedAggregate:
DefinedAggregate(name: str, promql_agg)
class Variables:
Variables(model, position)
def instrument( values: Dict[str, wallaroo.checks.Expression], gauges: List[str], validations: List[str]):
def dns_compliant(name: str):

Returns true if a string is compliant with DNS label name requirement to ensure it can be a part of a full DNS host name

def require_dns_compliance(name: str):

Validates that 'name' complies with DNS naming requirements or raises an exception

3.5 - wallaroo.client

class Client:

Client handle to a Wallaroo platform instance.

Objects of this class serve as the entrypoint to Wallaroo platform functionality.

Client( api_endpoint: str = 'http://api-lb:8080', auth_endpoint: str = '', request_timeout: Optional[int] = None, auth_type: Optional[str] = None, gql_client: Optional[gql.client.Client] = None, interactive: Optional[bool] = None, time_format: str = '%Y-%d-%b %H:%M:%S')

Create a Client handle.

Parameters
  • str api_endpoint: Host/port of the platform API endpoint
  • str auth_endpoint: Host/port of the platform Keycloak instance
  • int timeout: Max timeout of web requests, in seconds
  • str auth_type: Authentication type to use. Can be one of: "none", "sso", "user_password".
  • bool interactive: If provided and True, some calls will print additional human information, or won't when False. If not provided, interactive defaults to True if running inside Jupyter and False otherwise.
  • str time_format: Preferred strftime format string for displaying timestamps in a human context.
@staticmethod
def get_urls( auth_type: Optional[str], api_endpoint: str, auth_endpoint: str) -> Tuple[Optional[str], str, str]:

Method to calculate the auth values specified as defaults, as params or in ENV vars. Made static to be testable without reaching out to SSO, etc.

def list_tags(self) -> wallaroo.tag.Tags:

List all tags on the platform.

Returns

A list of all tags on the platform.

def list_models(self) -> wallaroo.models.ModelsList:

List all models on the platform.

Returns

A list of all models on the platform.

def list_deployments(self) -> List[wallaroo.deployment.Deployment]:

List all deployments (active or not) on the platform.

Returns

A list of all deployments on the platform.

def search_pipelines( self, search_term: Optional[str] = None, deployed: Optional[bool] = None, created_start: Optional[Datetime] = None, created_end: Optional[Datetime] = None, updated_start: Optional[Datetime] = None, updated_end: Optional[Datetime] = None) -> wallaroo.pipeline_variant.PipelineVariants:

Search for pipelines. All parameters are optional, in which case the result is the same as list_pipelines(). All times are strings to be parsed by datetime.isoformat. Example:

 myclient.search_pipelines(created_end='2022-04-19 13:17:59+00:00', search_term="foo")
Parameters
  • str search_term: Will be matched against tags and model names. Example: "footag123".
  • bool deployed: Pipeline was deployed or not
  • str created_start: Pipeline was created at or after this time
  • str created_end: Pipeline was created at or before this time
  • str updated_start: Pipeline was updated at or before this time
  • str updated_end: Pipeline was updated at or before this time
Returns

A list of all pipelines on the platform.

def search_my_models( self, search_term: Optional[str] = None, uploaded_time_start: Optional[Datetime] = None, uploaded_time_end: Optional[Datetime] = None) -> wallaroo.model.ModelVersions:

Search models owned by you params: search_term: Searches the following metadata: names, shas, versions, file names, and tags uploaded_time_start: Inclusive time of upload uploaded_time_end: Inclusive time of upload

def search_models( self, search_term: Optional[str] = None, uploaded_time_start: Optional[Datetime] = None, uploaded_time_end: Optional[Datetime] = None) -> wallaroo.model.ModelVersions:

Search all models you have access to. params: search_term: Searches the following metadata: names, shas, versions, file names, and tags uploaded_time_start: Inclusive time of upload uploaded_time_end: Inclusive time of upload

def get_user_by_email(self, email: str) -> Optional[wallaroo.user.User]:

Find a user by email

def deactivate_user(self, email: str) -> None:

Deactivates an existing user of the platform

Deactivated users cannot log into the platform. Deactivated users do not count towards the number of allotted user seats from the license.

The Models and Pipelines owned by the deactivated user are not removed from the platform.

Parameters
  • str email: The email address of the user to deactivate.
Returns

None

def activate_user(self, email: str) -> None:

Activates an existing user of the platform that had been previously deactivated.

Activated users can log into the platform.

Parameters
  • str email: The email address of the user to activate.
Returns

None

def list_users(self) -> List[wallaroo.user.User]:

List of all Users on the platform

Returns

A list of all Users on the platform.

def upload_model( self, name: str, path: Union[str, pathlib.Path], framework: Optional[wallaroo.framework.Framework] = None, input_schema: Optional[pyarrow.lib.Schema] = None, output_schema: Optional[pyarrow.lib.Schema] = None, convert_wait: Optional[bool] = True) -> wallaroo.model.Model:

Upload a model defined by a file as a new model variant.

Parameters
  • name: str The name of the model of which this is a variant. Names must be ASCII alpha-numeric characters or dash (-) only.
  • path: Union[str, pathlib.Path] Path of the model file to upload.
  • framework: Optional[Framework] Supported model frameworks. Use models from Framework Enum. Example: Framework.PYTORCH, Framework.TENSORFLOW
  • input_schema: Optional pa.Schema Input schema, required for flavors other than ONNX, Tensorflow, and Python
  • output_schema: Optional pa.Schema Output schema, required for flavors other than ONNX, Tensorflow, and Python
  • convert_wait: Optional bool Defaults to True. Specifies if method should return when conversion is over or not.
Returns

The created Model.

def register_model_image(self, name: str, image: str) -> wallaroo.model.Model:

Registers an MLFlow model as a new model.

Parameters
  • str model_name: The name of the model of which this is a variant. Names must be ASCII alpha-numeric characters or dash (-) only.
  • str image: Image name of the MLFlow model to register.
Returns

The created Model.

def model_by_name(self, model_class: str, model_name: str) -> wallaroo.model.Model:

Fetch a Model by name.

Parameters
  • str model_class: Name of the model class.
  • str model_name: Name of the variant within the specified model class.
Returns

The Model with the corresponding model and variant name.

def deployment_by_name(self, deployment_name: str) -> wallaroo.deployment.Deployment:

Fetch a Deployment by name.

Parameters
  • str deployment_name: Name of the deployment.
Returns

The Deployment with the corresponding name.

def pipelines_by_name(self, pipeline_name: str) -> List[wallaroo.pipeline.Pipeline]:

Fetch Pipelines by name.

Parameters
  • str pipeline_name: Name of the pipeline.
Returns

The Pipeline with the corresponding name.

def list_pipelines(self) -> List[wallaroo.pipeline.Pipeline]:

List all pipelines on the platform.

Returns

A list of all pipelines on the platform.

def build_pipeline(self, pipeline_name: str) -> wallaroo.pipeline.Pipeline:

Starts building a pipeline with the given pipeline_name, returning a :py:PipelineConfigBuilder:

When completed, the pipeline can be uploaded with .upload()

Parameters
  • pipeline_name string: Name of the pipeline, must be composed of ASCII alpha-numeric characters plus dash (-).
def create_value_split_experiment( self, name: str, meta_key: str, default_model: wallaroo.model_config.ModelConfig, challenger_models: List[Tuple[Any, wallaroo.model_config.ModelConfig]]) -> wallaroo.pipeline.Pipeline:

Creates a new PipelineVariant of a "value-split experiment" type.

Parameters
  • str name: Name of the Pipeline
  • meta_key str: Inference input key on which to redirect inputs to experiment models.
  • default_model ModelConfig: Model to send inferences by default.
  • challenger_models List[Tuple[Any, ModelConfig]]: A list of meta_key values -> Models to send inferences. If the inference data referred to by meta_key is equal to one of the keys in this tuple, that inference is redirected to the corresponding model instead of the default model.
@staticmethod
def cleanup_arrow_data_for_display(arrow_data: pyarrow.lib.Table) -> pyarrow.lib.Table:

Cleans up the inference result and log data from engine / plateau for display (ux) purposes.

def get_logs( self, topic: str, limit: Optional[int] = None, start_datetime: Optional[datetime.datetime] = None, end_datetime: Optional[datetime.datetime] = None, dataset: Optional[List[str]] = None, dataset_exclude: Optional[List[str]] = None, dataset_separator: Optional[str] = None, directory: Optional[str] = None, file_prefix: Optional[str] = None, data_size_limit: Optional[str] = None, arrow: Optional[bool] = False) -> Tuple[Union[pyarrow.lib.Table, pandas.core.frame.DataFrame, wallaroo.logs.LogEntries, NoneType], Optional[str]]:

Get logs for the given topic.

Parameters
  • topic: str The topic to get logs for.
  • limit: Optional[int] The maximum number of logs to return.
  • start_datetime: Optional[datetime] The start time to get logs for.
  • end_datetime: Optional[datetime] The end time to get logs for. :param dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
  • dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
  • dataset_separator: Optional[Union[Sequence[str], str]] If set to ".", return dataset will be flattened.
  • directory: Optional[str] If set, logs will be exported to a file in the given directory.
  • file_prefix: Optional[str] Prefix to name the exported file. Required if directory is set.
  • data_size_limit: Optional[str] The maximum size of the exported data in MB. Size includes all files within the provided directory. By default, the data_size_limit will be set to 100MB.
  • arrow: Optional[bool] If set to True, return logs as an Arrow Table. Else, returns Pandas DataFrame.
Returns

Tuple[Union[pa.Table, pd.DataFrame, LogEntries], str] The logs and status.

def security_logs(self, limit: int) -> List[dict]:

This function is not available in this release

def get_raw_logs( self, topic: str, start: Optional[datetime.datetime] = None, end: Optional[datetime.datetime] = None, limit: int = 100000, parse: bool = False, dataset: Optional[List[str]] = None, dataset_exclude: Optional[List[str]] = None, dataset_separator: Optional[str] = None, verbose: bool = False) -> Union[List[Dict[str, Any]], pandas.core.frame.DataFrame]:

Gets logs from Plateau for a particular time window without attempting to convert them to Inference LogEntries. Logs can be returned as strings or the json parsed into lists and dicts.

Parameters
  • topic str: The name of the topic to query
  • start Optional[datetime]: The start of the time window
  • end Optional[datetime]: The end of the time window
  • limit int: The number of records to retrieve. Note retrieving many records may be a performance bottleneck.
  • parse bool: Wether to attempt to parse the string as a json object.
  • verbose bool: Prints out info to help diagnose issues.
def get_raw_pipeline_inference_logs( self, topic: str, start: datetime.datetime, end: datetime.datetime, model_name: Optional[str] = None, limit: int = 100000, verbose: bool = False) -> List[Union[Dict[str, Any], pandas.core.frame.DataFrame]]:

Gets logs from Plateau for a particular time window and filters them for the model specified.

Parameters
  • pipeline_name str: The name/pipeline_id of the pipeline to query
  • topic str: The name of the topic to query
  • start Optional[datetime]: The start of the time window
  • end Optional[datetime]: The end of the time window
  • model_id: The name of the specific model to filter if any
  • limit int: The number of records to retrieve. Note retrieving many records may be a performance bottleneck.
  • verbose bool: Prints out info to help diagnose issues.
def get_pipeline_inference_dataframe( self, topic: str, start: datetime.datetime, end: datetime.datetime, model_name: Optional[str] = None, limit: int = 100000, verbose=False) -> pandas.core.frame.DataFrame:
def get_assay_results( self, assay_id: int, start: datetime.datetime, end: datetime.datetime) -> wallaroo.assay.AssayAnalysisList:

Gets the assay results for a particular time window, parses them, and returns an AssayAnalysisList of AssayAnalysis.

Parameters
  • assay_id int: The id of the assay we are looking for.
  • start datetime: The start of the time window
  • end datetime: The end of the time window
def build_assay( self, assay_name: str, pipeline: wallaroo.pipeline.Pipeline, model_name: str, baseline_start: datetime.datetime, baseline_end: datetime.datetime) -> wallaroo.assay_config.AssayBuilder:

Creates an AssayBuilder that can be used to configure and create Assays.

Parameters
  • assay_name str: Human friendly name for the assay
  • pipeline Pipeline: The pipeline this assay will work on
  • model_name str: The model that this assay will monitor
  • baseline_start datetime: The start time for the inferences to use as the baseline
  • baseline_end datetime: The end time of the baseline window. the baseline. Windows start immediately after the baseline window and are run at regular intervals continously until the assay is deactivated or deleted.
def upload_assay(self, config: wallaroo.assay_config.AssayConfig) -> int:

Creates an assay in the database.

Parameters
  • config AssayConfig: The configuration for the assay to create.
Returns

The identifier for the assay that was created. :rtype int

def list_assays(self) -> List[wallaroo.assay.Assay]:

List all assays on the platform.

Returns

A list of all assays on the platform.

def create_tag(self, tag_text: str) -> wallaroo.tag.Tag:

Create a new tag with the given text.

def create_workspace(self, workspace_name: str) -> wallaroo.workspace.Workspace:

Create a new workspace with the current user as its first owner.

Parameters
  • str workspace_name: Name of the workspace, must be composed of ASCII alpha-numeric characters plus dash (-)
def list_workspaces(self) -> List[wallaroo.workspace.Workspace]:

List all workspaces on the platform which this user has permission see.

Returns

A list of all workspaces on the platform.

def set_current_workspace( self, workspace: wallaroo.workspace.Workspace) -> wallaroo.workspace.Workspace:

Any calls involving pipelines or models will use the given workspace from then on.

def get_current_workspace(self) -> wallaroo.workspace.Workspace:

Return the current workspace. See also set_current_workspace.

def invite_user(self, email, password=None):
def get_topic_name(self, pipeline_pk_id: int) -> str:
def shim_token(self, token_data: wallaroo.auth.TokenData):

Given an inbound source model, a model type (xgboost, keras, sklearn), and conversion arguments. Convert the model to onnx, and add to available models for a pipeline.

Parameters
  • Union[str, pathlib.Path] path: The path to the model to convert, i.e. the source model.
  • ModelConversionSource source: The origin model type i.e. keras, sklearn or xgboost.
  • ModelConversionArguments conversion_arguments: A structure representing the arguments for converting a specific model type.
Returns

An instance of the Model being converted to Onnx.

Raises
  • ModelConversionGenericException: On a generic failure, please contact our support for further assistance.
  • ModelConversionFailure: Failure in converting the model type.
  • ModelConversionUnsupportedType: Raised when the source type passed is not supported.
  • ModelConversionSourceFileNotPresent: Raised when the passed source file does not exist.
def list_orchestrations(self):

List all Orchestrations in the current workspace.

Returns

A List containing all Orchestrations in the current workspace.

def upload_orchestration( self, bytes_buffer: Optional[bytes] = None, path: Optional[str] = None, name: Optional[str] = None, file_name: Optional[str] = None):

Upload a file to be packaged and used as an Orchestration.

The uploaded artifact must be a ZIP file which contains:

  • User code. If main.py exists, then that will be used as the task entrypoint. Otherwise, the first main.py found in any subdirectory will be used as the entrypoint.
  • Optional: A standard Python requirements.txt for any dependencies to be provided in the task environment. The Wallaroo SDK will already be present and should not be mentioned. Multiple requirements.txt files are not allowed.
  • Optional: Any other artifacts desired for runtime, including data or code.
Parameters
  • Optional[str] path: The path to the file on your filesystem that will be uploaded as an Orchestration.
  • Optional[bytes] bytes_buffer: The raw bytes to upload to be used Orchestration. Cannot be used with the path param.
  • Optional[str] name: An optional descriptive name for this Orchestration.
  • Optional[str] file_name: An optional filename to describe your Orchestration when using the bytes_buffer param. Ignored when path is used.
Returns

The Orchestration that was uploaded. :raises OrchestrationUploadFailed If a server-side error prevented the upload from succeeding.

def list_tasks(self, killed: bool = False):

List all Tasks in the current Workspace.

Returns

A List containing Task objects.

def get_task_by_id(self, task_id: str):

Retrieve a Task by its ID.

Parameters
  • str task_id: The ID of the Task to retrieve.
Returns

A Task object.

def in_task(self) -> bool:

Determines if this code is inside an orchestration task.

Returns

True if running in a task.

def task_args(self) -> Dict[Any, Any]:

When running inside a task (see in_task()), obtain arguments passed to the task.

Returns

Dict of the arguments

def list_connections(self) -> wallaroo.connection.ConnectionList:

List all Connections defined in the platform.

Returns

List of Connections in the whole platform.

def get_connection(self, name=<class 'str'>) -> wallaroo.connection.Connection:

Retrieves a Connection by its name.

Returns

Connection to an external data source.

def create_connection( self, name=<class 'str'>, connection_type=<class 'str'>, details=typing.Dict[str, typing.Any]) -> wallaroo.connection.Connection:

Creates a Connection with the given name, type, and type-specific details.

Returns

Connection to an external data source.

def mlops(self) -> wallaroo.wallaroo_ml_ops_api_client.client.AuthenticatedClient:

3.6 - wallaroo.comment

class Comment(wallaroo.object.Object):

Comment that may be attached to models and pipelines.

Comment( client: Optional[wallaroo.client.Client], data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
def id(self) -> int:
def user_id(*args, **kwargs):
def message(*args, **kwargs):
def updated_at(*args, **kwargs):
def list_models(self) -> List[wallaroo.model.Model]:

Lists the models this comment is on.

def list_pipelines(self) -> List[wallaroo.pipeline.Pipeline]:

Lists the models this comment is on.

def add_to_model(self, model_pk_id: int):
def remove_from_model(self, model_id: int):
def add_to_pipeline(self, pipeline_id: int):
def remove_from_pipeline(self, pipeline_pk_id: int):

3.7 - wallaroo.connection

class Connection(wallaroo.object.Object):

Connection to an external data source or destination.

Connection( client: wallaroo.client.Client, data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
@staticmethod
def list_connections( client: wallaroo.client.Client, workspace_id: Optional[int] = None) -> wallaroo.connection.ConnectionList:
@staticmethod
def get_connection( client: wallaroo.client.Client, name: str) -> wallaroo.connection.Connection:
@staticmethod
def create_connection( client: wallaroo.client.Client, name: str, connection_type: str, details: Dict[str, Any]) -> wallaroo.connection.Connection:
def delete_connection(self):
def add_connection_to_workspace(self, workspace_id: int):
def remove_connection_from_workspace(self, workspace_id: int):
def id(self):
def name(*args, **kwargs):
def connection_type(*args, **kwargs):
def details(*args, **kwargs):
def created_at(*args, **kwargs):
def workspace_names(*args, **kwargs):
class ConnectionList(typing.List[wallaroo.connection.Connection]):

Wraps a list of connections for display in a display-aware environment like Jupyter.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort

3.8 - wallaroo.datasizeunit

class DataSizeUnit(enum.Enum):

Data size limits for exported pipeline log files

KiB = <DataSizeUnit.KiB: 'KiB'>
MiB = <DataSizeUnit.MiB: 'MiB'>
GiB = <DataSizeUnit.GiB: 'GiB'>
TiB = <DataSizeUnit.TiB: 'TiB'>
@staticmethod
def from_string(unit_str):
def calculate_bytes(self, size):
Inherited Members
enum.Enum
name
value

3.9 - wallaroo.deployment

class WaitForError(builtins.Exception):

Common base class for all non-exit exceptions.

WaitForError(message: str, status: Optional[Dict[str, Any]])
Inherited Members
builtins.BaseException
with_traceback
args
class WaitForDeployError(builtins.RuntimeError):

Unspecified run-time error.

WaitForDeployError(message: str)
Inherited Members
builtins.BaseException
with_traceback
args
def hack_pandas_dataframe_order(df):
class Deployment(wallaroo.object.Object):

Base class for all backend GraphQL API objects.

This class serves as a framework for API objects to be constructed based on a partially-complete JSON response, and to fill in their remaining members dynamically if needed.

Deployment(client: Optional[wallaroo.client.Client], data: Dict[str, Any])

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
def id(self) -> int:
def name(*args, **kwargs):
def deployed(*args, **kwargs):
def model_configs(*args, **kwargs):
def pipeline_variants(*args, **kwargs):
def pipeline_name(*args, **kwargs):
def deploy(self) -> wallaroo.deployment.Deployment:

Deploys this deployment, if it is not already deployed.

If the deployment is already deployed, this is a no-op.

def undeploy(self) -> wallaroo.deployment.Deployment:

Shuts down this deployment, if it is deployed.

If the deployment is already undeployed, this is a no-op.

def status(self) -> Dict[str, Any]:

Returns a dict of deployment status useful for determining if a deployment has succeeded.

Returns

Dict of deployment internal state information.

def check_limit_status(self):
def wait_for_running(self, timeout: Optional[int] = None) -> wallaroo.deployment.Deployment:

Waits for the deployment status to enter the "Running" state.

Will wait up "timeout_request" seconds for the deployment to enter that state. This is set in the "Client" object constructor. Will raise various exceptions on failures.

Returns

The deployment, for chaining.

def wait_for_undeployed(self) -> wallaroo.deployment.Deployment:

Waits for the deployment to end.

Will wait up "timeout_request" seconds for the deployment to enter that state. This is set in the "Client" object constructor. Will raise various exceptions on failures.

Returns

The deployment, for chaining.

def infer( self, tensor: Union[Dict[str, Any], List[Any], pandas.core.frame.DataFrame, pyarrow.lib.Table], timeout: Union[int, float, NoneType] = None, dataset: Optional[List[str]] = None, dataset_exclude: Optional[List[str]] = None, dataset_separator: Optional[str] = None):

Returns an inference result on this deployment, given a tensor.

Parameters
  • tensor: Union[Dict[str, Any], List[Any], pd.DataFrame, pa.Table]. The tensor to be sent to run inference on.
  • timeout: Optional[Union[int, float]] infer requests will time out after the amount of seconds provided are exceeded. timeout defaults to 15 secs.
  • dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
  • dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
  • dataset_separator: Optional[str] If set to ".", returned dataset will be flattened.
Returns

InferenceResult in dictionary, dataframe or arrow format.

def infer_from_file( self, filename: Union[str, pathlib.Path], data_format: Optional[str] = None, timeout: Union[int, float, NoneType] = None, dataset: Optional[List[str]] = None, dataset_exclude: Optional[List[str]] = None, dataset_separator: Optional[str] = None) -> Union[List[wallaroo.inference_result.InferenceResult], pandas.core.frame.DataFrame, pyarrow.lib.Table]:

This method is used to run inference on a deployment using a file. The file can be in one of the following formats: pandas.DataFrame: .arrow, .json which contains data either in the pandas.records format or wallaroo custom json format.

Parameters
  • filename: Union[str, pathlib.Path]. The file to be sent to run inference on.
  • data_format: Optional[str]. The format of the data in the file. If not provided, the format will be inferred from the file extension.
  • timeout: Optional[Union[int, float]] infer requests will time out after the amount of seconds provided are exceeded. timeout defaults to 15 secs.
  • dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
  • dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
  • dataset_separator: Optional[str] If set to ".", returned dataset will be flattened.
Returns

InferenceResult in dictionary, dataframe or arrow format.

def replace_model(self, model: wallaroo.model.Model) -> wallaroo.deployment.Deployment:

Replaces the current model with a default-configured Model.

Parameters
  • Model model: Model variant to replace current model with
def replace_configured_model( self, model_config: wallaroo.model_config.ModelConfig) -> wallaroo.deployment.Deployment:

Replaces the current model with a configured variant.

Parameters
  • ModelConfig model_config: Configured model to replace current model with
def internal_url(self) -> str:

Returns the internal inference URL that is only reachable from inside of the Wallaroo cluster by SDK instances deployed in the cluster.

If both pipelines and models are configured on the Deployment, this gives preference to pipelines. The returned URL is always for the first configured pipeline or model.

def url(self) -> str:

Returns the inference URL.

If both pipelines and models are configured on the Deployment, this gives preference to pipelines. The returned URL is always for the first configured pipeline or model.

def logs( self, limit: int = 100, valid: Optional[bool] = None) -> wallaroo.logs.LogEntries:

Deployment.logs() has been removed. Please use pipeline.logs() instead.

3.10 - wallaroo.deployment_config

class DeploymentConfig(typing.Dict):
def guarantee_workspace_id( self, workspace_id: Optional[int]) -> wallaroo.deployment_config.DeploymentConfig:
Inherited Members
builtins.dict
get
setdefault
pop
popitem
keys
items
values
update
fromkeys
clear
copy
class DeploymentConfigBuilder:
DeploymentConfigBuilder(workspace_id: Optional[int] = None)
def replica_count(self, count: int) -> wallaroo.deployment_config.DeploymentConfigBuilder:
def replica_autoscale_min_max(self, maximum: int, minimum: int = 0):

Configures the minimum and maximum for autoscaling

def autoscale_cpu_utilization(self, cpu_utilization_percentage: int):

Sets the average CPU metric to scale on in a percentage

def disable_autoscale(self):

Disables autoscaling in the deployment configuration

def cpus( self, core_count: int) -> wallaroo.deployment_config.DeploymentConfigBuilder:
def memory( self, memory_spec: str) -> wallaroo.deployment_config.DeploymentConfigBuilder:
def lb_cpus( self, core_count: int) -> wallaroo.deployment_config.DeploymentConfigBuilder:
def lb_memory( self, memory_spec: int) -> wallaroo.deployment_config.DeploymentConfigBuilder:
def python_load_timeout_secs( self, timeout_secs: int) -> wallaroo.deployment_config.DeploymentConfigBuilder:
def sidekick_cpus( self, model: wallaroo.model.Model, core_count: int) -> wallaroo.deployment_config.DeploymentConfigBuilder:

Sets the number of CPUs to be used for the model's sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment.

Parameters
  • Model model: The sidekick model to configure.
  • int core_count: Number of CPU cores to use in this sidekick.
Returns

This DeploymentConfigBuilder instance for chaining.

def sidekick_memory( self, model: wallaroo.model.Model, memory_spec: str) -> wallaroo.deployment_config.DeploymentConfigBuilder:

Sets the memory to be used for the model's sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment.

Parameters
  • Model model: The sidekick model to configure.
  • str memory_spec: Specification of amount of memory (e.g., "2Gi", "500Mi") to use in this sidekick.
Returns

This DeploymentConfigBuilder instance for chaining.

def sidekick_env( self, model: wallaroo.model.Model, environment: Dict[str, str]) -> wallaroo.deployment_config.DeploymentConfigBuilder:

Sets the environment variables to be set for the model's sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment.

Parameters
  • Model model: The sidekick model to configure.
  • Dict[str, str] environment: Dictionary of environment variables names and their corresponding values to be set in the sidekick container.
Returns

This DeploymentConfigBuilder instance for chaining.

3.11 - wallaroo.engine_config

class EngineConfig:

Wraps an engine config.

EngineConfig( cpus: int, inference_channel_size: Optional[int] = None, model_concurrency: Optional[int] = None, pipeline_config_directory: Optional[str] = None, model_config_directory: Optional[str] = None, model_directory: Optional[str] = None, audit_logging: bool = False, standalone: bool = False)
@staticmethod
def as_standalone( cpus: int, inference_channel_size: Optional[int] = None, model_concurrency: Optional[int] = None, pipeline_config_directory: Optional[str] = None, model_config_directory: Optional[str] = None, model_directory: Optional[str] = None) -> wallaroo.engine_config.EngineConfig:

Creates an EngineConfig for use with standalone mode

def to_json(self) -> str:

Returns a json representation of this object

def to_yaml(self) -> str:

Returns a yaml representation of this object for use with standalone mode

3.12 - wallaroo.expression

3.13 - wallaroo.framework

class Framework(builtins.str, enum.Enum):

An Enum to represent the supported frameworks.

ONNX = <Framework.ONNX: 'onnx'>
TENSORFLOW = <Framework.TENSORFLOW: 'tensorflow'>
PYTHON = <Framework.PYTHON: 'python'>
KERAS = <Framework.KERAS: 'keras'>
SKLEARN = <Framework.SKLEARN: 'sklearn'>
PYTORCH = <Framework.PYTORCH: 'pytorch'>
XGBOOST = <Framework.XGBOOST: 'xgboost'>
HUGGING_FACE_FEATURE_EXTRACTION = <Framework.HUGGING_FACE_FEATURE_EXTRACTION: 'hugging-face-feature-extraction'>
HUGGING_FACE_IMAGE_CLASSIFICATION = <Framework.HUGGING_FACE_IMAGE_CLASSIFICATION: 'hugging-face-image-classification'>
HUGGING_FACE_IMAGE_SEGMENTATION = <Framework.HUGGING_FACE_IMAGE_SEGMENTATION: 'hugging-face-image-segmentation'>
HUGGING_FACE_IMAGE_TO_TEXT = <Framework.HUGGING_FACE_IMAGE_TO_TEXT: 'hugging-face-image-to-text'>
HUGGING_FACE_OBJECT_DETECTION = <Framework.HUGGING_FACE_OBJECT_DETECTION: 'hugging-face-object-detection'>
HUGGING_FACE_QUESTION_ANSWERING = <Framework.HUGGING_FACE_QUESTION_ANSWERING: 'hugging-face-question-answering'>
HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG = <Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG: 'hugging-face-stable-diffusion-text-2-img'>
HUGGING_FACE_SUMMARIZATION = <Framework.HUGGING_FACE_SUMMARIZATION: 'hugging-face-summarization'>
HUGGING_FACE_TEXT_CLASSIFICATION = <Framework.HUGGING_FACE_TEXT_CLASSIFICATION: 'hugging-face-text-classification'>
HUGGING_FACE_TRANSLATION = <Framework.HUGGING_FACE_TRANSLATION: 'hugging-face-translation'>
HUGGING_FACE_ZERO_SHOT_CLASSIFICATION = <Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION: 'hugging-face-zero-shot-classification'>
HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION = <Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION: 'hugging-face-zero-shot-image-classification'>
HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION = <Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION: 'hugging-face-zero-shot-object-detection'>
HUGGING_FACE_SENTIMENT_ANALYSIS = <Framework.HUGGING_FACE_SENTIMENT_ANALYSIS: 'hugging-face-sentiment-analysis'>
HUGGING_FACE_TEXT_GENERATION = <Framework.HUGGING_FACE_TEXT_GENERATION: 'hugging-face-text-generation'>
CUSTOM = <Framework.CUSTOM: 'custom'>
Inherited Members
enum.Enum
name
value
builtins.str
encode
replace
split
rsplit
join
capitalize
casefold
title
center
count
expandtabs
find
partition
index
ljust
lower
lstrip
rfind
rindex
rjust
rstrip
rpartition
splitlines
strip
swapcase
translate
upper
startswith
endswith
removeprefix
removesuffix
isascii
islower
isupper
istitle
isspace
isdecimal
isdigit
isnumeric
isalpha
isalnum
isidentifier
isprintable
zfill
format
format_map
maketrans

3.14 - wallaroo.functions

3.15 - wallaroo.inference_decode

def convert_to_np_dtype(dtype):
def to_nd_array_list(outputs: List[Dict[str, Any]]) -> List[numpy.ndarray]:
def decode_inference_result(entry: Dict[str, Any]) -> List[Dict[str, Any]]:

Decode inference results. Since they have a potentially rich structure, this could become a substantial effort in the future.

TODO: Support multiple outputs TODO: Support multiple data types

def flatten_tensor(prefix: str, numeric_list: list) -> Dict[str, numbers.Number]:

Converts a possibly multidimentionsl list of numbers into a dict where each item in the list is represented by a key value pair in the dict. Does not maintain dimensions since dataframes are 2d. Does not maintain/manage types since it should work for any type supported by numpy.

For example [1,2,3] => {prefix_0: 1, prefix_1: 2, prefix_2: 3}. [[1,2],[3,4]] => {prefix_0_0: 1, prefix_0_1: 2, prefix_1_0: 3, prefix_1_1: 4}

def flatten_dict(prefix: str, input_dict: Dict) -> Dict[str, Any]:

Recursively flattens the input dict, setting the values on the output dict. Assumes simple value types (str, numbers, dicts, and lists). If a value is a dict it is flattened recursively. If a value is a list each item is set as a new k, v pair.

def inference_logs_to_dataframe(logs: List[Dict[str, Any]]) -> pandas.core.frame.DataFrame:

Very similar to dict_list_to_dataframe but specific to inference logs since they have input and output heiararchical fields/structures that must be treated in particular ways.

def nested_df_to_flattened_df(orig: pandas.core.frame.DataFrame) -> pandas.core.frame.DataFrame:
def dict_list_to_dataframe(assay_results: List[Dict[str, Any]]) -> pandas.core.frame.DataFrame:

Primarily for assay result lists but can be used for any list of simple dicts.

3.16 - wallaroo.inference_result

class InferenceResult:
InferenceResult(gql_client: Optional[gql.client.Client], data: Dict[str, Any])

Initializes an InferenceResult.

Parameters
  • gql.Client gql_client: GQL client that this can pass to created objects.
  • Dict[str, Any] data: Response parsed from JSON inference result body.
def data(self) -> List[numpy.ndarray]:

Returns the inference result data.

def model(self) -> Tuple[str, str]:

Returns the model this inference was generated by.

def time_elapsed(self) -> datetime.timedelta:

Returns the total time taken by the engine.

def timestamp(self) -> datetime.datetime:

Returns the time at which this inference occurred.

def input_data(self) -> Dict[str, Any]:

Returns the input data for this inference result.

def shadow_data(self) -> Optional[Dict[str, numpy.ndarray]]:

3.17 - wallaroo.logs

def fetch_plateau_logs(server: str, topic: str, limit: int = 100):
class LogEntry:

Wraps a single log entry.

This class is highly experimental, is unsupported/untested, and may change/disappear in the near future.

LogEntry(entry: Dict[str, Any])
class LogEntries(typing.List[wallaroo.logs.LogEntry]):

Wraps a list of log entries.

This class is highly experimental, is unsupported/untested, and may change/disappear in the near future.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort
class LogEntriesShadowDeploy(typing.List[wallaroo.logs.LogEntry]):

Wraps a list of log entries.

This class is highly experimental, is unsupported/untested, and may change/disappear in the near future.

LogEntriesShadowDeploy(logs: wallaroo.logs.LogEntries)
Inherited Members
builtins.list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort

3.18 - wallaroo.model

Wraps a backend Model object.

Model( client: Optional[wallaroo.client.Client], data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
@staticmethod
def as_standalone(name: str, version: str, file_name: str) -> wallaroo.model.Model:

Creates a Model intended for use in generating standalone configurations

def id(self) -> int:
def uid(self) -> str:
def name(*args, **kwargs):
def version(*args, **kwargs):
def models_pk_id(*args, **kwargs):
def sha(*args, **kwargs):
def status(*args, **kwargs):
def file_name(*args, **kwargs):
def image_path(*args, **kwargs):
def last_update_time(*args, **kwargs):
inputs
outputs
def tags(*args, **kwargs):
def rehydrate_config(*args, **kwargs):
def configure( self, runtime: Optional[str] = None, tensor_fields: List[str] = None, filter_threshold: float = None, input_schema: Optional[pyarrow.lib.Schema] = None, output_schema: Optional[pyarrow.lib.Schema] = None, batch_config: Optional[str] = None) -> wallaroo.model.Model:
def logs( self, limit: int = 100, valid: Optional[bool] = None, arrow: Optional[bool] = False) -> Tuple[Any, Optional[str]]:
def deploy( self, pipeline_name: str, deployment_config: Optional[wallaroo.deployment_config.DeploymentConfig] = None) -> wallaroo.pipeline.Pipeline:

Convenience function to quickly deploy a Model. It will configure the model, create a pipeline with a single model step, deploy it, and return the pipeline.

Typically, the configure() method is used to configure a model prior to deploying it. However, if a default configuration is sufficient, this function can be used to quickly deploy with said default configuration.

The filename this Model was generated from needs to have a recognizable file extension so that the runtime can be inferred. Currently, this is:

  • .onnx -> ONNX runtime
Parameters
  • str deployment_name: Name of the deployment to create. Must be unique across all deployments. Deployment names must be ASCII alpha-numeric characters plus dash (-) only.
class ModelVersions(typing.List[wallaroo.model.Model]):

Wraps a list of Models for display in a display-aware environment like Jupyter.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort

3.19 - wallaroo.model_config

class ModelConfig(wallaroo.object.Object):

Wraps a backend ModelConfig object.

ModelConfig( client: Optional[wallaroo.client.Client], data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
@staticmethod
def as_standalone( model: wallaroo.model.Model, filter_threshold: Optional[float] = None, runtime: Optional[str] = None, tensor_fields: Optional[List[str]] = None) -> wallaroo.model_config.ModelConfig:

Creates a ModelConfig intended for use in generating standalone configurations

inputs
outputs
def to_yaml(self):

Generates a yaml file for standalone engines

def to_k8s_yaml(self):
def id(self) -> int:
def filter_threshold(*args, **kwargs):
def model(*args, **kwargs):
def runtime(*args, **kwargs):
def tensor_fields(*args, **kwargs):

3.20 - wallaroo.ModelConversion

class ModelConversionInputType(enum.Enum):

An enumeration.

Float16 = <ModelConversionInputType.Float16: 'float16'>
Float32 = <ModelConversionInputType.Float32: 'float32'>
Float64 = <ModelConversionInputType.Float64: 'float64'>
UInt16 = <ModelConversionInputType.UInt16: 'uint16'>
UInt32 = <ModelConversionInputType.UInt32: 'uint32'>
UInt64 = <ModelConversionInputType.UInt64: 'uint64'>
Double = <ModelConversionInputType.Double: 'double'>
Inherited Members
enum.Enum
name
value
class ConvertKerasArguments(typing.NamedTuple):

ConvertKerasArguments(name, comment, input_type, dimensions)

ConvertKerasArguments( name: str, comment: Optional[str], input_type: wallaroo.ModelConversion.ModelConversionInputType, dimensions: List[Union[NoneType, int, float]])

Create new instance of ConvertKerasArguments(name, comment, input_type, dimensions)

name: str

Alias for field number 0

comment: Optional[str]

Alias for field number 1

Alias for field number 2

dimensions: List[Union[NoneType, int, float]]

Alias for field number 3

def to_dict(self) -> Dict[str, Any]:
Inherited Members
builtins.tuple
index
count
class ConvertSKLearnArguments(typing.NamedTuple):

ConvertSKLearnArguments(name, number_of_columns, input_type, comment)

ConvertSKLearnArguments( name: str, number_of_columns: int, input_type: wallaroo.ModelConversion.ModelConversionInputType, comment: Optional[str])

Create new instance of ConvertSKLearnArguments(name, number_of_columns, input_type, comment)

name: str

Alias for field number 0

number_of_columns: int

Alias for field number 1

Alias for field number 2

comment: Optional[str]

Alias for field number 3

def to_dict(self) -> Dict[str, Any]:
Inherited Members
builtins.tuple
index
count
class ConvertXGBoostArgs(typing.NamedTuple):

ConvertXGBoostArgs(name, number_of_columns, input_type, comment)

ConvertXGBoostArgs( name: str, number_of_columns: int, input_type: wallaroo.ModelConversion.ModelConversionInputType, comment: Optional[str])

Create new instance of ConvertXGBoostArgs(name, number_of_columns, input_type, comment)

name: str

Alias for field number 0

number_of_columns: int

Alias for field number 1

Alias for field number 2

comment: Optional[str]

Alias for field number 3

def to_dict(self) -> Dict[str, Any]:
Inherited Members
builtins.tuple
index
count
class ModelConversionSource(enum.Enum):

An enumeration.

KERAS = <ModelConversionSource.KERAS: 'keras'>
XGBOOST = <ModelConversionSource.XGBOOST: 'xgboost'>
SKLEARN = <ModelConversionSource.SKLEARN: 'sklearn'>
Inherited Members
enum.Enum
name
value
class ModelConversionGenericException(builtins.Exception):

Common base class for all non-exit exceptions.

Inherited Members
builtins.Exception
Exception
builtins.BaseException
with_traceback
args
class ModelConversionFailure(builtins.Exception):

Common base class for all non-exit exceptions.

Inherited Members
builtins.Exception
Exception
builtins.BaseException
with_traceback
args
class ModelConversionUnsupportedType(builtins.Exception):

Common base class for all non-exit exceptions.

Inherited Members
builtins.Exception
Exception
builtins.BaseException
with_traceback
args
class ModelConversionSourceFileNotPresent(builtins.Exception):

Common base class for all non-exit exceptions.

Inherited Members
builtins.Exception
Exception
builtins.BaseException
with_traceback
args

3.21 - wallaroo.models

class Models(wallaroo.object.Object):

A Wallaroo Model object. Models may have multiple versions, accessed via .versions()

Models( client: Optional[wallaroo.client.Client], data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
def id(self) -> int:
def name(*args, **kwargs):
def owner_id(*args, **kwargs):
def last_update_time(*args, **kwargs):
def created_at(*args, **kwargs):
def versions(*args, **kwargs):
class ModelsList(typing.List[wallaroo.models.Models]):

Wraps a list of Models for display in a display-aware environment like Jupyter.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort

3.22 - wallaroo.notify

class Notification:
Notification()
def to_json(self):
class Email(Notification):
Email(to)
def to_json(self):
@classmethod
def from_json(cls, json):
def to_json(notifications):
def from_json(json):
class AlertConfiguration:
AlertConfiguration(name, expression, notifications)
@classmethod
def from_json(Cls, json):
def to_json(self):

3.23 - wallaroo.object

class DehydratedValue:

Represents a not-set sentinel value.

Attributes that are null in the database will be returned as None in Python, and we want them to be set as such, so None cannot be used as a sentinel value signaling that an optional attribute is not yet set. Objects of this class fill that role instead.

DehydratedValue()
def rehydrate(attr):

Decorator that rehydrates the named attribute if needed.

This should decorate getter calls for an attribute:

@rehydrate(_foo_attr)
def foo_attr(self):
    return self._foo_attr

This will cause the API object to "rehydrate" (perform a query to fetch and fill in all attributes from the database) if the named attribute is not set.

def value_if_present( data: Dict[str, Any], path: str) -> Union[Any, wallaroo.object.DehydratedValue]:

Returns a value in a nested dictionary, or DehydratedValue.

Parameters
  • str path: Dot-delimited path within a nested dictionary; e.g. foo.bar.baz
Returns

The requested value inside the dictionary, or DehydratedValue if it doesn't exist.

class RequiredAttributeMissing(builtins.Exception):

Raised when an API object is initialized without a required attribute.

RequiredAttributeMissing(class_name: str, attribute_name: str)
Inherited Members
builtins.BaseException
with_traceback
args
class ModelUploadError(builtins.Exception):

Raised when a model file fails to upload.

ModelUploadError(e)
Inherited Members
builtins.BaseException
with_traceback
args
class ModelConversionError(builtins.Exception):

Raised when a model file fails to convert.

ModelConversionError(e)
Inherited Members
builtins.BaseException
with_traceback
args
class ModelConversionTimeoutError(builtins.Exception):

Raised when a model conversion took longer than 10mins

ModelConversionTimeoutError(e)
Inherited Members
builtins.BaseException
with_traceback
args
class EntityNotFoundError(builtins.Exception):

Raised when a query for a specific API object returns no results.

This is specifically for queries by unique identifiers that are expected to return exactly one result; queries that can return 0 to many results should return empty list instead of raising this exception.

EntityNotFoundError(entity_type: str, params: Dict[str, str])
Inherited Members
builtins.BaseException
with_traceback
args
class LimitError(builtins.Exception):

Raised when deployment fails.

LimitError(e)
Inherited Members
builtins.BaseException
with_traceback
args
class UserLimitError(builtins.Exception):

Raised when a community instance has hit the user limit

UserLimitError()
Inherited Members
builtins.BaseException
with_traceback
args
class DeploymentError(builtins.Exception):

Raised when deployment fails.

DeploymentError(e)
Inherited Members
builtins.BaseException
with_traceback
args
class InferenceError(builtins.Exception):

Raised when inference fails.

InferenceError(error: Dict[str, str])
Inherited Members
builtins.BaseException
with_traceback
args
class InvalidNameError(builtins.Exception):

Raised when an entity's name does not meet the expected critieria.

Parameters
  • str name: the name string that is invalid
  • str req: a string description of the requirement
InvalidNameError(name: str, req: str)
Inherited Members
builtins.BaseException
with_traceback
args
class CommunicationError(builtins.Exception):

Raised when some component cannot be contacted. There is a networking, configuration or installation problem.

CommunicationError(e)
Inherited Members
builtins.BaseException
with_traceback
args
class Object(abc.ABC):

Base class for all backend GraphQL API objects.

This class serves as a framework for API objects to be constructed based on a partially-complete JSON response, and to fill in their remaining members dynamically if needed.

Object( gql_client: Optional[gql.client.Client], data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.

3.24 - wallaroo.orchestration

class Orchestration(wallaroo.object.Object):

An Orchestration object that represents some user-defined code that has been packaged into a container and can be deployed.

Orchestration( client: wallaroo.client.Client, data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
@staticmethod
def list_orchestrations( client: wallaroo.client.Client) -> List[wallaroo.orchestration.Orchestration]:
@staticmethod
def upload( client: wallaroo.client.Client, name: Optional[str] = None, bytes_buffer: Optional[bytes] = None, path: Optional[str] = None, file_name: Optional[str] = None) -> wallaroo.orchestration.Orchestration:
def run_once( self, name: Optional[str], json_args: Dict[Any, Any] = {}, timeout: Optional[int] = None, debug: Optional[bool] = False):

Runs this Orchestration once.

Parameters
  • name str A descriptive identifier for this run.
  • json_args Dict[Any, Any] A JSON object containing deploy-specific arguments.
  • timeout Optional[int] A timeout in seconds. Any instance of this orchestration that is running for longer than this specified time will be automatically killed.
  • debug Optional[bool] Produce extra debugging output about the run
Returns

A metadata object associated with the deploy Task

def run_scheduled( self, name: str, schedule: str, json_args: Dict[Any, Any] = {}, timeout: Optional[int] = None, debug: Optional[bool] = False):

Runs this Orchestration on a cron schedule.

Parameters
  • name str A descriptive identifier for this run.
  • schedule str A cron-style scheduling string, e.g. "* * * * " or "/15 * * * *"
  • json_args Dict[Any, Any] A JSON object containing deploy-specific arguments.
  • timeout Optional[int] A timeout in seconds. Any single instance of this orchestration that is running for longer than this specified time will be automatically killed. Future runs will still be scheduled.
  • debug Optional[bool] Produce extra debugging output about the run
Returns

A metadata object associated with the deploy Task

def id(*args, **kwargs):
def name(*args, **kwargs):
def file_name(*args, **kwargs):
def sha(*args, **kwargs):
def status(*args, **kwargs):
def created_at(*args, **kwargs):
def updated_at(*args, **kwargs):
def workspace_id(*args, **kwargs):
def task(*args, **kwargs):
def list_tasks(self):
class OrchestrationList(typing.List[wallaroo.orchestration.Orchestration]):

Wraps a list of pipelines for display in a display-aware environment like Jupyter.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort
class OrchestrationUploadFailed(builtins.Exception):

Raised when uploading an Orchestration fails due to a backend issue.

OrchestrationUploadFailed(e)
Inherited Members
builtins.BaseException
with_traceback
args
class OrchestrationMissingFile(builtins.Exception):

Raised when uploading an Orchestration without providing a file-like object.

OrchestrationMissingFile()
Inherited Members
builtins.BaseException
with_traceback
args
class OrchestrationDeployOneshotFailed(builtins.Exception):

Raised when deploying an Orchestration fails due to a backend issue.

OrchestrationDeployOneshotFailed(e)
Inherited Members
builtins.BaseException
with_traceback
args

3.25 - wallaroo.pipeline

def update_timestamp(f):
class Pipeline(wallaroo.object.Object):

A pipeline is an execution context for models. Pipelines contain Steps, which are often Models. Pipelines can be deployed or un-deployed.

Pipeline(client: Optional[wallaroo.client.Client], data: Dict[str, Any])

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
def id(self) -> int:
def owner_id(*args, **kwargs):
def create_time(*args, **kwargs):
def last_update_time(*args, **kwargs):
def name(*args, **kwargs):
def variants(*args, **kwargs):
def tags(*args, **kwargs):
def get_pipeline_configuration(self, version: Optional[str] = None) -> Dict[str, Any]:

Get a pipeline configuration for a specific version.

Parameters
  • version: str Version of the pipeline. :return Dict[str, Any] Pipeline configuration.
def logs( self, limit: Optional[int] = None, start_datetime: Optional[datetime.datetime] = None, end_datetime: Optional[datetime.datetime] = None, valid: Optional[bool] = None, dataset: Optional[List[str]] = None, dataset_exclude: Optional[List[str]] = None, dataset_separator: Optional[str] = None, arrow: Optional[bool] = False) -> Union[wallaroo.logs.LogEntries, pyarrow.lib.Table, pandas.core.frame.DataFrame]:

Get inference logs for this pipeline.

Parameters
  • limit: Optional[int]: Maximum number of logs to return.
  • start_datetime: Optional[datetime.datetime]: Start time for logs.
  • end_datetime: Optional[datetime.datetime]: End time for logs.
  • valid: Optional[bool]: If set to False, will include logs for failed inferences
  • dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
  • dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
  • dataset_separator: Optional[Union[Sequence[str], str]] If set to ".", return dataset will be flattened.
  • arrow: Optional[bool] If set to True, return logs as an Arrow Table. Else, returns Pandas DataFrame.
Returns

Union[LogEntries, pa.Table, pd.DataFrame]

def export_logs( self, directory: Optional[str] = None, file_prefix: Optional[str] = None, data_size_limit: Optional[str] = None, limit: Optional[int] = None, start_datetime: Optional[datetime.datetime] = None, end_datetime: Optional[datetime.datetime] = None, valid: Optional[bool] = None, dataset: Optional[List[str]] = None, dataset_exclude: Optional[List[str]] = None, dataset_separator: Optional[str] = None, arrow: Optional[bool] = False) -> None:

Export logs to a user provided local file.

Parameters
  • directory: Optional[str] Logs will be exported to a file in the given directory. By default, logs will be exported to new "logs" subdirectory in current working directory.
  • file_prefix: Optional[str] Prefix to name the exported file. By default, the file_prefix will be set to the pipeline name.
  • data_size_limit: Optional[str] The maximum size of the exported data in bytes. Size includes all files within the provided directory. By default, the data_size_limit will be set to 100MB.
  • limit: Optional[int] The maximum number of logs to return.
  • start_datetime: Optional[datetime.datetime] The start time to filter logs by.
  • end_datetime: Optional[datetime.datetime] The end time to filter logs by.
  • valid: Optional[bool] If set to False, will return logs for failed inferences.
  • dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
  • dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
  • dataset_separator: Optional[Union[Sequence[str], str]] If set to ".", return dataset will be flattened.
  • arrow: Optional[bool] If set to True, return logs as an Arrow Table. Else, returns Pandas DataFrame. :return None
def logs_shadow_deploy(self):
def url(self) -> str:

Returns the inference URL for this pipeline.

def deploy( self, pipeline_name: Optional[str] = None, deployment_config: Optional[wallaroo.deployment_config.DeploymentConfig] = None) -> wallaroo.pipeline.Pipeline:

Deploy pipeline. pipeline_name is optional if deploy was called previously. When specified, pipeline_name must be ASCII alpha-numeric characters, plus dash (-) only.

def definition(self) -> str:

Get the current definition of the pipeline as a string

def get_topic_name(self) -> str:
def undeploy(self) -> wallaroo.pipeline.Pipeline:
def infer(self, *args, **kwargs):

Returns an inference result on this deployment, given a tensor.

Parameters
  • tensor: Union[Dict[str, Any], pd.DataFrame, pa.Table] Inference data. Should be a dictionary. Future improvement: will be a pandas dataframe or arrow table
  • timeout: Optional[Union[int, float]] infer requests will time out after the amount of seconds provided are exceeded. timeout defaults to 15 secs.
  • dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
  • dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
  • dataset_separator: Optional[Union[Sequence[str], str]] If set to ".", return dataset will be flattened.
Returns

InferenceResult in dictionary, dataframe or arrow format.

def infer_from_file(self, *args, **kwargs):

Returns an inference result on this deployment, given tensors in a file.

def status(self) -> Dict[str, Any]:

Status of pipeline

def steps(self) -> List[wallaroo.pipeline_config.Step]:

Returns a list of the steps of a pipeline. Not exactly a shim

def model_configs(self) -> List[wallaroo.model_config.ModelConfig]:

Returns a list of the model configs of a pipeline. Not exactly a shim

def remove_step(self, index: int) -> wallaroo.pipeline.Pipeline:

Remove a step at a given index

def add_model_step(self, model: wallaroo.model.Model) -> wallaroo.pipeline.Pipeline:

Perform inference with a single model.

def replace_with_model_step( self, index: int, model: wallaroo.model.Model) -> wallaroo.pipeline.Pipeline:

Replaces the step at the given index with a model step

def add_multi_model_step( self, models: Iterable[wallaroo.model.Model]) -> wallaroo.pipeline.Pipeline:

Perform inference on the same input data for any number of models.

def replace_with_multi_model_step( self, index: int, models: Iterable[wallaroo.model.Model]) -> wallaroo.pipeline.Pipeline:

Replaces the step at the index with a multi model step

def add_audit(self, slice) -> wallaroo.pipeline.Pipeline:

Run audit logging on a specified slice of model outputs.

The slice must be in python-like format. start:, start:end, and :end are supported.

def replace_with_audit(self, index: int, audit_slice: str) -> wallaroo.pipeline.Pipeline:

Replaces the step at the index with an audit step

def add_select(self, index: int) -> wallaroo.pipeline.Pipeline:

Select only the model output with the given index from an array of outputs.

def replace_with_select(self, step_index: int, select_index: int) -> wallaroo.pipeline.Pipeline:

Replaces the step at the index with a select step

def add_key_split( self, default: wallaroo.model.Model, meta_key: str, options: Dict[str, wallaroo.model.Model]) -> wallaroo.pipeline.Pipeline:

Split traffic based on the value at a given meta_key in the input data, routing to the appropriate model.

If the resulting value is a key in options, the corresponding model is used. Otherwise, the default model is used for inference.

def replace_with_key_split( self, index: int, default: wallaroo.model.Model, meta_key: str, options: Dict[str, wallaroo.model.Model]) -> wallaroo.pipeline.Pipeline:

Replace the step at the index with a key split step

def add_random_split( self, weighted: Iterable[Tuple[float, wallaroo.model.Model]], hash_key: Optional[str] = None) -> wallaroo.pipeline.Pipeline:

Routes inputs to a single model, randomly chosen from the list of weighted options.

Each model receives inputs that are approximately proportional to the weight it is assigned. For example, with two models having weights 1 and 1, each will receive roughly equal amounts of inference inputs. If the weights were changed to 1 and 2, the models would receive roughly 33% and 66% respectively instead.

When choosing the model to use, a random number between 0.0 and 1.0 is generated. The weighted inputs are mapped to that range, and the random input is then used to select the model to use. For example, for the two-models equal-weight case, a random key of 0.4 would route to the first model. 0.6 would route to the second.

To support consistent assignment to a model, a hash_key can be specified. This must be between 0.0 and 1.0. The value at this key, when present in the input data, will be used instead of a random number for model selection.

def replace_with_random_split( self, index: int, weighted: Iterable[Tuple[float, wallaroo.model.Model]], hash_key: Optional[str] = None) -> wallaroo.pipeline.Pipeline:

Replace the step at the index with a random split step

def add_shadow_deploy( self, champion: wallaroo.model.Model, challengers: Iterable[wallaroo.model.Model]) -> wallaroo.pipeline.Pipeline:

Create a "shadow deployment" experiment pipeline. The champion model and all challengers are run for each input. The result data for all models is logged, but the output of the champion is the only result returned.

This is particularly useful for "burn-in" testing a new model with real world data without displacing the currently proven model.

This is currently implemented as three steps: A multi model step, an audit step, and a select step. To remove or replace this step, you need to remove or replace all three. You can remove steps using pipeline.remove_step

def replace_with_shadow_deploy( self, index: int, champion: wallaroo.model.Model, challengers: Iterable[wallaroo.model.Model]) -> wallaroo.pipeline.Pipeline:

Replace a given step with a shadow deployment

def add_validation( self, name: str, validation: wallaroo.checks.Expression) -> wallaroo.pipeline.Pipeline:

Add a validation with the given name. All validations are run on all outputs, and all failures are logged.

def add_alert( self, name: str, alert: wallaroo.checks.Alert, notifications: List[wallaroo.notify.Notification]) -> wallaroo.pipeline.Pipeline:
def replace_with_alert( self, index: int, name: str, alert: wallaroo.checks.Alert, notifications: List[wallaroo.notify.Notification]) -> wallaroo.pipeline.Pipeline:

Replace the step at the given index with the specified alert

def clear(self) -> wallaroo.pipeline.Pipeline:

Remove all steps from the pipeline. This might be desireable if replacing models, for example.

def list_explainability_configs(self) -> List[wallaroo.explainability.ExplainabilityConfig]:

List the explainability configs we've created.

def get_explainability_config( self, expr: Union[str, wallaroo.explainability.ExplainabilityConfig]) -> wallaroo.explainability.ExplainabilityConfig:

Get the details of an explainability config.

def create_explainability_config(self, feature_names: Sequence[str], num_points=10):

Create a shap config to be used later for reference and adhoc requests.

class Pipelines(typing.List[wallaroo.pipeline.Pipeline]):

Wraps a list of pipelines for display in a display-aware environment like Jupyter.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort

3.26 - wallaroo.pipeline_config

class ValidDataType(builtins.str, enum.Enum):

An enumeration.

f32 = <ValidDataType.f32: 'f32'>
f64 = <ValidDataType.f64: 'f64'>
i8 = <ValidDataType.i8: 'i8'>
u8 = <ValidDataType.u8: 'u8'>
i16 = <ValidDataType.i16: 'i16'>
u16 = <ValidDataType.u16: 'u16'>
i32 = <ValidDataType.i32: 'i32'>
u32 = <ValidDataType.u32: 'u32'>
i64 = <ValidDataType.i64: 'i64'>
u64 = <ValidDataType.u64: 'u64'>
Inherited Members
enum.Enum
name
value
builtins.str
encode
replace
split
rsplit
join
capitalize
casefold
title
center
count
expandtabs
find
partition
index
ljust
lower
lstrip
rfind
rindex
rjust
rstrip
rpartition
splitlines
strip
swapcase
translate
upper
startswith
endswith
removeprefix
removesuffix
isascii
islower
isupper
istitle
isspace
isdecimal
isdigit
isnumeric
isalpha
isalnum
isidentifier
isprintable
zfill
format
format_map
maketrans
class ModelConfigsForStep:
ModelConfigsForStep(model_configs: List[wallaroo.model_config.ModelConfig])
class ModelForStep:
ModelForStep(name, version, sha)
def to_json(self):
@classmethod
def from_json(cls, json_dict: Dict[str, str]):
@classmethod
def from_model(cls, model: wallaroo.model.Model):
class ModelWeight:
ModelWeight(weight: float, model: wallaroo.pipeline_config.ModelForStep)
def to_json(self):
@classmethod
def from_json(cls, json_dict: Dict[str, Any]):
@classmethod
def from_tuple(cls, tup: Tuple[float, wallaroo.model.Model]):
class RowToModel:
RowToModel(row_index: int, model: wallaroo.pipeline_config.ModelForStep)
def to_json(self):
@classmethod
def from_json(cls, json_dict: Dict[str, Any]):
class Step:
Step()
def to_json(self):
def is_inference_step(self):
@staticmethod
def from_json(json_dict: Dict):
class Average(Step):
Average()
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
Inherited Members
class AuditResults(Step):
AuditResults(start: int, end: Optional[int] = None)
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
Inherited Members
class Check(Step):
Check(tree: str)
def to_json(self):
@classmethod
def from_name_and_validation( cls, name: str, validation: wallaroo.checks.Expression, gauges: List[str] = []):
@staticmethod
def from_json(json_dict: Dict):
Inherited Members
class ColumnsSelect(Step):
ColumnsSelect(columns: List[int])
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
Inherited Members
class ColumnsToRows(Step):
ColumnsToRows()
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
Inherited Members
class InputDataToType(Step):
InputDataToType(data_type: wallaroo.pipeline_config.ValidDataType)
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
Inherited Members
class ModelInference(Step):
ModelInference(models: List[wallaroo.pipeline_config.ModelForStep])
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
def is_inference_step(self):
class RowsToModels(Step):
RowsToModels(rows_to_models: List[wallaroo.pipeline_config.RowToModel])
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
def is_inference_step(self):
class Nth(Step):
Nth(index: int)
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
Inherited Members
class MultiOut(Step):
MultiOut()
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
Inherited Members
class MetaValueSplit(Step):
MetaValueSplit( split_key: str, control: wallaroo.pipeline_config.ModelForStep, routes: Dict[str, wallaroo.pipeline_config.ModelForStep])
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
def is_inference_step(self):
class RandomSplit(Step):
RandomSplit( weights: List[wallaroo.pipeline_config.ModelWeight], hash_key: Optional[str] = None)
def to_json(self):
@staticmethod
def from_json(json_dict: Dict):
def is_inference_step(self):
class PipelineConfig:
PipelineConfig( pipeline_name: str, steps: Iterable[wallaroo.pipeline_config.Step], alert_configurations: Iterable[wallaroo.notify.AlertConfiguration])
@classmethod
def from_json(Klass, json):
def to_json(self):
def to_yaml(self):
class PipelineConfigBuilder:
PipelineConfigBuilder( client: Optional[wallaroo.client.Client], pipeline_name: str, standalone=False)
@staticmethod
def as_standalone(pipeline_name: str):
def upload(self) -> wallaroo.pipeline.Pipeline:
def remove_step(self, index: int):

Remove a step at a given index

Perform inference with a single model.

def replace_with_model_step( self, index: int, model: wallaroo.model.Model) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Replaces the step at the given index with a model step

def add_multi_model_step( self, models: Iterable[wallaroo.model.Model]) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Perform inference on the same input data for any number of models.

def replace_with_multi_model_step( self, index: int, models: Iterable[wallaroo.model.Model]) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Replaces the step at the index with a multi model step

def add_audit(self, audit_slice: str) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Run audit logging on a specified slice of model outputs.

The slice must be in python-like format. start:, start:end, and :end are supported.

def replace_with_audit( self, index: int, audit_slice: str) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Replaces the step at the index with an audit step

def add_select(self, index: int) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Select only the model output with the given index from an array of outputs.

def add_multi_out(self):
def replace_with_select( self, step_index: int, select_index: int) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Replaces the step at the index with a select step

def add_key_split( self, default: wallaroo.model.Model, meta_key: str, options: Dict[str, wallaroo.model.Model]) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Split traffic based on the value at a given meta_key in the input data, routing to the appropriate model.

If the resulting value is a key in options, the corresponding model is used. Otherwise, the default model is used for inference.

def replace_with_key_split( self, index: int, default: wallaroo.model.Model, meta_key: str, options: Dict[str, wallaroo.model.Model]) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Replace the step at the index with a key split step

def add_random_split( self, weighted: Iterable[Tuple[float, wallaroo.model.Model]], hash_key: Optional[str] = None) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Routes inputs to a single model, randomly chosen from the list of weighted options.

Each model receives inputs that are approximately proportional to the weight it is assigned. For example, with two models having weights 1 and 1, each will receive roughly equal amounts of inference inputs. If the weights were changed to 1 and 2, the models would receive roughly 33% and 66% respectively instead.

When choosing the model to use, a random number between 0.0 and 1.0 is generated. The weighted inputs are mapped to that range, and the random input is then used to select the model to use. For example, for the two-models equal-weight case, a random key of 0.4 would route to the first model. 0.6 would route to the second.

To support consistent assignment to a model, a hash_key can be specified. This must be between 0.0 and 1.0. The value at this key, when present in the input data, will be used instead of a random number for model selection.

def replace_with_random_split( self, index: int, weighted: Iterable[Tuple[float, wallaroo.model.Model]], hash_key: Optional[str] = None) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Replace the step at the index with a random split step

def add_shadow_deploy( self, champion: wallaroo.model.Model, challengers: Iterable[wallaroo.model.Model]) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Create a "shadow deployment" experiment pipeline. The champion model and all challengers are run for each input. The result data for all models is logged, but the output of the champion is the only result returned.

This is particularly useful for "burn-in" testing a new model with real world data without displacing the currently proven model.

This is currently implemented as three steps: A multi model step, an audit step, and a select step. To remove or replace this step, you need to remove or replace all three. You can remove steps using pipeline.remove_step

def replace_with_shadow_deploy( self, index: int, champion: wallaroo.model.Model, challengers: Iterable[wallaroo.model.Model]) -> wallaroo.pipeline_config.PipelineConfigBuilder:
def add_validation( self, name: str, validation: wallaroo.checks.Expression) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Add a validation with the given name. All validations are run on all outputs, and all failures are logged.

def replace_with_validation( self, index: int, name: str, validation: wallaroo.checks.Expression) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Replace the step at the given index with a validation step

def add_alert( self, name: str, alert: wallaroo.checks.Alert, notifications: List[wallaroo.notify.Notification]) -> wallaroo.pipeline_config.PipelineConfigBuilder:
def replace_with_alert( self, index, name: str, alert: wallaroo.checks.Alert, notifications: List[wallaroo.notify.Notification]) -> wallaroo.pipeline_config.PipelineConfigBuilder:

Replace the step at the given index with the specified alert

Remove all steps from the pipeline. This might be desireable if replacing models, for example.

3.27 - wallaroo.pipeline_variant

class PipelineVariant(wallaroo.object.Object):

Base class for all backend GraphQL API objects.

This class serves as a framework for API objects to be constructed based on a partially-complete JSON response, and to fill in their remaining members dynamically if needed.

PipelineVariant(client: Optional[wallaroo.client.Client], data: Dict[str, Any])

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
def id(self) -> int:
def create_time(*args, **kwargs):
def last_update_time(*args, **kwargs):
def name(*args, **kwargs):
def definition(*args, **kwargs):
def pipeline(*args, **kwargs):
def deployments(*args, **kwargs):
def model_configs(*args, **kwargs):
def deploy( self, deployment_name: str, model_configs: List[wallaroo.model_config.ModelConfig], config: Optional[wallaroo.deployment_config.DeploymentConfig] = None) -> wallaroo.deployment.Deployment:

Deploys this PipelineVariant.

Parameters
  • str deployment_name: Name of the new Deployment. Must be unique across all deployments.
  • List[ModelConfig] model_configs: List of the configured models to use. These must be the same ModelConfigs used when creating the Pipeline.
  • Optional[DeploymentConfig] config: Deployment configuration to use.
Returns

A Deployment object for the resulting deployment.

class PipelineVariants(typing.List[wallaroo.pipeline_variant.PipelineVariant]):

Wraps a list of pipelines for display in a display-aware environment like Jupyter.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort

3.28 - wallaroo.tag

Tags that may be attached to models and pipelines.

Tag( client: Optional[wallaroo.client.Client], data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
def id(self) -> int:
def tag(*args, **kwargs):
def model_tags(*args, **kwargs):
def pipeline_tags(*args, **kwargs):
def list_models(self) -> List[wallaroo.model.Model]:

Lists the models this tag is on.

def add_to_model(self, model_id: int):
def remove_from_model(self, model_id: int):
def list_pipelines(self) -> List[wallaroo.pipeline.Pipeline]:

Lists the pipelines this tag is on.

def add_to_pipeline(self, pipeline_id: int):
def remove_from_pipeline(self, pipeline_id: int):
class Tags(typing.List[wallaroo.tag.Tag]):

Wraps a list of tags for display in a display-aware environment like Jupyter.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort

3.29 - wallaroo.task

Base class for all backend GraphQL API objects.

This class serves as a framework for API objects to be constructed based on a partially-complete JSON response, and to fill in their remaining members dynamically if needed.

Task( client: wallaroo.client.Client, data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
def kill(self):

Kill this Task.

@staticmethod
def list_tasks( client: wallaroo.client.Client, workspace_id: int, killed: bool = False):
@staticmethod
def get_task_by_id(client: wallaroo.client.Client, task_id: str):
def id(*args, **kwargs):
def name(*args, **kwargs):
def workspace_id(*args, **kwargs):
def input_data(*args, **kwargs):
def status(*args, **kwargs):
def task_type(*args, **kwargs):
def created_at(*args, **kwargs):
def updated_at(*args, **kwargs):
def last_runs( self, limit: Optional[int] = None, status: Optional[str] = None) -> wallaroo.task_run.TaskRunList:

Return the runs associated with this task.

Parameters
  • limit int The number of runs to return
  • status str Return only runs with the matching status. One of "success", "failure", "running", "all"
class TaskList(typing.List[wallaroo.task.Task]):

Wraps a list of pipelines for display in a display-aware environment like Jupyter.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort

3.30 - wallaroo.task_run

class TaskRun(wallaroo.object.Object):

Base class for all backend GraphQL API objects.

This class serves as a framework for API objects to be constructed based on a partially-complete JSON response, and to fill in their remaining members dynamically if needed.

TaskRun( client: wallaroo.client.Client, data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
def logs(self, limit: Optional[int] = None) -> wallaroo.task_run.TaskRunLogs:

Returns the application logs for the given Task Run. These may be print or Exception logs running your Orchestration.

Note: The default retention policy for Orchestration logs is 30 days.

Parameters
  • limit int Limits the number of lines of logs returned. Starts from the most recent logs.
Returns

A List of str. Each str represents a newline-separated entry from the Task's log. :

class TaskRunList(typing.List[wallaroo.task_run.TaskRun]):

Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort
class TaskRunLogs(typing.List[str]):

This is a list of logs associated with a Task run.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort

3.31 - wallaroo.user

class User:

A platform User.

User(client, data: Dict[str, Any], standalone=False)
def id(self) -> str:
def email(self) -> str:
def username(self) -> str:
def enabled(self) -> bool:
@staticmethod
def list_users( auth, api_endpoint: str = 'http://api-lb:8080', auth_endpoint: str = 'http://api-lb:8080'):
@staticmethod
def invite_user( email, password, auth, api_endpoint: str = 'http://api-lb:8080', auth_endpoint: str = 'http://api-lb:8080'):

3.32 - wallaroo.workspace

class Workspace(wallaroo.object.Object):

Workspace provides a user and visibility context for access to models and pipelines.

Workspace( client: Optional[wallaroo.client.Client], data: Dict[str, Any], standalone=False)

Base constructor.

Each object requires:

  • a GraphQL client - in order to fill its missing members dynamically
  • an initial data blob - typically from unserialized JSON, contains at
  • least the data for required members (typically the object's primary key) and optionally other data members.
def to_json(self):
def id(self) -> int:
def name(*args, **kwargs):
def archived(*args, **kwargs):
def created_at(*args, **kwargs):
def created_by(*args, **kwargs):
def models(*args, **kwargs):

Returns a List of Models objects that have been created in this workspace.

def pipelines(*args, **kwargs):
def users(*args, **kwargs):
def add_user(self, user_email: str) -> wallaroo.workspace.Workspace:

Add a user to workspace as participant

def add_owner(self, user_email: str) -> wallaroo.workspace.Workspace:

Add a user to workspace as owner

def remove_user(self, user_email: str):
def list_connections(self) -> wallaroo.connection.ConnectionList:

Return a list of Connections available in this Workspace.

Returns

List of Connections in this Workspace.

def add_connection(self, name: str):

Adds an existing Connection with the given name to this Workspace.

def remove_connection(self, name: str):

Removes a Connection with the given name from this Workspace.

class Workspaces(typing.List[wallaroo.workspace.Workspace]):

Wraps a list of workspaces for display.

Inherited Members
builtins.list
list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort